Parameters of H2OPCA¶
Affected Class¶
- ai.h2o.sparkling.ml.features.H2OPCA
Parameters¶
- Each parameter has also a corresponding getter and setter method. (E.g.: - label->- getLabel(),- setLabel(...))
- ignoredCols
- Names of columns to ignore for training. - Scala default value: - null; Python default value:- None- Also available on the trained model. 
- outputCol
- Output column name - Default value: - "H2OPCA_79551496103b__output"- Also available on the trained model. 
- columnsToCategorical
- List of columns to convert to categorical before modelling - Scala default value: - Array(); Python default value:- []
- computeMetrics
- Whether to compute metrics on the training data. - Scala default value: - true; Python default value:- True- Also available on the trained model. 
- convertInvalidNumbersToNa
- If set to ‘true’, the model converts invalid numbers to NA during making predictions. - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- convertUnknownCategoricalLevelsToNa
- If set to ‘true’, the model converts unknown categorical levels to NA during making predictions. - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- dataFrameSerializer
- A full name of a serializer used for serialization and deserialization of Spark DataFrames to a JSON value within NullableDataFrameParam. - Default value: - "ai.h2o.sparkling.utils.JSONDataFrameSerializer"- Also available on the trained model. 
- exportCheckpointsDir
- Automatically export generated models to this directory. - Scala default value: - null; Python default value:- None- Also available on the trained model. 
- ignoreConstCols
- Ignore constant columns. - Scala default value: - true; Python default value:- True- Also available on the trained model. 
- imputeMissing
- Whether to impute missing entries with the column mean. - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- inputCols
- The array of input columns - Scala default value: - Array(); Python default value:- []- Also available on the trained model. 
- k
- Rank of matrix approximation. - Default value: - 1- Also available on the trained model. 
- keepBinaryModels
- If set to true, all binary models created during execution of the - fitmethod will be kept in DKV of H2O-3 cluster.- Scala default value: - false; Python default value:- False
- maxIterations
- Maximum training iterations. - Default value: - 1000- Also available on the trained model. 
- maxRuntimeSecs
- Maximum allowed runtime in seconds for model training. Use 0 to disable. - Default value: - 0.0- Also available on the trained model. 
- modelId
- Destination id for this model; auto-generated if not specified. - Scala default value: - null; Python default value:- None
- pcaImpl
- Specify the implementation to use for computing PCA (via SVD or EVD): MTJ_EVD_DENSEMATRIX - eigenvalue decompositions for dense matrix using MTJ; MTJ_EVD_SYMMMATRIX - eigenvalue decompositions for symmetric matrix using MTJ; MTJ_SVD_DENSEMATRIX - singular-value decompositions for dense matrix using MTJ; JAMA - eigenvalue decompositions for dense matrix using JAMA. References: JAMA - http://math.nist.gov/javanumerics/jama/; MTJ - https://github.com/fommil/matrix-toolkits-java/. Possible values are - "MTJ_EVD_DENSEMATRIX",- "MTJ_EVD_SYMMMATRIX",- "MTJ_SVD_DENSEMATRIX",- "JAMA".- Default value: - "MTJ_EVD_SYMMMATRIX"- Also available on the trained model. 
- pcaMethod
- Specify the algorithm to use for computing the principal components: GramSVD - uses a distributed computation of the Gram matrix, followed by a local SVD; Power - computes the SVD using the power iteration method (experimental); Randomized - uses randomized subspace iteration method; GLRM - fits a generalized low-rank model with L2 loss function and no regularization and solves for the SVD using local matrix algebra (experimental). Possible values are - "GramSVD",- "Power",- "Randomized",- "GLRM".- Default value: - "GramSVD"- Also available on the trained model. 
- scoreEachIteration
- Whether to score during each iteration of model training. - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- seed
- RNG seed for initialization. - Scala default value: - -1L; Python default value:- -1- Also available on the trained model. 
- splitRatio
- Accepts values in range [0, 1.0] which determine how large part of dataset is used for training and for validation. For example, 0.8 -> 80% training 20% validation. This parameter is ignored when validationDataFrame is set. - Default value: - 1.0
- transform
- Transformation of training data. Possible values are - "NONE",- "STANDARDIZE",- "NORMALIZE",- "DEMEAN",- "DESCALE".- Default value: - "NONE"- Also available on the trained model. 
- useAllFactorLevels
- Whether first factor level is included in each categorical expansion. - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- validationDataFrame
- A data frame dedicated for a validation of the trained model. If the parameters is not set,a validation frame created via the ‘splitRatio’ parameter. The parameter is not serializable! - Scala default value: - null; Python default value:- None