.. _parameters_H2OPCA:

Parameters of H2OPCA
--------------------

Affected Class
##############

- ``ai.h2o.sparkling.ml.features.H2OPCA``

Parameters
##########

- *Each parameter has also a corresponding getter and setter method.*
  *(E.g.:* ``label`` *->* ``getLabel()`` *,* ``setLabel(...)`` *)*

ignoredCols
  Names of columns to ignore for training.

  *Scala default value:* ``null`` *; Python default value:* ``None``
  
  *Also available on the trained model.*

outputCol
  Output column name

  *Default value:* ``"H2OPCA_3d1756b7caed__output"``
  
  *Also available on the trained model.*

columnsToCategorical
  List of columns to convert to categorical before modelling

  *Scala default value:* ``Array()`` *; Python default value:* ``[]``
  

computeMetrics
  Whether to compute metrics on the training data.

  *Scala default value:* ``true`` *; Python default value:* ``True``
  
  *Also available on the trained model.*

convertInvalidNumbersToNa
  If set to 'true', the model converts invalid numbers to NA during making predictions.

  *Scala default value:* ``false`` *; Python default value:* ``False``
  
  *Also available on the trained model.*

convertUnknownCategoricalLevelsToNa
  If set to 'true', the model converts unknown categorical levels to NA during making predictions.

  *Scala default value:* ``false`` *; Python default value:* ``False``
  
  *Also available on the trained model.*

dataFrameSerializer
  A full name of a serializer used for serialization and deserialization of Spark DataFrames to a JSON value within NullableDataFrameParam.

  *Default value:* ``"ai.h2o.sparkling.utils.JSONDataFrameSerializer"``
  
  *Also available on the trained model.*

exportCheckpointsDir
  Automatically export generated models to this directory.

  *Scala default value:* ``null`` *; Python default value:* ``None``
  
  *Also available on the trained model.*

ignoreConstCols
  Ignore constant columns.

  *Scala default value:* ``true`` *; Python default value:* ``True``
  
  *Also available on the trained model.*

imputeMissing
  Whether to impute missing entries with the column mean.

  *Scala default value:* ``false`` *; Python default value:* ``False``
  
  *Also available on the trained model.*

inputCols
  The array of input columns

  *Scala default value:* ``Array()`` *; Python default value:* ``[]``
  
  *Also available on the trained model.*

k
  Rank of matrix approximation.

  *Default value:* ``1``
  
  *Also available on the trained model.*

keepBinaryModels
  If set to true, all binary models created during execution of the ``fit`` method will be kept in DKV of H2O-3 cluster.

  *Scala default value:* ``false`` *; Python default value:* ``False``
  

maxIterations
  Maximum training iterations.

  *Default value:* ``1000``
  
  *Also available on the trained model.*

maxRuntimeSecs
  Maximum allowed runtime in seconds for model training. Use 0 to disable.

  *Default value:* ``0.0``
  
  *Also available on the trained model.*

modelId
  Destination id for this model; auto-generated if not specified.

  *Scala default value:* ``null`` *; Python default value:* ``None``
  

pcaImpl
  Specify the implementation to use for computing PCA (via SVD or EVD): MTJ_EVD_DENSEMATRIX - eigenvalue decompositions for dense matrix using MTJ; MTJ_EVD_SYMMMATRIX - eigenvalue decompositions for symmetric matrix using MTJ; MTJ_SVD_DENSEMATRIX - singular-value decompositions for dense matrix using MTJ; JAMA - eigenvalue decompositions for dense matrix using JAMA. References: JAMA - http://math.nist.gov/javanumerics/jama/; MTJ - https://github.com/fommil/matrix-toolkits-java/. Possible values are ``"MTJ_EVD_DENSEMATRIX"``, ``"MTJ_EVD_SYMMMATRIX"``, ``"MTJ_SVD_DENSEMATRIX"``, ``"JAMA"``.

  *Default value:* ``"MTJ_EVD_SYMMMATRIX"``
  
  *Also available on the trained model.*

pcaMethod
  Specify the algorithm to use for computing the principal components: GramSVD - uses a distributed computation of the Gram matrix, followed by a local SVD; Power - computes the SVD using the power iteration method (experimental); Randomized - uses randomized subspace iteration method; GLRM - fits a generalized low-rank model with L2 loss function and no regularization and solves for the SVD using local matrix algebra (experimental). Possible values are ``"GramSVD"``, ``"Power"``, ``"Randomized"``, ``"GLRM"``.

  *Default value:* ``"GramSVD"``
  
  *Also available on the trained model.*

scoreEachIteration
  Whether to score during each iteration of model training.

  *Scala default value:* ``false`` *; Python default value:* ``False``
  
  *Also available on the trained model.*

seed
  RNG seed for initialization.

  *Scala default value:* ``-1L`` *; Python default value:* ``-1``
  
  *Also available on the trained model.*

splitRatio
  Accepts values in range [0, 1.0] which determine how large part of dataset is used for training and for validation. For example, 0.8 -> 80% training 20% validation. This parameter is ignored when validationDataFrame is set.

  *Default value:* ``1.0``
  

transform
  Transformation of training data. Possible values are ``"NONE"``, ``"STANDARDIZE"``, ``"NORMALIZE"``, ``"DEMEAN"``, ``"DESCALE"``.

  *Default value:* ``"NONE"``
  
  *Also available on the trained model.*

useAllFactorLevels
  Whether first factor level is included in each categorical expansion.

  *Scala default value:* ``false`` *; Python default value:* ``False``
  
  *Also available on the trained model.*

validationDataFrame
  A data frame dedicated for a validation of the trained model. If the parameters is not set,a validation frame created via the 'splitRatio' parameter. The parameter is not serializable!

  *Scala default value:* ``null`` *; Python default value:* ``None``