Parameters of H2OGLRM¶

Affected Class¶

ai.h2o.sparkling.ml.features.H2OGLRM

Parameters¶

Each parameter has also a corresponding getter and setter method. (E.g.: label -> getLabel() , setLabel(...) )

maxScoringIterations

The maximum number of iterations used in MOJO scoring to update X

Default value: 100

Also available on the trained model.

reconstructedCol

Reconstructed column name. This column contains reconstructed input values (A_hat=X*Y instead of just X).

Default value: "H2OGLRM_00f851ff1c7e__reconstructed"

Also available on the trained model.

withReconstructedCol

A flag identifying whether a column with reconstructed input values will be produced or not.

Scala default value: false ; Python default value: False

Also available on the trained model.

lossByColNames

Columns names for which loss function will be overridden by the ‘lossByCol’ parameter

Scala default value: null ; Python default value: None

outputCol

Output column name

Default value: "H2OGLRM_00f851ff1c7e__output"

Also available on the trained model.

userX

User-specified initial matrix X.

Scala default value: null ; Python default value: None

userY

User-specified initial matrix Y.

Scala default value: null ; Python default value: None

columnsToCategorical

List of columns to convert to categorical before modelling

Scala default value: Array() ; Python default value: []

convertInvalidNumbersToNa

If set to ‘true’, the model converts invalid numbers to NA during making predictions.

Scala default value: false ; Python default value: False

Also available on the trained model.

convertUnknownCategoricalLevelsToNa

If set to ‘true’, the model converts unknown categorical levels to NA during making predictions.

Scala default value: false ; Python default value: False

Also available on the trained model.

dataFrameSerializer

A full name of a serializer used for serialization and deserialization of Spark DataFrames to a JSON value within NullableDataFrameParam.

Default value: "ai.h2o.sparkling.utils.JSONDataFrameSerializer"

Also available on the trained model.

expandUserY

Expand categorical columns in user-specified initial Y.

Scala default value: true ; Python default value: True

Also available on the trained model.

exportCheckpointsDir

Automatically export generated models to this directory.

Scala default value: null ; Python default value: None

Also available on the trained model.

gammaX

Regularization weight on X matrix.

Default value: 0.0

Also available on the trained model.

gammaY

Regularization weight on Y matrix.

Default value: 0.0

Also available on the trained model.

ignoreConstCols

Ignore constant columns.

Scala default value: true ; Python default value: True

Also available on the trained model.

ignoredCols

Names of columns to ignore for training.

Scala default value: null ; Python default value: None

Also available on the trained model.

imputeOriginal

Reconstruct original training data by reversing transform.

Scala default value: false ; Python default value: False

Also available on the trained model.

init

Initialization mode. Possible values are "Random", "SVD", "PlusPlus", "User", "Power".

Default value: "PlusPlus"

Also available on the trained model.

initStepSize

Initial step size.

Default value: 1.0

Also available on the trained model.

inputCols

The array of input columns

Scala default value: Array() ; Python default value: []

Also available on the trained model.

k

Rank of matrix approximation.

Default value: 1

Also available on the trained model.

keepBinaryModels

If set to true, all binary models created during execution of the fit method will be kept in DKV of H2O-3 cluster.

Scala default value: false ; Python default value: False

loadingName

[Deprecated] Use representation_name instead. Frame key to save resulting X.

Scala default value: null ; Python default value: None

Also available on the trained model.

loss

Numeric loss function. Possible values are "Quadratic", "Absolute", "Huber", "Poisson", "Periodic(0)", "Logistic", "Hinge", "Categorical", "Ordinal".

Default value: "Quadratic"

Also available on the trained model.

lossByCol

Loss function by column (override). Possible values are "Quadratic", "Absolute", "Huber", "Poisson", "Periodic(0)", "Logistic", "Hinge", "Categorical", "Ordinal".

Scala default value: null ; Python default value: None

Also available on the trained model.

maxIterations

Maximum number of iterations.

Default value: 1000

Also available on the trained model.

maxRuntimeSecs

Maximum allowed runtime in seconds for model training. Use 0 to disable.

Default value: 0.0

Also available on the trained model.

maxUpdates

Maximum number of updates, defaults to 2*max_iterations.

Default value: 2000

Also available on the trained model.

minStepSize

Minimum step size.

Scala default value: 1.0e-4 ; Python default value: 1.0E-4

Also available on the trained model.

modelId

Destination id for this model; auto-generated if not specified.

Scala default value: null ; Python default value: None

multiLoss

Categorical loss function. Possible values are "Quadratic", "Absolute", "Huber", "Poisson", "Periodic(0)", "Logistic", "Hinge", "Categorical", "Ordinal".

Default value: "Categorical"

Also available on the trained model.

period

Length of period (only used with periodic loss function).

Default value: 1

Also available on the trained model.

recoverSvd

Recover singular values and eigenvectors of XY.

Scala default value: false ; Python default value: False

Also available on the trained model.

regularizationX

Regularization function for X matrix. Possible values are "None", "Quadratic", "L2", "L1", "NonNegative", "OneSparse", "UnitOneSparse", "Simplex".

Default value: "None"

Also available on the trained model.

regularizationY

Regularization function for Y matrix. Possible values are "None", "Quadratic", "L2", "L1", "NonNegative", "OneSparse", "UnitOneSparse", "Simplex".

Default value: "None"

Also available on the trained model.

representationName

Frame key to save resulting X.

Scala default value: null ; Python default value: None

Also available on the trained model.

scoreEachIteration

Whether to score during each iteration of model training.

Scala default value: false ; Python default value: False

Also available on the trained model.

seed

RNG seed for initialization.

Scala default value: -1L ; Python default value: -1

Also available on the trained model.

splitRatio

Accepts values in range [0, 1.0] which determine how large part of dataset is used for training and for validation. For example, 0.8 -> 80% training 20% validation. This parameter is ignored when validationDataFrame is set.

Default value: 1.0

svdMethod

Method for computing SVD during initialization (Caution: Randomized is currently experimental and unstable). Possible values are "GramSVD", "Power", "Randomized".

Default value: "Randomized"

Also available on the trained model.

transform

Transformation of training data. Possible values are "NONE", "STANDARDIZE", "NORMALIZE", "DEMEAN", "DESCALE".

Default value: "NONE"

Also available on the trained model.

validationDataFrame

A data frame dedicated for a validation of the trained model. If the parameters is not set,a validation frame created via the ‘splitRatio’ parameter. The parameter is not serializable!

Scala default value: null ; Python default value: None