.. _parameters_H2OGAM:

Parameters of H2OGAM
--------------------

Affected Classes
################

- ``ai.h2o.sparkling.ml.algos.H2OGAM``
- ``ai.h2o.sparkling.ml.algos.classification.H2OGAMClassifier``
- ``ai.h2o.sparkling.ml.algos.regression.H2OGAMRegressor``

Parameters
##########

- *Each parameter has also a corresponding getter and setter method.*
*(E.g.:* ``label`` *->* ``getLabel()`` *,* ``setLabel(...)`` *)*

betaConstraints
Data frame of beta constraints enabling to set special conditions over the model coefficients.

*Scala default value:* ``null`` *; Python default value:* ``None``

ignoredCols
Names of columns to ignore for training.

*Scala default value:* ``null`` *; Python default value:* ``None``

*Also available on the trained model.*

alphaValue
Distribution of regularization between the L1 (Lasso) and L2 (Ridge) penalties. A value of 1 for alpha represents Lasso regression, a value of 0 produces Ridge regression, and anything in between specifies the amount of mixing between the two. Default value of alpha is 0 when SOLVER = 'L-BFGS'; 0.5 otherwise.

*Scala default value:* ``null`` *; Python default value:* ``None``

*Also available on the trained model.*

aucType
Set default multinomial AUC type. Possible values are ``"AUTO"``, ``"NONE"``, ``"MACRO_OVR"``, ``"WEIGHTED_OVR"``, ``"MACRO_OVO"``, ``"WEIGHTED_OVO"``.

*Default value:* ``"AUTO"``

*Also available on the trained model.*

balanceClasses
Balance training data class counts via over/under-sampling (for imbalanced data).

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

betaEpsilon
Converge if beta changes less (using L-infinity norm) than beta esilon, ONLY applies to IRLSM solver .

*Scala default value:* ``1.0e-4`` *; Python default value:* ``1.0E-4``

*Also available on the trained model.*

bs
Basis function type for each gam predictors, 0 for cr, 1 for thin plate regression with knots, 2 for monotone I-splines, 3 for NBSplineTypeI M-splines (refer to doc here: https://github.com/h2oai/h2o-3/issues/6926). If specified, must be the same size as gam_columns.

*Scala default value:* ``null`` *; Python default value:* ``None``

*Also available on the trained model.*

classSamplingFactors
Desired over/under-sampling ratios per class (in lexicographic order). If not specified, sampling factors will be automatically computed to obtain class balance during training. Requires balance_classes.

*Scala default value:* ``null`` *; Python default value:* ``None``

*Also available on the trained model.*

coldStart
Only applicable to multiple alpha/lambda values when calling GLM from GAM. If false, build the next model for next set of alpha/lambda values starting from the values provided by current model. If true will start GLM model from scratch.

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

columnsToCategorical
List of columns to convert to categorical before modelling

*Scala default value:* ``Array()`` *; Python default value:* ``[]``

computePValues
Request p-values computation, p-values work only with IRLSM solver and no regularization.

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

convertInvalidNumbersToNa
If set to 'true', the model converts invalid numbers to NA during making predictions.

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

convertUnknownCategoricalLevelsToNa
If set to 'true', the model converts unknown categorical levels to NA during making predictions.

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

dataFrameSerializer
A full name of a serializer used for serialization and deserialization of Spark DataFrames to a JSON value within NullableDataFrameParam.

*Default value:* ``"ai.h2o.sparkling.utils.JSONDataFrameSerializer"``

*Also available on the trained model.*

detailedPredictionCol
Column containing additional prediction details, its content depends on the model type.

*Default value:* ``"detailed_prediction"``

*Also available on the trained model.*

earlyStopping
Stop early when there is no more relative improvement on train or validation (if provided).

*Scala default value:* ``true`` *; Python default value:* ``True``

*Also available on the trained model.*

exportCheckpointsDir
Automatically export generated models to this directory.

*Scala default value:* ``null`` *; Python default value:* ``None``

*Also available on the trained model.*

family
Family. Use binomial for classification with logistic regression, others are for regression problems. Possible values are ``"AUTO"``, ``"gaussian"``, ``"binomial"``, ``"fractionalbinomial"``, ``"quasibinomial"``, ``"poisson"``, ``"gamma"``, ``"multinomial"``, ``"tweedie"``, ``"ordinal"``, ``"negativebinomial"``.

*Default value:* ``"AUTO"``

*Also available on the trained model.*

featuresCols
Name of feature columns

*Scala default value:* ``Array()`` *; Python default value:* ``[]``

*Also available on the trained model.*

foldAssignment
Cross-validation fold assignment scheme, if fold_column is not specified. The 'Stratified' option will stratify the folds based on the response variable, for classification problems. Possible values are ``"AUTO"``, ``"Random"``, ``"Modulo"``, ``"Stratified"``.

*Default value:* ``"AUTO"``

*Also available on the trained model.*

foldCol
Column with cross-validation fold index assignment per observation.

*Scala default value:* ``null`` *; Python default value:* ``None``

*Also available on the trained model.*

gainsliftBins
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.

*Default value:* ``-1``

*Also available on the trained model.*

gamCols
Arrays of predictor column names for gam for smoothers using single or multiple predictors like {{'c1'},{'c2','c3'},{'c4'},...}

*Scala default value:* ``null`` *; Python default value:* ``None``

*Also available on the trained model.*

gradientEpsilon
Converge if objective changes less (using L-infinity norm) than this, ONLY applies to L-BFGS solver. Default indicates: If lambda_search is set to False and lambda is equal to zero, the default value of gradient_epsilon is equal to .000001, otherwise the default value is .0001. If lambda_search is set to True, the conditional values above are 1E-8 and 1E-6 respectively.

*Default value:* ``-1.0``

*Also available on the trained model.*

ignoreConstCols
Ignore constant columns.

*Scala default value:* ``true`` *; Python default value:* ``True``

*Also available on the trained model.*

interactions
A list of predictor column indices to interact. All pairwise combinations will be computed for the list.

*Scala default value:* ``null`` *; Python default value:* ``None``

*Also available on the trained model.*

intercept
Include constant term in the model.

*Scala default value:* ``true`` *; Python default value:* ``True``

*Also available on the trained model.*

keepBinaryModels
If set to true, all binary models created during execution of the ``fit`` method will be kept in DKV of H2O-3 cluster.

*Scala default value:* ``false`` *; Python default value:* ``False``

keepCrossValidationFoldAssignment
Whether to keep the cross-validation fold assignment.

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

keepCrossValidationModels
Whether to keep the cross-validation models.

*Scala default value:* ``true`` *; Python default value:* ``True``

*Also available on the trained model.*

keepCrossValidationPredictions
Whether to keep the predictions of the cross-validation models.

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

keepGamCols
Save keys of model matrix.

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

knotIds
Array storing frame keys of knots. One for each gam column set specified in gam_columns.

*Scala default value:* ``null`` *; Python default value:* ``None``

*Also available on the trained model.*

labelCol
Response variable column.

*Default value:* ``"label"``

*Also available on the trained model.*

lambdaSearch
Use lambda search starting at lambda max, given lambda is then interpreted as lambda min.

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

lambdaValue
Regularization strength.

*Scala default value:* ``null`` *; Python default value:* ``None``

*Also available on the trained model.*

link
Link function. Possible values are ``"family_default"``, ``"identity"``, ``"logit"``, ``"log"``, ``"inverse"``, ``"tweedie"``, ``"multinomial"``, ``"ologit"``, ``"oprobit"``, ``"ologlog"``.

*Default value:* ``"family_default"``

*Also available on the trained model.*

maxActivePredictors
Maximum number of active predictors during computation. Use as a stopping criterion to prevent expensive model building with many predictors. Default indicates: If the IRLSM solver is used, the value of max_active_predictors is set to 5000 otherwise it is set to 100000000.

*Default value:* ``-1``

*Also available on the trained model.*

maxAfterBalanceSize
Maximum relative size of the training data after balancing class counts (can be less than 1.0). Requires balance_classes.

*Scala default value:* ``5.0f`` *; Python default value:* ``5.0``

*Also available on the trained model.*

maxConfusionMatrixSize
[Deprecated] Maximum size (# classes) for confusion matrices to be printed in the Logs.

*Default value:* ``20``

*Also available on the trained model.*

maxIterations
Maximum number of iterations.

*Default value:* ``-1``

*Also available on the trained model.*

maxRuntimeSecs
Maximum allowed runtime in seconds for model training. Use 0 to disable.

*Default value:* ``0.0``

*Also available on the trained model.*

missingValuesHandling
Handling of missing values. Either MeanImputation, Skip or PlugValues. Possible values are ``"MeanImputation"``, ``"PlugValues"``, ``"Skip"``.

*Default value:* ``"MeanImputation"``

*Also available on the trained model.*

modelId
Destination id for this model; auto-generated if not specified.

*Scala default value:* ``null`` *; Python default value:* ``None``

nfolds
Number of folds for K-fold cross-validation (0 to disable or >= 2).

*Default value:* ``0``

*Also available on the trained model.*

nlambdas
Number of lambdas to be used in a search. Default indicates: If alpha is zero, with lambda search set to True, the value of nlamdas is set to 30 (fewer lambdas are needed for ridge regression) otherwise it is set to 100.

*Default value:* ``-1``

*Also available on the trained model.*

nonNegative
Restrict coefficients (not intercept) to be non-negative.

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

numKnots
Number of knots for gam predictors. If specified, must specify one for each gam predictor. For monotone I-splines, mininum = 2, for cs spline, minimum = 3. For thin plate, minimum is size of polynomial basis + 2.

*Scala default value:* ``null`` *; Python default value:* ``None``

*Also available on the trained model.*

objReg
Likelihood divider in objective value computation, default is 1/nobs.

*Default value:* ``-1.0``

*Also available on the trained model.*

objectiveEpsilon
Converge if objective value changes less than this. Default indicates: If lambda_search is set to True the value of objective_epsilon is set to .0001. If the lambda_search is set to False and lambda is equal to zero, the value of objective_epsilon is set to .000001, for any other value of lambda the default value of objective_epsilon is set to .0001.

*Default value:* ``-1.0``

*Also available on the trained model.*

offsetCol
Offset column. This will be added to the combination of columns before applying the link function.

*Scala default value:* ``null`` *; Python default value:* ``None``

*Also available on the trained model.*

predictionCol
Prediction column name

*Default value:* ``"prediction"``

*Also available on the trained model.*

prior
Prior probability for y==1. To be used only for logistic regression iff the data has been sampled and the mean of response does not reflect reality.

*Default value:* ``-1.0``

*Also available on the trained model.*

removeCollinearCols
In case of linearly dependent columns, remove some of the dependent columns.

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

scale
Smoothing parameter for gam predictors. If specified, must be of the same length as gam_columns.

*Scala default value:* ``null`` *; Python default value:* ``None``

*Also available on the trained model.*

scaleTpPenaltyMat
Scale penalty matrix for tp (thin plate) smoothers as in R.

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

scoreEachIteration
Whether to score during each iteration of model training.

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

seed
Seed for pseudo random number generator (if applicable).

*Scala default value:* ``-1L`` *; Python default value:* ``-1``

solver
AUTO will set the solver based on given data and the other parameters. IRLSM is fast on on problems with small number of predictors and for lambda-search with L1 penalty, L_BFGS scales better for datasets with many columns. Possible values are ``"AUTO"``, ``"IRLSM"``, ``"L_BFGS"``, ``"COORDINATE_DESCENT_NAIVE"``, ``"COORDINATE_DESCENT"``, ``"GRADIENT_DESCENT_LH"``, ``"GRADIENT_DESCENT_SQERR"``.

*Default value:* ``"AUTO"``

*Also available on the trained model.*

splineOrders
Order of I-splines or NBSplineTypeI M-splines used for gam predictors. If specified, must be the same size as gam_columns. For I-splines, the spline_orders will be the same as the polynomials used to generate the splines. For M-splines, the polynomials used to generate the splines will be spline_order-1. Values for bs=0 or 1 will be ignored.

*Scala default value:* ``null`` *; Python default value:* ``None``

*Also available on the trained model.*

splinesNonNegative
Valid for I-spline (bs=2) only. True if the I-splines are monotonically increasing (and monotonically non-decreasing) and False if the I-splines are monotonically decreasing (and monotonically non-increasing). If specified, must be the same size as gam_columns. Values for other spline types will be ignored. Default to true.

*Scala default value:* ``null`` *; Python default value:* ``None``

*Also available on the trained model.*

splitRatio
Accepts values in range [0, 1.0] which determine how large part of dataset is used for training and for validation. For example, 0.8 -> 80% training 20% validation. This parameter is ignored when validationDataFrame is set.

*Default value:* ``1.0``

standardize
Standardize numeric columns to have zero mean and unit variance.

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

standardizeTpGamCols
standardize tp (thin plate) predictor columns.

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

startval
double array to initialize coefficients for GAM.

*Scala default value:* ``null`` *; Python default value:* ``None``

*Also available on the trained model.*

stoppingMetric
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client. Possible values are ``"AUTO"``, ``"deviance"``, ``"logloss"``, ``"MSE"``, ``"RMSE"``, ``"MAE"``, ``"RMSLE"``, ``"AUC"``, ``"AUCPR"``, ``"lift_top_group"``, ``"misclassification"``, ``"mean_per_class_error"``, ``"anomaly_score"``, ``"AUUC"``, ``"ATE"``, ``"ATT"``, ``"ATC"``, ``"qini"``, ``"custom"``, ``"custom_increasing"``.

*Default value:* ``"AUTO"``

*Also available on the trained model.*

stoppingRounds
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable).

*Default value:* ``0``

*Also available on the trained model.*

stoppingTolerance
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much).

*Default value:* ``0.001``

*Also available on the trained model.*

storeKnotLocations
If set to true, will return knot locations as double[][] array for gam column names found knots_for_gam. Default to false.

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

theta
Theta.

*Default value:* ``0.0``

*Also available on the trained model.*

tweedieLinkPower
Tweedie link power.

*Default value:* ``0.0``

*Also available on the trained model.*

tweedieVariancePower
Tweedie variance power.

*Default value:* ``0.0``

*Also available on the trained model.*

validationDataFrame
A data frame dedicated for a validation of the trained model. If the parameters is not set,a validation frame created via the 'splitRatio' parameter. The parameter is not serializable!

*Scala default value:* ``null`` *; Python default value:* ``None``

weightCol
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.

*Scala default value:* ``null`` *; Python default value:* ``None``

*Also available on the trained model.*

withContributions
Enables or disables generating a sub-column of detailedPredictionCol containing Shapley values of original features.

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

withLeafNodeAssignments
Enables or disables computation of leaf node assignments.

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

withStageResults
Enables or disables computation of stage results.

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*