.. _parameters_H2ODeepLearning:

Parameters of H2ODeepLearning
-----------------------------

Affected Classes
################

- ``ai.h2o.sparkling.ml.algos.H2ODeepLearning``
- ``ai.h2o.sparkling.ml.algos.classification.H2ODeepLearningClassifier``
- ``ai.h2o.sparkling.ml.algos.regression.H2ODeepLearningRegressor``

Parameters
##########

- *Each parameter has also a corresponding getter and setter method.*
*(E.g.:* ``label`` *->* ``getLabel()`` *,* ``setLabel(...)`` *)*

activation
Activation function. Possible values are ``"Tanh"``, ``"TanhWithDropout"``, ``"Rectifier"``, ``"RectifierWithDropout"``, ``"Maxout"``, ``"MaxoutWithDropout"``, ``"ExpRectifier"``, ``"ExpRectifierWithDropout"``.

*Default value:* ``"Rectifier"``

*Also available on the trained model.*

adaptiveRate
Adaptive learning rate.

*Scala default value:* ``true`` *; Python default value:* ``True``

*Also available on the trained model.*

ignoredCols
Names of columns to ignore for training.

*Scala default value:* ``null`` *; Python default value:* ``None``

*Also available on the trained model.*

initialBiases
A array of weight vectors to be used for bias initialization of every network layer.If this parameter is set, the parameter 'initialWeights' has to be set as well.

*Scala default value:* ``null`` *; Python default value:* ``None``

initialWeights
A array of weight matrices to be used for initialization of the neural network. If this parameter is set, the parameter 'initialBiases' has to be set as well.

*Scala default value:* ``null`` *; Python default value:* ``None``

aucType
Set default multinomial AUC type. Possible values are ``"AUTO"``, ``"NONE"``, ``"MACRO_OVR"``, ``"WEIGHTED_OVR"``, ``"MACRO_OVO"``, ``"WEIGHTED_OVO"``.

*Default value:* ``"AUTO"``

*Also available on the trained model.*

averageActivation
Average activation for sparse auto-encoder. #Experimental.

*Default value:* ``0.0``

*Also available on the trained model.*

balanceClasses
Balance training data class counts via over/under-sampling (for imbalanced data).

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

calculateFeatureImportances
Compute variable importances for input features (Gedeon method) - can be slow for large networks.

*Scala default value:* ``true`` *; Python default value:* ``True``

*Also available on the trained model.*

categoricalEncoding
Encoding scheme for categorical features. Possible values are ``"AUTO"``, ``"OneHotInternal"``, ``"OneHotExplicit"``, ``"Enum"``, ``"Binary"``, ``"Eigen"``, ``"LabelEncoder"``, ``"SortByResponse"``, ``"EnumLimited"``.

*Default value:* ``"AUTO"``

*Also available on the trained model.*

classSamplingFactors
Desired over/under-sampling ratios per class (in lexicographic order). If not specified, sampling factors will be automatically computed to obtain class balance during training. Requires balance_classes.

*Scala default value:* ``null`` *; Python default value:* ``None``

*Also available on the trained model.*

classificationStop
Stopping criterion for classification error fraction on training data (-1 to disable).

*Default value:* ``0.0``

*Also available on the trained model.*

columnsToCategorical
List of columns to convert to categorical before modelling

*Scala default value:* ``Array()`` *; Python default value:* ``[]``

convertInvalidNumbersToNa
If set to 'true', the model converts invalid numbers to NA during making predictions.

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

convertUnknownCategoricalLevelsToNa
If set to 'true', the model converts unknown categorical levels to NA during making predictions.

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

dataFrameSerializer
A full name of a serializer used for serialization and deserialization of Spark DataFrames to a JSON value within NullableDataFrameParam.

*Default value:* ``"ai.h2o.sparkling.utils.JSONDataFrameSerializer"``

*Also available on the trained model.*

detailedPredictionCol
Column containing additional prediction details, its content depends on the model type.

*Default value:* ``"detailed_prediction"``

*Also available on the trained model.*

diagnostics
Enable diagnostics for hidden layers.

*Scala default value:* ``true`` *; Python default value:* ``True``

*Also available on the trained model.*

distribution
Distribution function. Possible values are ``"AUTO"``, ``"bernoulli"``, ``"quasibinomial"``, ``"modified_huber"``, ``"multinomial"``, ``"ordinal"``, ``"gaussian"``, ``"poisson"``, ``"gamma"``, ``"tweedie"``, ``"huber"``, ``"laplace"``, ``"quantile"``, ``"fractionalbinomial"``, ``"negativebinomial"``, ``"custom"``.

*Default value:* ``"AUTO"``

*Also available on the trained model.*

elasticAveraging
Elastic averaging between compute nodes can improve distributed model convergence. #Experimental.

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

elasticAveragingMovingRate
Elastic averaging moving rate (only if elastic averaging is enabled).

*Default value:* ``0.9``

*Also available on the trained model.*

elasticAveragingRegularization
Elastic averaging regularization strength (only if elastic averaging is enabled).

*Default value:* ``0.001``

*Also available on the trained model.*

epochs
How many times the dataset should be iterated (streamed), can be fractional.

*Default value:* ``10.0``

*Also available on the trained model.*

epsilon
Adaptive learning rate smoothing factor (to avoid divisions by zero and allow progress).

*Scala default value:* ``1.0e-8`` *; Python default value:* ``1.0E-8``

*Also available on the trained model.*

exportCheckpointsDir
Automatically export generated models to this directory.

*Scala default value:* ``null`` *; Python default value:* ``None``

*Also available on the trained model.*

exportWeightsAndBiases
Whether to export Neural Network weights and biases to H2O Frames.

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

fastMode
Enable fast mode (minor approximation in back-propagation).

*Scala default value:* ``true`` *; Python default value:* ``True``

*Also available on the trained model.*

featuresCols
Name of feature columns

*Scala default value:* ``Array()`` *; Python default value:* ``[]``

*Also available on the trained model.*

foldAssignment
Cross-validation fold assignment scheme, if fold_column is not specified. The 'Stratified' option will stratify the folds based on the response variable, for classification problems. Possible values are ``"AUTO"``, ``"Random"``, ``"Modulo"``, ``"Stratified"``.

*Default value:* ``"AUTO"``

*Also available on the trained model.*

foldCol
Column with cross-validation fold index assignment per observation.

*Scala default value:* ``null`` *; Python default value:* ``None``

*Also available on the trained model.*

forceLoadBalance
Force extra load balancing to increase training speed for small datasets (to keep all cores busy).

*Scala default value:* ``true`` *; Python default value:* ``True``

*Also available on the trained model.*

hidden
Hidden layer sizes (e.g. [100, 100]).

*Scala default value:* ``Array(200, 200)`` *; Python default value:* ``[200, 200]``

*Also available on the trained model.*

hiddenDropoutRatios
Hidden layer dropout ratios (can improve generalization), specify one value per hidden layer, defaults to 0.5.

*Scala default value:* ``null`` *; Python default value:* ``None``

*Also available on the trained model.*

huberAlpha
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).

*Default value:* ``0.9``

*Also available on the trained model.*

ignoreConstCols
Ignore constant columns.

*Scala default value:* ``true`` *; Python default value:* ``True``

*Also available on the trained model.*

initialWeightDistribution
Initial weight distribution. Possible values are ``"UniformAdaptive"``, ``"Uniform"``, ``"Normal"``.

*Default value:* ``"UniformAdaptive"``

*Also available on the trained model.*

initialWeightScale
Uniform: -value...value, Normal: stddev.

*Default value:* ``1.0``

*Also available on the trained model.*

inputDropoutRatio
Input layer dropout ratio (can improve generalization, try 0.1 or 0.2).

*Default value:* ``0.0``

*Also available on the trained model.*

keepBinaryModels
If set to true, all binary models created during execution of the ``fit`` method will be kept in DKV of H2O-3 cluster.

*Scala default value:* ``false`` *; Python default value:* ``False``

keepCrossValidationFoldAssignment
Whether to keep the cross-validation fold assignment.

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

keepCrossValidationModels
Whether to keep the cross-validation models.

*Scala default value:* ``true`` *; Python default value:* ``True``

*Also available on the trained model.*

keepCrossValidationPredictions
Whether to keep the predictions of the cross-validation models.

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

l1
L1 regularization (can add stability and improve generalization, causes many weights to become 0).

*Default value:* ``0.0``

*Also available on the trained model.*

l2
L2 regularization (can add stability and improve generalization, causes many weights to be small.

*Default value:* ``0.0``

*Also available on the trained model.*

labelCol
Response variable column.

*Default value:* ``"label"``

*Also available on the trained model.*

loss
Loss function. Possible values are ``"Automatic"``, ``"Quadratic"``, ``"CrossEntropy"``, ``"ModifiedHuber"``, ``"Huber"``, ``"Absolute"``, ``"Quantile"``.

*Default value:* ``"Automatic"``

*Also available on the trained model.*

maxAfterBalanceSize
Maximum relative size of the training data after balancing class counts (can be less than 1.0). Requires balance_classes.

*Scala default value:* ``5.0f`` *; Python default value:* ``5.0``

*Also available on the trained model.*

maxCategoricalFeatures
Max. number of categorical features, enforced via hashing. #Experimental.

*Default value:* ``2147483647``

*Also available on the trained model.*

maxRuntimeSecs
Maximum allowed runtime in seconds for model training. Use 0 to disable.

*Default value:* ``0.0``

*Also available on the trained model.*

maxW2
Constraint for squared sum of incoming weights per unit (e.g. for Rectifier).

*Scala default value:* ``3.402823e38f`` *; Python default value:* ``3.402823E38``

*Also available on the trained model.*

miniBatchSize
Mini-batch size (smaller leads to better fit, larger can speed up and generalize better).

*Default value:* ``1``

*Also available on the trained model.*

missingValuesHandling
Handling of missing values. Either MeanImputation or Skip. Possible values are ``"MeanImputation"``, ``"Skip"``.

*Default value:* ``"MeanImputation"``

*Also available on the trained model.*

modelId
Destination id for this model; auto-generated if not specified.

*Scala default value:* ``null`` *; Python default value:* ``None``

momentumRamp
Number of training samples for which momentum increases.

*Default value:* ``1000000.0``

*Also available on the trained model.*

momentumStable
Final momentum after the ramp is over (try 0.99).

*Default value:* ``0.0``

*Also available on the trained model.*

momentumStart
Initial momentum at the beginning of training (try 0.5).

*Default value:* ``0.0``

*Also available on the trained model.*

nesterovAcceleratedGradient
Use Nesterov accelerated gradient (recommended).

*Scala default value:* ``true`` *; Python default value:* ``True``

*Also available on the trained model.*

nfolds
Number of folds for K-fold cross-validation (0 to disable or >= 2).

*Default value:* ``0``

*Also available on the trained model.*

offsetCol
Offset column. This will be added to the combination of columns before applying the link function.

*Scala default value:* ``null`` *; Python default value:* ``None``

*Also available on the trained model.*

overwriteWithBestModel
If enabled, override the final model with the best model found during training.

*Scala default value:* ``true`` *; Python default value:* ``True``

*Also available on the trained model.*

predictionCol
Prediction column name

*Default value:* ``"prediction"``

*Also available on the trained model.*

quantileAlpha
Desired quantile for Quantile regression, must be between 0 and 1.

*Default value:* ``0.5``

*Also available on the trained model.*

quietMode
Enable quiet mode for less output to standard output.

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

rate
Learning rate (higher => less stable, lower => slower convergence).

*Default value:* ``0.005``

*Also available on the trained model.*

rateAnnealing
Learning rate annealing: rate / (1 + rate_annealing * samples).

*Scala default value:* ``1.0e-6`` *; Python default value:* ``1.0E-6``

*Also available on the trained model.*

rateDecay
Learning rate decay factor between layers (N-th layer: rate * rate_decay ^ (n - 1).

*Default value:* ``1.0``

*Also available on the trained model.*

regressionStop
Stopping criterion for regression error (MSE) on training data (-1 to disable).

*Scala default value:* ``1.0e-6`` *; Python default value:* ``1.0E-6``

*Also available on the trained model.*

replicateTrainingData
Replicate the entire training dataset onto every node for faster training on small datasets.

*Scala default value:* ``true`` *; Python default value:* ``True``

*Also available on the trained model.*

reproducible
Force reproducibility on small data (will be slow - only uses 1 thread).

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

rho
Adaptive learning rate time decay factor (similarity to prior updates).

*Default value:* ``0.99``

*Also available on the trained model.*

scoreDutyCycle
Maximum duty cycle fraction for scoring (lower: more training, higher: more scoring).

*Default value:* ``0.1``

*Also available on the trained model.*

scoreEachIteration
Whether to score during each iteration of model training.

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

scoreInterval
Shortest time interval (in seconds) between model scoring.

*Default value:* ``5.0``

*Also available on the trained model.*

scoreTrainingSamples
Number of training set samples for scoring (0 for all).

*Scala default value:* ``10000L`` *; Python default value:* ``10000``

*Also available on the trained model.*

scoreValidationSamples
Number of validation set samples for scoring (0 for all).

*Scala default value:* ``0L`` *; Python default value:* ``0``

*Also available on the trained model.*

scoreValidationSampling
Method used to sample validation dataset for scoring. Possible values are ``"Uniform"``, ``"Stratified"``.

*Default value:* ``"Uniform"``

*Also available on the trained model.*

seed
Seed for random numbers (affects sampling) - Note: only reproducible when running single threaded.

*Scala default value:* ``-1L`` *; Python default value:* ``-1``

*Also available on the trained model.*

shuffleTrainingData
Enable shuffling of training data (recommended if training data is replicated and train_samples_per_iteration is close to #nodes x #rows, of if using balance_classes).

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

singleNodeMode
Run on a single node for fine-tuning of model parameters.

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

sparse
Sparse data handling (more efficient for data with lots of 0 values).

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

sparsityBeta
Sparsity regularization. #Experimental.

*Default value:* ``0.0``

*Also available on the trained model.*

splitRatio
Accepts values in range [0, 1.0] which determine how large part of dataset is used for training and for validation. For example, 0.8 -> 80% training 20% validation. This parameter is ignored when validationDataFrame is set.

*Default value:* ``1.0``

standardize
If enabled, automatically standardize the data. If disabled, the user must provide properly scaled input data.

*Scala default value:* ``true`` *; Python default value:* ``True``

*Also available on the trained model.*

stoppingMetric
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client. Possible values are ``"AUTO"``, ``"deviance"``, ``"logloss"``, ``"MSE"``, ``"RMSE"``, ``"MAE"``, ``"RMSLE"``, ``"AUC"``, ``"AUCPR"``, ``"lift_top_group"``, ``"misclassification"``, ``"mean_per_class_error"``, ``"anomaly_score"``, ``"custom"``, ``"custom_increasing"``.

*Default value:* ``"AUTO"``

*Also available on the trained model.*

stoppingRounds
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable).

*Default value:* ``5``

*Also available on the trained model.*

stoppingTolerance
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much).

*Default value:* ``0.0``

*Also available on the trained model.*

targetRatioCommToComp
Target ratio of communication overhead to computation. Only for multi-node operation and train_samples_per_iteration = -2 (auto-tuning).

*Default value:* ``0.05``

*Also available on the trained model.*

trainSamplesPerIteration
Number of training samples (globally) per MapReduce iteration. Special values are 0: one epoch, -1: all available data (e.g., replicated training data), -2: automatic.

*Scala default value:* ``-2L`` *; Python default value:* ``-2``

*Also available on the trained model.*

tweediePower
Tweedie power for Tweedie regression, must be between 1 and 2.

*Default value:* ``1.5``

*Also available on the trained model.*

useAllFactorLevels
Use all factor levels of categorical variables. Otherwise, the first factor level is omitted (without loss of accuracy). Useful for variable importances and auto-enabled for autoencoder.

*Scala default value:* ``true`` *; Python default value:* ``True``

*Also available on the trained model.*

validationDataFrame
A data frame dedicated for a validation of the trained model. If the parameters is not set,a validation frame created via the 'splitRatio' parameter. The parameter is not serializable!

*Scala default value:* ``null`` *; Python default value:* ``None``

weightCol
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.

*Scala default value:* ``null`` *; Python default value:* ``None``

*Also available on the trained model.*

withContributions
Enables or disables generating a sub-column of detailedPredictionCol containing Shapley values of original features.

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

withLeafNodeAssignments
Enables or disables computation of leaf node assignments.

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*

withStageResults
Enables or disables computation of stage results.

*Scala default value:* ``false`` *; Python default value:* ``False``

*Also available on the trained model.*