Parameters of H2ODeepLearning¶
Affected Classes¶
ai.h2o.sparkling.ml.algos.H2ODeepLearningai.h2o.sparkling.ml.algos.classification.H2ODeepLearningClassifierai.h2o.sparkling.ml.algos.regression.H2ODeepLearningRegressor
Parameters¶
Each parameter has also a corresponding getter and setter method. (E.g.:
label->getLabel(),setLabel(...))
- activation
Activation function. Possible values are
"Tanh","TanhWithDropout","Rectifier","RectifierWithDropout","Maxout","MaxoutWithDropout","ExpRectifier","ExpRectifierWithDropout".Default value:
"Rectifier"Also available on the trained model.
- adaptiveRate
Adaptive learning rate.
Scala default value:
true; Python default value:TrueAlso available on the trained model.
- ignoredCols
Names of columns to ignore for training.
Scala default value:
null; Python default value:NoneAlso available on the trained model.
- initialBiases
A array of weight vectors to be used for bias initialization of every network layer.If this parameter is set, the parameter ‘initialWeights’ has to be set as well.
Scala default value:
null; Python default value:None- initialWeights
A array of weight matrices to be used for initialization of the neural network. If this parameter is set, the parameter ‘initialBiases’ has to be set as well.
Scala default value:
null; Python default value:None- aucType
Set default multinomial AUC type. Possible values are
"AUTO","NONE","MACRO_OVR","WEIGHTED_OVR","MACRO_OVO","WEIGHTED_OVO".Default value:
"AUTO"Also available on the trained model.
- averageActivation
Average activation for sparse auto-encoder. #Experimental.
Default value:
0.0Also available on the trained model.
- balanceClasses
Balance training data class counts via over/under-sampling (for imbalanced data).
Scala default value:
false; Python default value:FalseAlso available on the trained model.
- calculateFeatureImportances
Compute variable importances for input features (Gedeon method) - can be slow for large networks.
Scala default value:
true; Python default value:TrueAlso available on the trained model.
- categoricalEncoding
Encoding scheme for categorical features. Possible values are
"AUTO","OneHotInternal","OneHotExplicit","Enum","Binary","Eigen","LabelEncoder","SortByResponse","EnumLimited".Default value:
"AUTO"Also available on the trained model.
- classSamplingFactors
Desired over/under-sampling ratios per class (in lexicographic order). If not specified, sampling factors will be automatically computed to obtain class balance during training. Requires balance_classes.
Scala default value:
null; Python default value:NoneAlso available on the trained model.
- classificationStop
Stopping criterion for classification error fraction on training data (-1 to disable).
Default value:
0.0Also available on the trained model.
- columnsToCategorical
List of columns to convert to categorical before modelling
Scala default value:
Array(); Python default value:[]- convertInvalidNumbersToNa
If set to ‘true’, the model converts invalid numbers to NA during making predictions.
Scala default value:
false; Python default value:FalseAlso available on the trained model.
- convertUnknownCategoricalLevelsToNa
If set to ‘true’, the model converts unknown categorical levels to NA during making predictions.
Scala default value:
false; Python default value:FalseAlso available on the trained model.
- dataFrameSerializer
A full name of a serializer used for serialization and deserialization of Spark DataFrames to a JSON value within NullableDataFrameParam.
Default value:
"ai.h2o.sparkling.utils.JSONDataFrameSerializer"Also available on the trained model.
- detailedPredictionCol
Column containing additional prediction details, its content depends on the model type.
Default value:
"detailed_prediction"Also available on the trained model.
- diagnostics
Enable diagnostics for hidden layers.
Scala default value:
true; Python default value:TrueAlso available on the trained model.
- distribution
Distribution function. Possible values are
"AUTO","bernoulli","quasibinomial","modified_huber","multinomial","ordinal","gaussian","poisson","gamma","tweedie","huber","laplace","quantile","fractionalbinomial","negativebinomial","custom".Default value:
"AUTO"Also available on the trained model.
- elasticAveraging
Elastic averaging between compute nodes can improve distributed model convergence. #Experimental.
Scala default value:
false; Python default value:FalseAlso available on the trained model.
- elasticAveragingMovingRate
Elastic averaging moving rate (only if elastic averaging is enabled).
Default value:
0.9Also available on the trained model.
- elasticAveragingRegularization
Elastic averaging regularization strength (only if elastic averaging is enabled).
Default value:
0.001Also available on the trained model.
- epochs
How many times the dataset should be iterated (streamed), can be fractional.
Default value:
10.0Also available on the trained model.
- epsilon
Adaptive learning rate smoothing factor (to avoid divisions by zero and allow progress).
Scala default value:
1.0e-8; Python default value:1.0E-8Also available on the trained model.
- exportCheckpointsDir
Automatically export generated models to this directory.
Scala default value:
null; Python default value:NoneAlso available on the trained model.
- exportWeightsAndBiases
Whether to export Neural Network weights and biases to H2O Frames.
Scala default value:
false; Python default value:FalseAlso available on the trained model.
- fastMode
Enable fast mode (minor approximation in back-propagation).
Scala default value:
true; Python default value:TrueAlso available on the trained model.
- featuresCols
Name of feature columns
Scala default value:
Array(); Python default value:[]Also available on the trained model.
- foldAssignment
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems. Possible values are
"AUTO","Random","Modulo","Stratified".Default value:
"AUTO"Also available on the trained model.
- foldCol
Column with cross-validation fold index assignment per observation.
Scala default value:
null; Python default value:NoneAlso available on the trained model.
- forceLoadBalance
Force extra load balancing to increase training speed for small datasets (to keep all cores busy).
Scala default value:
true; Python default value:TrueAlso available on the trained model.
- hidden
Hidden layer sizes (e.g. [100, 100]).
Scala default value:
Array(200, 200); Python default value:[200, 200]Also available on the trained model.
- hiddenDropoutRatios
Hidden layer dropout ratios (can improve generalization), specify one value per hidden layer, defaults to 0.5.
Scala default value:
null; Python default value:NoneAlso available on the trained model.
- huberAlpha
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).
Default value:
0.9Also available on the trained model.
- ignoreConstCols
Ignore constant columns.
Scala default value:
true; Python default value:TrueAlso available on the trained model.
- initialWeightDistribution
Initial weight distribution. Possible values are
"UniformAdaptive","Uniform","Normal".Default value:
"UniformAdaptive"Also available on the trained model.
- initialWeightScale
Uniform: -value…value, Normal: stddev.
Default value:
1.0Also available on the trained model.
- inputDropoutRatio
Input layer dropout ratio (can improve generalization, try 0.1 or 0.2).
Default value:
0.0Also available on the trained model.
- keepBinaryModels
If set to true, all binary models created during execution of the
fitmethod will be kept in DKV of H2O-3 cluster.Scala default value:
false; Python default value:False- keepCrossValidationFoldAssignment
Whether to keep the cross-validation fold assignment.
Scala default value:
false; Python default value:FalseAlso available on the trained model.
- keepCrossValidationModels
Whether to keep the cross-validation models.
Scala default value:
true; Python default value:TrueAlso available on the trained model.
- keepCrossValidationPredictions
Whether to keep the predictions of the cross-validation models.
Scala default value:
false; Python default value:FalseAlso available on the trained model.
- l1
L1 regularization (can add stability and improve generalization, causes many weights to become 0).
Default value:
0.0Also available on the trained model.
- l2
L2 regularization (can add stability and improve generalization, causes many weights to be small.
Default value:
0.0Also available on the trained model.
- labelCol
Response variable column.
Default value:
"label"Also available on the trained model.
- loss
Loss function. Possible values are
"Automatic","Quadratic","CrossEntropy","ModifiedHuber","Huber","Absolute","Quantile".Default value:
"Automatic"Also available on the trained model.
- maxAfterBalanceSize
Maximum relative size of the training data after balancing class counts (can be less than 1.0). Requires balance_classes.
Scala default value:
5.0f; Python default value:5.0Also available on the trained model.
- maxCategoricalFeatures
Max. number of categorical features, enforced via hashing. #Experimental.
Default value:
2147483647Also available on the trained model.
- maxRuntimeSecs
Maximum allowed runtime in seconds for model training. Use 0 to disable.
Default value:
0.0Also available on the trained model.
- maxW2
Constraint for squared sum of incoming weights per unit (e.g. for Rectifier).
Scala default value:
3.402823e38f; Python default value:3.402823E38Also available on the trained model.
- miniBatchSize
Mini-batch size (smaller leads to better fit, larger can speed up and generalize better).
Default value:
1Also available on the trained model.
- missingValuesHandling
Handling of missing values. Either MeanImputation or Skip. Possible values are
"MeanImputation","Skip".Default value:
"MeanImputation"Also available on the trained model.
- modelId
Destination id for this model; auto-generated if not specified.
Scala default value:
null; Python default value:None- momentumRamp
Number of training samples for which momentum increases.
Default value:
1000000.0Also available on the trained model.
- momentumStable
Final momentum after the ramp is over (try 0.99).
Default value:
0.0Also available on the trained model.
- momentumStart
Initial momentum at the beginning of training (try 0.5).
Default value:
0.0Also available on the trained model.
- nesterovAcceleratedGradient
Use Nesterov accelerated gradient (recommended).
Scala default value:
true; Python default value:TrueAlso available on the trained model.
- nfolds
Number of folds for K-fold cross-validation (0 to disable or >= 2).
Default value:
0Also available on the trained model.
- offsetCol
Offset column. This will be added to the combination of columns before applying the link function.
Scala default value:
null; Python default value:NoneAlso available on the trained model.
- overwriteWithBestModel
If enabled, override the final model with the best model found during training.
Scala default value:
true; Python default value:TrueAlso available on the trained model.
- predictionCol
Prediction column name
Default value:
"prediction"Also available on the trained model.
- quantileAlpha
Desired quantile for Quantile regression, must be between 0 and 1.
Default value:
0.5Also available on the trained model.
- quietMode
Enable quiet mode for less output to standard output.
Scala default value:
false; Python default value:FalseAlso available on the trained model.
- rate
Learning rate (higher => less stable, lower => slower convergence).
Default value:
0.005Also available on the trained model.
- rateAnnealing
Learning rate annealing: rate / (1 + rate_annealing * samples).
Scala default value:
1.0e-6; Python default value:1.0E-6Also available on the trained model.
- rateDecay
Learning rate decay factor between layers (N-th layer: rate * rate_decay ^ (n - 1).
Default value:
1.0Also available on the trained model.
- regressionStop
Stopping criterion for regression error (MSE) on training data (-1 to disable).
Scala default value:
1.0e-6; Python default value:1.0E-6Also available on the trained model.
- replicateTrainingData
Replicate the entire training dataset onto every node for faster training on small datasets.
Scala default value:
true; Python default value:TrueAlso available on the trained model.
- reproducible
Force reproducibility on small data (will be slow - only uses 1 thread).
Scala default value:
false; Python default value:FalseAlso available on the trained model.
- rho
Adaptive learning rate time decay factor (similarity to prior updates).
Default value:
0.99Also available on the trained model.
- scoreDutyCycle
Maximum duty cycle fraction for scoring (lower: more training, higher: more scoring).
Default value:
0.1Also available on the trained model.
- scoreEachIteration
Whether to score during each iteration of model training.
Scala default value:
false; Python default value:FalseAlso available on the trained model.
- scoreInterval
Shortest time interval (in seconds) between model scoring.
Default value:
5.0Also available on the trained model.
- scoreTrainingSamples
Number of training set samples for scoring (0 for all).
Scala default value:
10000L; Python default value:10000Also available on the trained model.
- scoreValidationSamples
Number of validation set samples for scoring (0 for all).
Scala default value:
0L; Python default value:0Also available on the trained model.
- scoreValidationSampling
Method used to sample validation dataset for scoring. Possible values are
"Uniform","Stratified".Default value:
"Uniform"Also available on the trained model.
- seed
Seed for random numbers (affects sampling) - Note: only reproducible when running single threaded.
Scala default value:
-1L; Python default value:-1Also available on the trained model.
- shuffleTrainingData
Enable shuffling of training data (recommended if training data is replicated and train_samples_per_iteration is close to #nodes x #rows, of if using balance_classes).
Scala default value:
false; Python default value:FalseAlso available on the trained model.
- singleNodeMode
Run on a single node for fine-tuning of model parameters.
Scala default value:
false; Python default value:FalseAlso available on the trained model.
- sparse
Sparse data handling (more efficient for data with lots of 0 values).
Scala default value:
false; Python default value:FalseAlso available on the trained model.
- sparsityBeta
Sparsity regularization. #Experimental.
Default value:
0.0Also available on the trained model.
- splitRatio
Accepts values in range [0, 1.0] which determine how large part of dataset is used for training and for validation. For example, 0.8 -> 80% training 20% validation. This parameter is ignored when validationDataFrame is set.
Default value:
1.0- standardize
If enabled, automatically standardize the data. If disabled, the user must provide properly scaled input data.
Scala default value:
true; Python default value:TrueAlso available on the trained model.
- stoppingMetric
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client. Possible values are
"AUTO","deviance","logloss","MSE","RMSE","MAE","RMSLE","AUC","AUCPR","lift_top_group","misclassification","mean_per_class_error","anomaly_score","custom","custom_increasing".Default value:
"AUTO"Also available on the trained model.
- stoppingRounds
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable).
Default value:
5Also available on the trained model.
- stoppingTolerance
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much).
Default value:
0.0Also available on the trained model.
- targetRatioCommToComp
Target ratio of communication overhead to computation. Only for multi-node operation and train_samples_per_iteration = -2 (auto-tuning).
Default value:
0.05Also available on the trained model.
- trainSamplesPerIteration
Number of training samples (globally) per MapReduce iteration. Special values are 0: one epoch, -1: all available data (e.g., replicated training data), -2: automatic.
Scala default value:
-2L; Python default value:-2Also available on the trained model.
- tweediePower
Tweedie power for Tweedie regression, must be between 1 and 2.
Default value:
1.5Also available on the trained model.
- useAllFactorLevels
Use all factor levels of categorical variables. Otherwise, the first factor level is omitted (without loss of accuracy). Useful for variable importances and auto-enabled for autoencoder.
Scala default value:
true; Python default value:TrueAlso available on the trained model.
- validationDataFrame
A data frame dedicated for a validation of the trained model. If the parameters is not set,a validation frame created via the ‘splitRatio’ parameter. The parameter is not serializable!
Scala default value:
null; Python default value:None- weightCol
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.
Scala default value:
null; Python default value:NoneAlso available on the trained model.
- withContributions
Enables or disables generating a sub-column of detailedPredictionCol containing Shapley values of original features.
Scala default value:
false; Python default value:FalseAlso available on the trained model.
- withLeafNodeAssignments
Enables or disables computation of leaf node assignments.
Scala default value:
false; Python default value:FalseAlso available on the trained model.
- withStageResults
Enables or disables computation of stage results.
Scala default value:
false; Python default value:FalseAlso available on the trained model.