Parameters of H2OAutoEncoder¶
Affected Class¶
- ai.h2o.sparkling.ml.features.H2OAutoEncoder
Parameters¶
- Each parameter has also a corresponding getter and setter method. (E.g.: - label->- getLabel(),- setLabel(...))
- activation
- Activation function. Possible values are - "Tanh",- "TanhWithDropout",- "Rectifier",- "RectifierWithDropout",- "Maxout",- "MaxoutWithDropout",- "ExpRectifier",- "ExpRectifierWithDropout".- Default value: - "Rectifier"- Also available on the trained model. 
- adaptiveRate
- Adaptive learning rate. - Scala default value: - true; Python default value:- True- Also available on the trained model. 
- mseCol
- MSE column name. This column contains mean square error calculated from original and output values. - Default value: - "H2OAutoEncoder_28043ca1df97__mse"- Also available on the trained model. 
- originalCol
- Original column name. This column contains input values to the neural network of auto encoder. - Default value: - "H2OAutoEncoder_28043ca1df97__original"- Also available on the trained model. 
- withMSECol
- A flag identifying whether a column with mean square error will be produced or not. - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- withOriginalCol
- A flag identifying whether a column with input values to the neural network will be produced or not. - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- ignoredCols
- Names of columns to ignore for training. - Scala default value: - null; Python default value:- None- Also available on the trained model. 
- initialBiases
- A array of weight vectors to be used for bias initialization of every network layer.If this parameter is set, the parameter ‘initialWeights’ has to be set as well. - Scala default value: - null; Python default value:- None
- initialWeights
- A array of weight matrices to be used for initialization of the neural network. If this parameter is set, the parameter ‘initialBiases’ has to be set as well. - Scala default value: - null; Python default value:- None
- outputCol
- Output column name - Default value: - "H2OAutoEncoder_28043ca1df97__output"- Also available on the trained model. 
- averageActivation
- Average activation for sparse auto-encoder. #Experimental. - Default value: - 0.0- Also available on the trained model. 
- calculateFeatureImportances
- Compute variable importances for input features (Gedeon method) - can be slow for large networks. - Scala default value: - true; Python default value:- True- Also available on the trained model. 
- categoricalEncoding
- Encoding scheme for categorical features. Possible values are - "AUTO",- "OneHotInternal",- "OneHotExplicit",- "Enum",- "Binary",- "Eigen",- "LabelEncoder",- "SortByResponse",- "EnumLimited".- Default value: - "AUTO"- Also available on the trained model. 
- columnsToCategorical
- List of columns to convert to categorical before modelling - Scala default value: - Array(); Python default value:- []
- convertInvalidNumbersToNa
- If set to ‘true’, the model converts invalid numbers to NA during making predictions. - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- convertUnknownCategoricalLevelsToNa
- If set to ‘true’, the model converts unknown categorical levels to NA during making predictions. - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- dataFrameSerializer
- A full name of a serializer used for serialization and deserialization of Spark DataFrames to a JSON value within NullableDataFrameParam. - Default value: - "ai.h2o.sparkling.utils.JSONDataFrameSerializer"- Also available on the trained model. 
- diagnostics
- Enable diagnostics for hidden layers. - Scala default value: - true; Python default value:- True- Also available on the trained model. 
- elasticAveraging
- Elastic averaging between compute nodes can improve distributed model convergence. #Experimental. - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- elasticAveragingMovingRate
- Elastic averaging moving rate (only if elastic averaging is enabled). - Default value: - 0.9- Also available on the trained model. 
- elasticAveragingRegularization
- Elastic averaging regularization strength (only if elastic averaging is enabled). - Default value: - 0.001- Also available on the trained model. 
- epochs
- How many times the dataset should be iterated (streamed), can be fractional. - Default value: - 10.0- Also available on the trained model. 
- epsilon
- Adaptive learning rate smoothing factor (to avoid divisions by zero and allow progress). - Scala default value: - 1.0e-8; Python default value:- 1.0E-8- Also available on the trained model. 
- exportCheckpointsDir
- Automatically export generated models to this directory. - Scala default value: - null; Python default value:- None- Also available on the trained model. 
- exportWeightsAndBiases
- Whether to export Neural Network weights and biases to H2O Frames. - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- fastMode
- Enable fast mode (minor approximation in back-propagation). - Scala default value: - true; Python default value:- True- Also available on the trained model. 
- forceLoadBalance
- Force extra load balancing to increase training speed for small datasets (to keep all cores busy). - Scala default value: - true; Python default value:- True- Also available on the trained model. 
- hidden
- Hidden layer sizes (e.g. [100, 100]). - Scala default value: - Array(200, 200); Python default value:- [200, 200]- Also available on the trained model. 
- hiddenDropoutRatios
- Hidden layer dropout ratios (can improve generalization), specify one value per hidden layer, defaults to 0.5. - Scala default value: - null; Python default value:- None- Also available on the trained model. 
- ignoreConstCols
- Ignore constant columns. - Scala default value: - true; Python default value:- True- Also available on the trained model. 
- initialWeightDistribution
- Initial weight distribution. Possible values are - "UniformAdaptive",- "Uniform",- "Normal".- Default value: - "UniformAdaptive"- Also available on the trained model. 
- initialWeightScale
- Uniform: -value…value, Normal: stddev. - Default value: - 1.0- Also available on the trained model. 
- inputCols
- The array of input columns - Scala default value: - Array(); Python default value:- []- Also available on the trained model. 
- inputDropoutRatio
- Input layer dropout ratio (can improve generalization, try 0.1 or 0.2). - Default value: - 0.0- Also available on the trained model. 
- keepBinaryModels
- If set to true, all binary models created during execution of the - fitmethod will be kept in DKV of H2O-3 cluster.- Scala default value: - false; Python default value:- False
- l1
- L1 regularization (can add stability and improve generalization, causes many weights to become 0). - Default value: - 0.0- Also available on the trained model. 
- l2
- L2 regularization (can add stability and improve generalization, causes many weights to be small. - Default value: - 0.0- Also available on the trained model. 
- loss
- Loss function. Possible values are - "Automatic",- "Quadratic",- "CrossEntropy",- "ModifiedHuber",- "Huber",- "Absolute",- "Quantile".- Default value: - "Automatic"- Also available on the trained model. 
- maxCategoricalFeatures
- Max. number of categorical features, enforced via hashing. #Experimental. - Default value: - 2147483647- Also available on the trained model. 
- maxRuntimeSecs
- Maximum allowed runtime in seconds for model training. Use 0 to disable. - Default value: - 0.0- Also available on the trained model. 
- maxW2
- Constraint for squared sum of incoming weights per unit (e.g. for Rectifier). - Scala default value: - 3.402823e38f; Python default value:- 3.402823E38- Also available on the trained model. 
- miniBatchSize
- Mini-batch size (smaller leads to better fit, larger can speed up and generalize better). - Default value: - 1- Also available on the trained model. 
- missingValuesHandling
- Handling of missing values. Either MeanImputation or Skip. Possible values are - "MeanImputation",- "Skip".- Default value: - "MeanImputation"- Also available on the trained model. 
- modelId
- Destination id for this model; auto-generated if not specified. - Scala default value: - null; Python default value:- None
- momentumRamp
- Number of training samples for which momentum increases. - Default value: - 1000000.0- Also available on the trained model. 
- momentumStable
- Final momentum after the ramp is over (try 0.99). - Default value: - 0.0- Also available on the trained model. 
- momentumStart
- Initial momentum at the beginning of training (try 0.5). - Default value: - 0.0- Also available on the trained model. 
- nesterovAcceleratedGradient
- Use Nesterov accelerated gradient (recommended). - Scala default value: - true; Python default value:- True- Also available on the trained model. 
- overwriteWithBestModel
- If enabled, override the final model with the best model found during training. - Scala default value: - true; Python default value:- True- Also available on the trained model. 
- quietMode
- Enable quiet mode for less output to standard output. - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- rate
- Learning rate (higher => less stable, lower => slower convergence). - Default value: - 0.005- Also available on the trained model. 
- rateAnnealing
- Learning rate annealing: rate / (1 + rate_annealing * samples). - Scala default value: - 1.0e-6; Python default value:- 1.0E-6- Also available on the trained model. 
- rateDecay
- Learning rate decay factor between layers (N-th layer: rate * rate_decay ^ (n - 1). - Default value: - 1.0- Also available on the trained model. 
- replicateTrainingData
- Replicate the entire training dataset onto every node for faster training on small datasets. - Scala default value: - true; Python default value:- True- Also available on the trained model. 
- reproducible
- Force reproducibility on small data (will be slow - only uses 1 thread). - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- rho
- Adaptive learning rate time decay factor (similarity to prior updates). - Default value: - 0.99- Also available on the trained model. 
- scoreDutyCycle
- Maximum duty cycle fraction for scoring (lower: more training, higher: more scoring). - Default value: - 0.1- Also available on the trained model. 
- scoreEachIteration
- Whether to score during each iteration of model training. - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- scoreInterval
- Shortest time interval (in seconds) between model scoring. - Default value: - 5.0- Also available on the trained model. 
- scoreTrainingSamples
- Number of training set samples for scoring (0 for all). - Scala default value: - 10000L; Python default value:- 10000- Also available on the trained model. 
- scoreValidationSamples
- Number of validation set samples for scoring (0 for all). - Scala default value: - 0L; Python default value:- 0- Also available on the trained model. 
- scoreValidationSampling
- Method used to sample validation dataset for scoring. Possible values are - "Uniform",- "Stratified".- Default value: - "Uniform"- Also available on the trained model. 
- seed
- Seed for random numbers (affects sampling) - Note: only reproducible when running single threaded. - Scala default value: - -1L; Python default value:- -1- Also available on the trained model. 
- shuffleTrainingData
- Enable shuffling of training data (recommended if training data is replicated and train_samples_per_iteration is close to #nodes x #rows, of if using balance_classes). - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- singleNodeMode
- Run on a single node for fine-tuning of model parameters. - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- sparse
- Sparse data handling (more efficient for data with lots of 0 values). - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- sparsityBeta
- Sparsity regularization. #Experimental. - Default value: - 0.0- Also available on the trained model. 
- splitRatio
- Accepts values in range [0, 1.0] which determine how large part of dataset is used for training and for validation. For example, 0.8 -> 80% training 20% validation. This parameter is ignored when validationDataFrame is set. - Default value: - 1.0
- standardize
- If enabled, automatically standardize the data. If disabled, the user must provide properly scaled input data. - Scala default value: - true; Python default value:- True- Also available on the trained model. 
- stoppingMetric
- Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client. Possible values are - "AUTO",- "deviance",- "logloss",- "MSE",- "RMSE",- "MAE",- "RMSLE",- "AUC",- "AUCPR",- "lift_top_group",- "misclassification",- "mean_per_class_error",- "anomaly_score",- "custom",- "custom_increasing".- Default value: - "AUTO"- Also available on the trained model. 
- stoppingRounds
- Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable). - Default value: - 5- Also available on the trained model. 
- stoppingTolerance
- Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much). - Default value: - 0.0- Also available on the trained model. 
- targetRatioCommToComp
- Target ratio of communication overhead to computation. Only for multi-node operation and train_samples_per_iteration = -2 (auto-tuning). - Default value: - 0.05- Also available on the trained model. 
- trainSamplesPerIteration
- Number of training samples (globally) per MapReduce iteration. Special values are 0: one epoch, -1: all available data (e.g., replicated training data), -2: automatic. - Scala default value: - -2L; Python default value:- -2- Also available on the trained model. 
- useAllFactorLevels
- Use all factor levels of categorical variables. Otherwise, the first factor level is omitted (without loss of accuracy). Useful for variable importances and auto-enabled for autoencoder. - Scala default value: - true; Python default value:- True- Also available on the trained model. 
- validationDataFrame
- A data frame dedicated for a validation of the trained model. If the parameters is not set,a validation frame created via the ‘splitRatio’ parameter. The parameter is not serializable! - Scala default value: - null; Python default value:- None
- weightCol
- Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0. - Scala default value: - null; Python default value:- None- Also available on the trained model.