DeepLearningV3.DeepLearningParametersV3 (h2o-algos version 3.2.0.1 API)

java.lang.Object
- water.Iced
- - water.api.Schema<P,S>
  - - water.api.ModelParametersSchema<DeepLearningParameters,DeepLearningV3.DeepLearningParametersV3>
    - - hex.schemas.DeepLearningV3.DeepLearningParametersV3

All Implemented Interfaces:: java.io.Externalizable, java.io.Serializable, java.lang.Cloneable, water.Freezable

Enclosing class:: DeepLearningV3

public static final class DeepLearningV3.DeepLearningParametersV3
extends water.api.ModelParametersSchema<DeepLearningParameters,DeepLearningV3.DeepLearningParametersV3>

See Also:: Serialized Form

Nested Class Summary
- Nested classes/interfaces inherited from class water.api.Schema
  water.api.Schema.Meta

Field Summary

Fields
Modifier and Type	Field and Description
`DeepLearningParameters.Activation`	`activation` The activation function (non-linearity) to be used the neurons in the hidden layers.
`boolean`	`adaptive_rate` The implemented adaptive learning rate algorithm (ADADELTA) automatically combines the benefits of learning rate annealing and momentum training to avoid slow convergence.
`boolean`	`autoencoder`
`double`	`average_activation`
`boolean`	`balance_classes` For imbalanced data, balance training data class counts via over/under-sampling.
`float[]`	`class_sampling_factors` Desired over/under-sampling ratios per class (lexicographic order).
`double`	`classification_stop` The stopping criteria in terms of classification error (1-accuracy) on the training data scoring dataset.
`boolean`	`col_major`
`boolean`	`diagnostics` Gather diagnostics for hidden layers, such as mean and RMS values of learning rate, momentum, weights and biases.
`hex.Distribution.Family`	`distribution`
`double`	`epochs` The number of passes over the training dataset to be carried out.
`double`	`epsilon` The second of two hyper parameters for adaptive learning rate (ADADELTA).
`boolean`	`export_weights_and_biases`
`boolean`	`fast_mode` Enable fast mode (minor approximation in back-propagation), should not affect results significantly.
`static java.lang.String[]`	`fields`
`boolean`	`force_load_balance` Increase training speed on small datasets by splitting it into many chunks to allow utilization of all cores.
`int[]`	`hidden` The number and size of each hidden layer in the model.
`double[]`	`hidden_dropout_ratios` A fraction of the inputs for each hidden layer to be omitted from training in order to improve generalization.
`DeepLearningParameters.InitialWeightDistribution`	`initial_weight_distribution` The distribution from which initial weights are to be drawn.
`double`	`initial_weight_scale` The scale of the distribution function for Uniform or Normal distributions.
`double`	`input_dropout_ratio` A fraction of the features for each training row to be omitted from training in order to improve generalization (dimension sampling).
`double`	`l1` A regularization method that constrains the absolute value of the weights and has the net effect of dropping some weights (setting them to zero) from a model to reduce complexity and avoid overfitting.
`double`	`l2` A regularization method that constrains the sum of the squared weights.
`DeepLearningParameters.Loss`	`loss` The loss (error) function to be minimized by the model.
`float`	`max_after_balance_size` When classes are balanced, limit the resulting dataset size to the specified multiple of the original dataset size.
`int`	`max_categorical_features`
`int`	`max_confusion_matrix_size` For classification models, the maximum size (in terms of classes) of the confusion matrix for it to be printed.
`int`	`max_hit_ratio_k` The maximum number (top K) of predictions to use for hit ratio computation (for multi-class only, 0 to disable)
`float`	`max_w2` A maximum on the sum of the squared incoming weights into any one neuron.
`DeepLearningParameters.MissingValuesHandling`	`missing_values_handling`
`double`	`momentum_ramp` The momentum_ramp parameter controls the amount of learning for which momentum increases (assuming momentum_stable is larger than momentum_start).
`double`	`momentum_stable` The momentum_stable parameter controls the final momentum value reached after momentum_ramp training samples.
`double`	`momentum_start` The momentum_start parameter controls the amount of momentum at the beginning of training.
`boolean`	`nesterov_accelerated_gradient` The Nesterov accelerated gradient descent method is a modification to traditional gradient descent for convex functions.
`boolean`	`overwrite_with_best_model` If enabled, store the best model under the destination key of this model at the end of training.
`boolean`	`quiet_mode` Enable quiet mode for less output to standard output.
`double`	`rate` When adaptive learning rate is disabled, the magnitude of the weight updates are determined by the user specified learning rate (potentially annealed), and are a function of the difference between the predicted value and the target value.
`double`	`rate_annealing` Learning rate annealing reduces the learning rate to "freeze" into local minima in the optimization landscape.
`double`	`rate_decay` The learning rate decay parameter controls the change of learning rate across layers.
`double`	`regression_stop` The stopping criteria in terms of regression error (MSE) on the training data scoring dataset.
`boolean`	`replicate_training_data` Replicate the entire training dataset onto every node for faster training on small datasets.
`boolean`	`reproducible`
`double`	`rho` The first of two hyper parameters for adaptive learning rate (ADADELTA).
`double`	`score_duty_cycle` Maximum fraction of wall clock time spent on model scoring on training and validation samples, and on diagnostics such as computation of feature importances (i.e., not on training).
`double`	`score_interval` The minimum time (in seconds) to elapse between model scoring.
`long`	`score_training_samples` The number of training dataset points to be used for scoring.
`long`	`score_validation_samples` The number of validation dataset points to be used for scoring.
`DeepLearningParameters.ClassSamplingMethod`	`score_validation_sampling` Method used to sample the validation dataset for scoring, see Score Validation Samples above.
`long`	`seed` The random seed controls sampling and initialization.
`boolean`	`shuffle_training_data` Enable shuffling of training data (on each node).
`boolean`	`single_node_mode` Run on a single node for fine-tuning of model parameters.
`boolean`	`sparse`
`double`	`sparsity_beta`
`double`	`target_ratio_comm_to_comp`
`long`	`train_samples_per_iteration` The number of training data rows to be processed per iteration.
`double`	`tweedie_power`
`boolean`	`use_all_factor_levels`
`boolean`	`variable_importances` Whether to compute variable importances for input features.

Fields inherited from class water.api.ModelParametersSchema
checkpoint, fold_assignment, fold_column, ignore_const_cols, ignored_columns, keep_cross_validation_predictions, model_id, nfolds, offset_column, response_column, score_each_iteration, training_frame, validation_frame, weights_column

Constructor Summary

Constructors
Constructor and Description

DeepLearningV3.DeepLearningParametersV3()

Constructors
Constructor and Description
`DeepLearningV3.DeepLearningParametersV3()`

Method Summary
- Methods inherited from class water.api.ModelParametersSchema
  append_field_arrays, fields, fillFromImpl, fillImpl, writeParametersJSON
- Methods inherited from class water.api.Schema
  createAndFillImpl, createImpl, fillFromParms, fillFromParms, get__meta, getExperimentalVersion, getHighestSupportedVersion, getImplClass, getImplClass, getLatestVersion, getSchemaVersion, markdown, markdown, markdown, newInstance, newInstance, registerAllSchemasIfNecessary, schema, schema, schemaClass, schemas, setField
- Methods inherited from class water.Iced
  clone, frozenType, read_impl, read, readExternal, readJSON_impl, readJSON, toJsonString, write_impl, write, writeExternal, writeJSON_impl, writeJSON
- Methods inherited from class java.lang.Object
  equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - fields
```
public static java.lang.String[] fields
```
  - balance_classes
```
@API(help="Balance training data class counts via over/under-sampling (for imbalanced data).",
     level=secondary,
     direction=INOUT,
     gridable=true)
public boolean balance_classes
```
    For imbalanced data, balance training data class counts via over/under-sampling. This can result in improved predictive accuracy.
  - class_sampling_factors
```
@API(help="Desired over/under-sampling ratios per class (in lexicographic order). If not specified, sampling factors will be automatically computed to obtain class balance during training. Requires balance_classes.",
     level=expert,
     direction=INOUT,
     gridable=true)
public float[] class_sampling_factors
```
    Desired over/under-sampling ratios per class (lexicographic order). Only when balance_classes is enabled. If not specified, they will be automatically computed to obtain class balance during training.
  - max_after_balance_size
```
@API(help="Maximum relative size of the training data after balancing class counts (can be less than 1.0). Requires balance_classes.",
     level=expert,
     direction=INOUT,
     gridable=true)
public float max_after_balance_size
```
    When classes are balanced, limit the resulting dataset size to the specified multiple of the original dataset size.
  - max_confusion_matrix_size
```
@API(help="Maximum size (# classes) for confusion matrices to be printed in the Logs",
     level=secondary,
     direction=INOUT,
     gridable=true)
public int max_confusion_matrix_size
```
    For classification models, the maximum size (in terms of classes) of the confusion matrix for it to be printed. This option is meant to avoid printing extremely large confusion matrices.
  - max_hit_ratio_k
```
@API(help="Max. number (top K) of predictions to use for hit ratio computation (for multi-class only, 0 to disable)",
     level=secondary,
     direction=INOUT,
     gridable=true)
public int max_hit_ratio_k
```
    The maximum number (top K) of predictions to use for hit ratio computation (for multi-class only, 0 to disable)
  - overwrite_with_best_model
```
@API(help="If enabled, override the final model with the best model found during training",
     level=expert,
     direction=INOUT)
public boolean overwrite_with_best_model
```
    If enabled, store the best model under the destination key of this model at the end of training. Only applicable if training is not cancelled.
  - autoencoder
```
@API(help="Auto-Encoder",
     level=secondary,
     direction=INOUT)
public boolean autoencoder
```
  - use_all_factor_levels
```
@API(help="Use all factor levels of categorical variables. Otherwise, the first factor level is omitted (without loss of accuracy). Useful for variable importances and auto-enabled for autoencoder.",
     level=secondary,
     direction=INOUT,
     gridable=true)
public boolean use_all_factor_levels
```
  - activation
```
@API(help="Activation function",
     values={"Tanh","TanhWithDropout","Rectifier","RectifierWithDropout","Maxout","MaxoutWithDropout"},
     level=critical,
     direction=INOUT,
     gridable=true)
public DeepLearningParameters.Activation activation
```
    The activation function (non-linearity) to be used the neurons in the hidden layers. Tanh: Hyperbolic tangent function (same as scaled and shifted sigmoid). Rectifier: Chooses the maximum of (0, x) where x is the input value. Maxout: Choose the maximum coordinate of the input vector. With Dropout: Zero out a random user-given fraction of the incoming weights to each hidden layer during training, for each training row. This effectively trains exponentially many models at once, and can improve generalization.
  - hidden
```
@API(help="Hidden layer sizes (e.g. 100,100).",
     level=critical,
     direction=INOUT,
     gridable=true)
public int[] hidden
```
    The number and size of each hidden layer in the model. For example, if a user specifies "100,200,100" a model with 3 hidden layers will be produced, and the middle hidden layer will have 200 neurons.
  - epochs
```
@API(help="How many times the dataset should be iterated (streamed), can be fractional",
     level=critical,
     direction=INOUT,
     gridable=true)
public double epochs
```
    The number of passes over the training dataset to be carried out. It is recommended to start with lower values for initial grid searches. This value can be modified during checkpoint restarts and allows continuation of selected models.
  - train_samples_per_iteration
```
@API(help="Number of training samples (globally) per MapReduce iteration. Special values are 0: one epoch, -1: all available data (e.g., replicated training data), -2: automatic",
     level=secondary,
     direction=INOUT,
     gridable=true)
public long train_samples_per_iteration
```
    The number of training data rows to be processed per iteration. Note that independent of this parameter, each row is used immediately to update the model with (online) stochastic gradient descent. This parameter controls the synchronization period between nodes in a distributed environment and the frequency at which scoring and model cancellation can happen. For example, if it is set to 10,000 on H2O running on 4 nodes, then each node will process 2,500 rows per iteration, sampling randomly from their local data. Then, model averaging between the nodes takes place, and scoring can happen (dependent on scoring interval and duty factor). Special values are 0 for one epoch per iteration, -1 for processing the maximum amount of data per iteration (if **replicate training data** is enabled, N epochs will be trained per iteration on N nodes, otherwise one epoch). Special value of -2 turns on automatic mode (auto-tuning).
  - target_ratio_comm_to_comp
```
@API(help="Target ratio of communication overhead to computation. Only for multi-node operation and train_samples_per_iteration=-2 (auto-tuning)",
     level=expert,
     direction=INOUT,
     gridable=true)
public double target_ratio_comm_to_comp
```
  - seed
```
@API(help="Seed for random numbers (affects sampling) - Note: only reproducible when running single threaded",
     level=expert,
     direction=INOUT,
     gridable=true)
public long seed
```
    The random seed controls sampling and initialization. Reproducible results are only expected with single-threaded operation (i.e., when running on one node, turning off load balancing and providing a small dataset that fits in one chunk). In general, the multi-threaded asynchronous updates to the model parameters will result in (intentional) race conditions and non-reproducible results. Note that deterministic sampling and initialization might still lead to some weak sense of determinism in the model.
  - adaptive_rate
```
@API(help="Adaptive learning rate",
     level=secondary,
     direction=INOUT,
     gridable=true)
public boolean adaptive_rate
```
    The implemented adaptive learning rate algorithm (ADADELTA) automatically combines the benefits of learning rate annealing and momentum training to avoid slow convergence. Specification of only two parameters (rho and epsilon) simplifies hyper parameter search. In some cases, manually controlled (non-adaptive) learning rate and momentum specifications can lead to better results, but require the specification (and hyper parameter search) of up to 7 parameters. If the model is built on a topology with many local minima or long plateaus, it is possible for a constant learning rate to produce sub-optimal results. Learning rate annealing allows digging deeper into local minima, while rate decay allows specification of different learning rates per layer. When the gradient is being estimated in a long valley in the optimization landscape, a large learning rate can cause the gradient to oscillate and move in the wrong direction. When the gradient is computed on a relatively flat surface with small learning rates, the model can converge far slower than necessary.
  - rho
```
@API(help="Adaptive learning rate time decay factor (similarity to prior updates)",
     level=expert,
     direction=INOUT,
     gridable=true)
public double rho
```
    The first of two hyper parameters for adaptive learning rate (ADADELTA). It is similar to momentum and relates to the memory to prior weight updates. Typical values are between 0.9 and 0.999. This parameter is only active if adaptive learning rate is enabled.
  - epsilon
```
@API(help="Adaptive learning rate smoothing factor (to avoid divisions by zero and allow progress)",
     level=expert,
     direction=INOUT,
     gridable=true)
public double epsilon
```
    The second of two hyper parameters for adaptive learning rate (ADADELTA). It is similar to learning rate annealing during initial training and momentum at later stages where it allows forward progress. Typical values are between 1e-10 and 1e-4. This parameter is only active if adaptive learning rate is enabled.
  - rate
```
@API(help="Learning rate (higher => less stable, lower => slower convergence)",
     level=expert,
     direction=INOUT,
     gridable=true)
public double rate
```
    When adaptive learning rate is disabled, the magnitude of the weight updates are determined by the user specified learning rate (potentially annealed), and are a function of the difference between the predicted value and the target value. That difference, generally called delta, is only available at the output layer. To correct the output at each hidden layer, back propagation is used. Momentum modifies back propagation by allowing prior iterations to influence the current update. Using the momentum parameter can aid in avoiding local minima and the associated instability. Too much momentum can lead to instabilities, that's why the momentum is best ramped up slowly. This parameter is only active if adaptive learning rate is disabled.
  - rate_annealing
```
@API(help="Learning rate annealing: rate / (1 + rate_annealing * samples)",
     level=expert,
     direction=INOUT,
     gridable=true)
public double rate_annealing
```
    Learning rate annealing reduces the learning rate to "freeze" into local minima in the optimization landscape. The annealing rate is the inverse of the number of training samples it takes to cut the learning rate in half (e.g., 1e-6 means that it takes 1e6 training samples to halve the learning rate). This parameter is only active if adaptive learning rate is disabled.
  - rate_decay
```
@API(help="Learning rate decay factor between layers (N-th layer: rate*alpha^(N-1))",
     level=expert,
     direction=INOUT,
     gridable=true)
public double rate_decay
```
    The learning rate decay parameter controls the change of learning rate across layers. For example, assume the rate parameter is set to 0.01, and the rate_decay parameter is set to 0.5. Then the learning rate for the weights connecting the input and first hidden layer will be 0.01, the learning rate for the weights connecting the first and the second hidden layer will be 0.005, and the learning rate for the weights connecting the second and third hidden layer will be 0.0025, etc. This parameter is only active if adaptive learning rate is disabled.
  - momentum_start
```
@API(help="Initial momentum at the beginning of training (try 0.5)",
     level=expert,
     direction=INOUT,
     gridable=true)
public double momentum_start
```
    The momentum_start parameter controls the amount of momentum at the beginning of training. This parameter is only active if adaptive learning rate is disabled.
  - momentum_ramp
```
@API(help="Number of training samples for which momentum increases",
     level=expert,
     direction=INOUT)
public double momentum_ramp
```
    The momentum_ramp parameter controls the amount of learning for which momentum increases (assuming momentum_stable is larger than momentum_start). The ramp is measured in the number of training samples. This parameter is only active if adaptive learning rate is disabled.
  - momentum_stable
```
@API(help="Final momentum after the ramp is over (try 0.99)",
     level=expert,
     direction=INOUT,
     gridable=true)
public double momentum_stable
```
    The momentum_stable parameter controls the final momentum value reached after momentum_ramp training samples. The momentum used for training will remain the same for training beyond reaching that point. This parameter is only active if adaptive learning rate is disabled.
  - nesterov_accelerated_gradient
```
@API(help="Use Nesterov accelerated gradient (recommended)",
     level=expert,
     direction=INOUT,
     gridable=true)
public boolean nesterov_accelerated_gradient
```
    The Nesterov accelerated gradient descent method is a modification to traditional gradient descent for convex functions. The method relies on gradient information at various points to build a polynomial approximation that minimizes the residuals in fewer iterations of the descent. This parameter is only active if adaptive learning rate is disabled.
  - input_dropout_ratio
```
@API(help="Input layer dropout ratio (can improve generalization, try 0.1 or 0.2)",
     level=secondary,
     direction=INOUT,
     gridable=true)
public double input_dropout_ratio
```
    A fraction of the features for each training row to be omitted from training in order to improve generalization (dimension sampling).
  - hidden_dropout_ratios
```
@API(help="Hidden layer dropout ratios (can improve generalization), specify one value per hidden layer, defaults to 0.5",
     level=secondary,
     direction=INOUT,
     gridable=true)
public double[] hidden_dropout_ratios
```
    A fraction of the inputs for each hidden layer to be omitted from training in order to improve generalization. Defaults to 0.5 for each hidden layer if omitted.
  - l1
```
@API(help="L1 regularization (can add stability and improve generalization, causes many weights to become 0)",
     level=secondary,
     direction=INOUT,
     gridable=true)
public double l1
```
    A regularization method that constrains the absolute value of the weights and has the net effect of dropping some weights (setting them to zero) from a model to reduce complexity and avoid overfitting.
  - l2
```
@API(help="L2 regularization (can add stability and improve generalization, causes many weights to be small",
     level=secondary,
     direction=INOUT,
     gridable=true)
public double l2
```
    A regularization method that constrains the sum of the squared weights. This method introduces bias into parameter estimates, but frequently produces substantial gains in modeling as estimate variance is reduced.
  - max_w2
```
@API(help="Constraint for squared sum of incoming weights per unit (e.g. for Rectifier)",
     level=expert,
     direction=INOUT,
     gridable=true)
public float max_w2
```
    A maximum on the sum of the squared incoming weights into any one neuron. This tuning parameter is especially useful for unbound activation functions such as Maxout or Rectifier.
  - initial_weight_distribution
```
@API(help="Initial Weight Distribution",
     values={"UniformAdaptive","Uniform","Normal"},
     level=expert,
     direction=INOUT,
     gridable=true)
public DeepLearningParameters.InitialWeightDistribution initial_weight_distribution
```
    The distribution from which initial weights are to be drawn. The default option is an optimized initialization that considers the size of the network. The "uniform" option uses a uniform distribution with a mean of 0 and a given interval. The "normal" option draws weights from the standard normal distribution with a mean of 0 and given standard deviation.
  - initial_weight_scale
```
@API(help="Uniform: -value...value, Normal: stddev)",
     level=expert,
     direction=INOUT,
     gridable=true)
public double initial_weight_scale
```
    The scale of the distribution function for Uniform or Normal distributions. For Uniform, the values are drawn uniformly from -initial_weight_scale...initial_weight_scale. For Normal, the values are drawn from a Normal distribution with a standard deviation of initial_weight_scale.
  - loss
```
@API(help="Loss function",
     values={"Automatic","CrossEntropy","MeanSquare","Huber","Absolute"},
     required=false,
     level=secondary,
     direction=INOUT,
     gridable=true)
public DeepLearningParameters.Loss loss
```
    The loss (error) function to be minimized by the model. CrossEntropy loss is used when the model output consists of independent hypotheses, and the outputs can be interpreted as the probability that each hypothesis is true. Cross entropy is the recommended loss function when the target values are class labels, and especially for imbalanced data. It strongly penalizes error in the prediction of the actual class label. MeanSquare loss is used when the model output are continuous real values, but can be used for classification as well (where it emphasizes the error on all output classes, not just for the actual class).
  - distribution
```
@API(help="Distribution function",
     values={"AUTO","bernoulli","multinomial","gaussian","poisson","gamma","tweedie"},
     level=secondary,
     gridable=true)
public hex.Distribution.Family distribution
```
  - tweedie_power
```
@API(help="Tweedie Power",
     level=secondary)
public double tweedie_power
```
  - score_interval
```
@API(help="Shortest time interval (in secs) between model scoring",
     level=secondary,
     direction=INOUT,
     gridable=true)
public double score_interval
```
    The minimum time (in seconds) to elapse between model scoring. The actual interval is determined by the number of training samples per iteration and the scoring duty cycle.
  - score_training_samples
```
@API(help="Number of training set samples for scoring (0 for all)",
     level=secondary,
     direction=INOUT,
     gridable=true)
public long score_training_samples
```
    The number of training dataset points to be used for scoring. Will be randomly sampled. Use 0 for selecting the entire training dataset.
  - score_validation_samples
```
@API(help="Number of validation set samples for scoring (0 for all)",
     level=secondary,
     direction=INOUT,
     gridable=true)
public long score_validation_samples
```
    The number of validation dataset points to be used for scoring. Can be randomly sampled or stratified (if "balance classes" is set and "score validation sampling" is set to stratify). Use 0 for selecting the entire training dataset.
  - score_duty_cycle
```
@API(help="Maximum duty cycle fraction for scoring (lower: more training, higher: more scoring).",
     level=secondary,
     direction=INOUT,
     gridable=true)
public double score_duty_cycle
```
    Maximum fraction of wall clock time spent on model scoring on training and validation samples, and on diagnostics such as computation of feature importances (i.e., not on training).
  - classification_stop
```
@API(help="Stopping criterion for classification error fraction on training data (-1 to disable)",
     level=expert,
     direction=INOUT,
     gridable=true)
public double classification_stop
```
    The stopping criteria in terms of classification error (1-accuracy) on the training data scoring dataset. When the error is at or below this threshold, training stops.
  - regression_stop
```
@API(help="Stopping criterion for regression error (MSE) on training data (-1 to disable)",
     level=expert,
     direction=INOUT,
     gridable=true)
public double regression_stop
```
    The stopping criteria in terms of regression error (MSE) on the training data scoring dataset. When the error is at or below this threshold, training stops.
  - quiet_mode
```
@API(help="Enable quiet mode for less output to standard output",
     level=expert,
     direction=INOUT,
     gridable=true)
public boolean quiet_mode
```
    Enable quiet mode for less output to standard output.
  - score_validation_sampling
```
@API(help="Method used to sample validation dataset for scoring",
     values={"Uniform","Stratified"},
     level=expert,
     direction=INOUT,
     gridable=true)
public DeepLearningParameters.ClassSamplingMethod score_validation_sampling
```
    Method used to sample the validation dataset for scoring, see Score Validation Samples above.
  - diagnostics
```
@API(help="Enable diagnostics for hidden layers",
     level=expert,
     direction=INOUT)
public boolean diagnostics
```
    Gather diagnostics for hidden layers, such as mean and RMS values of learning rate, momentum, weights and biases.
  - variable_importances
```
@API(help="Compute variable importances for input features (Gedeon method) - can be slow for large networks",
     direction=INOUT,
     gridable=true)
public boolean variable_importances
```
    Whether to compute variable importances for input features. The implemented method (by Gedeon) considers the weights connecting the input features to the first two hidden layers.
  - fast_mode
```
@API(help="Enable fast mode (minor approximation in back-propagation)",
     level=expert,
     direction=INOUT,
     gridable=true)
public boolean fast_mode
```
    Enable fast mode (minor approximation in back-propagation), should not affect results significantly.
  - force_load_balance
```
@API(help="Force extra load balancing to increase training speed for small datasets (to keep all cores busy)",
     level=expert,
     direction=INOUT,
     gridable=true)
public boolean force_load_balance
```
    Increase training speed on small datasets by splitting it into many chunks to allow utilization of all cores.
  - replicate_training_data
```
@API(help="Replicate the entire training dataset onto every node for faster training on small datasets",
     level=secondary,
     direction=INOUT,
     gridable=true)
public boolean replicate_training_data
```
    Replicate the entire training dataset onto every node for faster training on small datasets.
  - single_node_mode
```
@API(help="Run on a single node for fine-tuning of model parameters",
     level=expert,
     direction=INOUT,
     gridable=true)
public boolean single_node_mode
```
    Run on a single node for fine-tuning of model parameters. Can be useful for checkpoint resumes after training on multiple nodes for fast initial convergence.
  - shuffle_training_data
```
@API(help="Enable shuffling of training data (recommended if training data is replicated and train_samples_per_iteration is close to #nodes x #rows, of if using balance_classes)",
     level=expert,
     direction=INOUT,
     gridable=true)
public boolean shuffle_training_data
```
    Enable shuffling of training data (on each node). This option is recommended if training data is replicated on N nodes, and the number of training samples per iteration is close to N times the dataset size, where all nodes train will (almost) all the data. It is automatically enabled if the number of training samples per iteration is set to -1 (or to N times the dataset size or larger).
  - missing_values_handling
```
@API(help="Handling of missing values. Either Skip or MeanImputation.",
     values={"Skip","MeanImputation"},
     level=expert,
     direction=INOUT,
     gridable=true)
public DeepLearningParameters.MissingValuesHandling missing_values_handling
```
  - sparse
```
@API(help="Sparse data handling (Deprecated).",
     level=expert,
     direction=INOUT,
     gridable=true)
public boolean sparse
```
  - col_major
```
@API(help="Use a column major weight matrix for input layer. Can speed up forward propagation, but might slow down backpropagation (Deprecated).",
     level=expert,
     direction=INOUT,
     gridable=true)
public boolean col_major
```
  - average_activation
```
@API(help="Average activation for sparse auto-encoder (Experimental)",
     level=expert,
     direction=INOUT,
     gridable=true)
public double average_activation
```
  - sparsity_beta
```
@API(help="Sparsity regularization (Experimental)",
     level=expert,
     direction=INOUT,
     gridable=true)
public double sparsity_beta
```
  - max_categorical_features
```
@API(help="Max. number of categorical features, enforced via hashing (Experimental)",
     level=expert,
     direction=INOUT,
     gridable=true)
public int max_categorical_features
```
  - reproducible
```
@API(help="Force reproducibility on small data (will be slow - only uses 1 thread)",
     level=expert,
     direction=INOUT,
     gridable=true)
public boolean reproducible
```
  - export_weights_and_biases
```
@API(help="Whether to export Neural Network weights and biases to H2O Frames",
     level=expert,
     direction=INOUT)
public boolean export_weights_and_biases
```
- Constructor Detail
  - DeepLearningV3.DeepLearningParametersV3
```
public DeepLearningV3.DeepLearningParametersV3()
```

Class DeepLearningV3.DeepLearningParametersV3

Nested Class Summary

Nested classes/interfaces inherited from class water.api.Schema

Field Summary

Fields inherited from class water.api.ModelParametersSchema

Constructor Summary

Method Summary

Methods inherited from class water.api.ModelParametersSchema

Methods inherited from class water.api.Schema

Methods inherited from class water.Iced

Methods inherited from class java.lang.Object

Field Detail

fields

balance_classes

class_sampling_factors

max_after_balance_size

max_confusion_matrix_size

max_hit_ratio_k

overwrite_with_best_model

autoencoder

use_all_factor_levels

activation

hidden

epochs

train_samples_per_iteration

target_ratio_comm_to_comp

seed

adaptive_rate

rho

epsilon

rate

rate_annealing

rate_decay

momentum_start

momentum_ramp

momentum_stable

nesterov_accelerated_gradient

input_dropout_ratio

hidden_dropout_ratios

l1

l2

max_w2

initial_weight_distribution

initial_weight_scale

loss

distribution

tweedie_power

score_interval

score_training_samples

score_validation_samples

score_duty_cycle

classification_stop

regression_stop

quiet_mode

score_validation_sampling

diagnostics

variable_importances

fast_mode

force_load_balance

replicate_training_data

single_node_mode

shuffle_training_data

missing_values_handling

sparse

col_major

average_activation

sparsity_beta

max_categorical_features

reproducible

export_weights_and_biases

Constructor Detail

DeepLearningV3.DeepLearningParametersV3