Parameters of H2OIsolationForest

Affected Class



  • Each parameter has also a corresponding getter and setter method. (E.g.: label -> getLabel() , setLabel(...) )


Calibration frame for Platt Scaling. To enable usage of the data frame, set the parameter calibrateModel to True.

Scala default value: null ; Python default value: None


Run on one node only; no network overhead but fewer cpus used. Suitable for small datasets.

Scala default value: false ; Python default value: False

Also available on the trained model.


Encoding scheme for categorical features. Possible values are "AUTO", "OneHotInternal", "OneHotExplicit", "Enum", "Binary", "Eigen", "LabelEncoder", "SortByResponse", "EnumLimited".

Default value: "AUTO"

Also available on the trained model.


Relative change of the column sampling rate for every level (must be > 0.0 and <= 2.0).

Default value: 1.0

Also available on the trained model.


Column sample rate per tree (from 0.0 to 1.0).

Default value: 1.0

Also available on the trained model.


List of columns to convert to categorical before modelling

Scala default value: Array() ; Python default value: []


Contamination ratio - the proportion of anomalies in the input dataset. If undefined (-1) the predict function will not mark observations as anomalies and only anomaly score will be returned. Defaults to -1 (undefined).

Default value: -1.0

Also available on the trained model.


If set to ‘true’, the model converts invalid numbers to NA during making predictions.

Scala default value: false ; Python default value: False

Also available on the trained model.


If set to ‘true’, the model converts unknown categorical levels to NA during making predictions.

Scala default value: false ; Python default value: False

Also available on the trained model.


Column containing additional prediction details, its content depends on the model type.

Default value: "detailed_prediction"

Also available on the trained model.


Automatically export generated models to this directory.

Scala default value: null ; Python default value: None

Also available on the trained model.


Name of feature columns

Scala default value: Array() ; Python default value: []

Also available on the trained model.


Ignore constant columns.

Scala default value: true ; Python default value: True

Also available on the trained model.


Names of columns to ignore for training.

Scala default value: null ; Python default value: None

Also available on the trained model.


Maximum tree depth (0 for unlimited).

Default value: 8

Also available on the trained model.


Maximum allowed runtime in seconds for model training. Use 0 to disable.

Default value: 0.0

Also available on the trained model.


Fewest allowed (weighted) observations in a leaf.

Default value: 1.0

Also available on the trained model.


Destination id for this model; auto-generated if not specified.

Scala default value: null ; Python default value: None


Number of variables randomly sampled as candidates at each split. If set to -1, defaults (number of predictors)/3.

Default value: -1

Also available on the trained model.


Mojo Output is not stored in the array but in the properly named columns

Scala default value: true ; Python default value: True

Also available on the trained model.


Number of trees.

Default value: 50

Also available on the trained model.


Prediction column name

Default value: "prediction"

Also available on the trained model.


Rate of randomly sampled observations used to train each Isolation Forest tree. Needs to be in range from 0.0 to 1.0. If set to -1, sample_rate is disabled and sample_size will be used instead.

Default value: -1.0

Also available on the trained model.


Number of randomly sampled observations used to train each Isolation Forest tree. Only one of parameters sample_size and sample_rate should be defined. If sample_rate is defined, sample_size will be ignored.

Scala default value: 256L ; Python default value: 256

Also available on the trained model.


Whether to score during each iteration of model training.

Scala default value: false ; Python default value: False

Also available on the trained model.


Score the model after every so many trees. Disabled if set to 0.

Default value: 0

Also available on the trained model.


Seed for pseudo random number generator (if applicable).

Scala default value: -1L ; Python default value: -1

Also available on the trained model.


Accepts values in range [0, 1.0] which determine how large part of dataset is used for training and for validation. For example, 0.8 -> 80% training 20% validation. This parameter is ignored when validationDataFrame is set.

Default value: 1.0


Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anonomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client. Possible values are "AUTO", "deviance", "logloss", "MSE", "RMSE", "MAE", "RMSLE", "AUC", "AUCPR", "lift_top_group", "misclassification", "mean_per_class_error", "anomaly_score", "custom", "custom_increasing".

Default value: "AUTO"

Also available on the trained model.


Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable).

Default value: 0

Also available on the trained model.


Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much).

Default value: 0.01

Also available on the trained model.


A data frame dedicated for a validation of the trained model. If the parameters is not set,a validation frame created via the ‘splitRatio’ parameter.

Scala default value: null ; Python default value: None


(experimental) Name of the label column in the validation data frame. The label column should be a string column with two distinct values indicating the anomaly. The negative value must be alphabetically smaller than the positive value. (E.g. ‘0’/’1’, ‘False’/’True’

Default value: "label"


Enables or disables generating a sub-column of detailedPredictionCol containing Shapley values.

Scala default value: false ; Python default value: False

Also available on the trained model.


Enables or disables computation of leaf node assignments.

Scala default value: false ; Python default value: False

Also available on the trained model.


Enables or disables computation of stage results.

Scala default value: false ; Python default value: False

Also available on the trained model.