.. _parameters_H2OIsolationForest:

Parameters of H2OIsolationForest
--------------------------------

Affected Class
##############

- ``ai.h2o.sparkling.ml.algos.H2OIsolationForest``

Parameters
##########

- *Each parameter has also a corresponding getter and setter method.*
  *(E.g.:* ``label`` *->* ``getLabel()`` *,* ``setLabel(...)`` *)*

calibrationDataFrame
  Calibration frame for Platt Scaling. To enable usage of the data frame, set the parameter calibrateModel to True.

  *Scala default value:* ``null`` *; Python default value:* ``None``
  

buildTreeOneNode
  Run on one node only; no network overhead but fewer cpus used. Suitable for small datasets.

  *Scala default value:* ``false`` *; Python default value:* ``False``
  
  *Also available on the trained model.*

categoricalEncoding
  Encoding scheme for categorical features. Possible values are ``"AUTO"``, ``"OneHotInternal"``, ``"OneHotExplicit"``, ``"Enum"``, ``"Binary"``, ``"Eigen"``, ``"LabelEncoder"``, ``"SortByResponse"``, ``"EnumLimited"``.

  *Default value:* ``"AUTO"``
  
  *Also available on the trained model.*

colSampleRateChangePerLevel
  Relative change of the column sampling rate for every level (must be > 0.0 and <= 2.0).

  *Default value:* ``1.0``
  
  *Also available on the trained model.*

colSampleRatePerTree
  Column sample rate per tree (from 0.0 to 1.0).

  *Default value:* ``1.0``
  
  *Also available on the trained model.*

columnsToCategorical
  List of columns to convert to categorical before modelling

  *Scala default value:* ``Array()`` *; Python default value:* ``[]``
  

contamination
  Contamination ratio - the proportion of anomalies in the input dataset. If undefined (-1) the predict function will not mark observations as anomalies and only anomaly score will be returned. Defaults to -1 (undefined).

  *Default value:* ``-1.0``
  
  *Also available on the trained model.*

convertInvalidNumbersToNa
  If set to 'true', the model converts invalid numbers to NA during making predictions.

  *Scala default value:* ``false`` *; Python default value:* ``False``
  
  *Also available on the trained model.*

convertUnknownCategoricalLevelsToNa
  If set to 'true', the model converts unknown categorical levels to NA during making predictions.

  *Scala default value:* ``false`` *; Python default value:* ``False``
  
  *Also available on the trained model.*

dataFrameSerializer
  A full name of a serializer used for serialization and deserialization of Spark DataFrames to a JSON value within NullableDataFrameParam.

  *Default value:* ``"ai.h2o.sparkling.utils.JSONDataFrameSerializer"``
  
  *Also available on the trained model.*

detailedPredictionCol
  Column containing additional prediction details, its content depends on the model type.

  *Default value:* ``"detailed_prediction"``
  
  *Also available on the trained model.*

exportCheckpointsDir
  Automatically export generated models to this directory.

  *Scala default value:* ``null`` *; Python default value:* ``None``
  
  *Also available on the trained model.*

featuresCols
  Name of feature columns

  *Scala default value:* ``Array()`` *; Python default value:* ``[]``
  
  *Also available on the trained model.*

ignoreConstCols
  Ignore constant columns.

  *Scala default value:* ``true`` *; Python default value:* ``True``
  
  *Also available on the trained model.*

ignoredCols
  Names of columns to ignore for training.

  *Scala default value:* ``null`` *; Python default value:* ``None``
  
  *Also available on the trained model.*

keepBinaryModels
  If set to true, all binary models created during execution of the ``fit`` method will be kept in DKV of H2O-3 cluster.

  *Scala default value:* ``false`` *; Python default value:* ``False``
  

maxDepth
  Maximum tree depth (0 for unlimited).

  *Default value:* ``8``
  
  *Also available on the trained model.*

maxRuntimeSecs
  Maximum allowed runtime in seconds for model training. Use 0 to disable.

  *Default value:* ``0.0``
  
  *Also available on the trained model.*

minRows
  Fewest allowed (weighted) observations in a leaf.

  *Default value:* ``1.0``
  
  *Also available on the trained model.*

modelId
  Destination id for this model; auto-generated if not specified.

  *Scala default value:* ``null`` *; Python default value:* ``None``
  

mtries
  Number of variables randomly sampled as candidates at each split. If set to -1, defaults (number of predictors)/3.

  *Default value:* ``-1``
  
  *Also available on the trained model.*

namedMojoOutputColumns
  Mojo Output is not stored in the array but in the properly named columns

  *Scala default value:* ``true`` *; Python default value:* ``True``
  
  *Also available on the trained model.*

ntrees
  Number of trees.

  *Default value:* ``50``
  
  *Also available on the trained model.*

predictionCol
  Prediction column name

  *Default value:* ``"prediction"``
  
  *Also available on the trained model.*

sampleRate
  Rate of randomly sampled observations used to train each Isolation Forest tree. Needs to be in range from 0.0 to 1.0. If set to -1, sample_rate is disabled and sample_size will be used instead.

  *Default value:* ``-1.0``
  
  *Also available on the trained model.*

sampleSize
  Number of randomly sampled observations used to train each Isolation Forest tree. Only one of parameters sample_size and sample_rate should be defined. If sample_rate is defined, sample_size will be ignored.

  *Scala default value:* ``256L`` *; Python default value:* ``256``
  
  *Also available on the trained model.*

scoreEachIteration
  Whether to score during each iteration of model training.

  *Scala default value:* ``false`` *; Python default value:* ``False``
  
  *Also available on the trained model.*

scoreTreeInterval
  Score the model after every so many trees. Disabled if set to 0.

  *Default value:* ``0``
  
  *Also available on the trained model.*

seed
  Seed for pseudo random number generator (if applicable).

  *Scala default value:* ``-1L`` *; Python default value:* ``-1``
  
  *Also available on the trained model.*

splitRatio
  Accepts values in range [0, 1.0] which determine how large part of dataset is used for training and for validation. For example, 0.8 -> 80% training 20% validation. This parameter is ignored when validationDataFrame is set.

  *Default value:* ``1.0``
  

stoppingMetric
  Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anonomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client. Possible values are ``"AUTO"``, ``"deviance"``, ``"logloss"``, ``"MSE"``, ``"RMSE"``, ``"MAE"``, ``"RMSLE"``, ``"AUC"``, ``"AUCPR"``, ``"lift_top_group"``, ``"misclassification"``, ``"mean_per_class_error"``, ``"anomaly_score"``, ``"custom"``, ``"custom_increasing"``.

  *Default value:* ``"AUTO"``
  
  *Also available on the trained model.*

stoppingRounds
  Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable).

  *Default value:* ``0``
  
  *Also available on the trained model.*

stoppingTolerance
  Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much).

  *Default value:* ``0.01``
  
  *Also available on the trained model.*

validationDataFrame
  A data frame dedicated for a validation of the trained model. If the parameters is not set,a validation frame created via the 'splitRatio' parameter. The parameter is not serializable!

  *Scala default value:* ``null`` *; Python default value:* ``None``
  

validationLabelCol
  (experimental) Name of the label column in the validation data frame. The label column should be a string column with two distinct values indicating the anomaly. The negative value must be alphabetically smaller than the positive value. (E.g. '0'/'1', 'False'/'True'

  *Default value:* ``"label"``
  

withContributions
  Enables or disables generating a sub-column of detailedPredictionCol containing Shapley values.

  *Scala default value:* ``false`` *; Python default value:* ``False``
  
  *Also available on the trained model.*

withLeafNodeAssignments
  Enables or disables computation of leaf node assignments.

  *Scala default value:* ``false`` *; Python default value:* ``False``
  
  *Also available on the trained model.*

withStageResults
  Enables or disables computation of stage results.

  *Scala default value:* ``false`` *; Python default value:* ``False``
  
  *Also available on the trained model.*