Parameters of H2ORuleFit¶
Affected Classes¶
ai.h2o.sparkling.ml.algos.H2ORuleFit
ai.h2o.sparkling.ml.algos.classification.H2ORuleFitClassifier
ai.h2o.sparkling.ml.algos.regression.H2ORuleFitRegressor
Parameters¶
Each parameter has also a corresponding getter and setter method. (E.g.:
label
->getLabel()
,setLabel(...)
)
- ignoredCols
Names of columns to ignore for training.
Scala default value:
null
; Python default value:None
Also available on the trained model.
- algorithm
The algorithm to use to generate rules. Possible values are
"DRF"
,"GBM"
,"AUTO"
.Default value:
"AUTO"
Also available on the trained model.
- aucType
Set default multinomial AUC type. Possible values are
"AUTO"
,"NONE"
,"MACRO_OVR"
,"WEIGHTED_OVR"
,"MACRO_OVO"
,"WEIGHTED_OVO"
.Default value:
"AUTO"
Also available on the trained model.
- columnsToCategorical
List of columns to convert to categorical before modelling
Scala default value:
Array()
; Python default value:[]
- convertInvalidNumbersToNa
If set to ‘true’, the model converts invalid numbers to NA during making predictions.
Scala default value:
false
; Python default value:False
Also available on the trained model.
- convertUnknownCategoricalLevelsToNa
If set to ‘true’, the model converts unknown categorical levels to NA during making predictions.
Scala default value:
false
; Python default value:False
Also available on the trained model.
- dataFrameSerializer
A full name of a serializer used for serialization and deserialization of Spark DataFrames to a JSON value within NullableDataFrameParam.
Default value:
"ai.h2o.sparkling.utils.JSONDataFrameSerializer"
Also available on the trained model.
- detailedPredictionCol
Column containing additional prediction details, its content depends on the model type.
Default value:
"detailed_prediction"
Also available on the trained model.
- distribution
Distribution function. Possible values are
"AUTO"
,"bernoulli"
,"quasibinomial"
,"modified_huber"
,"multinomial"
,"ordinal"
,"gaussian"
,"poisson"
,"gamma"
,"tweedie"
,"huber"
,"laplace"
,"quantile"
,"fractionalbinomial"
,"negativebinomial"
,"custom"
.Default value:
"AUTO"
Also available on the trained model.
- featuresCols
Name of feature columns
Scala default value:
Array()
; Python default value:[]
Also available on the trained model.
- keepBinaryModels
If set to true, all binary models created during execution of the
fit
method will be kept in DKV of H2O-3 cluster.Scala default value:
false
; Python default value:False
- labelCol
Response variable column.
Default value:
"label"
Also available on the trained model.
- lambdaValue
Lambda for LASSO regressor.
Scala default value:
null
; Python default value:None
Also available on the trained model.
- maxNumRules
The maximum number of rules to return. defaults to -1 which means the number of rules is selectedby diminishing returns in model deviance.
Default value:
-1
Also available on the trained model.
- maxRuleLength
Maximum length of rules. Defaults to 3.
Default value:
3
Also available on the trained model.
- minRuleLength
Minimum length of rules. Defaults to 3.
Default value:
3
Also available on the trained model.
- modelId
Destination id for this model; auto-generated if not specified.
Scala default value:
null
; Python default value:None
- modelType
Specifies type of base learners in the ensemble. Possible values are
"RULES"
,"RULES_AND_LINEAR"
,"LINEAR"
.Default value:
"RULES_AND_LINEAR"
Also available on the trained model.
- namedMojoOutputColumns
Mojo Output is not stored in the array but in the properly named columns
Scala default value:
true
; Python default value:True
Also available on the trained model.
- predictionCol
Prediction column name
Default value:
"prediction"
Also available on the trained model.
- removeDuplicates
Whether to remove rules which are identical to an earlier rule. Defaults to true.
Scala default value:
true
; Python default value:True
Also available on the trained model.
- ruleGenerationNtrees
Specifies the number of trees to build in the tree model. Defaults to 50.
Default value:
50
Also available on the trained model.
- seed
Seed for pseudo random number generator (if applicable).
Scala default value:
-1L
; Python default value:-1
Also available on the trained model.
- splitRatio
Accepts values in range [0, 1.0] which determine how large part of dataset is used for training and for validation. For example, 0.8 -> 80% training 20% validation. This parameter is ignored when validationDataFrame is set.
Default value:
1.0
- validationDataFrame
A data frame dedicated for a validation of the trained model. If the parameters is not set,a validation frame created via the ‘splitRatio’ parameter. The parameter is not serializable!
Scala default value:
null
; Python default value:None
- weightCol
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.
Scala default value:
null
; Python default value:None
Also available on the trained model.
- withContributions
Enables or disables generating a sub-column of detailedPredictionCol containing Shapley values.
Scala default value:
false
; Python default value:False
Also available on the trained model.
- withLeafNodeAssignments
Enables or disables computation of leaf node assignments.
Scala default value:
false
; Python default value:False
Also available on the trained model.
- withStageResults
Enables or disables computation of stage results.
Scala default value:
false
; Python default value:False
Also available on the trained model.