Modeling In H2O¶
H2OEstimator
¶
-
class
h2o.estimators.estimator_base.
H2OEstimator
[source]¶ Bases:
h2o.model.model_base.ModelBase
H2O Estimators
- H2O Estimators implement the following methods for model construction:
- start - Top-level user-facing API for asynchronous model build
- join - Top-level user-facing API for blocking on async model build
- train - Top-level user-facing API for model building.
- fit - Used by scikit-learn.
Because H2OEstimator instances are instances of ModelBase, these objects can use the H2O model API.
-
fit
(X, y=None, **params)[source]¶ Fit an H2O model as part of a scikit-learn pipeline or grid search.
A warning will be issued if a caller other than sklearn attempts to use this method.
Parameters: X : H2OFrame
An H2OFrame consisting of the predictor variables.
- y : H2OFrame, optional
An H2OFrame consisting of the response variable.
- params : optional
Extra arguments.
Returns: The current instance of H2OEstimator for method chaining.
-
get_params
(deep=True)[source]¶ Useful method for obtaining parameters for this estimator. Used primarily for sklearn Pipelines and sklearn grid search.
Parameters: deep : bool, optional
If True, return parameters of all sub-objects that are estimators.
Returns: A dict of parameters
-
set_params
(**parms)[source]¶ Used by sklearn for updating parameters during grid search.
Parameters: parms : dict
A dictionary of parameters that will be set on this model.
Returns: Returns self, the current estimator object with the parameters all set as desired.
-
start
(x, y=None, training_frame=None, offset_column=None, fold_column=None, weights_column=None, validation_frame=None, **params)[source]¶ Asynchronous model build by specifying the predictor columns, response column, and any additional frame-specific values.
To block for results, call join.
Parameters: x : list
A list of column names or indices indicating the predictor columns.
- y : str
An index or a column name indicating the response column.
- training_frame : H2OFrame
The H2OFrame having the columns indicated by x and y (as well as any additional columns specified by fold, offset, and weights).
- offset_column : str, optional
The name or index of the column in training_frame that holds the offsets.
- fold_column : str, optional
The name or index of the column in training_frame that holds the per-row fold assignments.
- weights_column : str, optional
The name or index of the column in training_frame that holds the per-row weights.
- validation_frame : H2OFrame, optional
H2OFrame with validation data to be scored on while training.
-
train
(x, y=None, training_frame=None, offset_column=None, fold_column=None, weights_column=None, validation_frame=None, max_runtime_secs=None, **params)[source]¶ Train the H2O model by specifying the predictor columns, response column, and any additional frame-specific values.
Parameters: x : list
A list of column names or indices indicating the predictor columns.
- y : str
An index or a column name indicating the response column.
- training_frame : H2OFrame
The H2OFrame having the columns indicated by x and y (as well as any additional columns specified by fold, offset, and weights).
- offset_column : str, optional
The name or index of the column in training_frame that holds the offsets.
- fold_column : str, optional
The name or index of the column in training_frame that holds the per-row fold assignments.
- weights_column : str, optional
The name or index of the column in training_frame that holds the per-row weights.
- validation_frame : H2OFrame, optional
H2OFrame with validation data to be scored on while training.
- max_runtime_secs : float
Maximum allowed runtime in seconds for model training. Use 0 to disable.
H2ODeepLearningEstimator
¶
-
class
h2o.estimators.deeplearning.
H2ODeepLearningEstimator
(**kwargs)[source]¶ Bases:
h2o.estimators.estimator_base.H2OEstimator
Examples
>>> import h2o as ml >>> from h2o.estimators.deeplearning import H2ODeepLearningEstimator >>> ml.init() >>> rows = [[1,2,3,4,0], [2,1,2,4,1], [2,1,4,2,1], [0,1,2,34,1], [2,3,4,1,0]] * 50 >>> fr = ml.H2OFrame(rows) >>> fr[4] = fr[4].asfactor() >>> model = H2ODeepLearningEstimator() >>> model.train(x=range(4), y=4, training_frame=fr)
H2OAutoEncoderEstimator
¶
-
class
h2o.estimators.deeplearning.
H2OAutoEncoderEstimator
(**kwargs)[source]¶ Bases:
h2o.estimators.deeplearning.H2ODeepLearningEstimator
Examples
>>> import h2o as ml >>> from h2o.estimators.deeplearning import H2OAutoEncoderEstimator >>> ml.init() >>> rows = [[1,2,3,4,0]*50, [2,1,2,4,1]*50, [2,1,4,2,1]*50, [0,1,2,34,1]*50, [2,3,4,1,0]*50] >>> fr = ml.H2OFrame(rows) >>> fr[4] = fr[4].asfactor() >>> model = H2OAutoEncoderEstimator() >>> model.train(x=range(4), training_frame=fr)
H2ORandomForestEstimator
¶
H2OGradientBoostingEstimator
¶
H2OGeneralizedLinearEstimator
¶
-
class
h2o.estimators.glm.
H2OGeneralizedLinearEstimator
(**kwargs)[source]¶ Bases:
h2o.estimators.estimator_base.H2OEstimator
Returns: A subclass of ModelBase is returned. The specific subclass depends on the machine learning task at hand
(if it’s binomial classification, then an H2OBinomialModel is returned, if it’s regression then a
H2ORegressionModel is returned). The default print-out of the models is shown, but further GLM-specific
information can be queried out of the object. Upon completion of the GLM, the resulting object has
coefficients, normalized coefficients, residual/null deviance, aic, and a host of model metrics including
MSE, AUC (for logistic regression), degrees of freedom, and confusion matrices.
-
Lambda
¶ [DEPRECATED] Use self.lambda_ instead
-
static
getGLMRegularizationPath
(model)[source]¶ Extract full regularization path explored during lambda search from glm model. @param model - source lambda search model
-
lambda_
¶ [DEPRECATED] Use self.lambda_ instead
-
static
makeGLMModel
(model, coefs, threshold=0.5)[source]¶ Create a custom GLM model using the given coefficients. Needs to be passed source model trained on the dataset to extract the dataset information from.
@param model - source model, used for extracting dataset information @param coefs - dictionary containing model coefficients @param threshold - (optional, only for binomial) decision threshold used for classification
-