Modeling In H2O¶
H2OEstimator¶
- class h2o.estimators.estimator_base.H2OEstimator[source]¶
Bases: h2o.model.model_base.ModelBase
H2O Estimators
- H2O Estimators implement the following methods for model construction:
- start - Top-level user-facing API for asynchronous model build
- join - Top-level user-facing API for blocking on async model build
- train - Top-level user-facing API for model building.
- fit - Used by scikit-learn.
Because H2OEstimator instances are instances of ModelBase, these objects can use the H2O model API.
Attributes
full_parameters model_id params type xvals Methods
aic auc biases build_model catoffsets coef coef_norm deepfeatures download_pojo fit get_params get_xval_models giniCoef is_cross_validated join logloss mean_residual_deviance mixin model_performance mse next normmul normsub null_degrees_of_freedom null_deviance pprint_coef predict r2 residual_degrees_of_freedom residual_deviance respmul respsub score_history scoring_history set_params show start summary train varimp weights xval_keys - fit(X, y=None, **params)[source]¶
Fit an H2O model as part of a scikit-learn pipeline or grid search.
A warning will be issued if a caller other than sklearn attempts to use this method.
Parameters: X : H2OFrame
An H2OFrame consisting of the predictor variables.
- y : H2OFrame, optional
An H2OFrame consisting of the response variable.
- params : optional
Extra arguments.
Returns: The current instance of H2OEstimator for method chaining.
- get_params(deep=True)[source]¶
Useful method for obtaining parameters for this estimator. Used primarily for sklearn Pipelines and sklearn grid search.
Parameters: deep : bool, optional
If True, return parameters of all sub-objects that are estimators.
Returns: A dict of parameters
- set_params(**parms)[source]¶
Used by sklearn for updating parameters during grid search.
Parameters: parms : dict
A dictionary of parameters that will be set on this model.
Returns: Returns self, the current estimator object with the parameters all set as desired.
- start(x, y=None, training_frame=None, offset_column=None, fold_column=None, weights_column=None, validation_frame=None, **params)[source]¶
Asynchronous model build by specifying the predictor columns, response column, and any additional frame-specific values.
To block for results, call join.
Parameters: x : list
A list of column names or indices indicating the predictor columns.
- y : str
An index or a column name indicating the response column.
- training_frame : H2OFrame
The H2OFrame having the columns indicated by x and y (as well as any additional columns specified by fold, offset, and weights).
- offset_column : str, optional
The name or index of the column in training_frame that holds the offsets.
- fold_column : str, optional
The name or index of the column in training_frame that holds the per-row fold assignments.
- weights_column : str, optional
The name or index of the column in training_frame that holds the per-row weights.
- validation_frame : H2OFrame, optional
H2OFrame with validation data to be scored on while training.
- train(x, y=None, training_frame=None, offset_column=None, fold_column=None, weights_column=None, validation_frame=None, max_runtime_secs=None, **params)[source]¶
Train the H2O model by specifying the predictor columns, response column, and any additional frame-specific values.
Parameters: x : list
A list of column names or indices indicating the predictor columns.
y : str
An index or a column name indicating the response column.
training_frame : H2OFrame
The H2OFrame having the columns indicated by x and y (as well as any additional columns specified by fold, offset, and weights).
offset_column : str, optional
The name or index of the column in training_frame that holds the offsets.
fold_column : str, optional
The name or index of the column in training_frame that holds the per-row fold assignments.
weights_column : str, optional
The name or index of the column in training_frame that holds the per-row weights.
validation_frame : H2OFrame, optional
H2OFrame with validation data to be scored on while training.
max_runtime_secs : float
Maximum allowed runtime in seconds for model training. Use 0 to disable.