Modeling In H2O

H2OEstimator

class h2o.estimators.estimator_base.H2OEstimator[source]

Bases: h2o.model.model_base.ModelBase

H2O Estimators

H2O Estimators implement the following methods for model construction:
  • start - Top-level user-facing API for asynchronous model build
  • join - Top-level user-facing API for blocking on async model build
  • train - Top-level user-facing API for model building.
  • fit - Used by scikit-learn.

Because H2OEstimator instances are instances of ModelBase, these objects can use the H2O model API.

Attributes

full_parameters
model_id
params
type
xvals

Methods

aic
auc
biases
build_model
catoffsets
coef
coef_norm
deepfeatures
download_pojo
fit
get_params
get_xval_models
giniCoef
is_cross_validated
join
logloss
mean_residual_deviance
mixin
model_performance
mse
next
normmul
normsub
null_degrees_of_freedom
null_deviance
pprint_coef
predict
r2
residual_degrees_of_freedom
residual_deviance
respmul
respsub
score_history
scoring_history
set_params
show
start
summary
train
varimp
weights
xval_keys
build_model(algo_params)[source]
fit(X, y=None, **params)[source]

Fit an H2O model as part of a scikit-learn pipeline or grid search.

A warning will be issued if a caller other than sklearn attempts to use this method.

Parameters:

X : H2OFrame

An H2OFrame consisting of the predictor variables.

y : H2OFrame, optional

An H2OFrame consisting of the response variable.

params : optional

Extra arguments.

Returns:

The current instance of H2OEstimator for method chaining.

get_params(deep=True)[source]

Useful method for obtaining parameters for this estimator. Used primarily for sklearn Pipelines and sklearn grid search.

Parameters:

deep : bool, optional

If True, return parameters of all sub-objects that are estimators.

Returns:

A dict of parameters

join()[source]
static mixin(obj, cls)[source]
set_params(**parms)[source]

Used by sklearn for updating parameters during grid search.

Parameters:

parms : dict

A dictionary of parameters that will be set on this model.

Returns:

Returns self, the current estimator object with the parameters all set as desired.

start(x, y=None, training_frame=None, offset_column=None, fold_column=None, weights_column=None, validation_frame=None, **params)[source]

Asynchronous model build by specifying the predictor columns, response column, and any additional frame-specific values.

To block for results, call join.

Parameters:

x : list

A list of column names or indices indicating the predictor columns.

y : str

An index or a column name indicating the response column.

training_frame : H2OFrame

The H2OFrame having the columns indicated by x and y (as well as any additional columns specified by fold, offset, and weights).

offset_column : str, optional

The name or index of the column in training_frame that holds the offsets.

fold_column : str, optional

The name or index of the column in training_frame that holds the per-row fold assignments.

weights_column : str, optional

The name or index of the column in training_frame that holds the per-row weights.

validation_frame : H2OFrame, optional

H2OFrame with validation data to be scored on while training.

train(x, y=None, training_frame=None, offset_column=None, fold_column=None, weights_column=None, validation_frame=None, max_runtime_secs=None, **params)[source]

Train the H2O model by specifying the predictor columns, response column, and any additional frame-specific values.

Parameters:

x : list

A list of column names or indices indicating the predictor columns.

y : str

An index or a column name indicating the response column.

training_frame : H2OFrame

The H2OFrame having the columns indicated by x and y (as well as any additional columns specified by fold, offset, and weights).

offset_column : str, optional

The name or index of the column in training_frame that holds the offsets.

fold_column : str, optional

The name or index of the column in training_frame that holds the per-row fold assignments.

weights_column : str, optional

The name or index of the column in training_frame that holds the per-row weights.

validation_frame : H2OFrame, optional

H2OFrame with validation data to be scored on while training.

max_runtime_secs : float

Maximum allowed runtime in seconds for model training. Use 0 to disable.