Provides a set of functions to launch a grid search and get its results.
h2o.grid(algorithm, grid_id, x, y, training_frame, ..., hyper_params = list(), is_supervised = NULL, do_hyper_params_check = FALSE, search_criteria = NULL)
algorithm | Name of algorithm to use in grid search (gbm, randomForest, kmeans, glm, deeplearning, naivebayes, pca). |
---|---|
grid_id | (Optional) ID for resulting grid search. If it is not specified then it is autogenerated. |
x | (Optional) A vector containing the names or indices of the predictor variables to use in building the model. If x is missing, then all columns except y are used. |
y | The name or column index of the response variable in the data. The response must be either a numeric or a categorical/factor variable. If the response is numeric, then a regression model will be trained, otherwise it will train a classification model. |
training_frame | Id of the training data frame. |
... | arguments describing parameters to use with algorithm (i.e., x, y, training_frame). Look at the specific algorithm - h2o.gbm, h2o.glm, h2o.kmeans, h2o.deepLearning - for available parameters. |
hyper_params | List of lists of hyper parameters (i.e., |
is_supervised | (Optional) If specified then override the default heuristic which decides if the given algorithm name and parameters specify a supervised or unsupervised algorithm. |
do_hyper_params_check | Perform client check for specified hyper parameters. It can be time expensive for large hyper space. |
search_criteria | (Optional) List of control parameters for smarter hyperparameter search. The default
strategy 'Cartesian' covers the entire space of hyperparameter combinations. Specify the
'RandomDiscrete' strategy to get random search of all the combinations of your hyperparameters. RandomDiscrete
should be usually combined with at least one early stopping criterion,
max_models and/or max_runtime_secs, e.g. |
Launch grid search with given algorithm and parameters.
# NOT RUN { library(h2o) library(jsonlite) h2o.init() iris_hf <- as.h2o(iris) grid <- h2o.grid("gbm", x = c(1:4), y = 5, training_frame = iris_hf, hyper_params = list(ntrees = c(1, 2, 3))) # Get grid summary summary(grid) # Fetch grid models model_ids <- grid@model_ids models <- lapply(model_ids, function(id) { h2o.getModel(id)}) # }