max_hit_ratio_k

  • Available in: GBM, DRF, Deep Learning, Naïve-Bayes
  • Hyperparameter: no

Description

Hit ratios can be used to evaluate the performance of a model. The hit ratio is the percentage of instances where the model correctly predicts the actual class of an observation. The max_hit_ratio_k option specifies the maximum number of predictions to consider for the hit ratio computation.

Note that this option is available for multiclass problems only and is set to 0 (disabled) by default.

Example

  • r
  • python
library(h2o)
h2o.init()

# import the covtype dataset:
# this dataset is used to classify the correct forest cover type
# original dataset can be found at https://archive.ics.uci.edu/ml/datasets/Covertype
covtype <- h2o.importFile("https://s3.amazonaws.com/h2o-public-test-data/smalldata/covtype/covtype.20k.data")

# convert response column to a factor
covtype[,55] <- as.factor(covtype[,55])

# set the predictor names and the response column name
predictors <- colnames(covtype[1:54])
response <- 'C55'

# split into train and validation sets
covtype.splits <- h2o.splitFrame(data =  covtype, ratios = .8, seed = 1234)
train <- covtype.splits[[1]]
valid <- covtype.splits[[2]]

# try using the max_hit_ratio_k parameter:
# max_hit_ratio_k does not affect the actual model fit, and is for information
# and inner-H2O calculations
cov_gbm <- h2o.gbm(x = predictors, y = response, training_frame = train,
                   validation_frame = valid, max_hit_ratio_k = 3, seed = 1234)

# print out model results to see the max_hite_ratio_k table
cov_gbm