Variable importance heatmap shows variable importance across multiple models. Some models in H2O return variable importance for one-hot (binary indicator) encoded versions of categorical columns (e.g. Deep Learning, XGBoost). In order for the variable importance of categorical columns to be compared across all model types we compute a summarization of the the variable importance across all one-hot encoded features and return a single variable importance for the original categorical feature. By default, the models and variables are ordered by their similarity.
h2o.varimp_heatmap(object, top_n = 20, num_of_features = 20)
object | A list of H2O models, an H2O AutoML instance, or an H2OFrame with a 'model_id' column (e.g. H2OAutoML leaderboard). |
---|---|
top_n | Integer specifying the number models shown in the heatmap (based on leaderboard ranking). Defaults to 20. |
num_of_features | Integer specifying the number of features shown in the heatmap based on the maximum variable importance across the models. Use NULL for unlimited. Defaults to 20. |
A ggplot2 object.
# NOT RUN { library(h2o) h2o.init() # Import the wine dataset into H2O: f <- "https://h2o-public-test-data.s3.amazonaws.com/smalldata/wine/winequality-redwhite-no-BOM.csv" df <- h2o.importFile(f) # Set the response response <- "quality" # Split the dataset into a train and test set: splits <- h2o.splitFrame(df, ratios = 0.8, seed = 1) train <- splits[[1]] test <- splits[[2]] # Build and train the model: aml <- h2o.automl(y = response, training_frame = train, max_models = 10, seed = 1) # Create the variable importance heatmap varimp_heatmap <- h2o.varimp_heatmap(aml) print(varimp_heatmap) # }