R Tutorial¶

This tutorial provides a sample workflow for new users of H2O’s R API. Readers will learn the basic syntax of H2O, including importing and parsing files, specifying a model, and obtaining model output.

New H2O users should refer to the quick start guide for additional instructions on how to run H2O. The following tutorial assumes that H2O is installed in R.

Getting Started¶

R uses an REST API to send functions to H2O, so a reference object in R to the H2O instance is required. You can start H2O outside of R and connect to it. You can also launch directly from R, but if you close the R session, the H2O instance is closed as well. The client object is used to direct R to datasets and models located in H2O.

Launch From R

By default, if the argument max_mem_size is not specified when running h2o.init(), the heap size of the H2O running on 32-bit Java is 1g. On 64-bit Java, the heap size is 1/4 of the total memory available on the machine. For a 32-bit version, the function runs a check and suggests an upgrade.

> library(h2o)
> localH2O <- h2o.init(ip = 'localhost', port = 54321, max_mem_size = '4g')
 Successfully connected to http://localhost:54321
       R is connected to H2O cluster:
   H2O cluster uptime:         11 minutes 35 seconds
   H2O cluster version:        2.7.0.1497
   H2O cluster name:           H2O_started_from_R
   H2O cluster total nodes:    1
   H2O cluster total memory:   3.56 GB
   H2O cluster total cores:    8
   H2O cluster allowed cores:  8
   H2O cluster healthy:        TRUE

Launch From Command Line

Follow one of the deployment tutorials to launch an instance from the command line:

on your desktop
on ec2 instances
on Hadoop servers

After launching the H2O cluster, initialize the connection by taking one node in the cluster and run h2o.init with the node’s IP Address and port in the parentheses. Note that the IP Address must be on your local machine. For the following example, change 192.168.1.161 to your local host.

> library(h2o)
> localH2O <- h2o.init(ip = '192.168.1.161', port =54321)

Warning

If the version of the current H2O instance is not the same as the package version loaded in R, a “version mismatch” warning message displays. To fix this issue, update the R package or launch an H2O instance using the jar file from the installed package.

Error in h2o.init():
Version mismatch! H2O is running version # but R package is version #

Cluster Info

To check the status and health of the H2O cluster, use h2o.clusterInfo() to display an easy-to-read summary of information about the cluster.

> library(h2o)
> localH2O = h2o.init(ip = 'localhost', port = 54321)
> h2o.clusterInfo(localH2O)
R is connected to H2O cluster:
  H2O cluster uptime:         43 minutes 43 seconds
  H2O cluster version:        2.7.0.1497
  H2O cluster name:           H2O_started_from_R
  H2O cluster total nodes:    1
  H2O cluster total memory:   3.56 GB
  H2O cluster total cores:    8
  H2O cluster allowed cores:  8
  H2O cluster healthy:        TRUE

Importing Data¶

Import File

The H2O package consolidates all of the various supported import functions. Although h2o.importFolder and h2o.importHDFS will still work, these functions are deprecated and should be updated to h2o.importFile.

## To import small iris data file from H2O's package
> irisPath = system.file("extdata", "iris.csv", package="h2o")
> iris.hex = h2o.importFile(localH2O, path = irisPath, key = "iris.hex")
      |=================================================| 100%

## To import an entire folder of files as one data object
> pathToFolder = "/Users/Amy/0xdata/data/airlines/"
> airlines.hex = h2o.importFile(localH2O, path = pathToFolder, key = "airlines.hex")
      |=================================================| 100%

## To import from HDFS
> pathToData = "hdfs://mr-0xd6.0xdata.loc/datasets/airlines_all.csv"
> airlines.hex = h2o.importFile(localH2O, path = pathToData, key = "airlines.hex")
      |=================================================| 100%

Upload File

To upload a file from your local disk, importFile is recommended. However, you can still run upload file.

> irisPath = system.file("extdata", "iris.csv", package="h2o")
> iris.hex = h2o.uploadFile(localH2O, path = irisPath, key = "iris.hex")
|====================================================| 100%

Data Manipulation and Description¶

Any Factor

Determine if any column in a data set is a factor.

> irisPath = system.file("extdata", "iris_wheader.csv", package="h2o")
> iris.hex = h2o.importFile(localH2O, path = irisPath)
|===================================================| 100%
> h2o.anyFactor(iris.hex)
[1] TRUE

As Data Frame

Convert an H2O parsed data object into an R data frame that can be manipulated using R calls. While this can be very useful, be careful with as.data.frame when converting H2O Parsed Data objects. Data sets that are easily and quickly handled by H2O are often too large to be treated equivalently well in R.

 > prosPath <- system.file("extdata", "prostate.csv", package="h2o")
 > prostate.hex = h2o.importFile(localH2O, path = prosPath)
  |===================================================| 100%

 > prostate.data.frame<- as.data.frame(prostate.hex)
 > summary(prostate.data.frame)
       ID            CAPSULE            AGE             RACE
Min.   :  1.00   Min.   :0.0000   Min.   :43.00   Min.   :0.000
1st Qu.: 95.75   1st Qu.:0.0000   1st Qu.:62.00   1st Qu.:1.000
       ....
 > head(prostate.data.frame)
         ID CAPSULE AGE RACE DPROS DCAPS  PSA  VOL GLEASON
       1  1       0  65    1     2     1  1.4  0.0       6
       2  2       0  72    1     3     2  6.7  0.0       7
               ....

As Factor

Convert an integer into a non-ordered factor (also called an enum or categorical).

> prosPath = system.file("extdata", "prostate.csv", package="h2o")
> prostate.hex = h2o.importFile(localH2O, path = prosPath)
|===================================================| 100%
> prostate.hex[,4] = as.factor(prostate.hex[,4])
> summary(prostate.hex)
      ID               CAPSULE          AGE             RACE    DPROS
      Min.   :  1.00   Min.   :0.0000   Min.   :43.00   1 :341  Min.   :1.000
      1st Qu.: 95.75   1st Qu.:0.0000   1st Qu.:62.00   2 : 36  1st Qu.:1.000
              ....

As H2O

Pass a data frame from inside the R environment to the H2O instance.

> data(iris)
> summary(iris)
      Sepal.Length    Sepal.Width     Petal.Length    Petal.Width
       Min.   :4.300   Min.   :2.000   Min.   :1.000   Min.   :0.100
      1st Qu.:5.100   1st Qu.:2.800   1st Qu.:1.600   1st Qu.:0.300
              ....
> iris.r <- iris
> iris.h2o <- as.h2o(localH2O, iris.r, key="iris.h2o")
  |===================================================| 100%
> class(iris.h2o)
      [1] "H2OParsedData"
      attr(,"package")
      [1] "h2o"

Assign H2O

Create a hex key on the server running H2O for data sets manipulated in R. For instance, in the example below, the prostate data set was uploaded to the H2O instance and manipulated to remove outliers. To save the new data set on the H2O server so that it can be subsequently be analyzed with H2O without overwriting the original data set, use h2o.assign.

> prosPath = system.file("extdata", "prostate.csv", package="h2o")
> prostate.hex = h2o.importFile(localH2O, path = prosPath)
    |===================================================| 100%
> prostate.qs = quantile(prostate.hex$PSA)
> PSA.outliers = prostate.hex[prostate.hex$PSA
<= prostate.qs[2] | prostate.hex$PSA >=   prostate.qs[10],]
> PSA.outliers = h2o.assign(PSA.outliers, "PSA.outliers")
> nrow(prostate.hex)
[1] 380
> nrow(PSA.outliers)
[1] 380

Colnames

Obtain a list of the column names in a data set.

> irisPath = system.file("extdata", "iris.csv", package="h2o")
> iris.hex = h2o.importFile(localH2O, path = irisPath, key = "iris.hex")
      |===================================================| 100%
> colnames(iris.hex)
[1] "C1" "C2" "C3" "C4" "C5"

Extremes

Obtain the maximum and minimum values in real-valued columns.

> ausPath = system.file("extdata", "australia.csv", package="h2o")
> australia.hex = h2o.importFile(localH2O, path = ausPath, key = "australia.hex")
 |===================================================| 100%
> min(australia.hex)
[1] 0
> min(c(-1, 0.5, 0.2), FALSE, australia.hex[,1:4])
[1] -1

Quantile

Request quantiles for an H2O parsed data set. To request a quantile for a single numeric column, use the column name (for example, $AGE). When you request for a full parsed data set, quantile() returns a matrix that displays quantile information for all numeric columns in the data set.

> prosPath = system.file("extdata", "prostate.csv", package="h2o")
> prostate.hex = h2o.importFile(localH2O, path = prosPath)
   |===================================================| 100%
> quantile(prostate.hex$AGE)

Summary

Generate an R-like summary for each of the columns in a data set. For continuous real functions, this produces a summary that includes information on quartiles, min, max, and mean. For factors, this produces information about counts of elements within each factor level. For information on the Summary algorithm, see Summary.

  > prosPath = system.file("extdata", "prostate.csv", package="h2o")
  > prostate.hex = h2o.importFile(localH2O, path = prosPath)
       |===================================================| 100%
  > summary(prostate.hex)
          ID            CAPSULE            AGE             RACE
Min.   :  1.00   Min.   :0.0000   Min.   :43.00   Min.   :0.000
1st Qu.: 95.75   1st Qu.:0.0000   1st Qu.:62.00   1st Qu.:1.000
       ....
  > summary(prostate.hex$GLEASON)
  GLEASON
Min.   :0.000
1st Qu.:6.000
Median :6.000
Mean   :6.384
3rd Qu.:7.000
Max.   :9.000
  > summary(prostate.hex[,4:6])
  RACE            DPROS           DCAPS
Min.   :0.000   Min.   :1.000   Min.   :1.000
1st Qu.:1.000   1st Qu.:1.000   1st Qu.:1.000
Median :1.000   Median :2.000   Median :1.000
Mean   :1.087   Mean   :2.271   Mean   :1.108
3rd Qu.:1.000   3rd Qu.:3.000   3rd Qu.:1.000
Max.   :2.000   Max.   :4.000   Max.   :2.000

H2O Table

Summarize information in data. Because H2O handles such large data sets, it is possible to generate tables that are larger than R’s capacity. To minimize this risk and enable uninterrupted work, h2o.table is called inside of a call for head() or tail(). Within head() and tail(), specify the number of rows in the table to return.

> head(h2o.table(prostate.hex[,3]))
         row.names Count
      43     1
      47     1
      50     2
      51     3
      52     2
      53     4

> head(h2o.table(prostate.hex[,c(3,4)]))
        row.names X0 X1 X2
      43  1  0  0
      47  0  1  0
      50  0  2  0
      51  0  3  0
      52  0  2  0
      53  0  3  1

Generate Random Uniformly Distributed Numbers

h2o.runif() appends a column of random numbers to an H2O data frame and facilitates creating testing/training data splits for analysis and validation in H2O.

> prosPath = system.file("extdata", "prostate.csv", package="h2o")
> prostate.hex = h2o.importFile(localH2O, path = prosPath, key = "prostate.hex")
     |===================================================| 100%
> s = h2o.runif(prostate.hex)
> summary(s)
rnd
      Min.   :0.001434
  1st Qu.:0.241275
  Median :0.496995
  Mean   :0.489468
  3rd Qu.:0.740592
  Max.   :0.994894

> prostate.train = prostate.hex[s <= 0.8,]
> prostate.train = h2o.assign(prostate.train, "prostate.train")
> prostate.test = prostate.hex[s > 0.8,]
> prostate.test = h2o.assign(prostate.test, "prostate.test")
> nrow(prostate.train) + nrow(prostate.test)
[1] 380

Split Frame

Generate two subsets from an existing H2O data set, according to user-specified ratios that can be used as testing/training sets. This is the preferred method of splitting a data frame because it’s faster and more stable than running runif across entire the data set. However, runif can be used for customized frame splitting.

> prosPath = system.file("extdata", "prostate.csv", package="h2o")
> prostate.hex = h2o.importFile(localH2O, path = prosPath, key = "prostate.hex")
       |===================================================| 100%
> prostate.split = h2o.splitFrame(data = prostate.hex , ratios = 0.75)
> prostate.train = prostate.split[1]
> prostate.test = prostate.split[2]
> summary(prostate.train)
   Length Class         Mode
      [1,] 9      H2OParsedData S4
> summary(prostate.test)
  Length Class         Mode
      [1,] 9      H2OParsedData S4

Running Models¶

GBM

Generate Gradient Boosted Models (GBM), which are used to develop forward-learning ensembles. For information on the GBM algorithm, see Gradient Boosted Regression and Classification.

> ausPath = system.file("extdata", "australia.csv", package="h2o")
> australia.hex = h2o.importFile(localH2O, path = ausPath)
   |===================================================| 100%
> independent <- c("premax", "salmax","minairtemp", "maxairtemp",
"maxsst", "maxsoilmoist", "Max_czcs")
> dependent <- "runoffnew"
> h2o.gbm(y = dependent, x = independent, data = australia.hex,
> n.trees = 10, interaction.depth = 3,
   n.minobsinnode = 2, shrinkage = 0.2, distribution= "gaussian")
     |======================================================| 100%
      IP Address: 127.0.0.1
      Port      : 54321
      Parsed Data Key: australia1.hex

      GBM Model Key: GBM_a3ae2edf5dfadbd9ba5dc2e9560c405d

      Mean-squared Error by tree:
       [1] 230760.11 166957.80 124904.30  94031.17  72367.01  57180.17  47092.85
      [8]  39168.05  34456.00  31095.86  28397.10

Run multinomial classification GBM on abalone data

To generate a classification model that uses labels, use a multinomial distribution.

> h2o.gbm(y = dependent, x = independent, data = australia.hex, n.trees
= 15, interaction.depth = 5,
 n.minobsinnode = 2, shrinkage = 0.01, distribution= "multinomial")
      IP Address: 127.0.0.1
      Port      : 54321
      Parsed Data Key: australia1.hex

      GBM Model Key: GBM_8e4591a9b413407b983d73fbd9eb44cf

      Confusion matrix:
      Reported on australia1.hex
      Predicted
      Actual     0 3 6 7 14 16 17 19 20 25 38 43 61 75 82 107 138 150 167 191 200
        0      115 0 0 0  0  0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0
        3        0 1 0 0  0  0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0
        6        0 0 1 0  0  0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0
        7        0 0 0 2  0  0  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0
      ....
       Totals 120 1 1 2  1  2  2  2  2 31  1  1  1  6  1   1   1   6   1   1   1
      Predicted
      Actual   210 245 300 343 396 400 462 480 514 533 545 600 750 764 840 933 960
       0        0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
       3        0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
       6        0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
       7        0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
       14       0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
       16       0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
      ....
       Totals   1   1  20   1   1   1   1   1   1   1   1   8   1   1   1   1   1
      Predicted
      Actual   1154 1200 2000 2400 Error
        0         0    0    0    0 0.000
        3         0    0    0    0 0.000
        6         0    0    0    0 0.000
        7         0    0    0    0 0.000
      ....
      Mean-squared Error by tree:
       [1] 0.9529478 0.9337646 0.9157476 0.8985756 0.8818316 0.8654845 0.8497011
      [8] 0.8341974 0.8187867 0.8036760 0.7887764 0.7741757 0.7594546 0.7452223
      [15] 0.7309634 0.7168317

GLM

Generate Generalized Linear Models, which are used to develop linear models for exponential distributions. Regularization can be applied. For information on the GLM algorithm, see Generalized Linear Model.

> prostate.hex = h2o.importFile(localH2O, path =
"https://raw.github.com/0xdata/h2o/master/smalldata/logreg/prostate.csv",
key = "prostate.hex")
       |===================================================| 100%
> h2o.glm(y = "CAPSULE", x = c("AGE","RACE","PSA","DCAPS"), data =
prostate.hex, family = "binomial", nfolds = 10, alpha = 0.5)
 |=====================================================================| 100%
      IP Address: 127.0.0.1
      Port      : 54321
      Parsed Data Key: prostate.hex

      GLM2 Model Key: GLMModel__a2fdb4e3fdd92e0325141cdbd1bd43e1

      Coefficients:
    AGE      RACE     DCAPS       PSA Intercept
       -0.01104  -0.63136   1.31888   0.04713  -1.10896

      Normalized Coefficients:
    AGE      RACE     DCAPS       PSA Intercept
       -0.07208  -0.19495   0.40972   0.94253  -0.33707

      Degrees of Freedom: 379 Total (i.e. Null);  375 Residual
      Null Deviance:     514.9
      Residual Deviance: 461.3  AIC: 471.3
      Deviance Explained: 0.10404
      AUC: 0.68875  Best Threshold: 0.328

      Confusion Matrix:
      Predicted
      Actual   false true Error
      false    127  100 0.441
      true      51  102 0.333
      Totals   178  202 0.397

      Cross-Validation Models:
       Nonzeros       AUC Deviance Explained
      Model 1         4 0.6532738        0.048419803
      Model 2         4 0.6316527       -0.006414532
      Model 3         4 0.7100840        0.087779178
      Model 4         4 0.8268698        0.243020554
      Model 5         4 0.6354167        0.153190735
      Model 6         4 0.6888889        0.041892118
      Model 7         4 0.7366071        0.164717509
      Model 8         4 0.6711310        0.004897310
      Model 9         4 0.7803571        0.200384622
      Model 10        4 0.7435897        0.114548543

> myX = setdiff(colnames(prostate.hex), c("ID", "DPROS", "DCAPS", "VOL"))
> h2o.glm(y = "VOL", x = myX, data = prostate.hex, family = "gaussian", nfolds = 5,
alpha = 0.1)
       |=========================================================| 100%

      IP Address: 127.0.0.1
      Port      : 54321
      Parsed Data Key: prostate.hex

      GLM2 Model Key: GLMModel__b8339af00fbe8951ba0871611c9e42eb

      Coefficients:
      CAPSULE       AGE      RACE       PSA   GLEASON Intercept
      -4.29014   0.29787   4.35557   0.04946  -0.51274  -4.35359

      Normalized Coefficients:
      CAPSULE       AGE      RACE       PSA   GLEASON Intercept
      -2.10678   1.94424   1.34488   0.98908  -0.55989  15.81292

      Degrees of Freedom: 379 Total (i.e. Null);  374 Residual
      Null Deviance:     126623.9
      Residual Deviance: 127402  AIC: 11059.1
      Deviance Explained: -0.00615

      Cross-Validation Models:
      Nonzeros      AIC Deviance Explained
      Model 1        5 685.6101        -0.02827868
      Model 2        5 660.3719        -0.15397511
      Model 3        5 658.0768         0.05826293
      Model 4        5 665.8665         0.05117173
      Model 5        5 683.6276         0.01333543

K-Means

Generate a K-means model, which is a clustering algorithm that allows users to characterize data. This algorithm does not rely on a dependent variable. For information on the K-Means algorithm, see K-Means

> prosPath = system.file("extdata", "prostate.csv", package="h2o")
> prostate.hex = h2o.importFile(localH2O, path = prosPath)
|=========================================================| 100%
> prostate.km = h2o.kmeans(data = prostate.hex, centers = 10,
cols = c("AGE", "RACE", "VOL", "GLEASON"))
  |=========================================================| 100%
print(prostate.km)
IP Address: 127.0.0.1
Port      : 54321
Parsed Data Key: prostate6.hex

K-Means Model Key: KMeans2_99fea55be4a22f741df74532d7844bb4

K-means clustering with 10 clusters of sizes 41, 27, 59, 17, 21, 47, 26, 61, 47, 34

      Cluster means:
      AGE     RACE         VOL  GLEASON
      1  69.73171 1.024390 37.99756098 6.512195
      2  54.48148 1.111111  0.32222222 6.518519
      3  62.59322 1.067797  0.19322034 5.966102
      .....

Principal Components Analysis

Map a set of variables onto a subspace using linear transformations. Principle Components Analysis (PCA) is the first step in Principal Components Regression. For more information on PCA, see Principal Components Analysis.

> ausPath = system.file("extdata", "australia.csv", package="h2o")
> australia.hex = h2o.importFile(localH2O, path = ausPath)
  |=========================================================| 100%
> australia.pca = h2o.prcomp(data = australia.hex, standardize = TRUE)
  |=========================================================| 100%
> print(australia.pca)
IP Address: 127.0.0.1
 Port      : 54321
 Parsed Data Key: australia2.hex

 PCA Model Key: PCA_90d7162c6d4855392ba1272c2f314bec

 Standard deviations:
 1.750703 1.512142 1.031181 0.8283127 0.6083786 0.5481364 0.4181621 0.2314953
      ....

summary(australia.pca)
Importance of components:
....

Principal Components Regression

Map a set of variables to a new set of linearly independent variables. The new set of variables are linearly independent linear combinations of the original variables and exist in a subspace of lower dimension. This transformation is then prepended to a regression model, often improving results. For more information on PCA, see Principal Components Analysis.

> prostate.hex = h2o.importFile(localH2O, path =
  "https://raw.github.com/0xdata/h2o/master/smalldata/logreg/prostate.csv",
key = "prostate.hex")
    |=========================================================| 100%
> h2o.pcr(x = c("AGE","RACE","PSA","DCAPS"), y = "CAPSULE", data =
prostate.hex, family = "binomial",
nfolds = 10, alpha = 0.5, ncomp = 3)
 |==========================================================| 100%

      IP Address: 127.0.0.1
      Port      : 54321
      Parsed Data Key: PCAPredict_80069467adfe441c92282ac766f9de7e

      GLM2 Model Key: GLMModel__a1454a5b8a212d1069376356543a4887

      Coefficients:
    PC0       PC1       PC2 Intercept
      3.76219   1.26824  -1.35455  -0.36271
      ....

Obtaining Predictions¶

Predict

Apply an H2O model to a holdout set to obtain predictions based on model results. In the examples below, models are generated first, and then the predictions for that model are displayed.

> prostate.hex = h2o.importFile(localH2O, path =
  "https://raw.github.com/0xdata/h2o/master/smalldata/logreg/prostate.csv",
  key = "prostate.hex")
  |==========================================================| 100%
> prostate.glm = h2o.glm(y = "CAPSULE", x =
c("AGE","RACE","PSA","DCAPS"), data = prostate.hex,
family = "binomial", nfolds = 10, alpha = 0.5)
    |==========================================================| 100%

> prostate.fit = h2o.predict(object = prostate.glm, newdata = prostate.hex)
> (prostate.fit)
IP Address: 127.0.0.1
      Port      : 54321
      Parsed Data Key: GLM2Predict_8b6890653fa743be9eb3ab1668c5a6e9

predict        X0        X1
      1       0 0.7452267 0.2547732
      2       1 0.3969807 0.6030193
      3       1 0.4120950 0.5879050
      4       1 0.3726134 0.6273866
      5       1 0.6465137 0.3534863
      6       1 0.4331880 0.5668120

Other Useful Functions¶

Get Frame

For users that alternate between using the web interface and the R API, or for multiple users accessing the same H2O, this function gives the user the option to create a reference object for a data frame sitting in H2O (assuming there’s a prostate.hex in the KV store).

> prostate.hex = h2o.getFrame(h2o = localH2O, key = "prostate.hex")

Get Model

For users that alternate between using the web interface and the R API, this function gives the user the option to create a reference object for a data frame sitting in H2O (assuming there’s a GLMModel__ba724fe4f6d6d5b8b6370f776df94e47 model in the KV store).

> glm.model = h2o.getModel(h2o = localH2O, key = "GLMModel__ba724fe4f6d6d5b8b6370f776df94e47")
> glm.model

List all H2O Objects

Generate a list of all H2O objects generated during a work session, along with each object’s byte size.

> prostate.hex = h2o.importFile(localH2O, path = prosPath, key = "prostate.hex")
    |==========================================================| 100%
> prostate.split = h2o.splitFrame(prostate.hex , ratio = 0.8)
> prostate.train = prostate.split[[1]]
> prostate.train = h2o.assign(prostate.train, "prostate.train")
> h2o.ls(localH2O)
                                                   Key Bytesize
      1               GBM_8e4591a9b413407b983d73fbd9eb44cf    40617
      2               GBM_a3ae2edf5dfadbd9ba5dc2e9560c405d     1516

Remove an H2O object from the server where H2O is running

To remove an H2O object on the server associated with an object in the R environment, we recommend also removing the object from the R environment.

 > h2o.ls(localH2O)
                Key Bytesize
1      Last.value.39      448
2      Last.value.42       73
3       prostate.hex     4874
4     prostate.train     4028
5 prostate_part0.hex     4028
6 prostate_part1.hex     1432

> h2o.rm(object= localH2O, keys= "prostate.train")
> h2o.ls(localH2O)
                Key Bytesize
1      Last.value.39      448
2      Last.value.42       73
3       prostate.hex     4874
4 prostate_part0.hex     4028
5 prostate_part1.hex     1432