Combining Rows from Two Datasets

You can use the rbind function to combine two similar datasets into a single large dataset. This can be used, for example, to create a larger dataset by combining data from a validation dataset with its training or testing dataset.

Note that when using rbind, the two datasets must have the same set of columns.

  • r
  • python
> library(h2o)
> h2o.init()

# Import an existing training dataset
> ecg1Path <- "http://h2o-public-test-data.s3.amazonaws.com/smalldata/anomaly/ecg_discord_train.csv"
> ecg1.hex <- h2o.importFile(path=ecg1Path, destination_frame="ecg1.hex")
> print(dim(ecg1.hex))
[1] 20 210

# Import an existing testing dataset
> ecg2Path <- "http://h2o-public-test-data.s3.amazonaws.com/smalldata/anomaly/ecg_discord_test.csv"
> ecg2.hex <- h2o.importFile(path=ecg2Path, destination_frame="ecg2.hex")
> print(dim(ecg2.hex))
[1] 23 210

# Combine the two datasets into a single, larger dataset
> ecgCombine.hex <- h2o.rbind(ecg1.hex, ecg2.hex)
> print(dim(ecgCombine.hex))
[1] 43 210