Combining Columns from Two Datasets¶
The cbind
function allows you to combine datasets by adding columns from one dataset into another. Note that when using cbind
, the two datasets must have the same number of rows. In addition, if the datasets contain common column names, H2O will append the joined column with 0
.
- r
- python
library(h2o)
h2o.init()
# Create two simple, two-column R data frames by inputting values,
# ensuring that both have a common column (in this case, "fruit").
left <- data.frame(fruit = c('apple','orange','banana','lemon','strawberry','blueberry'),
color = c('red','orange','yellow','yellow','red','blue'))
right <- data.frame(fruit = c('apple','orange','banana','lemon','strawberry','watermelon'),
citrus = c(FALSE, TRUE, FALSE, TRUE, FALSE, FALSE))
# Create the H2O data frames from the inputted data.
l.hex <- as.h2o(left)
print(l.hex)
fruit color
1 apple red
2 orange orange
3 banana yellow
4 lemon yellow
5 strawberry red
6 blueberry blue
[6 rows x 2 columns]
r.hex <- as.h2o(right)
print(r.hex)
fruit citrus
1 apple FALSE
2 orange TRUE
3 banana FALSE
4 lemon TRUE
5 strawberry FALSE
6 watermelon FALSE
[6 rows x 2 columns]
# Combine the l.hex and r.hex datasets into a single dataset.
# The columns from r.hex will be appended to the right side of the final dataset.
# In addition, because both datasets include a "fruit" column, H2O will append the
# second "fruit" column name with "0". Note that this is different than ``merge``,
# which combines data from two commonly named columns in two datasets.
columns.hex <- h2o.cbind(l.hex, r.hex)
print(columns.hex)
fruit color fruit0 citrus
1 apple red apple FALSE
2 orange orange orange TRUE
3 banana yellow banana FALSE
4 lemon yellow lemon TRUE
5 strawberry red strawberry FALSE
6 blueberry blue watermelon FALSE
[6 rows x 4 columns]