Drops duplicated rows.

Drops duplicated rows across specified columns.

h2o.drop_duplicates(frame, columns, keep = "first")

Arguments

frame	An H2OFrame object to drop duplicates on.
columns	Columns to compare during the duplicate detection process.
keep	Which rows to keep. The "first" value (default) keeps the first row and deletes the rest. The "last" keeps the last row.

Examples

# NOT RUN {
library(h2o)
h2o.init()

data <- as.h2o(iris)
deduplicated_data <- h2o.drop_duplicates(data, c("Species", "Sepal.Length"), keep = "first")
# }

Arguments

Examples

Contents