Trains a Support Vector Machine model on an H2O dataset

Alpha version. Supports only binomial classification problems.

h2o.psvm(
  x,
  y,
  training_frame,
  model_id = NULL,
  validation_frame = NULL,
  ignore_const_cols = TRUE,
  hyper_param = 1,
  kernel_type = c("gaussian"),
  gamma = -1,
  rank_ratio = -1,
  positive_weight = 1,
  negative_weight = 1,
  disable_training_metrics = TRUE,
  sv_threshold = 1e-04,
  fact_threshold = 1e-05,
  feasible_threshold = 0.001,
  surrogate_gap_threshold = 0.001,
  mu_factor = 10,
  max_iterations = 200,
  seed = -1
)

Arguments

x	(Optional) A vector containing the names or indices of the predictor variables to use in building the model. If x is missing, then all columns except y are used.
y	The name or column index of the response variable in the data. The response must be either a binary categorical/factor variable or a numeric variable with values -1/1 (for compatibility with SVMlight format).
training_frame	Id of the training data frame.
model_id	Destination id for this model; auto-generated if not specified.
validation_frame	Id of the validation data frame.
ignore_const_cols	`Logical`. Ignore constant columns. Defaults to TRUE.
hyper_param	Penalty parameter C of the error term Defaults to 1.
kernel_type	Type of used kernel Must be one of: "gaussian". Defaults to gaussian.
gamma	Coefficient of the kernel (currently RBF gamma for gaussian kernel, -1 means 1/#features) Defaults to -1.
rank_ratio	Desired rank of the ICF matrix expressed as an ration of number of input rows (-1 means use sqrt(#rows)). Defaults to -1.
positive_weight	Weight of positive (+1) class of observations Defaults to 1.
negative_weight	Weight of positive (-1) class of observations Defaults to 1.
disable_training_metrics	`Logical`. Disable calculating training metrics (expensive on large datasets) Defaults to TRUE.
sv_threshold	Threshold for accepting a candidate observation into the set of support vectors Defaults to 0.0001.
fact_threshold	Convergence threshold of the Incomplete Cholesky Factorization (ICF) Defaults to 1e-05.
feasible_threshold	Convergence threshold for primal-dual residuals in the IPM iteration Defaults to 0.001.
surrogate_gap_threshold	Feasibility criterion of the surrogate duality gap (eta) Defaults to 0.001.
mu_factor	Increasing factor mu Defaults to 10.
max_iterations	Maximum number of iteration of the algorithm Defaults to 200.
seed	Seed for random numbers (affects certain parts of the algo that are stochastic and those might or might not be enabled by default). Defaults to -1 (time-based random number).

Examples

# NOT RUN {
library(h2o)
h2o.init()

# Import the splice dataset
f <- "https://s3.amazonaws.com/h2o-public-test-data/smalldata/splice/splice.svm"
splice <- h2o.importFile(f)

# Train the Support Vector Machine model
svm_model <- h2o.psvm(gamma = 0.01, rank_ratio = 0.1,
                      y = "C1", training_frame = splice,
                      disable_training_metrics = FALSE)
# }

Trains a Support Vector Machine model on an H2O dataset

Arguments

Examples

Contents