Trains a Cox Proportional Hazards Model (CoxPH) on an H2O dataset

h2o.coxph(
  x,
  event_column,
  training_frame,
  model_id = NULL,
  start_column = NULL,
  stop_column = NULL,
  weights_column = NULL,
  offset_column = NULL,
  stratify_by = NULL,
  ties = c("efron", "breslow"),
  init = 0,
  lre_min = 9,
  max_iterations = 20,
  interactions = NULL,
  interaction_pairs = NULL,
  interactions_only = NULL,
  use_all_factor_levels = FALSE,
  export_checkpoints_dir = NULL,
  single_node_mode = FALSE
)

Arguments

x

(Optional) A vector containing the names or indices of the predictor variables to use in building the model. If x is missing, then all columns except event_column, start_column and stop_column are used.

event_column

The name of binary data column in the training frame indicating the occurrence of an event.

training_frame

Id of the training data frame.

model_id

Destination id for this model; auto-generated if not specified.

start_column

Start Time Column.

stop_column

Stop Time Column.

weights_column

Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.

offset_column

Offset column. This will be added to the combination of columns before applying the link function.

stratify_by

List of columns to use for stratification.

ties

Method for Handling Ties. Must be one of: "efron", "breslow". Defaults to efron.

init

Coefficient starting value. Defaults to 0.

lre_min

Minimum log-relative error. Defaults to 9.

max_iterations

Maximum number of iterations. Defaults to 20.

interactions

A list of predictor column indices to interact. All pairwise combinations will be computed for the list.

interaction_pairs

A list of pairwise (first order) column interactions.

interactions_only

A list of columns that should only be used to create interactions but should not itself participate in model training.

use_all_factor_levels

Logical. (Internal. For development only!) Indicates whether to use all factor levels. Defaults to FALSE.

export_checkpoints_dir

Automatically export generated models to this directory.

single_node_mode

Logical. Run on a single node to reduce the effect of network overhead (for smaller datasets) Defaults to FALSE.

Examples

# NOT RUN {
library(h2o)
h2o.init()

# Import the heart dataset
f <- "https://s3.amazonaws.com/h2o-public-test-data/smalldata/coxph_test/heart.csv"
heart <- h2o.importFile(f)

# Set the predictor and response
predictor <- "age"
response <- "event"

# Train a Cox Proportional Hazards model 
heart_coxph <- h2o.coxph(x = predictor, training_frame = heart,
                         event_column = "event",
                         start_column = "start",
                         stop_column = "stop")
# }