stratify_by

  • Available in: CoxPH
  • Hyperparameter: no

Description

In a CoxPH model, stratification is useful as a diagnostic for checking the proportional hazards assumption, as it allows for as many different hazard functions as there are strata. For example, when attempting to predict X, you can include a secondary categorical predictor, Z, that can be adjusted for when making inferences about X’s relationship to the time-to-event endpoint.

Use the `stratify_by parameter to specify a list of columns to use for stratification when building a CoxPH model.

Example

library(h2o)
h2o.init()

# import the heart dataset
heart <- h2o.importFile("http://s3.amazonaws.com/h2o-public-test-data/smalldata/coxph_test/heart.csv")

# set the predictor and response column
x <- "age"
y <- "event"

# set the start and stop columns
start <- "start"
stop <- "stop"

# convert the age column to a factor
heart["age"] <- as.factor(heart["age"])

# train your model
coxph.h2o <- h2o.coxph(x=c("year", x),
                       event_column=y,
                       start_column=start,
                       stop_column=stop,
                       stratify_by=x,
                       training_frame=heart)

# view the model details
coxph.h2o
Model Details:
==============

H2OCoxPHModel: coxph
Model ID:  CoxPH_model_R_1570209287520_5
Call:
Surv(start, stop, event) ~ year + strata(age)

        coef    exp(coef) se(coef)  z      p
year    4.734   113.717   8973.421  0.001  1

Likelihood ratio test=1.39  on 1 df, p=0.239
n= 172, number of events= 75