Quantiles

This function retrieves and displays quantiles for H2O-3 parsed data.

Note

The quantile results in Flow are computed lazily on-demand and cached. It’s a fast approximation (max - min / 1024) that’s very accurate for most use cases. If the distribution is skewed, the quantile results may not be as accurate as the results obtained using h2o.quantile in R or H2OFrame.quantile in Python.

Quantile parameters

  • combine_method: Specify the method for combining quantiles for even sample sizes. Abbreviations for average, low, and high are acceptable (avg, lo, hi). The default is to do linear interpolation (e.g. if method is “lo”, then it will take the lo value of the quantile). Available methods include:

    • average

    • high

    • interpolate

    • low

  • h2oFrame: Specify the H2OFrame.

  • prob: Specify a list of probabilities with values in the range [0,1]. By default, the following probabilities are returned:

    • Python: 0.01, 0.1, 0.25, 0.333, 0.5, 0.667, 0.75, 0.9, 0.99

    • R: 0.001, 0.01, 0.1, 0.25, 0.33, 0.5, 0.667, 0.75, 0.9, 0.99, 0.999

  • weights_column: A string name identifying the obsevation weights column in this frame or a single-column, separate H2OFrame of observation weights. If this option isn’t specified, then all rows are assumed to have equal importance.

Examples

# Import the prostate dataset:
prostate = h2o.import_file("http://s3.amazonaws.com/h2o-public-test-data/smalldata/prostate/prostate.csv")

# Request quantiles for the parsed dataset
prostate.quantile()

# Request quantiles for the AGE column:
prostate["AGE"].quantile()

# Request quantiles for probabilities 0.001 and 0.01 for the AGE column
prostate["AGE"].quantile(prob=[0.001, 0.01])
# Import the prostate dataset:
prostate <- h2o.importFile("http://s3.amazonaws.com/h2o-public-test-data/smalldata/prostate/prostate.csv")

# Request quantiles for the parsed dataset
quantile(prostate)

# Request quantiles for the AGE column:
quantile(prostate[, 3])

# Request quantiles for probabilities 0.001 and 0.01 for the AGE column
quantile(prostate[, 3], prob=c(0.001, 0.01))