H2O 3.38.0.4

REST API Reference

GET /3/About

Return information about this H2O cluster.

InputAboutV3
OutputAboutV3

GET /3/Capabilities

List of all registered capabilities

InputCapabilitiesV3
OutputCapabilitiesV3

GET /3/Capabilities/API

List of all registered Rest API capabilities

InputCapabilitiesV3
OutputCapabilitiesV3

GET /3/Capabilities/Core

List registered core capabilities

InputCapabilitiesV3
OutputCapabilitiesV3

GET /3/Cloud

Determine the status of the nodes in the H2O cloud.

InputCloudV3
OutputCloudV3

HEAD /3/Cloud

Determine the status of the nodes in the H2O cloud.

InputCloudV3
OutputCloudV3

POST /3/CloudLock

Lock the cloud.

InputCloudLockV3
OutputCloudLockV3

GET /3/ComputeGram

Get weighted gram matrix

InputGramV3
OutputGramV3

POST /3/CreateFrame

Create a synthetic H2O Frame with random data. You can specify the number of rows/columns, as well as column types: integer, real, boolean, time, string, categorical. The frame may also have a dedicated “response” column, and some of the entries in the dataset may be created as missing.

InputCreateFrameV3
OutputJobV3

DELETE /3/DKV

Remove all keys from the H2O distributed K/V store.

InputRemoveAllV3
OutputRemoveAllV3

DELETE /3/DKV/{key}

Remove an arbitrary key from the H2O distributed K/V store.

InputRemoveV3
OutputRemoveV3

POST /3/DataInfoFrame

Test only

InputDataInfoFrameV3
OutputDataInfoFrameV3

POST /3/DecryptionSetup

Install a decryption tool for parsing of encrypted data.

InputDecryptionSetupV3
OutputDecryptionSetupV3

GET /3/DownloadDataset

Download dataset as a CSV.

InputDownloadDataV3
OutputDownloadDataV3

GET /3/DownloadDataset.bin

Download dataset as a CSV.

InputDownloadDataV3
OutputDownloadDataV3

POST /3/FeatureInteraction

Fetch feature interaction data

InputFeatureInteractionV3
OutputFeatureInteractionV3

GET /3/Find

Find a value within a Frame.

InputFindV3
OutputFindV3

GET /3/FrameChunks/{frame_id}

Return information about chunks for a given frame.

InputFrameChunksV3
OutputFrameChunksV3

GET /3/Frames

Return all Frames in the H2O distributed K/V store.

InputFramesListV3
OutputFramesListV3

DELETE /3/Frames

Delete all Frames from the H2O distributed K/V store.

InputFramesV3
OutputFramesV3

POST /3/Frames/load

Load a frame from data on given path.

InputFrameLoadV3
OutputFrameLoadV3

GET /3/Frames/{frame_id}

Return the specified Frame.

InputFramesV3
OutputFramesV3

DELETE /3/Frames/{frame_id}

Delete the specified Frame from the H2O distributed K/V store.

InputFramesV3
OutputFramesV3

GET /3/Frames/{frame_id}/columns

Return all the columns from a Frame.

InputFramesV3
OutputFramesV3

GET /3/Frames/{frame_id}/columns/{column}

Return the specified column from a Frame.

InputFramesV3
OutputFramesV3

GET /3/Frames/{frame_id}/columns/{column}/domain

Return the domains for the specified categorical column (“null” if the column is not a categorical).

InputFramesV3
OutputFramesV3

GET /3/Frames/{frame_id}/columns/{column}/summary

Return the summary metrics for a column, e.g. min, max, mean, sigma, percentiles, etc.

InputFramesV3
OutputFramesV3

POST /3/Frames/{frame_id}/export

Export a Frame to the given path with optional overwrite.

InputFramesV3
OutputFramesV3

GET /3/Frames/{frame_id}/export/{path}/overwrite/{force}

[DEPRECATED] Export a Frame to the given path with optional overwrite.

InputFramesV3
OutputFramesV3

GET /3/Frames/{frame_id}/light

Return a basic info about Frame to fill client Rapid expression cache.

InputFramesV3
OutputFramesV3

POST /3/Frames/{frame_id}/save

Save frame data to the given path.

InputFrameSaveV3
OutputFrameSaveV3

GET /3/Frames/{frame_id}/summary

Return a Frame, including the histograms, after forcing computation of rollups.

InputFramesV3
OutputFramesV3

POST /3/FriedmansPopescusH

Fetch Friedman Popescus H.

InputFriedmanPopescusHV3
OutputFriedmanPopescusHV3

POST /3/GarbageCollect

Explicitly call System.gc().

InputGarbageCollectV3
OutputGarbageCollectV3

GET /3/GetGLMRegPath

Get full regularization path

InputGLMRegularizationPathV3
OutputGLMRegularizationPathV3

POST /3/Grid.bin/import

Import previously saved grid model

InputGridImportV3
OutputGridKeyV3

POST /3/Grid.bin/{grid_id}/export

Export a Grid and its models.

InputGridExportV3
OutputGridKeyV3

GET /3/ImportFiles

[DEPRECATED] Import raw data files into a single-column H2O Frame.

InputImportFilesV3
OutputImportFilesV3

POST /3/ImportFiles

Import raw data files into a single-column H2O Frame.

InputImportFilesV3
OutputImportFilesV3

POST /3/ImportFilesMulti

Import raw data files from multiple directories (or different data sources) into a single-column H2O Frame.

InputImportFilesMultiV3
OutputImportFilesMultiV3

POST /3/ImportHiveTable

Import Hive table into an H2O Frame.

InputImportHiveTableV3
OutputJobV3

GET /3/InitID

Issue a new session ID.

InputInitIDV3
OutputInitIDV3

DELETE /3/InitID

End a session.

InputInitIDV3
OutputInitIDV3

POST /3/Interaction

Create interactions between categorical columns.

InputInteractionV3
OutputJobV3

GET /3/JStack

Report stack traces for all threads on all nodes.

InputJStackV3
OutputJStackV3

GET /3/Jobs

Get a list of all the H2O Jobs (long-running actions).

InputJobsV3
OutputJobsV3

GET /3/Jobs/{job_id}

Get the status of the given H2O Job (long-running action).

InputJobsV3
OutputJobsV3

POST /3/Jobs/{job_id}/cancel

Cancel a running job.

InputJobsV3
OutputJobsV3

GET /3/KillMinus3

Kill minus 3 on this node

InputKillMinus3V3
OutputKillMinus3V3

POST /3/LogAndEcho

Save a message to the H2O logfile.

InputLogAndEchoV3
OutputLogAndEchoV3

GET /3/Logs/nodes/{nodeidx}/files/{name}

Get named log file for a node.

InputLogsV3
OutputLogsV3

POST /3/MakeGLMModel

Make a new GLM model based on existing one

InputMakeGLMModelV3
OutputGLMModelV3

GET /3/Metadata/endpoints

Return the list of (almost) all REST API endpoints.

InputMetadataV3
OutputMetadataV3

GET /3/Metadata/endpoints/{path}

Return the REST API endpoint metadata, including documentation, for the endpoint specified by path or index.

InputMetadataV3
OutputMetadataV3

GET /3/Metadata/schemaclasses/{classname}

Return the REST API schema metadata for specified schema class.

InputMetadataV3
OutputMetadataV3

GET /3/Metadata/schemas

Return list of all REST API schemas.

InputMetadataV3
OutputMetadataV3

GET /3/Metadata/schemas/{schemaname}

Return the REST API schema metadata for specified schema.

InputMetadataV3
OutputMetadataV3

POST /3/MissingInserter

Insert missing values.

InputMissingInserterV3
OutputJobV3

GET /3/ModelBuilders

Return the Model Builder metadata for all available algorithms.

InputModelBuildersV3
OutputModelBuildersV3

POST /3/ModelBuilders/anovaglm

Train a ANOVAGLM model.

InputANOVAGLMV3
OutputANOVAGLMV3

POST /3/ModelBuilders/anovaglm/parameters

Validate a set of ANOVAGLM model builder parameters.

InputANOVAGLMV3
OutputANOVAGLMV3

POST /3/ModelBuilders/coxph

Train a CoxPH model.

InputCoxPHV3
OutputCoxPHV3

POST /3/ModelBuilders/coxph/parameters

Validate a set of CoxPH model builder parameters.

InputCoxPHV3
OutputCoxPHV3

POST /3/ModelBuilders/deeplearning

Train a DeepLearning model.

InputDeepLearningV3
OutputDeepLearningV3

POST /3/ModelBuilders/deeplearning/parameters

Validate a set of DeepLearning model builder parameters.

InputDeepLearningV3
OutputDeepLearningV3

POST /3/ModelBuilders/drf

Train a DRF model.

InputDRFV3
OutputDRFV3

POST /3/ModelBuilders/drf/parameters

Validate a set of DRF model builder parameters.

InputDRFV3
OutputDRFV3

POST /3/ModelBuilders/extendedisolationforest

Train a ExtendedIsolationForest model.

InputExtendedIsolationForestV3
OutputExtendedIsolationForestV3

POST /3/ModelBuilders/extendedisolationforest/parameters

Validate a set of ExtendedIsolationForest model builder parameters.

InputExtendedIsolationForestV3
OutputExtendedIsolationForestV3

POST /3/ModelBuilders/gam

Train a GAM model.

InputGAMV3
OutputGAMV3

POST /3/ModelBuilders/gam/parameters

Validate a set of GAM model builder parameters.

InputGAMV3
OutputGAMV3

POST /3/ModelBuilders/gbm

Train a GBM model.

InputGBMV3
OutputGBMV3

POST /3/ModelBuilders/gbm/parameters

Validate a set of GBM model builder parameters.

InputGBMV3
OutputGBMV3

POST /3/ModelBuilders/generic

Train a Generic model.

InputGenericV3
OutputGenericV3

POST /3/ModelBuilders/generic/parameters

Validate a set of Generic model builder parameters.

InputGenericV3
OutputGenericV3

POST /3/ModelBuilders/glm

Train a GLM model.

InputGLMV3
OutputGLMV3

POST /3/ModelBuilders/glm/parameters

Validate a set of GLM model builder parameters.

InputGLMV3
OutputGLMV3

POST /3/ModelBuilders/glrm

Train a GLRM model.

InputGLRMV3
OutputGLRMV3

POST /3/ModelBuilders/glrm/parameters

Validate a set of GLRM model builder parameters.

InputGLRMV3
OutputGLRMV3

POST /3/ModelBuilders/infogram

Train a Infogram model.

InputInfogramV3
OutputInfogramV3

POST /3/ModelBuilders/infogram/parameters

Validate a set of Infogram model builder parameters.

InputInfogramV3
OutputInfogramV3

POST /3/ModelBuilders/isolationforest

Train a IsolationForest model.

InputIsolationForestV3
OutputIsolationForestV3

POST /3/ModelBuilders/isolationforest/parameters

Validate a set of IsolationForest model builder parameters.

InputIsolationForestV3
OutputIsolationForestV3

POST /3/ModelBuilders/isotonicregression

Train a IsotonicRegression model.

InputIsotonicRegressionV3
OutputIsotonicRegressionV3

POST /3/ModelBuilders/isotonicregression/parameters

Validate a set of IsotonicRegression model builder parameters.

InputIsotonicRegressionV3
OutputIsotonicRegressionV3

POST /3/ModelBuilders/kmeans

Train a KMeans model.

InputKMeansV3
OutputKMeansV3

POST /3/ModelBuilders/kmeans/parameters

Validate a set of KMeans model builder parameters.

InputKMeansV3
OutputKMeansV3

POST /3/ModelBuilders/modelselection

Train a ModelSelection model.

InputModelSelectionV3
OutputModelSelectionV3

POST /3/ModelBuilders/modelselection/parameters

Validate a set of ModelSelection model builder parameters.

InputModelSelectionV3
OutputModelSelectionV3

POST /3/ModelBuilders/naivebayes

Train a NaiveBayes model.

InputNaiveBayesV3
OutputNaiveBayesV3

POST /3/ModelBuilders/naivebayes/parameters

Validate a set of NaiveBayes model builder parameters.

InputNaiveBayesV3
OutputNaiveBayesV3

POST /3/ModelBuilders/pca

Train a PCA model.

InputPCAV3
OutputPCAV3

POST /3/ModelBuilders/pca/parameters

Validate a set of PCA model builder parameters.

InputPCAV3
OutputPCAV3

POST /3/ModelBuilders/psvm

Train a PSVM model.

InputPSVMV3
OutputPSVMV3

POST /3/ModelBuilders/psvm/parameters

Validate a set of PSVM model builder parameters.

InputPSVMV3
OutputPSVMV3

POST /3/ModelBuilders/rulefit

Train a RuleFit model.

InputRuleFitV3
OutputRuleFitV3

POST /3/ModelBuilders/rulefit/parameters

Validate a set of RuleFit model builder parameters.

InputRuleFitV3
OutputRuleFitV3

POST /3/ModelBuilders/targetencoder

Train a TargetEncoder model.

InputTargetEncoderV3
OutputTargetEncoderV3

POST /3/ModelBuilders/targetencoder/parameters

Validate a set of TargetEncoder model builder parameters.

InputTargetEncoderV3
OutputTargetEncoderV3

POST /3/ModelBuilders/upliftdrf

Train a UpliftDRF model.

InputUpliftDRFV3
OutputUpliftDRFV3

POST /3/ModelBuilders/upliftdrf/parameters

Validate a set of UpliftDRF model builder parameters.

InputUpliftDRFV3
OutputUpliftDRFV3

POST /3/ModelBuilders/word2vec

Train a Word2Vec model.

InputWord2VecV3
OutputWord2VecV3

POST /3/ModelBuilders/word2vec/parameters

Validate a set of Word2Vec model builder parameters.

InputWord2VecV3
OutputWord2VecV3

POST /3/ModelBuilders/xgboost

Train a XGBoost model.

InputXGBoostV3
OutputXGBoostV3

POST /3/ModelBuilders/xgboost/parameters

Validate a set of XGBoost model builder parameters.

InputXGBoostV3
OutputXGBoostV3

GET /3/ModelBuilders/{algo}

Return the Model Builder metadata for the specified algorithm.

InputModelBuildersV3
OutputModelBuildersV3

POST /3/ModelBuilders/{algo}/model_id

Return a new unique model_id for the specified algorithm.

InputModelBuildersV3
OutputModelIdV3

GET /3/ModelMetrics

Return all the saved scoring metrics.

InputModelMetricsListSchemaV3
OutputModelMetricsListSchemaV3

GET /3/ModelMetrics/frames/{frame}

Return the saved scoring metrics for the specified Frame.

InputModelMetricsListSchemaV3
OutputModelMetricsListSchemaV3

GET /3/ModelMetrics/frames/{frame}/models/{model}

Return the saved scoring metrics for the specified Model and Frame.

InputModelMetricsListSchemaV3
OutputModelMetricsListSchemaV3

DELETE /3/ModelMetrics/frames/{frame}/models/{model}

Return the saved scoring metrics for the specified Model and Frame.

InputModelMetricsListSchemaV3
OutputModelMetricsListSchemaV3

GET /3/ModelMetrics/models/{model}

Return the saved scoring metrics for the specified Model.

InputModelMetricsListSchemaV3
OutputModelMetricsListSchemaV3

GET /3/ModelMetrics/models/{model}/frames/{frame}

Return the saved scoring metrics for the specified Model and Frame.

InputModelMetricsListSchemaV3
OutputModelMetricsListSchemaV3

DELETE /3/ModelMetrics/models/{model}/frames/{frame}

Return the saved scoring metrics for the specified Model and Frame.

InputModelMetricsListSchemaV3
OutputModelMetricsListSchemaV3

POST /3/ModelMetrics/models/{model}/frames/{frame}

Return the scoring metrics for the specified Frame with the specified Model. If the Frame has already been scored with the Model then cached results will be returned; otherwise predictions for all rows in the Frame will be generated and the metrics will be returned.

InputModelMetricsListSchemaV3
OutputModelMetricsListSchemaV3

POST /3/ModelMetrics/predictions_frame/{predictions_frame}/actuals_frame/{actuals_frame}

Create a ModelMetrics object from the predicted and actual values, and a domain for classification problems or a distribution family for regression problems.

InputModelMetricsMakerSchemaV3
OutputModelMetricsMakerSchemaV3

GET /3/Models

Return all Models from the H2O distributed K/V store.

InputModelsV3
OutputModelsV3

DELETE /3/Models

Delete all Models from the H2O distributed K/V store.

InputModelsV3
OutputModelsV3

GET /3/Models.fetch.bin/{model_id}

Return the model in the binary format.

InputModelsV3
OutputStreamingSchema

GET /3/Models.java/{model_id}

[DEPRECATED] Return the stream containing model implementation in Java code.

InputModelsV3
OutputStreamingSchema

GET /3/Models.java/{model_id}/preview

Return potentially abridged model suitable for viewing in a browser (currently only used for java model code).

InputModelsV3
OutputStreamingSchema

GET /3/Models/{model_id}

Return the specified Model from the H2O distributed K/V store, optionally with the list of compatible Frames.

InputModelsV3
OutputModelsV3

DELETE /3/Models/{model_id}

Delete the specified Model from the H2O distributed K/V store.

InputModelsV3
OutputModelsV3

GET /3/Models/{model_id}/mojo

Return the model in the MOJO format. This format can then be interpreted by gen_model.jar in order to perform prediction / scoring. Currently works for GBM and DRF algos only.

InputModelsV3
OutputStreamingSchema

GET /3/NetworkTest

Run a network test to measure the performance of the cluster interconnect.

InputNetworkTestV3
OutputNetworkTestV3

GET /3/NodePersistentStorage/categories/{category}/exists

Return true or false.

InputNodePersistentStorageV3
OutputNodePersistentStorageV3

GET /3/NodePersistentStorage/categories/{category}/names/{name}/exists

Return true or false.

InputNodePersistentStorageV3
OutputNodePersistentStorageV3

GET /3/NodePersistentStorage/configured

Return true or false.

InputNodePersistentStorageV3
OutputNodePersistentStorageV3

POST /3/NodePersistentStorage/{category}

Store a value.

InputNodePersistentStorageV3
OutputNodePersistentStorageV3

GET /3/NodePersistentStorage/{category}

Return all keys stored for a given category.

InputNodePersistentStorageV3
OutputNodePersistentStorageV3

POST /3/NodePersistentStorage/{category}/{name}

Store a named value.

InputNodePersistentStorageV3
OutputNodePersistentStorageV3

GET /3/NodePersistentStorage/{category}/{name}

Return value for a given name.

InputNodePersistentStorageV3
OutputNodePersistentStorageV3

DELETE /3/NodePersistentStorage/{category}/{name}

Delete a key.

InputNodePersistentStorageV3
OutputNodePersistentStorageV3

POST /3/Parse

Parse a raw byte-oriented Frame into a useful columnar data Frame.

InputParseV3
OutputParseV3

POST /3/ParseSVMLight

Parse a raw byte-oriented Frame into a useful columnar data Frame.

InputParseSVMLightV3
OutputJobV3

POST /3/ParseSetup

Guess the parameters for parsing raw byte-oriented data into an H2O Frame.

InputParseSetupV3
OutputParseSetupV3

POST /3/PartialDependence/

Create data for partial dependence plot(s) for the specified model and frame.

InputPartialDependenceV3
OutputJobV3

GET /3/PartialDependence/{name}

Fetch partial dependence data.

InputPartialDependenceKeyV3
OutputPartialDependenceV3

POST /3/PersistS3

Set Amazon S3 credentials (Secret Key ID, Secret Access Key)

InputPersistS3CredentialsV3
OutputPersistS3CredentialsV3

DELETE /3/PersistS3

Remove store Amazon S3 credentials

InputPersistS3CredentialsV3
OutputPersistS3CredentialsV3

GET /3/Ping

The endpoint used to let H2O know from external services that it should keep running.

InputPingV3
OutputPingV3

POST /3/Predictions/models/{model}/frames/{frame}

Score (generate predictions) for the specified Frame with the specified Model. Both the Frame of predictions and the metrics will be returned.

InputModelMetricsListSchemaV3
OutputModelMetricsListSchemaV3

GET /3/Profiler

Report real-time profiling information for all nodes (sorted, aggregated stack traces).

InputProfilerV3
OutputProfilerV3

POST /3/Recovery/resume

Recover stored state and resume interrupted job.

InputResumeV3
OutputResumeV3

POST /3/SaveToHiveTable

Save an H2O Frame contents into a Hive table.

InputSaveToHiveTableV3
OutputSaveToHiveTableV3

POST /3/SegmentModelsBuilders/anovaglm

Validate a set of ANOVAGLM model builder parameters.

InputModelBuilderSchema
OutputModelBuilderSchema

POST /3/SegmentModelsBuilders/coxph

Validate a set of CoxPH model builder parameters.

InputModelBuilderSchema
OutputModelBuilderSchema

POST /3/SegmentModelsBuilders/deeplearning

Validate a set of DeepLearning model builder parameters.

InputModelBuilderSchema
OutputModelBuilderSchema

POST /3/SegmentModelsBuilders/drf

Validate a set of DRF model builder parameters.

InputModelBuilderSchema
OutputModelBuilderSchema

POST /3/SegmentModelsBuilders/extendedisolationforest

Validate a set of ExtendedIsolationForest model builder parameters.

InputModelBuilderSchema
OutputModelBuilderSchema

POST /3/SegmentModelsBuilders/gam

Validate a set of GAM model builder parameters.

InputModelBuilderSchema
OutputModelBuilderSchema

POST /3/SegmentModelsBuilders/gbm

Validate a set of GBM model builder parameters.

InputModelBuilderSchema
OutputModelBuilderSchema

POST /3/SegmentModelsBuilders/generic

Validate a set of Generic model builder parameters.

InputModelBuilderSchema
OutputModelBuilderSchema

POST /3/SegmentModelsBuilders/glm

Validate a set of GLM model builder parameters.

InputModelBuilderSchema
OutputModelBuilderSchema

POST /3/SegmentModelsBuilders/glrm

Validate a set of GLRM model builder parameters.

InputModelBuilderSchema
OutputModelBuilderSchema

POST /3/SegmentModelsBuilders/infogram

Validate a set of Infogram model builder parameters.

InputModelBuilderSchema
OutputModelBuilderSchema

POST /3/SegmentModelsBuilders/isolationforest

Validate a set of IsolationForest model builder parameters.

InputModelBuilderSchema
OutputModelBuilderSchema

POST /3/SegmentModelsBuilders/isotonicregression

Validate a set of IsotonicRegression model builder parameters.

InputModelBuilderSchema
OutputModelBuilderSchema

POST /3/SegmentModelsBuilders/kmeans

Validate a set of KMeans model builder parameters.

InputModelBuilderSchema
OutputModelBuilderSchema

POST /3/SegmentModelsBuilders/modelselection

Validate a set of ModelSelection model builder parameters.

InputModelBuilderSchema
OutputModelBuilderSchema

POST /3/SegmentModelsBuilders/naivebayes

Validate a set of NaiveBayes model builder parameters.

InputModelBuilderSchema
OutputModelBuilderSchema

POST /3/SegmentModelsBuilders/pca

Validate a set of PCA model builder parameters.

InputModelBuilderSchema
OutputModelBuilderSchema

POST /3/SegmentModelsBuilders/psvm

Validate a set of PSVM model builder parameters.

InputModelBuilderSchema
OutputModelBuilderSchema

POST /3/SegmentModelsBuilders/rulefit

Validate a set of RuleFit model builder parameters.

InputModelBuilderSchema
OutputModelBuilderSchema

POST /3/SegmentModelsBuilders/targetencoder

Validate a set of TargetEncoder model builder parameters.

InputModelBuilderSchema
OutputModelBuilderSchema

POST /3/SegmentModelsBuilders/upliftdrf

Validate a set of UpliftDRF model builder parameters.

InputModelBuilderSchema
OutputModelBuilderSchema

POST /3/SegmentModelsBuilders/word2vec

Validate a set of Word2Vec model builder parameters.

InputModelBuilderSchema
OutputModelBuilderSchema

POST /3/SegmentModelsBuilders/xgboost

Validate a set of XGBoost model builder parameters.

InputModelBuilderSchema
OutputModelBuilderSchema

POST /3/SessionProperties

Set session property.

InputSessionPropertyV3
OutputSessionPropertyV3

GET /3/SessionProperties

Get session property.

InputSessionPropertyV3
OutputSessionPropertyV3

POST /3/Shutdown

Shut down the cluster.

InputShutdownV3
OutputShutdownV3

POST /3/SignificantRules

Fetch significant rules table.

InputSignificantRulesV3
OutputSignificantRulesV3

POST /3/SplitFrame

Split an H2O Frame.

InputSplitFrameV3
OutputSplitFrameV3

GET /3/SteamMetrics

Get metrics for Steam from H2O.

InputSteamMetricsV3
OutputSteamMetricsV3

GET /3/TargetEncoderTransform

Transform using give TargetEncoderModel

InputTargetEncoderTransformParametersV3
OutputFrameKeyV3

GET /3/Timeline

Debugging tool that provides information on current communication between nodes.

InputTimelineV3
OutputTimelineV3

GET /3/Tree

Obtain a traverseable representation of a specific tree

InputTreeV3
OutputTreeV3

GET /3/Typeahead/files

Typeahead hander for filename completion.

InputTypeaheadV3
OutputTypeaheadV3

POST /3/UnlockKeys

Unlock all keys in the H2O distributed K/V store, to attempt to recover from a crash.

InputUnlockKeysV3
OutputUnlockKeysV3

GET /3/WaterMeterCpuTicks/{nodeidx}

Return a CPU usage snapshot of all cores of all nodes in the H2O cluster.

InputWaterMeterCpuTicksV3
OutputWaterMeterCpuTicksV3

GET /3/WaterMeterIo

Return IO usage snapshot of all nodes in the H2O cluster.

InputWaterMeterIoV3
OutputWaterMeterIoV3

GET /3/WaterMeterIo/{nodeidx}

Return IO usage snapshot of all nodes in the H2O cluster.

InputWaterMeterIoV3
OutputWaterMeterIoV3

GET /3/Word2VecSynonyms

Find synonyms using a word2vec model

InputWord2VecSynonymsV3
OutputWord2VecSynonymsV3

GET /3/Word2VecTransform

Transform words to vectors using a word2vec model

InputWord2VecTransformV3
OutputWord2VecTransformV3

POST /3/XGBoostExecutor.cleanup

Remote XGBoost execution - cleanup

InputXGBoostExecReqV3
OutputXGBoostExecRespV3

POST /3/XGBoostExecutor.getBooster

Remote XGBoost execution - get booster

InputXGBoostExecReqV3
OutputStreamingSchema

POST /3/XGBoostExecutor.init

Remote XGBoost execution - init

InputXGBoostExecReqV3
OutputXGBoostExecRespV3

POST /3/XGBoostExecutor.setup

Remote XGBoost execution - setup

InputXGBoostExecReqV3
OutputStreamingSchema

POST /3/XGBoostExecutor.update

Remote XGBoost execution - update

InputXGBoostExecReqV3
OutputXGBoostExecRespV3

POST /4/Frames/$simple

Create frame with random (uniformly distributed) data. You can specify how many columns of each type to make; and what the desired range for each column type.

InputCreateFrameSimpleIV4
OutputJobV4

POST /4/Predictions/models/{model}/frames/{frame}

Score (generate predictions) for the specified Frame with the specified Model. Both the Frame of predictions and the metrics will be returned.

InputModelMetricsListSchemaV3
OutputJobV3

GET /4/endpoints

Returns the list of all REST API (v4) endpoints.

InputListRequestV4
OutputEndpointsListV4

GET /4/jobs/{job_id}

Retrieve information about the current state of a job.

InputJobIV4
OutputJobV4

GET /4/modelsinfo

Return basic information about all models available to train.

InputListRequestV4
OutputModelsInfoV4

POST /4/sessions

Start a new Rapids session, and return the session id.

InputInputSchemaV4
OutputSessionIdV4

DELETE /4/sessions/{session_key}

Close the Rapids session.

InputInitIDV3
OutputInitIDV3

POST /99/Assembly

Fit an assembly to an input frame

InputAssemblyV99
OutputAssemblyV99

GET /99/Assembly.java/{assembly_id}/{pojo_name}

Generate a Java POJO from the Assembly

InputAssemblyV99
OutputAssemblyV99

GET /99/AutoML/{automl_id}

Fetch the specified AutoML object.

InputAutoMLV99
OutputAutoMLV99

POST /99/AutoMLBuilder

Start an AutoML build process.

InputAutoMLBuildSpecV99
OutputAutoMLBuildSpecV99

POST /99/DCTTransformer

Row-by-row discrete cosine transforms in 1D, 2D and 3D.

InputDCTTransformerV3
OutputJobV3

POST /99/Grid/aggregator

Run grid search for Aggregator model.

InputAggregatorV99
OutputAggregatorV99

POST /99/Grid/aggregator/resume

Resume grid search for Aggregator model.

InputAggregatorV99
OutputAggregatorV99

POST /99/Grid/anovaglm

Run grid search for ANOVAGLM model.

InputANOVAGLMV3
OutputANOVAGLMV3

POST /99/Grid/anovaglm/resume

Resume grid search for ANOVAGLM model.

InputANOVAGLMV3
OutputANOVAGLMV3

POST /99/Grid/coxph

Run grid search for CoxPH model.

InputCoxPHV3
OutputCoxPHV3

POST /99/Grid/coxph/resume

Resume grid search for CoxPH model.

InputCoxPHV3
OutputCoxPHV3

POST /99/Grid/deeplearning

Run grid search for DeepLearning model.

InputDeepLearningV3
OutputDeepLearningV3

POST /99/Grid/deeplearning/resume

Resume grid search for DeepLearning model.

InputDeepLearningV3
OutputDeepLearningV3

POST /99/Grid/drf

Run grid search for DRF model.

InputDRFV3
OutputDRFV3

POST /99/Grid/drf/resume

Resume grid search for DRF model.

InputDRFV3
OutputDRFV3

POST /99/Grid/extendedisolationforest

Run grid search for ExtendedIsolationForest model.

InputExtendedIsolationForestV3
OutputExtendedIsolationForestV3

POST /99/Grid/extendedisolationforest/resume

Resume grid search for ExtendedIsolationForest model.

InputExtendedIsolationForestV3
OutputExtendedIsolationForestV3

POST /99/Grid/gam

Run grid search for GAM model.

InputGAMV3
OutputGAMV3

POST /99/Grid/gam/resume

Resume grid search for GAM model.

InputGAMV3
OutputGAMV3

POST /99/Grid/gbm

Run grid search for GBM model.

InputGBMV3
OutputGBMV3

POST /99/Grid/gbm/resume

Resume grid search for GBM model.

InputGBMV3
OutputGBMV3

POST /99/Grid/generic

Run grid search for Generic model.

InputGenericV3
OutputGenericV3

POST /99/Grid/generic/resume

Resume grid search for Generic model.

InputGenericV3
OutputGenericV3

POST /99/Grid/glm

Run grid search for GLM model.

InputGLMV3
OutputGLMV3

POST /99/Grid/glm/resume

Resume grid search for GLM model.

InputGLMV3
OutputGLMV3

POST /99/Grid/glrm

Run grid search for GLRM model.

InputGLRMV3
OutputGLRMV3

POST /99/Grid/glrm/resume

Resume grid search for GLRM model.

InputGLRMV3
OutputGLRMV3

POST /99/Grid/infogram

Run grid search for Infogram model.

InputInfogramV3
OutputInfogramV3

POST /99/Grid/infogram/resume

Resume grid search for Infogram model.

InputInfogramV3
OutputInfogramV3

POST /99/Grid/isolationforest

Run grid search for IsolationForest model.

InputIsolationForestV3
OutputIsolationForestV3

POST /99/Grid/isolationforest/resume

Resume grid search for IsolationForest model.

InputIsolationForestV3
OutputIsolationForestV3

POST /99/Grid/isotonicregression

Run grid search for IsotonicRegression model.

InputIsotonicRegressionV3
OutputIsotonicRegressionV3

POST /99/Grid/isotonicregression/resume

Resume grid search for IsotonicRegression model.

InputIsotonicRegressionV3
OutputIsotonicRegressionV3

POST /99/Grid/kmeans

Run grid search for KMeans model.

InputKMeansV3
OutputKMeansV3

POST /99/Grid/kmeans/resume

Resume grid search for KMeans model.

InputKMeansV3
OutputKMeansV3

POST /99/Grid/modelselection

Run grid search for ModelSelection model.

InputModelSelectionV3
OutputModelSelectionV3

POST /99/Grid/modelselection/resume

Resume grid search for ModelSelection model.

InputModelSelectionV3
OutputModelSelectionV3

POST /99/Grid/naivebayes

Run grid search for NaiveBayes model.

InputNaiveBayesV3
OutputNaiveBayesV3

POST /99/Grid/naivebayes/resume

Resume grid search for NaiveBayes model.

InputNaiveBayesV3
OutputNaiveBayesV3

POST /99/Grid/pca

Run grid search for PCA model.

InputPCAV3
OutputPCAV3

POST /99/Grid/pca/resume

Resume grid search for PCA model.

InputPCAV3
OutputPCAV3

POST /99/Grid/psvm

Run grid search for PSVM model.

InputPSVMV3
OutputPSVMV3

POST /99/Grid/psvm/resume

Resume grid search for PSVM model.

InputPSVMV3
OutputPSVMV3

POST /99/Grid/rulefit

Run grid search for RuleFit model.

InputRuleFitV3
OutputRuleFitV3

POST /99/Grid/rulefit/resume

Resume grid search for RuleFit model.

InputRuleFitV3
OutputRuleFitV3

POST /99/Grid/stackedensemble

Run grid search for StackedEnsemble model.

InputStackedEnsembleV99
OutputStackedEnsembleV99

POST /99/Grid/stackedensemble/resume

Resume grid search for StackedEnsemble model.

InputStackedEnsembleV99
OutputStackedEnsembleV99

POST /99/Grid/svd

Run grid search for SVD model.

InputSVDV99
OutputSVDV99

POST /99/Grid/svd/resume

Resume grid search for SVD model.

InputSVDV99
OutputSVDV99

POST /99/Grid/targetencoder

Run grid search for TargetEncoder model.

InputTargetEncoderV3
OutputTargetEncoderV3

POST /99/Grid/targetencoder/resume

Resume grid search for TargetEncoder model.

InputTargetEncoderV3
OutputTargetEncoderV3

POST /99/Grid/upliftdrf

Run grid search for UpliftDRF model.

InputUpliftDRFV3
OutputUpliftDRFV3

POST /99/Grid/upliftdrf/resume

Resume grid search for UpliftDRF model.

InputUpliftDRFV3
OutputUpliftDRFV3

POST /99/Grid/word2vec

Run grid search for Word2Vec model.

InputWord2VecV3
OutputWord2VecV3

POST /99/Grid/word2vec/resume

Resume grid search for Word2Vec model.

InputWord2VecV3
OutputWord2VecV3

POST /99/Grid/xgboost

Run grid search for XGBoost model.

InputXGBoostV3
OutputXGBoostV3

POST /99/Grid/xgboost/resume

Resume grid search for XGBoost model.

InputXGBoostV3
OutputXGBoostV3

GET /99/Grids

Return all grids from H2O distributed K/V store.

InputGridsV99
OutputGridsV99

GET /99/Grids/{grid_id}

Return the specified grid search result.

InputGridSchemaV99
OutputGridSchemaV99

POST /99/ImportSQLTable

Import SQL table into an H2O Frame.

InputImportSQLTableV99
OutputJobV3

GET /99/Leaderboards

Return all the AutoML leaderboards.

InputLeaderboardsV99
OutputLeaderboardsV99

GET /99/Leaderboards/{project_name}

Return the AutoML leaderboard for the given project.

InputLeaderboardsV99
OutputLeaderboardV99

POST /99/ModelBuilders/aggregator

Train a Aggregator model.

InputAggregatorV99
OutputAggregatorV99

POST /99/ModelBuilders/aggregator/parameters

Validate a set of Aggregator model builder parameters.

InputAggregatorV99
OutputAggregatorV99

POST /99/ModelBuilders/stackedensemble

Train a StackedEnsemble model.

InputStackedEnsembleV99
OutputStackedEnsembleV99

POST /99/ModelBuilders/stackedensemble/parameters

Validate a set of StackedEnsemble model builder parameters.

InputStackedEnsembleV99
OutputStackedEnsembleV99

POST /99/ModelBuilders/svd

Train a SVD model.

InputSVDV99
OutputSVDV99

POST /99/ModelBuilders/svd/parameters

Validate a set of SVD model builder parameters.

InputSVDV99
OutputSVDV99

POST /99/Models.bin/{model_id}

Import given binary model into H2O.

InputModelImportV3
OutputModelsV3

GET /99/Models.bin/{model_id}

Export given model.

InputModelExportV3
OutputModelExportV3

GET /99/Models.mojo/{model_id}

Export given model as Mojo.

InputModelExportV3
OutputModelExportV3

POST /99/Models.upload.bin/{model_id}

Upload given binary model into H2O.

InputModelImportV3
OutputModelsV3

GET /99/Models/{model_id}/json

Export given model details in json format.

InputModelExportV3
OutputModelExportV3

POST /99/Rapids

Execute an Rapids AstRoot.

InputRapidsSchemaV3
OutputRapidsSchemaV3

GET /99/Rapids/help

Produce help for Rapids AstRoot language.

InputSchemaV3
OutputRapidsHelpV3

GET /99/Sample

Example of an experimental endpoint. Call via /EXPERIMENTAL/Sample. Experimental endpoints can change at any moment.

InputCloudV3
OutputCloudV3

POST /99/SegmentModelsBuilders/aggregator

Validate a set of Aggregator model builder parameters.

InputModelBuilderSchema
OutputModelBuilderSchema

POST /99/SegmentModelsBuilders/stackedensemble

Validate a set of StackedEnsemble model builder parameters.

InputModelBuilderSchema
OutputModelBuilderSchema

POST /99/SegmentModelsBuilders/svd

Validate a set of SVD model builder parameters.

InputModelBuilderSchema
OutputModelBuilderSchema

POST /99/Tabulate

Tabulate one column vs another.

InputTabulateV3
OutputTabulateV3

REST API Schema Reference

ANOVAGLMModelOutputV3

coefficients_table
TwoDimTable[]
Table of CoefficientsIn
transformed_columns_key
Key
AnovaGLM transformed predictor frame key. For debugging purposes onlyIn
result_frame_key
Key
ANOVA table frame key containing Type III SS calculation, degree of freedom, F-statistics and p-values. This frame content is repeated in the model summary.In
names
string[]
Column namesOut
original_names
string[]
Original column namesOut
column_types
string[]
Column typesOut
domains
string[][]
Domains for categorical columnsOut
cross_validation_models
Key[]
Cross-validation models (model ids)Out
cross_validation_predictions
Key[]
Cross-validation predictions, one per cv model (deprecated, use cross_validation_holdout_predictions_frame_id instead)Out
cross_validation_holdout_predictions_frame_id
Key
Cross-validation holdout predictions (full out-of-sample predictions on training data)Out
cross_validation_fold_assignment_frame_id
Key
Cross-validation fold assignment (each row is assigned to one holdout fold)Out
model_category
enum
Category of the model (e.g., Binomial)Out
model_summary
TwoDimTable
Model summaryOut
scoring_history
TwoDimTable
Scoring historyOut
cv_scoring_history
TwoDimTable[]
Cross-Validation scoring historyOut
reproducibility_information_table
TwoDimTable[]
Model reproducibility informationOut
training_metrics
ModelMetrics
Training data model metricsOut
validation_metrics
ModelMetrics
Validation data model metricsOut
cross_validation_metrics
ModelMetrics
Cross-validation model metricsOut
cross_validation_metrics_summary
TwoDimTable
Cross-validation model metrics summaryOut
status
string
Job statusOut
start_time
long
Start time in millisecondsOut
end_time
long
End time in millisecondsOut
run_time
long
Runtime in millisecondsOut
default_threshold
double
Default threshold used for predictionsOut
help
Map
Help information for output fieldsOut

ANOVAGLMModelV3

model_id
Key
Model keyIn/Out
parameters
ANOVAGLMParameters
The build parameters for the model (e.g. K for KMeans).Out
output
ANOVAGLMModelOutput
The build output for the model (e.g. the cluster centers for KMeans).Out
compatible_frames
string[]
Compatible frames, if requestedOut
checksum
long
Checksum for all the things that go into building the Model.Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

ANOVAGLMParametersV3

seed
long
Seed for pseudo random number generator (if applicable)In
standardize
boolean
Standardize numeric columns to have zero mean and unit varianceIn
family
enum
Family. Use binomial for classification with logistic regression, others are for regression problems.In
tweedie_variance_power
double
Tweedie variance powerIn
tweedie_link_power
double
Tweedie link powerIn
theta
double
ThetaIn
alpha
double[]
Distribution of regularization between the L1 (Lasso) and L2 (Ridge) penalties. A value of 1 for alpha represents Lasso regression, a value of 0 produces Ridge regression, and anything in between specifies the amount of mixing between the two. Default value of alpha is 0 when SOLVER = ‘L-BFGS’; 0.5 otherwise.In
lambda
double[]
Regularization strengthIn
lambda_search
boolean
Use lambda search starting at lambda max, given lambda is then interpreted as lambda minIn
solver
enum
AUTO will set the solver based on given data and the other parameters. IRLSM is fast on on problems with small number of predictors and for lambda-search with L1 penalty, L_BFGS scales better for datasets with many columns.In
plug_values
Key
Plug Values (a single row frame containing values that will be used to impute missing values of the training/validation frame, use with conjunction missing_values_handling = PlugValues)In
non_negative
boolean
Restrict coefficients (not intercept) to be non-negativeIn
compute_p_values
boolean
Request p-values computation, p-values work only with IRLSM solver and no regularizationIn
max_iterations
int
Maximum number of iterationsIn
link
enum
Link function.In
prior
double
Prior probability for y==1. To be used only for logistic regression iff the data has been sampled and the mean of response does not reflect reality.In
highest_interaction_term
int
Limit the number of interaction terms, if 2 means interaction between 2 columns only, 3 for three columns and so on… Default to 2.In
type
int
Refer to the SS type 1, 2, 3, or 4. We are currently only supporting 3In
early_stopping
boolean
Stop early when there is no more relative improvement on train or validation (if provided).In
save_transformed_framekeys
boolean
true to save the keys of transformed predictors and interaction column.In
nparallelism
int
Number of models to build in parallel. Default to 4. Adjust according to your system.In
distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
missing_values_handling
enum
Handling of missing values. Either MeanImputation, Skip or PlugValues.In/Out
balance_classes
boolean
Balance training data class counts via over/under-sampling (for imbalanced data).In/Out
class_sampling_factors
float[]
Desired over/under-sampling ratios per class (in lexicographic order). If not specified, sampling factors will be automatically computed to obtain class balance during training. Requires balance_classes.In/Out
max_after_balance_size
float
Maximum relative size of the training data after balancing class counts (can be less than 1.0). Requires balance_classes.In/Out
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

ANOVAGLMV3

parameters
ANOVAGLMParameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out

AboutEntryV3

name
string
Property nameOut
value
string
Property valueOut

AboutV3

_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
entries
Iced[]
List of properties about this running H2O instanceOut

AggregatorModelOutputV99

output_frame
Key
Aggregated Frame of ExemplarsIn
mapping_frame
Key
Aggregated Frame mapping to the rows in the original dataIn
names
string[]
Column namesOut
original_names
string[]
Original column namesOut
column_types
string[]
Column typesOut
domains
string[][]
Domains for categorical columnsOut
cross_validation_models
Key[]
Cross-validation models (model ids)Out
cross_validation_predictions
Key[]
Cross-validation predictions, one per cv model (deprecated, use cross_validation_holdout_predictions_frame_id instead)Out
cross_validation_holdout_predictions_frame_id
Key
Cross-validation holdout predictions (full out-of-sample predictions on training data)Out
cross_validation_fold_assignment_frame_id
Key
Cross-validation fold assignment (each row is assigned to one holdout fold)Out
model_category
enum
Category of the model (e.g., Binomial)Out
model_summary
TwoDimTable
Model summaryOut
scoring_history
TwoDimTable
Scoring historyOut
cv_scoring_history
TwoDimTable[]
Cross-Validation scoring historyOut
reproducibility_information_table
TwoDimTable[]
Model reproducibility informationOut
training_metrics
ModelMetrics
Training data model metricsOut
validation_metrics
ModelMetrics
Validation data model metricsOut
cross_validation_metrics
ModelMetrics
Cross-validation model metricsOut
cross_validation_metrics_summary
TwoDimTable
Cross-validation model metrics summaryOut
status
string
Job statusOut
start_time
long
Start time in millisecondsOut
end_time
long
End time in millisecondsOut
run_time
long
Runtime in millisecondsOut
default_threshold
double
Default threshold used for predictionsOut
help
Map
Help information for output fieldsOut

AggregatorModelV99

model_id
Key
Model keyIn/Out
parameters
AggregatorParameters
The build parameters for the model (e.g. K for KMeans).Out
output
AggregatorOutput
The build output for the model (e.g. the cluster centers for KMeans).Out
compatible_frames
string[]
Compatible frames, if requestedOut
checksum
long
Checksum for all the things that go into building the Model.Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

AggregatorParametersV99

transform
enum
Transformation of training dataIn
pca_method
enum
Method for computing PCA (Caution: GLRM is currently experimental and unstable)In
distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
k
int
Rank of matrix approximationIn/Out
max_iterations
int
Maximum number of iterations for PCAIn/Out
target_num_exemplars
int
Targeted number of exemplarsIn/Out
rel_tol_num_exemplars
double
Relative tolerance for number of exemplars (e.g, 0.5 is +/- 50 percents)In/Out
seed
long
RNG seed for initializationIn/Out
use_all_factor_levels
boolean
Whether first factor level is included in each categorical expansionIn/Out
save_mapping_frame
boolean
Whether to export the mapping of the aggregated frameIn/Out
num_iteration_without_new_exemplar
int
The number of iterations to run before aggregator exits if the number of exemplars collected didn’t changeIn/Out
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

AggregatorV99

parameters
AggregatorParameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out

AssemblyKeyV3

name
string
Name (string representation) for this Key.In/Out
type
string
Name (string representation) for the type of Keyed this Key points to.In/Out
URL
string
URL for the resource that this Key points to, if one exists.In/Out

AssemblyV99

steps
string[]
A list of steps describing the assembly line.In
frame
Key
Input Frame for the assembly.In
pojo_name
string
The name of the file and generated class In
assembly_id
string
The key of the Assembly object to retrieve from the DKV.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
result
Key
Output of the assembly line.Out
assembly
Key
A Key to the fit Assembly data structureOut

AutoMLBuildControlV99

stopping_criteria
AutoMLStoppingCriteria
Model performance based stopping criteria for the AutoML run.In
nfolds
int
Number of folds for k-fold cross-validation (defaults to -1 (AUTO), otherwise it must be >=2 or use 0 to disable). Disabling prevents Stacked Ensembles from being built.In
balance_classes
boolean
Balance training data class counts via over/under-sampling (for imbalanced data).In
class_sampling_factors
float[]
Desired over/under-sampling ratios per class (in lexicographic order). If not specified, sampling factors will be automatically computed to obtain class balance during training. Requires balance_classes.In
max_after_balance_size
float
Maximum relative size of the training data after balancing class counts (defaults to 5.0 and can be less than 1.0). Requires balance_classes.In
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation predictions. This needs to be set to TRUE if running the same AutoML object for repeated runs because CV predictions are required to build additional Stacked Ensemble models in AutoML.In
keep_cross_validation_models
boolean
Whether to keep the cross-validated models. Keeping cross-validation models may consume significantly more memory in the H2O cluster.In
keep_cross_validation_fold_assignment
boolean
Whether to keep cross-validation assignments.In
export_checkpoints_dir
string
Path to a directory where every generated model will be stored.In
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
project_name
string
Optional project name used to group models from multiple AutoML runs into a single Leaderboard; derived from the training data name if not specified.In/Out
distribution
enum
Distribution function used by algorithms that support it; other algorithms use their defaults.In/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out

AutoMLBuildModelsV99

exclude_algos
enum[]
A list of algorithms to skip during the model-building phase.In
include_algos
enum[]
A list of algorithms to restrict to during the model-building phase.In
exploitation_ratio
double
The budget ratio (between 0 and 1) dedicated to the exploitation (vs exploration) phase.In
modeling_plan
StepDefinition[]
The list of modeling steps to be used by the AutoML engine (they may not all get executed, depending on other constraints).In
preprocessing
PreprocessingStepDefinition[]
The list of preprocessing steps to run. Only ‘target_encoding’ is currently supported.In
algo_parameters
Iced[]
Custom algorithm parameters.In
monotone_constraints
KeyValue[]
A mapping representing monotonic constraints. Use +1 to enforce an increasing constraint and -1 to specify a decreasing constraint.In

AutoMLBuildSpecV99

build_control
AutoMLBuildControl
Specification of overall controls for the AutoML build process.In
input_spec
AutoMLInput
Specification of the input data for the AutoML build process.In
build_models
AutoMLBuildModels
If present, specifies details of how to train models.In
job
Job
The AutoML Job keyOut

AutoMLCustomParameterV99

scope
enum
Scope of application of the parameter (specific algo, or any algo).In
name
string
Name of the model parameter.In
value
Polymorphic
Value of the model parameter.In

AutoMLInputV99

training_frame
Key
ID of the training data frame.In
response_column
VecSpecifier
Response columnIn
validation_frame
Key
ID of the validation data frame (used for early stopping in grid searches and for early stopping of the AutoML process itself).In
blending_frame
Key
ID of the H2OFrame used to train the the metalearning algorithm in Stacked Ensembles (instead of relying on cross-validated predicted values). When provided, it is also recommended to disable cross validation by setting nfolds=0 and to provide a leaderboard frame for scoring purposes.In
leaderboard_frame
Key
ID of the leaderboard data frame (used to score models and rank them on the AutoML Leaderboard).In
fold_column
VecSpecifier
Fold column (contains fold IDs) in the training frame. These assignments are used to create the folds for cross-validation of the models.In
weights_column
VecSpecifier
Weights column in the training frame, which specifies the row weights used in model training.In
ignored_columns
string[]
Names of columns to ignore in the training frame when building models.In
sort_metric
enum
Metric used to sort leaderboardIn

AutoMLKeyV3

name
string
Name (string representation) for this Key.In/Out
type
string
Name (string representation) for the type of Keyed this Key points to.In/Out
URL
string
URL for the resource that this Key points to, if one exists.In/Out

AutoMLStoppingCriteriaV99

seed
long
Seed for random number generator; set to a value other than -1 for reproducibility.In
max_models
int
Maximum number of models to build (optional). Always set this parameter to ensure AutoML reproducibility: all models are then trained until convergence and none is constrained by a time budget.In
max_runtime_secs
double
This argument specifies the maximum time that the AutoML process will run for. If both max_runtime_secs and max_models are specified, then the AutoML run will stop as soon as it hits either of these limits. If neither max_runtime_secs nor max_models are specified, then max_runtime_secs defaults to 3600 seconds (1 hour).In
max_runtime_secs_per_model
double
Maximum time to spend on each individual model (optional). Note that models constrained by a time budget are not guaranteed reproducible.In
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression)In
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In

AutoMLV99

automl_id
Key
Optional AutoML run ID; omitting this returns all runsIn
sort_metric
string
Metric used to sort leaderboardIn
verbosity
enum
Verbosity level of the returned event logIn/Out
project_name
string
Identifier for models that should be grouped together in the same leaderboardIn/Out
training_frame
Key
ID of the actual training frame for this AutoML run after any automatic splittingOut
validation_frame
Key
ID of the actual validation frame for this AutoML run after any automatic splittingOut
blending_frame
Key
ID of the actual blending frame used to train the Stacked Ensembles in blending modeOut
leaderboard_frame
Key
ID of the actual leaderboard frame for this AutoML run after any automatic splittingOut
leaderboard
Leaderboard
The leaderboard for this project, potentially including models from other AutoML runsOut
leaderboard_table
TwoDimTable
The leaderboard for this project, potentially including models from other AutoML runs, for easy renderingOut
event_log
EventLog
Event log of this AutoML runOut
event_log_table
TwoDimTable
Event log of this AutoML run, for easy renderingOut
modeling_steps
StepDefinition[]
The list of modeling steps effectively used during the AutoML runOut

CapabilitiesV3

_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
capabilities
Iced[]
List of H2O capabilitiesOut

CapabilityEntryV3

name
string
Extension nameOut

CartesianSearchCriteriaV99

strategy
enum
Hyperparameter space search strategy.In/Out

CloudLockV3

reason
string
reasonIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In

CloudV3

skip_ticks
boolean
skip_ticksIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
version
string
versionOut
branch_name
string
branch_nameOut
last_commit_hash
string
last_commit_hashOut
describe
string
describeOut
compiled_by
string
compiled_byOut
compiled_on
string
compiled_onOut
build_number
string
build_numberOut
build_age
string
build_ageOut
build_too_old
boolean
build_too_oldOut
node_idx
int
Node index number cloud status is collected from (zero-based)Out
cloud_name
string
cloud_nameOut
cloud_size
int
cloud_sizeOut
cloud_uptime_millis
long
cloud_uptime_millisOut
cloud_internal_timezone
string
Cloud internal timezoneOut
datafile_parser_timezone
string
Timezone used for parsing datetime columnsOut
cloud_healthy
boolean
cloud_healthyOut
bad_nodes
int
Nodes reporting unhealthyOut
consensus
boolean
Cloud voting is stableOut
locked
boolean
Cloud is accepting new members or notOut
is_client
boolean
Cloud is in client mode.Out
nodes
Iced[]
nodesOut
internal_security_enabled
boolean
internal_security_enabledOut
leader_idx
int
leader_idxOut

ClusteringModelBuilderSchema

parameters
Parameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out

ClusteringModelParametersSchemaV3

distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
k
int
The max. number of clusters. If estimate_k is disabled, the model will find k centroids, otherwise it will find up to k centroids.In/Out
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

ColSpecifierV3

column_name
string
Name of the columnIn/Out
is_member_of_frames
string[]
List of fields which specify columns that must contain this columnIn/Out

ColV3

label
string
labelOut
missing_count
long
missingOut
zero_count
long
zerosOut
positive_infinity_count
long
positive infinitiesOut
negative_infinity_count
long
negative infinitiesOut
mins
double[]
minsOut
maxs
double[]
maxsOut
mean
double
meanOut
sigma
double
sigmaOut
type
string
datatype: {enum, string, int, real, time, uuid}Out
domain
string[]
domain; not-null for categorical columns onlyOut
domain_cardinality
int
cardinality of this column’s domain; not-null for categorical columns onlyOut
data
double[]
dataOut
string_data
string[]
string dataOut
precision
byte
decimal precision, -1 for all digitsOut
histogram_bins
long[]
Histogram bins; null if not computedOut
histogram_base
double
Start of histogram bin zeroOut
histogram_stride
double
Stride per binOut
percentiles
double[]
Percentile values, matching the default percentilesOut

ColumnSpecsBase

name
string
Column NameOut
type
string
Column TypeOut
format
string
Column Format (printf)Out
description
string
Column DescriptionOut

ColumnsMappingV3

from
string[]
Input column(s) from the same encoding group.In
to
string[]
Output column(s) generated by the application of target encoding to the from group.In

ConfusionMatrixV3

table
TwoDimTable
Annotated confusion matrixOut

CoxPHModelOutputV3

coefficients_table
TwoDimTable
Table of CoefficientsIn
var_coef
double[][]
var(coefficients)In
null_loglik
double
null log-likelihoodIn
loglik
double
log-likelihoodIn
loglik_test
double
log-likelihood test statIn
wald_test
double
Wald test statIn
score_test
double
Score test statIn
rsq
double
R-squareIn
maxrsq
double
Maximum R-squareIn
lre
double
log relative errorIn
iter
int
number of iterationsIn
x_mean_cat
double[][]
x weighted mean vector for categorical variablesIn
x_mean_num
double[][]
x weighted mean vector for numeric variablesIn
mean_offset
double[]
unweighted mean vector for numeric offsetsIn
offset_names
string[]
names of offsetsIn
n
long
nIn
n_missing
long
number of rows with missing valuesIn
total_event
long
total eventsIn
time
double[]
timeIn
n_risk
double[]
number at riskIn
n_event
double[]
number of eventsIn
n_censor
double[]
number of censored obsIn
cumhaz_0
double[]
baseline cumulative hazardIn
var_cumhaz_1
double[]
component of var(cumhaz)In
var_cumhaz_2
Key
component of var(cumhaz)In
baseline_hazard
Key
Baseline HazardIn
baseline_survival
Key
Baseline SurvivalIn
formula
string
formulaIn
ties
enum
tiesIn
concordance
double
concordanceIn
names
string[]
Column namesOut
original_names
string[]
Original column namesOut
column_types
string[]
Column typesOut
domains
string[][]
Domains for categorical columnsOut
cross_validation_models
Key[]
Cross-validation models (model ids)Out
cross_validation_predictions
Key[]
Cross-validation predictions, one per cv model (deprecated, use cross_validation_holdout_predictions_frame_id instead)Out
cross_validation_holdout_predictions_frame_id
Key
Cross-validation holdout predictions (full out-of-sample predictions on training data)Out
cross_validation_fold_assignment_frame_id
Key
Cross-validation fold assignment (each row is assigned to one holdout fold)Out
model_category
enum
Category of the model (e.g., Binomial)Out
model_summary
TwoDimTable
Model summaryOut
scoring_history
TwoDimTable
Scoring historyOut
cv_scoring_history
TwoDimTable[]
Cross-Validation scoring historyOut
reproducibility_information_table
TwoDimTable[]
Model reproducibility informationOut
training_metrics
ModelMetrics
Training data model metricsOut
validation_metrics
ModelMetrics
Validation data model metricsOut
cross_validation_metrics
ModelMetrics
Cross-validation model metricsOut
cross_validation_metrics_summary
TwoDimTable
Cross-validation model metrics summaryOut
status
string
Job statusOut
start_time
long
Start time in millisecondsOut
end_time
long
End time in millisecondsOut
run_time
long
Runtime in millisecondsOut
default_threshold
double
Default threshold used for predictionsOut
help
Map
Help information for output fieldsOut

CoxPHModelV3

model_id
Key
Model keyIn/Out
parameters
CoxPHParameters
The build parameters for the model (e.g. K for KMeans).Out
output
CoxPHOutput
The build output for the model (e.g. the cluster centers for KMeans).Out
compatible_frames
string[]
Compatible frames, if requestedOut
checksum
long
Checksum for all the things that go into building the Model.Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

CoxPHParametersV3

interactions_only
string[]
A list of columns that should only be used to create interactions but should not itself participate in model training.In
interactions
string[]
A list of predictor column indices to interact. All pairwise combinations will be computed for the list.In
interaction_pairs
StringPair[]
A list of pairwise (first order) column interactions.In
use_all_factor_levels
boolean
(Internal. For development only!) Indicates whether to use all factor levels.In
distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
start_column
VecSpecifier
Start Time Column.In/Out
stop_column
VecSpecifier
Stop Time Column.In/Out
stratify_by
string[]
List of columns to use for stratification.In/Out
ties
enum
Method for Handling Ties.In/Out
init
double
Coefficient starting value.In/Out
lre_min
double
Minimum log-relative error.In/Out
max_iterations
int
Maximum number of iterations.In/Out
single_node_mode
boolean
Run on a single node to reduce the effect of network overhead (for smaller datasets)In/Out
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

CoxPHV3

parameters
CoxPHParameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out

CreateFrameOriginalIV4

dest
Key
destination keyIn
rows
int
Number of rowsIn
cols
int
Number of data columns (in addition to the first response column)In
seed
long
Random number seed that determines the random valuesIn
randomize
boolean
Whether frame should be randomizedIn
value
long
Constant value (for randomize=false)In
real_range
double
Range for real variables (-range … range)In
categorical_fraction
double
Fraction of categorical columns (for randomize=true)In
factors
int
Factor levels for categorical variablesIn
integer_fraction
double
Fraction of integer columns (for randomize=true)In
integer_range
int
Range for integer variables (-range … range)In
binary_fraction
double
Fraction of binary columns (for randomize=true)In
binary_ones_fraction
double
Fraction of 1’s in binary columnsIn
time_fraction
double
Fraction of date/time columns (for randomize=true)In
string_fraction
double
Fraction of string columns (for randomize=true)In
missing_fraction
double
Fraction of missing valuesIn
has_response
boolean
Whether an additional response column should be generatedIn
response_factors
int
Number of factor levels of the first column (1=real, 2=binomial, N=multinomial)In
positive_response
boolean
For real-valued response variable: Whether the response should be positive only.In
_fields
string
Filter on the set of output fields: if you set _fields=”foo,bar,baz”, then only those fields will be included in the output; or you can specify _fields=”-goo,gee” to include all fields except goo and gee. If the result contains nested data structures, then you can refer to the fields within those structures as well. For example if you specify _fields=”foo(oof),bar(-rab)”, then only fields foo and bar will be included, and within foo there will be only field oof, whereas within bar all fields except rab will be reported.In

CreateFrameSimpleIV4

dest
Key
Id for the frame to be created.In
seed
long
Random number seed that determines the random values.In
nrows
int
Number of rows.In
ncols_real
int
Number of real-valued columns. Values in these columns will be uniformly distributed between real_lb and real_ub.In
ncols_int
int
Number of integer columns.In
ncols_enum
int
Number of enum (categorical) columns.In
ncols_bool
int
Number of boolean (binary) columns.In
ncols_str
int
Number of string columns.In
ncols_time
int
Number of time columns.In
real_lb
double
Lower bound for the range of the real-valued columns.In
real_ub
double
Upper bound for the range of the real-valued columns.In
int_lb
int
Lower bound for the range of integer columns.In
int_ub
int
Upper bound for the range of integer columns.In
enum_nlevels
int
Number of levels (categories) for the enum columns.In
bool_p
double
Fraction of ones in each boolean (binary) column.In
time_lb
long
Lower bound for the range of time columns (in ms since the epoch).In
time_ub
long
Upper bound for the range of time columns (in ms since the epoch).In
str_length
int
Length of generated strings in string columns.In
missing_fraction
double
Fraction of missing values.In
response_type
enum
Type of the response column to add.In
response_lb
double
Lower bound for the response variable (real/int/time types).In
response_ub
double
Upper bound for the response variable (real/int/time types).In
response_p
double
Frequency of 1s for the bool (binary) response column.In
response_nlevels
int
Number of categorical levels for the enum response column.In
_fields
string
Filter on the set of output fields: if you set _fields=”foo,bar,baz”, then only those fields will be included in the output; or you can specify _fields=”-goo,gee” to include all fields except goo and gee. If the result contains nested data structures, then you can refer to the fields within those structures as well. For example if you specify _fields=”foo(oof),bar(-rab)”, then only fields foo and bar will be included, and within foo there will be only field oof, whereas within bar all fields except rab will be reported.In

CreateFrameV3

_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
dest
Key
destination keyIn/Out
rows
long
Number of rowsIn/Out
cols
int
Number of data columns (in addition to the first response column)In/Out
seed
long
Random number seed that determines the random valuesIn/Out
seed_for_column_types
long
Random number seed for setting the column typesIn/Out
randomize
boolean
Whether frame should be randomizedIn/Out
value
long
Constant value (for randomize=false)In/Out
real_range
long
Range for real variables (-range … range)In/Out
categorical_fraction
double
Fraction of categorical columns (for randomize=true)In/Out
factors
int
Factor levels for categorical variablesIn/Out
integer_fraction
double
Fraction of integer columns (for randomize=true)In/Out
integer_range
long
Range for integer variables (-range … range)In/Out
binary_fraction
double
Fraction of binary columns (for randomize=true)In/Out
binary_ones_fraction
double
Fraction of 1’s in binary columnsIn/Out
time_fraction
double
Fraction of date/time columns (for randomize=true)In/Out
string_fraction
double
Fraction of string columns (for randomize=true)In/Out
missing_fraction
double
Fraction of missing valuesIn/Out
has_response
boolean
Whether an additional response column should be generatedIn/Out
response_factors
int
Number of factor levels of the first column (1=real, 2=binomial, N=multinomial or ordinal)In/Out
positive_response
boolean
For real-valued response variable: Whether the response should be positive only.In/Out
key
Key
Job KeyOut

DCTTransformerV3

dataset
Key
DatasetIn
destination_frame
Key
Destination Frame IDIn
dimensions
int[]
Dimensions of the input array: Height, Width, Depth (Nx1x1 for 1D, NxMx1 for 2D)In
inverse
boolean
Whether to do the inverse transformIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In

DRFModelOutputV3

variable_importances
TwoDimTable
Variable ImportancesOut
init_f
double
The Intercept term, the initial model function value to which trees make adjustmentsOut
names
string[]
Column namesOut
original_names
string[]
Original column namesOut
column_types
string[]
Column typesOut
domains
string[][]
Domains for categorical columnsOut
cross_validation_models
Key[]
Cross-validation models (model ids)Out
cross_validation_predictions
Key[]
Cross-validation predictions, one per cv model (deprecated, use cross_validation_holdout_predictions_frame_id instead)Out
cross_validation_holdout_predictions_frame_id
Key
Cross-validation holdout predictions (full out-of-sample predictions on training data)Out
cross_validation_fold_assignment_frame_id
Key
Cross-validation fold assignment (each row is assigned to one holdout fold)Out
model_category
enum
Category of the model (e.g., Binomial)Out
model_summary
TwoDimTable
Model summaryOut
scoring_history
TwoDimTable
Scoring historyOut
cv_scoring_history
TwoDimTable[]
Cross-Validation scoring historyOut
reproducibility_information_table
TwoDimTable[]
Model reproducibility informationOut
training_metrics
ModelMetrics
Training data model metricsOut
validation_metrics
ModelMetrics
Validation data model metricsOut
cross_validation_metrics
ModelMetrics
Cross-validation model metricsOut
cross_validation_metrics_summary
TwoDimTable
Cross-validation model metrics summaryOut
status
string
Job statusOut
start_time
long
Start time in millisecondsOut
end_time
long
End time in millisecondsOut
run_time
long
Runtime in millisecondsOut
default_threshold
double
Default threshold used for predictionsOut
help
Map
Help information for output fieldsOut

DRFModelV3

model_id
Key
Model keyIn/Out
parameters
DRFParameters
The build parameters for the model (e.g. K for KMeans).Out
output
DRFOutput
The build output for the model (e.g. the cluster centers for KMeans).Out
compatible_frames
string[]
Compatible frames, if requestedOut
checksum
long
Checksum for all the things that go into building the Model.Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

DRFParametersV3

mtries
int
Number of variables randomly sampled as candidates at each split. If set to -1, defaults to sqrt{p} for classification and p/3 for regression (where p is the # of predictorsIn
binomial_double_trees
boolean
For binary classification: Build 2x as many trees (one per class) - can lead to higher accuracy.In
sample_rate
double
Row sample rate per tree (from 0.0 to 1.0)In
ntrees
int
Number of trees.In
max_depth
int
Maximum tree depth (0 for unlimited).In
min_rows
double
Fewest allowed (weighted) observations in a leaf.In
nbins
int
For numerical columns (real/int), build a histogram of (at least) this many bins, then split at the best pointIn
nbins_top_level
int
For numerical columns (real/int), build a histogram of (at most) this many bins at the root level, then decrease by factor of two per levelIn
nbins_cats
int
For categorical columns (factors), build a histogram of this many bins, then split at the best point. Higher values can lead to more overfitting.In
r2_stopping
double
r2_stopping is no longer supported and will be ignored if set - please use stopping_rounds, stopping_metric and stopping_tolerance instead. Previous version of H2O would stop making trees when the R^2 metric equals or exceeds thisIn
seed
long
Seed for pseudo random number generator (if applicable)In
build_tree_one_node
boolean
Run on one node only; no network overhead but fewer cpus used. Suitable for small datasets.In
sample_rate_per_class
double[]
A list of row sample rates per class (relative fraction for each class, from 0.0 to 1.0), for each treeIn
col_sample_rate_per_tree
double
Column sample rate per tree (from 0.0 to 1.0)In
col_sample_rate_change_per_level
double
Relative change of the column sampling rate for every level (must be > 0.0 and <= 2.0)In
score_tree_interval
int
Score the model after every so many trees. Disabled if set to 0.In
min_split_improvement
double
Minimum relative improvement in squared error reduction for a split to happenIn
histogram_type
enum
What type of histogram to use for finding optimal split pointsIn
calibrate_model
boolean
Use Platt Scaling (default) or Isotonic Regression to calculate calibrated class probabilities. Calibration can provide more accurate estimates of class probabilities.In
in_training_checkpoints_dir
string
Create checkpoints into defined directory while training process is still running. In case of cluster shutdown, this checkpoint can be used to restart training.In
in_training_checkpoints_tree_interval
int
Checkpoint the model after every so many trees. Parameter is used only when in_training_checkpoints_dir is definedIn
distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
balance_classes
boolean
Balance training data class counts via over/under-sampling (for imbalanced data).In/Out
class_sampling_factors
float[]
Desired over/under-sampling ratios per class (in lexicographic order). If not specified, sampling factors will be automatically computed to obtain class balance during training. Requires balance_classes.In/Out
max_after_balance_size
float
Maximum relative size of the training data after balancing class counts (can be less than 1.0). Requires balance_classes.In/Out
max_confusion_matrix_size
int
[Deprecated] Maximum size (# classes) for confusion matrices to be printed in the LogsIn/Out
calibration_frame
Key
Data for model calibrationIn/Out
calibration_method
enum
Calibration method to useIn/Out
check_constant_response
boolean
Check if response column is constant. If enabled, then an exception is thrown if the response column is a constant value.If disabled, then model will train regardless of the response column being a constant value or not.In/Out
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

DRFV3

parameters
DRFParameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out

DStackTraceV3

node
string
Node nameOut
time
long
Unix epoch timeOut
thread_traces
string[]
One trace per threadOut

DataInfoFrameV3

frame
Key
input frameIn
interactions
string[]
interactionsIn
use_all
boolean
use all factor levelsIn
standardize
boolean
standardizeIn
interactions_only
boolean
interactions only returnedIn
result
Key
output frameOut

DecryptionSetupV3

password
string
Key passwordIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
decrypt_tool_id
Key
Target key for the Decryption ToolIn/Out
decrypt_impl
string
Implementation of the Decryption ToolIn/Out
keystore_id
Key
Location of Java KeystoreIn/Out
keystore_type
string
Keystore typeIn/Out
key_alias
string
Key aliasIn/Out
cipher_spec
string
Specification of the cipher (and padding)In/Out

DecryptionToolKeyV3

name
string
Name (string representation) for this Key.In/Out
type
string
Name (string representation) for the type of Keyed this Key points to.In/Out
URL
string
URL for the resource that this Key points to, if one exists.In/Out

DeepLearningModelOutputV3

weights
Key[]
Frame keys for weight matricesIn
biases
Key[]
Frame keys for bias vectorsIn
normmul
double[]
Normalization/Standardization multipliers for numeric predictorsOut
normsub
double[]
Normalization/Standardization offsets for numeric predictorsOut
normrespmul
double[]
Normalization/Standardization multipliers for numeric responseOut
normrespsub
double[]
Normalization/Standardization offsets for numeric responseOut
catoffsets
int[]
Categorical offsets for one-hot encodingOut
variable_importances
TwoDimTable
Variable ImportancesOut
names
string[]
Column namesOut
original_names
string[]
Original column namesOut
column_types
string[]
Column typesOut
domains
string[][]
Domains for categorical columnsOut
cross_validation_models
Key[]
Cross-validation models (model ids)Out
cross_validation_predictions
Key[]
Cross-validation predictions, one per cv model (deprecated, use cross_validation_holdout_predictions_frame_id instead)Out
cross_validation_holdout_predictions_frame_id
Key
Cross-validation holdout predictions (full out-of-sample predictions on training data)Out
cross_validation_fold_assignment_frame_id
Key
Cross-validation fold assignment (each row is assigned to one holdout fold)Out
model_category
enum
Category of the model (e.g., Binomial)Out
model_summary
TwoDimTable
Model summaryOut
scoring_history
TwoDimTable
Scoring historyOut
cv_scoring_history
TwoDimTable[]
Cross-Validation scoring historyOut
reproducibility_information_table
TwoDimTable[]
Model reproducibility informationOut
training_metrics
ModelMetrics
Training data model metricsOut
validation_metrics
ModelMetrics
Validation data model metricsOut
cross_validation_metrics
ModelMetrics
Cross-validation model metricsOut
cross_validation_metrics_summary
TwoDimTable
Cross-validation model metrics summaryOut
status
string
Job statusOut
start_time
long
Start time in millisecondsOut
end_time
long
End time in millisecondsOut
run_time
long
Runtime in millisecondsOut
default_threshold
double
Default threshold used for predictionsOut
help
Map
Help information for output fieldsOut

DeepLearningModelV3

model_id
Key
Model keyIn/Out
parameters
DeepLearningParameters
The build parameters for the model (e.g. K for KMeans).Out
output
DeepLearningModelOutput
The build output for the model (e.g. the cluster centers for KMeans).Out
compatible_frames
string[]
Compatible frames, if requestedOut
checksum
long
Checksum for all the things that go into building the Model.Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

DeepLearningParametersV3

distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
balance_classes
boolean
Balance training data class counts via over/under-sampling (for imbalanced data).In/Out
class_sampling_factors
float[]
Desired over/under-sampling ratios per class (in lexicographic order). If not specified, sampling factors will be automatically computed to obtain class balance during training. Requires balance_classes.In/Out
max_after_balance_size
float
Maximum relative size of the training data after balancing class counts (can be less than 1.0). Requires balance_classes.In/Out
max_confusion_matrix_size
int
[Deprecated] Maximum size (# classes) for confusion matrices to be printed in the Logs.In/Out
activation
enum
Activation function.In/Out
hidden
int[]
Hidden layer sizes (e.g. [100, 100]).In/Out
epochs
double
How many times the dataset should be iterated (streamed), can be fractional.In/Out
train_samples_per_iteration
long
Number of training samples (globally) per MapReduce iteration. Special values are 0: one epoch, -1: all available data (e.g., replicated training data), -2: automatic.In/Out
target_ratio_comm_to_comp
double
Target ratio of communication overhead to computation. Only for multi-node operation and train_samples_per_iteration = -2 (auto-tuning).In/Out
seed
long
Seed for random numbers (affects sampling) - Note: only reproducible when running single threaded.In/Out
adaptive_rate
boolean
Adaptive learning rate.In/Out
rho
double
Adaptive learning rate time decay factor (similarity to prior updates).In/Out
epsilon
double
Adaptive learning rate smoothing factor (to avoid divisions by zero and allow progress).In/Out
rate
double
Learning rate (higher => less stable, lower => slower convergence).In/Out
rate_annealing
double
Learning rate annealing: rate / (1 + rate_annealing * samples).In/Out
rate_decay
double
Learning rate decay factor between layers (N-th layer: rate * rate_decay ^ (n - 1).In/Out
momentum_start
double
Initial momentum at the beginning of training (try 0.5).In/Out
momentum_ramp
double
Number of training samples for which momentum increases.In/Out
momentum_stable
double
Final momentum after the ramp is over (try 0.99).In/Out
nesterov_accelerated_gradient
boolean
Use Nesterov accelerated gradient (recommended).In/Out
input_dropout_ratio
double
Input layer dropout ratio (can improve generalization, try 0.1 or 0.2).In/Out
hidden_dropout_ratios
double[]
Hidden layer dropout ratios (can improve generalization), specify one value per hidden layer, defaults to 0.5.In/Out
l1
double
L1 regularization (can add stability and improve generalization, causes many weights to become 0).In/Out
l2
double
L2 regularization (can add stability and improve generalization, causes many weights to be small.In/Out
max_w2
float
Constraint for squared sum of incoming weights per unit (e.g. for Rectifier).In/Out
initial_weight_distribution
enum
Initial weight distribution.In/Out
initial_weight_scale
double
Uniform: -value…value, Normal: stddev.In/Out
initial_weights
Key[]
A list of H2OFrame ids to initialize the weight matrices of this model with.In/Out
initial_biases
Key[]
A list of H2OFrame ids to initialize the bias vectors of this model with.In/Out
loss
enum
Loss function.In/Out
score_interval
double
Shortest time interval (in seconds) between model scoring.In/Out
score_training_samples
long
Number of training set samples for scoring (0 for all).In/Out
score_validation_samples
long
Number of validation set samples for scoring (0 for all).In/Out
score_duty_cycle
double
Maximum duty cycle fraction for scoring (lower: more training, higher: more scoring).In/Out
classification_stop
double
Stopping criterion for classification error fraction on training data (-1 to disable).In/Out
regression_stop
double
Stopping criterion for regression error (MSE) on training data (-1 to disable).In/Out
quiet_mode
boolean
Enable quiet mode for less output to standard output.In/Out
score_validation_sampling
enum
Method used to sample validation dataset for scoring.In/Out
overwrite_with_best_model
boolean
If enabled, override the final model with the best model found during training.In/Out
autoencoder
boolean
Auto-Encoder.In/Out
use_all_factor_levels
boolean
Use all factor levels of categorical variables. Otherwise, the first factor level is omitted (without loss of accuracy). Useful for variable importances and auto-enabled for autoencoder.In/Out
standardize
boolean
If enabled, automatically standardize the data. If disabled, the user must provide properly scaled input data.In/Out
diagnostics
boolean
Enable diagnostics for hidden layers.In/Out
variable_importances
boolean
Compute variable importances for input features (Gedeon method) - can be slow for large networks.In/Out
fast_mode
boolean
Enable fast mode (minor approximation in back-propagation).In/Out
force_load_balance
boolean
Force extra load balancing to increase training speed for small datasets (to keep all cores busy).In/Out
replicate_training_data
boolean
Replicate the entire training dataset onto every node for faster training on small datasets.In/Out
single_node_mode
boolean
Run on a single node for fine-tuning of model parameters.In/Out
shuffle_training_data
boolean
Enable shuffling of training data (recommended if training data is replicated and train_samples_per_iteration is close to #nodes x #rows, of if using balance_classes).In/Out
missing_values_handling
enum
Handling of missing values. Either MeanImputation or Skip.In/Out
sparse
boolean
Sparse data handling (more efficient for data with lots of 0 values).In/Out
col_major
boolean
#DEPRECATED Use a column major weight matrix for input layer. Can speed up forward propagation, but might slow down backpropagation.In/Out
average_activation
double
Average activation for sparse auto-encoder. #ExperimentalIn/Out
sparsity_beta
double
Sparsity regularization. #ExperimentalIn/Out
max_categorical_features
int
Max. number of categorical features, enforced via hashing. #ExperimentalIn/Out
reproducible
boolean
Force reproducibility on small data (will be slow - only uses 1 thread).In/Out
export_weights_and_biases
boolean
Whether to export Neural Network weights and biases to H2O Frames.In/Out
mini_batch_size
int
Mini-batch size (smaller leads to better fit, larger can speed up and generalize better).In/Out
elastic_averaging
boolean
Elastic averaging between compute nodes can improve distributed model convergence. #ExperimentalIn/Out
elastic_averaging_moving_rate
double
Elastic averaging moving rate (only if elastic averaging is enabled).In/Out
elastic_averaging_regularization
double
Elastic averaging regularization strength (only if elastic averaging is enabled).In/Out
pretrained_autoencoder
Key
Pretrained autoencoder model to initialize this model with.In/Out
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

DeepLearningV3

parameters
DeepLearningParameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out

DownloadDataV3

frame_id
Key
Frame to downloadIn
hex_string
boolean
Emit double values in a machine readable lossless format with Double.toHexString().In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
csv
string
CSV StreamOut
filename
string
Suggested FilenameOut

EndpointV4

url
string
Method+Url of the request; variable parts are enclosed in curly braces. For example: /4/schemas/{schema_name}In
description
string
Short description of the functionality provided by the endpoint.In
name
string
Unique name of the endpoint. These names can be used to look up the endpoint’s info via GET /4/endpoints/{name}.In
input_schema
string
Input schema.In
output_schema
string
Schema for the result returned by the endpoint.In
__schema
string
Url describing the schema of the current object.In

EndpointsListV4

endpoints
Route[]
List of endpoints in H2O REST API (v4).In
__schema
string
Url describing the schema of the current object.In

EventLogEntryV99

timestamp
long
Timestamp for this event, in milliseconds since Jan 1, 1970Out
level
enum
Importance of this log eventOut
stage
enum
Stage of the AutoML process for this log eventOut
message
string
Message for this eventOut
name
string
String identifier associated to this entryOut
value
string
Value associated to this entryOut

EventLogV99

automl_id
Key
ID of the AutoML run for which the event log was recordedIn/Out
verbosity
enum
Verbosity level of the returned event logIn/Out
events
EventLogEntry[]
List of events produced during the AutoML runOut
table
TwoDimTable
A table representation of this event log, for easy renderingOut

EventV3

date
string
Time when the event was recorded. Format is hh:mm:ss:msIn
nanos
long
Time in nanosIn
type
enum
type of recorded eventIn

ExtendedIsolationForestModelOutputV3

names
string[]
Column namesOut
original_names
string[]
Original column namesOut
column_types
string[]
Column typesOut
domains
string[][]
Domains for categorical columnsOut
cross_validation_models
Key[]
Cross-validation models (model ids)Out
cross_validation_predictions
Key[]
Cross-validation predictions, one per cv model (deprecated, use cross_validation_holdout_predictions_frame_id instead)Out
cross_validation_holdout_predictions_frame_id
Key
Cross-validation holdout predictions (full out-of-sample predictions on training data)Out
cross_validation_fold_assignment_frame_id
Key
Cross-validation fold assignment (each row is assigned to one holdout fold)Out
model_category
enum
Category of the model (e.g., Binomial)Out
model_summary
TwoDimTable
Model summaryOut
scoring_history
TwoDimTable
Scoring historyOut
cv_scoring_history
TwoDimTable[]
Cross-Validation scoring historyOut
reproducibility_information_table
TwoDimTable[]
Model reproducibility informationOut
training_metrics
ModelMetrics
Training data model metricsOut
validation_metrics
ModelMetrics
Validation data model metricsOut
cross_validation_metrics
ModelMetrics
Cross-validation model metricsOut
cross_validation_metrics_summary
TwoDimTable
Cross-validation model metrics summaryOut
status
string
Job statusOut
start_time
long
Start time in millisecondsOut
end_time
long
End time in millisecondsOut
run_time
long
Runtime in millisecondsOut
default_threshold
double
Default threshold used for predictionsOut
help
Map
Help information for output fieldsOut

ExtendedIsolationForestModelV3

model_id
Key
Model keyIn/Out
parameters
ExtendedIsolationForestParameters
The build parameters for the model (e.g. K for KMeans).Out
output
ExtendedIsolationForestOutput
The build output for the model (e.g. the cluster centers for KMeans).Out
compatible_frames
string[]
Compatible frames, if requestedOut
checksum
long
Checksum for all the things that go into building the Model.Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

ExtendedIsolationForestParametersV3

ntrees
int
Number of Extended Isolation Forest trees.In
sample_size
int
Number of randomly sampled observations used to train each Extended Isolation Forest tree.In
extension_level
int
Maximum is N - 1 (N = numCols). Minimum is 0. Extended Isolation Forest with extension_Level = 0 behaves like Isolation Forest.In
seed
long
Seed for pseudo random number generator (if applicable)In
distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

ExtendedIsolationForestV3

parameters
ExtendedIsolationForestParameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out

FeatureInteractionV3

model_id
Key
Model id of interestIn
max_interaction_depth
int
Maximum interaction depthIn
max_tree_depth
int
Maximum tree depthIn
max_deepening
int
Maximum deepeningIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
feature_interaction
TwoDimTable[]
Feature importance tableOut

FieldMetadataV3

schema_name
string
Schema name for this field, if it is_schema, or the name of the enum, if it’s an enum.In
name
string
Field name in the SchemaOut
type
string
Type for this fieldOut
is_schema
boolean
Type for this field is itself a Schema.Out
value
Polymorphic
Value for this fieldOut
help
string
A short help description to appear alongside the field in a UIOut
label
string
The label that should be displayed for the field if the name is insufficientOut
required
boolean
Is this field required, or is the default value generally sufficient?Out
level
enum
How important is this field? The web UI uses the level to do a slow reveal of the parametersOut
direction
enum
Is this field an input, output or inout?Out
is_inherited
boolean
Is the field inherited from the parent schema?Out
inherited_from
string
If this field is inherited from a class higher in the hierarchy which one?Out
is_gridable
boolean
Is the field gridable (i.e., it can be used in grid call)Out
values
string[]
For enum-type fields the allowed values are specified using the values annotation; this is used in UIs to tell the user the allowed values, and for validationOut
json
boolean
Should this field be rendered in the JSON representation?Out
is_member_of_frames
string[]
For Vec-type fields this is the set of other Vec-type fields which must contain mutually exclusive values; for example, for a SupervisedModel the response_column must be mutually exclusive with the weights_columnOut
is_mutually_exclusive_with
string[]
For Vec-type fields this is the set of Frame-type fields which must contain the named column; for example, for a SupervisedModel the response_column must be in both the training_frame and (if it’s set) the validation_frameOut

FindV3

key
Frame
Frame to searchIn
column
string
Column, or null for allIn
row
long
Starting row for searchIn
match
string
Value to search for; leave blank for a search for missing valuesIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
prev
long
previous row with matching value, or -1Out
next
long
next row with matching value, or -1Out

FrameBaseV3

_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
frame_id
Key
Frame IDIn/Out
byte_size
long
Total data size in bytesOut
is_text
boolean
Is this Frame raw unparsed data?Out

FrameChunkV3

chunk_id
int
An identifier unique in scope of a given frameOut
row_count
int
Number of rows represented byt the chunkOut
node_idx
int
Index of H2O node where the chunk is located inOut

FrameChunksV3

frame_id
Key
ID of a given frameIn/Out
chunks
Iced[]
Description of particular chunksOut

FrameKeyV3

name
string
Name (string representation) for this Key.In/Out
type
string
Name (string representation) for the type of Keyed this Key points to.In/Out
URL
string
URL for the resource that this Key points to, if one exists.In/Out

FrameLoadV3

frame_id
Key
Import frame under given key into DKV.In
dir
string
Source directory (hdfs, s3, local) containing serialized frameIn
force
boolean
Override existing frame in case it exists or throw exception if set to falseIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
job
Job
Job indicating progressOut

FrameSaveV3

frame_id
Key
Name of Frame of interestIn
dir
string
Destination directory (hdfs, s3, local)In
force
boolean
Overwrite destination file in case it exists or throw exception if set to false.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
job
Job
Job indicating progressOut

FrameSynopsisV3

_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
frame_id
Key
Frame IDIn/Out
rows
long
Number of rows in the FrameOut
columns
long
Number of columns in the FrameOut
byte_size
long
Total data size in bytesOut
is_text
boolean
Is this Frame raw unparsed data?Out

FrameV3

row_offset
long
Row offset to displayIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
row_count
int
Number of rows to displayIn/Out
column_offset
int
Column offset to returnIn/Out
column_count
int
Number of columns to returnIn/Out
full_column_count
int
Number of full columns to return. The columns between full_column_count and column_count will be returned without the dataIn/Out
total_column_count
int
Total number of columns in the FrameIn/Out
frame_id
Key
Frame IDIn/Out
checksum
long
checksumOut
rows
long
Number of rows in the FrameOut
num_columns
long
Number of columns in the FrameOut
default_percentiles
double[]
Default percentiles, from 0 to 1Out
columns
Vec[]
Columns in the FrameOut
compatible_models
string[]
Compatible models, if requestedOut
chunk_summary
TwoDimTable
Chunk summaryOut
distribution_summary
TwoDimTable
Distribution summaryOut
byte_size
long
Total data size in bytesOut
is_text
boolean
Is this Frame raw unparsed data?Out

FramesListV3

frame_id
Key
Name of Frame of interestIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
frames
Iced[]
FramesOut

FramesV3

frame_id
Key
Name of Frame of interestIn
column
string
Name of column of interestIn
find_compatible_models
boolean
Find and return compatible models?In
path
string
File output pathIn
force
boolean
Overwrite existing fileIn
num_parts
int
Number of part files to use (1=single file,-1=automatic)In
parallel
boolean
Use parallel export to a single file (doesn’t apply when num_parts != 1, creates temporary files in the destination directory)In
format
enum
Output file format. Defaults to ‘csv’.In
compression
string
Compression method (default none; gzip, bzip2 and snappy available depending on runtime environment)In
separator
byte
Field separator (default ‘,’)In
header
boolean
Use header (default true)In
quote_header
boolean
Quote column names in header line (default true)In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
row_offset
long
Row offset to returnIn/Out
row_count
int
Number of rows to returnIn/Out
column_offset
int
Column offset to returnIn/Out
full_column_count
int
Number of full columns to return. The columns between full_column_count and column_count will be returned without the dataIn/Out
column_count
int
Number of columns to returnIn/Out
job
Job
Job for export fileOut
frames
Frame[]
FramesOut
compatible_models
Model[]
Compatible modelsOut
domain
string[][]
DomainsOut

FriedmanPopescusHV3

model_id
Key
Model id of interestIn
frame
Frame
Frame the model has been fitted toIn
variables
string[]
Variables of interestIn
h
double
Value of H statisticIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In

GAMModelOutputV3

coefficients_table
TwoDimTable
Table of CoefficientsIn
coefficients_table_no_centering
TwoDimTable
Table of Coefficients without centeringIn
glm_scoring_history
TwoDimTable
GLM scoring historyIn
glm_model_summary
TwoDimTable
GLM model summaryIn
standardized_coefficient_magnitudes
TwoDimTable
Table of Standardized Coefficients MagnitudesIn
gam_transformed_center_key
string
key storing gam columns and predictor columns. For debugging purposes onlyIn
glm_zvalues
double[]
GLM Z values. For debugging purposes onlyIn
glm_pvalues
double[]
GLM p values. For debugging purposes onlyIn
glm_std_err
double[]
GLM standard error values. For debugging purposes onlyIn
variable_importances
TwoDimTable
Variable ImportancesOut
names
string[]
Column namesOut
original_names
string[]
Original column namesOut
column_types
string[]
Column typesOut
domains
string[][]
Domains for categorical columnsOut
cross_validation_models
Key[]
Cross-validation models (model ids)Out
cross_validation_predictions
Key[]
Cross-validation predictions, one per cv model (deprecated, use cross_validation_holdout_predictions_frame_id instead)Out
cross_validation_holdout_predictions_frame_id
Key
Cross-validation holdout predictions (full out-of-sample predictions on training data)Out
cross_validation_fold_assignment_frame_id
Key
Cross-validation fold assignment (each row is assigned to one holdout fold)Out
model_category
enum
Category of the model (e.g., Binomial)Out
model_summary
TwoDimTable
Model summaryOut
scoring_history
TwoDimTable
Scoring historyOut
cv_scoring_history
TwoDimTable[]
Cross-Validation scoring historyOut
reproducibility_information_table
TwoDimTable[]
Model reproducibility informationOut
training_metrics
ModelMetrics
Training data model metricsOut
validation_metrics
ModelMetrics
Validation data model metricsOut
cross_validation_metrics
ModelMetrics
Cross-validation model metricsOut
cross_validation_metrics_summary
TwoDimTable
Cross-validation model metrics summaryOut
status
string
Job statusOut
start_time
long
Start time in millisecondsOut
end_time
long
End time in millisecondsOut
run_time
long
Runtime in millisecondsOut
default_threshold
double
Default threshold used for predictionsOut
help
Map
Help information for output fieldsOut

GAMModelV3

model_id
Key
Model keyIn/Out
parameters
GAMParameters
The build parameters for the model (e.g. K for KMeans).Out
output
GAMModelOutput
The build output for the model (e.g. the cluster centers for KMeans).Out
compatible_frames
string[]
Compatible frames, if requestedOut
checksum
long
Checksum for all the things that go into building the Model.Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

GAMParametersV3

seed
long
Seed for pseudo random number generator (if applicable)In
family
enum
Family. Use binomial for classification with logistic regression, others are for regression problems.In
tweedie_variance_power
double
Tweedie variance powerIn
tweedie_link_power
double
Tweedie link powerIn
theta
double
ThetaIn
solver
enum
AUTO will set the solver based on given data and the other parameters. IRLSM is fast on on problems with small number of predictors and for lambda-search with L1 penalty, L_BFGS scales better for datasets with many columns.In
alpha
double[]
Distribution of regularization between the L1 (Lasso) and L2 (Ridge) penalties. A value of 1 for alpha represents Lasso regression, a value of 0 produces Ridge regression, and anything in between specifies the amount of mixing between the two. Default value of alpha is 0 when SOLVER = ‘L-BFGS’; 0.5 otherwise.In
lambda
double[]
Regularization strengthIn
startval
double[]
double array to initialize coefficients for GAM.In
lambda_search
boolean
Use lambda search starting at lambda max, given lambda is then interpreted as lambda minIn
early_stopping
boolean
Stop early when there is no more relative improvement on train or validation (if provided)In
nlambdas
int
Number of lambdas to be used in a search. Default indicates: If alpha is zero, with lambda search set to True, the value of nlamdas is set to 30 (fewer lambdas are needed for ridge regression) otherwise it is set to 100.In
standardize
boolean
Standardize numeric columns to have zero mean and unit varianceIn
plug_values
Key
Plug Values (a single row frame containing values that will be used to impute missing values of the training/validation frame, use with conjunction missing_values_handling = PlugValues)In
non_negative
boolean
Restrict coefficients (not intercept) to be non-negativeIn
max_iterations
int
Maximum number of iterationsIn
beta_epsilon
double
Converge if beta changes less (using L-infinity norm) than beta esilon, ONLY applies to IRLSM solver In
objective_epsilon
double
Converge if objective value changes less than this. Default indicates: If lambda_search is set to True the value of objective_epsilon is set to .0001. If the lambda_search is set to False and lambda is equal to zero, the value of objective_epsilon is set to .000001, for any other value of lambda the default value of objective_epsilon is set to .0001.In
gradient_epsilon
double
Converge if objective changes less (using L-infinity norm) than this, ONLY applies to L-BFGS solver. Default indicates: If lambda_search is set to False and lambda is equal to zero, the default value of gradient_epsilon is equal to .000001, otherwise the default value is .0001. If lambda_search is set to True, the conditional values above are 1E-8 and 1E-6 respectively.In
obj_reg
double
Likelihood divider in objective value computation, default is 1/nobsIn
link
enum
Link function.In
intercept
boolean
Include constant term in the modelIn
prior
double
Prior probability for y==1. To be used only for logistic regression iff the data has been sampled and the mean of response does not reflect reality.In
cold_start
boolean
Only applicable to multiple alpha/lambda values when calling GLM from GAM. If false, build the next model for next set of alpha/lambda values starting from the values provided by current model. If true will start GLM model from scratch.In
lambda_min_ratio
double
Minimum lambda used in lambda search, specified as a ratio of lambda_max (the smallest lambda that drives all coefficients to zero). Default indicates: if the number of observations is greater than the number of variables, then lambda_min_ratio is set to 0.0001; if the number of observations is less than the number of variables, then lambda_min_ratio is set to 0.01.In
beta_constraints
Key
Beta constraintsIn
max_active_predictors
int
Maximum number of active predictors during computation. Use as a stopping criterion to prevent expensive model building with many predictors. Default indicates: If the IRLSM solver is used, the value of max_active_predictors is set to 5000 otherwise it is set to 100000000.In
interactions
string[]
A list of predictor column indices to interact. All pairwise combinations will be computed for the list.In
interaction_pairs
StringPair[]
A list of pairwise (first order) column interactions.In
compute_p_values
boolean
Request p-values computation, p-values work only with IRLSM solver and no regularizationIn
remove_collinear_columns
boolean
In case of linearly dependent columns, remove some of the dependent columnsIn
num_knots
int[]
Number of knots for gam predictors. If specified, must specify one for each gam predictor. For monotone I-splines, mininum = 2, for cs spline, minimum = 3. For thin plate, minimum is size of polynomial basis + 2.In
spline_orders
int[]
Order of I-splines or NBSplineTypeI M-splines used for gam predictors. If specified, must be the same size as gam_columns. For I-splines, the spline_orders will be the same as the polynomials used to generate the splines. For M-splines, the polynomials used to generate the splines will be spline_order-1. Values for bs=0 or 1 will be ignored.In
splines_non_negative
boolean[]
Valid for I-spline (bs=2) only. True if the I-splines are monotonically increasing (and monotonically non-decreasing) and False if the I-splines are monotonically decreasing (and monotonically non-increasing). If specified, must be the same size as gam_columns. Values for other spline types will be ignored. Default to true.In
gam_columns
string[][]
Arrays of predictor column names for gam for smoothers using single or multiple predictors like {{‘c1’},{‘c2’,’c3’},{‘c4’},…}In
scale
double[]
Smoothing parameter for gam predictors. If specified, must be of the same length as gam_columnsIn
bs
int[]
Basis function type for each gam predictors, 0 for cr, 1 for thin plate regression with knots, 2 for monotone I-splines, 3 for NBSplineTypeI M-splines (refer to doc here: https://h2oai.atlassian.net/browse/PUBDEV-8835). If specified, must be the same size as gam_columnsIn
keep_gam_cols
boolean
Save keys of model matrixIn
standardize_tp_gam_cols
boolean
standardize tp (thin plate) predictor columnsIn
scale_tp_penalty_mat
boolean
Scale penalty matrix for tp (thin plate) smoothers as in RIn
knot_ids
string[]
Array storing frame keys of knots. One for each gam column set specified in gam_columnsIn
distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
missing_values_handling
enum
Handling of missing values. Either MeanImputation, Skip or PlugValues.In/Out
balance_classes
boolean
Balance training data class counts via over/under-sampling (for imbalanced data).In/Out
class_sampling_factors
float[]
Desired over/under-sampling ratios per class (in lexicographic order). If not specified, sampling factors will be automatically computed to obtain class balance during training. Requires balance_classes.In/Out
max_after_balance_size
float
Maximum relative size of the training data after balancing class counts (can be less than 1.0). Requires balance_classes.In/Out
max_confusion_matrix_size
int
[Deprecated] Maximum size (# classes) for confusion matrices to be printed in the LogsIn/Out
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

GAMV3

parameters
GAMParameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out

GBMModelOutputV3

variable_importances
TwoDimTable
Variable ImportancesOut
init_f
double
The Intercept term, the initial model function value to which trees make adjustmentsOut
names
string[]
Column namesOut
original_names
string[]
Original column namesOut
column_types
string[]
Column typesOut
domains
string[][]
Domains for categorical columnsOut
cross_validation_models
Key[]
Cross-validation models (model ids)Out
cross_validation_predictions
Key[]
Cross-validation predictions, one per cv model (deprecated, use cross_validation_holdout_predictions_frame_id instead)Out
cross_validation_holdout_predictions_frame_id
Key
Cross-validation holdout predictions (full out-of-sample predictions on training data)Out
cross_validation_fold_assignment_frame_id
Key
Cross-validation fold assignment (each row is assigned to one holdout fold)Out
model_category
enum
Category of the model (e.g., Binomial)Out
model_summary
TwoDimTable
Model summaryOut
scoring_history
TwoDimTable
Scoring historyOut
cv_scoring_history
TwoDimTable[]
Cross-Validation scoring historyOut
reproducibility_information_table
TwoDimTable[]
Model reproducibility informationOut
training_metrics
ModelMetrics
Training data model metricsOut
validation_metrics
ModelMetrics
Validation data model metricsOut
cross_validation_metrics
ModelMetrics
Cross-validation model metricsOut
cross_validation_metrics_summary
TwoDimTable
Cross-validation model metrics summaryOut
status
string
Job statusOut
start_time
long
Start time in millisecondsOut
end_time
long
End time in millisecondsOut
run_time
long
Runtime in millisecondsOut
default_threshold
double
Default threshold used for predictionsOut
help
Map
Help information for output fieldsOut

GBMModelV3

model_id
Key
Model keyIn/Out
parameters
GBMParameters
The build parameters for the model (e.g. K for KMeans).Out
output
GBMOutput
The build output for the model (e.g. the cluster centers for KMeans).Out
compatible_frames
string[]
Compatible frames, if requestedOut
checksum
long
Checksum for all the things that go into building the Model.Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

GBMParametersV3

learn_rate
double
Learning rate (from 0.0 to 1.0)In
learn_rate_annealing
double
Scale the learning rate by this factor after each tree (e.g., 0.99 or 0.999) In
sample_rate
double
Row sample rate per tree (from 0.0 to 1.0)In
col_sample_rate
double
Column sample rate (from 0.0 to 1.0)In
monotone_constraints
KeyValue[]
A mapping representing monotonic constraints. Use +1 to enforce an increasing constraint and -1 to specify a decreasing constraint.In
max_abs_leafnode_pred
double
Maximum absolute value of a leaf node predictionIn
pred_noise_bandwidth
double
Bandwidth (sigma) of Gaussian multiplicative noise ~N(1,sigma) for tree node predictionsIn
interaction_constraints
string[][]
A set of allowed column interactions.In
ntrees
int
Number of trees.In
max_depth
int
Maximum tree depth (0 for unlimited).In
min_rows
double
Fewest allowed (weighted) observations in a leaf.In
nbins
int
For numerical columns (real/int), build a histogram of (at least) this many bins, then split at the best pointIn
nbins_top_level
int
For numerical columns (real/int), build a histogram of (at most) this many bins at the root level, then decrease by factor of two per levelIn
nbins_cats
int
For categorical columns (factors), build a histogram of this many bins, then split at the best point. Higher values can lead to more overfitting.In
r2_stopping
double
r2_stopping is no longer supported and will be ignored if set - please use stopping_rounds, stopping_metric and stopping_tolerance instead. Previous version of H2O would stop making trees when the R^2 metric equals or exceeds thisIn
seed
long
Seed for pseudo random number generator (if applicable)In
build_tree_one_node
boolean
Run on one node only; no network overhead but fewer cpus used. Suitable for small datasets.In
sample_rate_per_class
double[]
A list of row sample rates per class (relative fraction for each class, from 0.0 to 1.0), for each treeIn
col_sample_rate_per_tree
double
Column sample rate per tree (from 0.0 to 1.0)In
col_sample_rate_change_per_level
double
Relative change of the column sampling rate for every level (must be > 0.0 and <= 2.0)In
score_tree_interval
int
Score the model after every so many trees. Disabled if set to 0.In
min_split_improvement
double
Minimum relative improvement in squared error reduction for a split to happenIn
histogram_type
enum
What type of histogram to use for finding optimal split pointsIn
calibrate_model
boolean
Use Platt Scaling (default) or Isotonic Regression to calculate calibrated class probabilities. Calibration can provide more accurate estimates of class probabilities.In
in_training_checkpoints_dir
string
Create checkpoints into defined directory while training process is still running. In case of cluster shutdown, this checkpoint can be used to restart training.In
in_training_checkpoints_tree_interval
int
Checkpoint the model after every so many trees. Parameter is used only when in_training_checkpoints_dir is definedIn
distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
balance_classes
boolean
Balance training data class counts via over/under-sampling (for imbalanced data).In/Out
class_sampling_factors
float[]
Desired over/under-sampling ratios per class (in lexicographic order). If not specified, sampling factors will be automatically computed to obtain class balance during training. Requires balance_classes.In/Out
max_after_balance_size
float
Maximum relative size of the training data after balancing class counts (can be less than 1.0). Requires balance_classes.In/Out
max_confusion_matrix_size
int
[Deprecated] Maximum size (# classes) for confusion matrices to be printed in the LogsIn/Out
calibration_frame
Key
Data for model calibrationIn/Out
calibration_method
enum
Calibration method to useIn/Out
check_constant_response
boolean
Check if response column is constant. If enabled, then an exception is thrown if the response column is a constant value.If disabled, then model will train regardless of the response column being a constant value or not.In/Out
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

GBMV3

parameters
GBMParameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out

GLMModelOutputV3

coefficients_table
TwoDimTable
Table of CoefficientsIn
random_coefficients_table
TwoDimTable
Table of Random Coefficients for HGLMIn
coefficients_table_multinomials_with_class_names
TwoDimTable
Table of Coefficients with coefficients denoted with class names for GLM multinonimals only.In
standardized_coefficient_magnitudes
TwoDimTable
Standardized Coefficient MagnitudesIn
lambda_best
double
Lambda minimizing the objective value, only applicable with lambda search or when arrays of alpha and lambdas are providedIn
alpha_best
double
Alpha minimizing the objective value, only applicable when arrays of alphas are given In
best_submodel_index
int
submodel index minimizing the objective value, only applicable for arrays of alphas/lambda In
lambda_1se
double
Lambda best + 1 standard error. Only applicable with lambda search and cross-validationIn
lambda_min
double
Minimum lambda value calculated that may be used for lambda search. Early-stop may happen and the minimum lambda value will not be used in this case.In
lambda_max
double
Starting lambda value used when lambda search is enabled.In
dispersion
double
Dispersion parameter, only applicable to Tweedie family (input/output) and fractional Binomial (output only)In
vif_predictor_names
string[]
Predictor names where variable inflation factors are calculated.In
variable_inflation_factors
double[]
predictor variable inflation factors.In
variable_importances
TwoDimTable
Variable ImportancesOut
names
string[]
Column namesOut
original_names
string[]
Original column namesOut
column_types
string[]
Column typesOut
domains
string[][]
Domains for categorical columnsOut
cross_validation_models
Key[]
Cross-validation models (model ids)Out
cross_validation_predictions
Key[]
Cross-validation predictions, one per cv model (deprecated, use cross_validation_holdout_predictions_frame_id instead)Out
cross_validation_holdout_predictions_frame_id
Key
Cross-validation holdout predictions (full out-of-sample predictions on training data)Out
cross_validation_fold_assignment_frame_id
Key
Cross-validation fold assignment (each row is assigned to one holdout fold)Out
model_category
enum
Category of the model (e.g., Binomial)Out
model_summary
TwoDimTable
Model summaryOut
scoring_history
TwoDimTable
Scoring historyOut
cv_scoring_history
TwoDimTable[]
Cross-Validation scoring historyOut
reproducibility_information_table
TwoDimTable[]
Model reproducibility informationOut
training_metrics
ModelMetrics
Training data model metricsOut
validation_metrics
ModelMetrics
Validation data model metricsOut
cross_validation_metrics
ModelMetrics
Cross-validation model metricsOut
cross_validation_metrics_summary
TwoDimTable
Cross-validation model metrics summaryOut
status
string
Job statusOut
start_time
long
Start time in millisecondsOut
end_time
long
End time in millisecondsOut
run_time
long
Runtime in millisecondsOut
default_threshold
double
Default threshold used for predictionsOut
help
Map
Help information for output fieldsOut

GLMModelV3

model_id
Key
Model keyIn/Out
parameters
GLMParameters
The build parameters for the model (e.g. K for KMeans).Out
output
GLMOutput
The build output for the model (e.g. the cluster centers for KMeans).Out
compatible_frames
string[]
Compatible frames, if requestedOut
checksum
long
Checksum for all the things that go into building the Model.Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

GLMParametersV3

seed
long
Seed for pseudo random number generator (if applicable)In
family
enum
Family. Use binomial for classification with logistic regression, others are for regression problems.In
rand_family
enum[]
Random Component Family array. One for each random component. Only support gaussian for now.In
tweedie_variance_power
double
Tweedie variance powerIn
dispersion_learning_rate
double
Dispersion learning rate is only valid for tweedie family dispersion parameter estimation using ml. It must be > 0. This controls how much the dispersion parameter estimate is to be changed when the calculated loglikelihood actually decreases with the new dispersion. In this case, instead of setting new dispersion = dispersion - change, we set new dispersion = dispersion + dispersion_learning_rate * change. Defaults to 0.5.In
tweedie_link_power
double
Tweedie link powerIn
theta
double
ThetaIn
solver
enum
AUTO will set the solver based on given data and the other parameters. IRLSM is fast on on problems with small number of predictors and for lambda-search with L1 penalty, L_BFGS scales better for datasets with many columns.In
alpha
double[]
Distribution of regularization between the L1 (Lasso) and L2 (Ridge) penalties. A value of 1 for alpha represents Lasso regression, a value of 0 produces Ridge regression, and anything in between specifies the amount of mixing between the two. Default value of alpha is 0 when SOLVER = ‘L-BFGS’; 0.5 otherwise.In
lambda
double[]
Regularization strengthIn
lambda_search
boolean
Use lambda search starting at lambda max, given lambda is then interpreted as lambda minIn
early_stopping
boolean
Stop early when there is no more relative improvement on train or validation (if provided)In
nlambdas
int
Number of lambdas to be used in a search. Default indicates: If alpha is zero, with lambda search set to True, the value of nlamdas is set to 30 (fewer lambdas are needed for ridge regression) otherwise it is set to 100.In
score_iteration_interval
int
Perform scoring for every score_iteration_interval iterationsIn
standardize
boolean
Standardize numeric columns to have zero mean and unit varianceIn
cold_start
boolean
Only applicable to multiple alpha/lambda values. If false, build the next model for next set of alpha/lambda values starting from the values provided by current model. If true will start GLM model from scratch.In
plug_values
Key
Plug Values (a single row frame containing values that will be used to impute missing values of the training/validation frame, use with conjunction missing_values_handling = PlugValues)In
non_negative
boolean
Restrict coefficients (not intercept) to be non-negativeIn
max_iterations
int
Maximum number of iterationsIn
beta_epsilon
double
Converge if beta changes less (using L-infinity norm) than beta esilon, ONLY applies to IRLSM solver In
objective_epsilon
double
Converge if objective value changes less than this. Default (of -1.0) indicates: If lambda_search is set to True the value of objective_epsilon is set to .0001. If the lambda_search is set to False and lambda is equal to zero, the value of objective_epsilon is set to .000001, for any other value of lambda the default value of objective_epsilon is set to .0001.In
gradient_epsilon
double
Converge if objective changes less (using L-infinity norm) than this, ONLY applies to L-BFGS solver. Default (of -1.0) indicates: If lambda_search is set to False and lambda is equal to zero, the default value of gradient_epsilon is equal to .000001, otherwise the default value is .0001. If lambda_search is set to True, the conditional values above are 1E-8 and 1E-6 respectively.In
obj_reg
double
Likelihood divider in objective value computation, default (of -1.0) will set it to 1/nobsIn
link
enum
Link function.In
dispersion_parameter_method
enum
Method used to estimate the dispersion parameter for Tweedie, Gamma and Negative Binomial only.In
rand_link
enum[]
Link function array for random component in HGLM.In
startval
double[]
double array to initialize fixed and random coefficients for HGLM, coefficients for GLM.In
random_columns
int[]
random columns indices for HGLM.In
calc_like
boolean
if true, will return likelihood function value for HGLM.In
generate_variable_inflation_factors
boolean
if true, will generate variable inflation factors for numerical predictors. Default to false.In
intercept
boolean
Include constant term in the modelIn
build_null_model
boolean
If set, will build a model with only the intercept. Default to false.In
fix_dispersion_parameter
boolean
Only used for Tweedie, Gamma and Negative Binomial GLM. If set, will use the dispsersion parameter in init_dispersion_parameter as the standard error and use it to calculate the p-values. Default to false.In
init_dispersion_parameter
double
Only used for Tweedie, Gamma and Negative Binomial GLM. Store the initial value of dispersion parameter. If fix_dispersion_parameter is set, this value will be used in the calculation of p-values.Default to 1.0.In
HGLM
boolean
If set to true, will return HGLM model. Otherwise, normal GLM model will be returnedIn
prior
double
Prior probability for y==1. To be used only for logistic regression iff the data has been sampled and the mean of response does not reflect reality.In
lambda_min_ratio
double
Minimum lambda used in lambda search, specified as a ratio of lambda_max (the smallest lambda that drives all coefficients to zero). Default indicates: if the number of observations is greater than the number of variables, then lambda_min_ratio is set to 0.0001; if the number of observations is less than the number of variables, then lambda_min_ratio is set to 0.01.In
beta_constraints
Key
Beta constraintsIn
max_active_predictors
int
Maximum number of active predictors during computation. Use as a stopping criterion to prevent expensive model building with many predictors. Default indicates: If the IRLSM solver is used, the value of max_active_predictors is set to 5000 otherwise it is set to 100000000.In
interactions
string[]
A list of predictor column indices to interact. All pairwise combinations will be computed for the list.In
interaction_pairs
StringPair[]
A list of pairwise (first order) column interactions.In
compute_p_values
boolean
Request p-values computation, p-values work only with IRLSM solver and no regularizationIn
fix_tweedie_variance_power
boolean
If true, will fix tweedie variance power value to the value set in tweedie_variance_power.In
remove_collinear_columns
boolean
In case of linearly dependent columns, remove some of the dependent columnsIn
generate_scoring_history
boolean
If set to true, will generate scoring history for GLM. This may significantly slow down the algo.In
distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
missing_values_handling
enum
Handling of missing values. Either MeanImputation, Skip or PlugValues.In/Out
balance_classes
boolean
Balance training data class counts via over/under-sampling (for imbalanced data).In/Out
class_sampling_factors
float[]
Desired over/under-sampling ratios per class (in lexicographic order). If not specified, sampling factors will be automatically computed to obtain class balance during training. Requires balance_classes.In/Out
max_after_balance_size
float
Maximum relative size of the training data after balancing class counts (can be less than 1.0). Requires balance_classes.In/Out
max_confusion_matrix_size
int
[Deprecated] Maximum size (# classes) for confusion matrices to be printed in the LogsIn/Out
dispersion_epsilon
double
If changes in dispersion parameter estimation or loglikelihood value is smaller than dispersion_epsilon, will break out of the dispersion parameter estimation loop using maximum likelihood.In/Out
tweedie_epsilon
double
In estimating tweedie dispersion parameter using maximum likelihood, this is used to choose the lower and upper indices in the approximating of the infinite series summation.In/Out
max_iterations_dispersion
int
Control the maximum number of iterations in the dispersion parameter estimation loop using maximum likelihood.In/Out
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

GLMRegularizationPathV3

model
Key
source modelIn
lambdas
double[]
Computed lambda valuesIn
alphas
double[]
alpha values used in building submodelsIn
explained_deviance_train
double[]
explained deviance on the training setIn
explained_deviance_valid
double[]
explained deviance on the validation setIn
coefficients
double[][]
coefficients for all lambdasIn
coefficients_std
double[][]
standardized coefficients for all lambdasIn
coefficient_names
string[]
coefficient namesIn
z_values
double[][]
z-valuesIn
p_values
double[][]
p-valuesIn
std_errs
double[][]
standard errorIn

GLMV3

parameters
GLMParameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out

GLRMModelOutputV3

iterations
int
Number of iterations executedIn
updates
int
Number of updates executedIn
objective
double
Current value of the objective functionIn
avg_change_obj
double
Average change in objective value on final iterationIn
step_size
double
Final step sizeIn
archetypes
TwoDimTable
Mapping from lower dimensional k-space to training features (Y)In
singular_vals
double[]
Singular values of XY matrixIn
eigenvectors
TwoDimTable
Eigenvectors of XY matrixIn
representation_name
string
Frame key name for X matrixIn
importance
TwoDimTable
Standard deviation and importance of each principal componentIn
names
string[]
Column namesOut
original_names
string[]
Original column namesOut
column_types
string[]
Column typesOut
domains
string[][]
Domains for categorical columnsOut
cross_validation_models
Key[]
Cross-validation models (model ids)Out
cross_validation_predictions
Key[]
Cross-validation predictions, one per cv model (deprecated, use cross_validation_holdout_predictions_frame_id instead)Out
cross_validation_holdout_predictions_frame_id
Key
Cross-validation holdout predictions (full out-of-sample predictions on training data)Out
cross_validation_fold_assignment_frame_id
Key
Cross-validation fold assignment (each row is assigned to one holdout fold)Out
model_category
enum
Category of the model (e.g., Binomial)Out
model_summary
TwoDimTable
Model summaryOut
scoring_history
TwoDimTable
Scoring historyOut
cv_scoring_history
TwoDimTable[]
Cross-Validation scoring historyOut
reproducibility_information_table
TwoDimTable[]
Model reproducibility informationOut
training_metrics
ModelMetrics
Training data model metricsOut
validation_metrics
ModelMetrics
Validation data model metricsOut
cross_validation_metrics
ModelMetrics
Cross-validation model metricsOut
cross_validation_metrics_summary
TwoDimTable
Cross-validation model metrics summaryOut
status
string
Job statusOut
start_time
long
Start time in millisecondsOut
end_time
long
End time in millisecondsOut
run_time
long
Runtime in millisecondsOut
default_threshold
double
Default threshold used for predictionsOut
help
Map
Help information for output fieldsOut

GLRMModelV3

model_id
Key
Model keyIn/Out
parameters
GLRMParameters
The build parameters for the model (e.g. K for KMeans).Out
output
GLRMOutput
The build output for the model (e.g. the cluster centers for KMeans).Out
compatible_frames
string[]
Compatible frames, if requestedOut
checksum
long
Checksum for all the things that go into building the Model.Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

GLRMParametersV3

transform
enum
Transformation of training dataIn
k
int
Rank of matrix approximationIn
loss
enum
Numeric loss functionIn
multi_loss
enum
Categorical loss functionIn
loss_by_col
enum[]
Loss function by column (override)In
loss_by_col_idx
int[]
Loss function by column index (override)In
period
int
Length of period (only used with periodic loss function)In
regularization_x
enum
Regularization function for X matrixIn
regularization_y
enum
Regularization function for Y matrixIn
gamma_x
double
Regularization weight on X matrixIn
gamma_y
double
Regularization weight on Y matrixIn
max_iterations
int
Maximum number of iterationsIn
max_updates
int
Maximum number of updates, defaults to 2*max_iterationsIn
init_step_size
double
Initial step sizeIn
min_step_size
double
Minimum step sizeIn
seed
long
RNG seed for initializationIn
init
enum
Initialization modeIn
svd_method
enum
Method for computing SVD during initialization (Caution: Randomized is currently experimental and unstable)In
user_y
Key
User-specified initial YIn
user_x
Key
User-specified initial XIn
loading_name
string
[Deprecated] Use representation_name instead. Frame key to save resulting X.In
representation_name
string
Frame key to save resulting XIn
expand_user_y
boolean
Expand categorical columns in user-specified initial YIn
impute_original
boolean
Reconstruct original training data by reversing transformIn
recover_svd
boolean
Recover singular values and eigenvectors of XYIn
distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

GLRMV3

parameters
GLRMParameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out

GarbageCollectV3

(No fields)

GenericModelOutputV3

variable_importances
TwoDimTable
Variable ImportancesOut
original_model_identifier
string
Short identifier of the original algorithm nameOut
original_model_full_name
string
Full, potentially long name of the original agorithmOut
names
string[]
Column namesOut
original_names
string[]
Original column namesOut
column_types
string[]
Column typesOut
domains
string[][]
Domains for categorical columnsOut
cross_validation_models
Key[]
Cross-validation models (model ids)Out
cross_validation_predictions
Key[]
Cross-validation predictions, one per cv model (deprecated, use cross_validation_holdout_predictions_frame_id instead)Out
cross_validation_holdout_predictions_frame_id
Key
Cross-validation holdout predictions (full out-of-sample predictions on training data)Out
cross_validation_fold_assignment_frame_id
Key
Cross-validation fold assignment (each row is assigned to one holdout fold)Out
model_category
enum
Category of the model (e.g., Binomial)Out
model_summary
TwoDimTable
Model summaryOut
scoring_history
TwoDimTable
Scoring historyOut
cv_scoring_history
TwoDimTable[]
Cross-Validation scoring historyOut
reproducibility_information_table
TwoDimTable[]
Model reproducibility informationOut
training_metrics
ModelMetrics
Training data model metricsOut
validation_metrics
ModelMetrics
Validation data model metricsOut
cross_validation_metrics
ModelMetrics
Cross-validation model metricsOut
cross_validation_metrics_summary
TwoDimTable
Cross-validation model metrics summaryOut
status
string
Job statusOut
start_time
long
Start time in millisecondsOut
end_time
long
End time in millisecondsOut
run_time
long
Runtime in millisecondsOut
default_threshold
double
Default threshold used for predictionsOut
help
Map
Help information for output fieldsOut

GenericModelV3

model_id
Key
Model keyIn/Out
parameters
GenericModelParameters
The build parameters for the model (e.g. K for KMeans).Out
output
GenericModelOutput
The build output for the model (e.g. the cluster centers for KMeans).Out
compatible_frames
string[]
Compatible frames, if requestedOut
checksum
long
Checksum for all the things that go into building the Model.Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

GenericParametersV3

path
string
Path to file with self-contained model archive.In
distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
model_key
Key
Key to the self-contained model archive already uploaded to H2O.In/Out
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

GenericV3

parameters
GenericModelParameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out

GramV3

X
Key
source dataIn
W
VecSpecifier
weight vectorIn
use_all_factor_levels
boolean
use all factor levels when doing 1-hot encodingIn
standardize
boolean
standardize dataIn
skip_missing
boolean
skip missing valuesIn
destination_frame
Key
Destination key for the resulting matrix.Out

GrepModelOutputV3

matches
string[]
Matching stringsIn
offsets
long[]
Byte offsets of matchesIn
names
string[]
Column namesOut
original_names
string[]
Original column namesOut
column_types
string[]
Column typesOut
domains
string[][]
Domains for categorical columnsOut
cross_validation_models
Key[]
Cross-validation models (model ids)Out
cross_validation_predictions
Key[]
Cross-validation predictions, one per cv model (deprecated, use cross_validation_holdout_predictions_frame_id instead)Out
cross_validation_holdout_predictions_frame_id
Key
Cross-validation holdout predictions (full out-of-sample predictions on training data)Out
cross_validation_fold_assignment_frame_id
Key
Cross-validation fold assignment (each row is assigned to one holdout fold)Out
model_category
enum
Category of the model (e.g., Binomial)Out
model_summary
TwoDimTable
Model summaryOut
scoring_history
TwoDimTable
Scoring historyOut
cv_scoring_history
TwoDimTable[]
Cross-Validation scoring historyOut
reproducibility_information_table
TwoDimTable[]
Model reproducibility informationOut
training_metrics
ModelMetrics
Training data model metricsOut
validation_metrics
ModelMetrics
Validation data model metricsOut
cross_validation_metrics
ModelMetrics
Cross-validation model metricsOut
cross_validation_metrics_summary
TwoDimTable
Cross-validation model metrics summaryOut
status
string
Job statusOut
start_time
long
Start time in millisecondsOut
end_time
long
End time in millisecondsOut
run_time
long
Runtime in millisecondsOut
default_threshold
double
Default threshold used for predictionsOut
help
Map
Help information for output fieldsOut

GrepModelV3

model_id
Key
Model keyIn/Out
parameters
GrepParameters
The build parameters for the model (e.g. K for KMeans).Out
output
GrepOutput
The build output for the model (e.g. the cluster centers for KMeans).Out
compatible_frames
string[]
Compatible frames, if requestedOut
checksum
long
Checksum for all the things that go into building the Model.Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

GrepParametersV3

regex
string
regexIn
distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

GrepV3

parameters
GrepParameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out

GridExportV3

grid_id
string
ID of the Grid to load from the directoryIn
grid_directory
string
Path to the directory with saved Grid searchIn
save_params_references
boolean
True if objects referenced by params should also be saved.In
export_cross_validation_predictions
boolean
Flag indicating whether the exported model artifacts should also include CV Holdout Frame predictionsIn

GridImportV3

grid_path
string
Full path to the file containing saved GridIn
load_params_references
boolean
If true will also load saved objects referenced by params. Will fail with an error if grid was saved without objects referenced by params.In

GridKeyV3

name
string
Name (string representation) for this Key.In/Out
type
string
Name (string representation) for the type of Keyed this Key points to.In/Out
URL
string
URL for the resource that this Key points to, if one exists.In/Out

GridSchemaV99

grid_id
Key
Grid idIn
model_ids
Key[]
Model IDs built by a grid searchIn
sort_by
string
Model performance metric to sort by. Examples: logloss, residual_deviance, mse, rmse, mae,rmsle, auc, r2, f1, recall, precision, accuracy, mcc, err, err_count, lift_top_group, max_per_class_errorIn/Out
decreasing
boolean
Specify whether sort order should be decreasing.In/Out
hyper_names
string[]
Used hyper parameters.Out
failed_params
Parameters[]
List of failed parametersOut
warning_details
string[]
List of detailed warning messagesOut
failure_details
string[]
List of detailed failure messagesOut
failure_stack_traces
string[]
List of detailed failure stack tracesOut
failed_raw_params
string[][]
List of raw parameters causing model building failureOut
training_metrics
ModelMetrics[]
Training model metrics for the returned models; only returned if sort_by is setOut
validation_metrics
ModelMetrics[]
Validation model metrics for the returned models; only returned if sort_by is setOut
cross_validation_metrics
ModelMetrics[]
Cross validation model metrics for the returned models; only returned if sort_by is setOut
cross_validation_metrics_summary
TwoDimTable[]
Cross validation model metrics summary for the returned models; only returned if sort_by is setOut
export_checkpoints_dir
string
Directory for Grid automatic checkpointingOut
summary_table
TwoDimTable
SummaryOut
scoring_history
TwoDimTable
Scoring historyOut

GridSearchSchema

parameters
Parameters
Basic model builder parameters.In
parallelism
int
Level of parallelism during grid model building. 1 = sequential building (default). 0 for adaptive parallelism. Any number > 1 sets the exact number of models built in parallel.In
recovery_dir
string
Path to a directory where grid will save everything necessary to resume training after cluster crash.In
job_id
Key
Key to use for the Job handling this GridSearch (internal use only).In
hyper_parameters
Map
Grid search parameters.In/Out
grid_id
Key
Destination id for this grid; auto-generated if not specified.In/Out
search_criteria
HyperSpaceSearchCriteria
Hyperparameter search criteria, including strategy and early stopping directives. If it is not given, exhaustive Cartesian is used.In/Out
total_models
int
Number of all models generated by grid search.Out
job
Job
Job Key.Out

GridsV99

grids
Grid[]
GridsOut

H2OErrorV3

timestamp
long
Milliseconds since the epoch for the time that this H2OError instance was created. Generally this is a short time since the underlying error ocurred.Out
error_url
string
Error urlOut
msg
string
Message intended for the end user (a data scientist).Out
dev_msg
string
Potentially more detailed message intended for a developer (e.g. a front end engineer or someone designing a language binding).Out
http_status
int
HTTP status code for this error.Out
values
Map
Any values that are relevant to reporting or handling this error. Examples are a key name if the error is on a key, or a field name and object name if it’s on a specific field.Out
exception_type
string
Exception type, if any.Out
exception_msg
string
Raw exception message, if any.Out
stacktrace
string[]
Stacktrace, if any.Out

H2OModelBuilderErrorV3

parameters
Parameters
Model builder parameters.Out
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
timestamp
long
Milliseconds since the epoch for the time that this H2OError instance was created. Generally this is a short time since the underlying error ocurred.Out
error_url
string
Error urlOut
msg
string
Message intended for the end user (a data scientist).Out
dev_msg
string
Potentially more detailed message intended for a developer (e.g. a front end engineer or someone designing a language binding).Out
http_status
int
HTTP status code for this error.Out
values
Map
Any values that are relevant to reporting or handling this error. Examples are a key name if the error is on a key, or a field name and object name if it’s on a specific field.Out
exception_type
string
Exception type, if any.Out
exception_msg
string
Raw exception message, if any.Out
stacktrace
string[]
Stacktrace, if any.Out

HeartBeatEvent

sends
int
number of sent heartbeatsIn
recvs
int
number of received heartbeatsIn
date
string
Time when the event was recorded. Format is hh:mm:ss:msIn
nanos
long
Time in nanosIn
type
enum
type of recorded eventIn

HyperSpaceSearchCriteriaV99

strategy
enum
Hyperparameter space search strategy.In/Out

IOEvent

io_flavor
string
flavor of the recorded io (ice/hdfs/…)In
node
string
node where this io event happenedIn
data
string
data infoIn
date
string
Time when the event was recorded. Format is hh:mm:ss:msIn
nanos
long
Time in nanosIn
type
enum
type of recorded eventIn

ImportFilesMultiV3

paths
string[]
pathsIn
pattern
string
patternIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
files
string[]
filesOut
destination_frames
string[]
namesOut
fails
string[]
failsOut
dels
string[]
delsOut

ImportFilesV3

path
string
pathIn
pattern
string
patternIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
files
string[]
filesOut
destination_frames
string[]
namesOut
fails
string[]
failsOut
dels
string[]
delsOut

ImportHiveTableV3

database
string
databaseIn
table
string
tableIn
partitions
string[][]
partitionsIn
allow_multi_format
boolean
partitionsIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In

ImportSQLTableV99

connection_url
string
connection_urlIn
table
string
tableIn
select_query
string
select_queryIn
use_temp_table
string
use_temp_tableIn
temp_table_name
string
temp_table_nameIn
username
string
usernameIn
password
string
passwordIn
columns
string
columnsIn
fetch_mode
string
Mode for data loading. All modes may not be supported by all databases.In
num_chunks_hint
string
Desired number of chunks for the target Frame. Optional.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In

InfogramModelOutputV3

admissible_cmi
double[]
Array of conditional mutual information for admissible features normalized to 0.0 and 1.0Out
admissible_cmi_raw
double[]
Array of conditional mutual information for admissible features raw and not normalized to 0.0 and 1.0Out
admissible_relevance
double[]
Array of variable importance for admissible featuresOut
admissible_features
string[]
Array containing names of admissible features for the userOut
admissible_features_valid
string[]
Array containing names of admissible features for the user from the validation dataset.Out
admissible_features_xval
string[]
Array containing names of admissible features for the user from cross-validation.Out
cmi_raw
double[]
Array of raw conditional mutual information for all features excluding sensitive attributes if applicableOut
cmi
double[]
Array of conditional mutual information for all features excluding sensitive attributes if applicable normalized to 0.0 and 1.0Out
all_predictor_names
string[]
Array containing names of all features excluding sensitive attributes if applicable corresponding to CMI and relevanceOut
relevance
double[]
Array of variable importance for all features excluding sensitive attributes if applicableOut
admissible_score_key
Key
Frame key that stores the predictor names, net CMI and relevance.Out
admissible_score_key_valid
Key
Frame key that stores the predictor names, net CMI and relevance calculated from validation dataset.Out
admissible_score_key_xval
Key
Frame key that stores the predictor names, net CMI and relevance from cross-validation.Out
names
string[]
Column namesOut
original_names
string[]
Original column namesOut
column_types
string[]
Column typesOut
domains
string[][]
Domains for categorical columnsOut
cross_validation_models
Key[]
Cross-validation models (model ids)Out
cross_validation_predictions
Key[]
Cross-validation predictions, one per cv model (deprecated, use cross_validation_holdout_predictions_frame_id instead)Out
cross_validation_holdout_predictions_frame_id
Key
Cross-validation holdout predictions (full out-of-sample predictions on training data)Out
cross_validation_fold_assignment_frame_id
Key
Cross-validation fold assignment (each row is assigned to one holdout fold)Out
model_category
enum
Category of the model (e.g., Binomial)Out
model_summary
TwoDimTable
Model summaryOut
scoring_history
TwoDimTable
Scoring historyOut
cv_scoring_history
TwoDimTable[]
Cross-Validation scoring historyOut
reproducibility_information_table
TwoDimTable[]
Model reproducibility informationOut
training_metrics
ModelMetrics
Training data model metricsOut
validation_metrics
ModelMetrics
Validation data model metricsOut
cross_validation_metrics
ModelMetrics
Cross-validation model metricsOut
cross_validation_metrics_summary
TwoDimTable
Cross-validation model metrics summaryOut
status
string
Job statusOut
start_time
long
Start time in millisecondsOut
end_time
long
End time in millisecondsOut
run_time
long
Runtime in millisecondsOut
default_threshold
double
Default threshold used for predictionsOut
help
Map
Help information for output fieldsOut

InfogramModelV3

model_id
Key
Model keyIn/Out
parameters
InfogramParameters
The build parameters for the model (e.g. K for KMeans).Out
output
InfogramModelOutput
The build output for the model (e.g. the cluster centers for KMeans).Out
compatible_frames
string[]
Compatible frames, if requestedOut
checksum
long
Checksum for all the things that go into building the Model.Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

InfogramParametersV3

seed
long
Seed for pseudo random number generator (if applicable).In
standardize
boolean
Standardize numeric columns to have zero mean and unit variance.In
plug_values
Key
Plug Values (a single row frame containing values that will be used to impute missing values of the training/validation frame, use with conjunction missing_values_handling = PlugValues).In
max_iterations
int
Maximum number of iterations.In
prior
double
Prior probability for y==1. To be used only for logistic regression iff the data has been sampled and the mean of response does not reflect reality.In
algorithm_params
string
Customized parameters for the machine learning algorithm specified in the algorithm parameter.In
protected_columns
string[]
Columns that contain features that are sensitive and need to be protected (legally, or otherwise), if applicable. These features (e.g. race, gender, etc) should not drive the prediction of the response.In
total_information_threshold
double
A number between 0 and 1 representing a threshold for total information, defaulting to 0.1. For a specific feature, if the total information is higher than this threshold, and the corresponding net information is also higher than the threshold net_information_threshold, that feature will be considered admissible. The total information is the x-axis of the Core Infogram. Default is -1 which gets set to 0.1.In
net_information_threshold
double
A number between 0 and 1 representing a threshold for net information, defaulting to 0.1. For a specific feature, if the net information is higher than this threshold, and the corresponding total information is also higher than the total_information_threshold, that feature will be considered admissible. The net information is the y-axis of the Core Infogram. Default is -1 which gets set to 0.1.In
relevance_index_threshold
double
A number between 0 and 1 representing a threshold for the relevance index, defaulting to 0.1. This is only used when protected_columns is set by the user. For a specific feature, if the relevance index value is higher than this threshold, and the corresponding safety index is also higher than the safety_index_threshold``, that feature will be considered admissible. The relevance index is the x-axis of the Fair Infogram. Default is -1 which gets set to 0.1.In
safety_index_threshold
double
A number between 0 and 1 representing a threshold for the safety index, defaulting to 0.1. This is only used when protected_columns is set by the user. For a specific feature, if the safety index value is higher than this threshold, and the corresponding relevance index is also higher than the relevance_index_threshold, that feature will be considered admissible. The safety index is the y-axis of the Fair Infogram. Default is -1 which gets set to 0.1.In
data_fraction
double
The fraction of training frame to use to build the infogram model. Defaults to 1.0, and any value greater than 0 and less than or equal to 1.0 is acceptable.In
top_n_features
int
An integer specifying the number of columns to evaluate in the infogram. The columns are ranked by variable importance, and the top N are evaluated. Defaults to 50.In
distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
balance_classes
boolean
Balance training data class counts via over/under-sampling (for imbalanced data).In/Out
class_sampling_factors
float[]
Desired over/under-sampling ratios per class (in lexicographic order). If not specified, sampling factors will be automatically computed to obtain class balance during training. Requires balance_classes.In/Out
max_after_balance_size
float
Maximum relative size of the training data after balancing class counts (can be less than 1.0). Requires balance_classes.In/Out
algorithm
enum
Type of machine learning algorithm used to build the infogram. Options include ‘AUTO’ (gbm), ‘deeplearning’ (Deep Learning with default parameters), ‘drf’ (Random Forest with default parameters), ‘gbm’ (GBM with default parameters), ‘glm’ (GLM with default parameters), or ‘xgboost’ (if available, XGBoost with default parameters).In/Out
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

InfogramV3

parameters
InfogramParameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out

InitIDV3

_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
session_key
string
Session IDIn/Out
session_properties_allowed
boolean
Indicates whether users are allowed to set and modify session propertiesOut

InputSchemaV4

_fields
string
Filter on the set of output fields: if you set _fields=”foo,bar,baz”, then only those fields will be included in the output; or you can specify _fields=”-goo,gee” to include all fields except goo and gee. If the result contains nested data structures, then you can refer to the fields within those structures as well. For example if you specify _fields=”foo(oof),bar(-rab)”, then only fields foo and bar will be included, and within foo there will be only field oof, whereas within bar all fields except rab will be reported.In

InteractionV3

_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
dest
Key
destination keyIn/Out
source_frame
Key
Input data frameIn/Out
factor_columns
string[]
Factor columnsIn/Out
pairwise
boolean
Whether to create pairwise quadratic interactions between factors (otherwise create one higher-order interaction). Only applicable if there are 3 or more factors.In/Out
max_factors
int
Max. number of factor levels in pair-wise interaction terms (if enforced, one extra catch-all factor will be made)In/Out
min_occurrence
int
Min. occurrence threshold for factor levels in pair-wise interaction termsIn/Out

IoStatsEntry

backend
string
Back end typeOut
store_count
long
Number of store eventsOut
store_bytes
long
Cumulative stored bytesOut
delete_count
long
Number of delete eventsOut
load_count
long
Number of load eventsOut
load_bytes
long
Cumulative loaded bytesOut

IsolationForestModelOutputV3

variable_splits
TwoDimTable
Variable SplitsOut
variable_importances
TwoDimTable
Variable ImportancesOut
init_f
double
The Intercept term, the initial model function value to which trees make adjustmentsOut
names
string[]
Column namesOut
original_names
string[]
Original column namesOut
column_types
string[]
Column typesOut
domains
string[][]
Domains for categorical columnsOut
cross_validation_models
Key[]
Cross-validation models (model ids)Out
cross_validation_predictions
Key[]
Cross-validation predictions, one per cv model (deprecated, use cross_validation_holdout_predictions_frame_id instead)Out
cross_validation_holdout_predictions_frame_id
Key
Cross-validation holdout predictions (full out-of-sample predictions on training data)Out
cross_validation_fold_assignment_frame_id
Key
Cross-validation fold assignment (each row is assigned to one holdout fold)Out
model_category
enum
Category of the model (e.g., Binomial)Out
model_summary
TwoDimTable
Model summaryOut
scoring_history
TwoDimTable
Scoring historyOut
cv_scoring_history
TwoDimTable[]
Cross-Validation scoring historyOut
reproducibility_information_table
TwoDimTable[]
Model reproducibility informationOut
training_metrics
ModelMetrics
Training data model metricsOut
validation_metrics
ModelMetrics
Validation data model metricsOut
cross_validation_metrics
ModelMetrics
Cross-validation model metricsOut
cross_validation_metrics_summary
TwoDimTable
Cross-validation model metrics summaryOut
status
string
Job statusOut
start_time
long
Start time in millisecondsOut
end_time
long
End time in millisecondsOut
run_time
long
Runtime in millisecondsOut
default_threshold
double
Default threshold used for predictionsOut
help
Map
Help information for output fieldsOut

IsolationForestModelV3

model_id
Key
Model keyIn/Out
parameters
IsolationForestParameters
The build parameters for the model (e.g. K for KMeans).Out
output
IsolationForestOutput
The build output for the model (e.g. the cluster centers for KMeans).Out
compatible_frames
string[]
Compatible frames, if requestedOut
checksum
long
Checksum for all the things that go into building the Model.Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

IsolationForestParametersV3

sample_size
long
Number of randomly sampled observations used to train each Isolation Forest tree. Only one of parameters sample_size and sample_rate should be defined. If sample_rate is defined, sample_size will be ignored.In
sample_rate
double
Rate of randomly sampled observations used to train each Isolation Forest tree. Needs to be in range from 0.0 to 1.0. If set to -1, sample_rate is disabled and sample_size will be used instead.In
mtries
int
Number of variables randomly sampled as candidates at each split. If set to -1, defaults (number of predictors)/3.In
contamination
double
Contamination ratio - the proportion of anomalies in the input dataset. If undefined (-1) the predict function will not mark observations as anomalies and only anomaly score will be returned. Defaults to -1 (undefined).In
ntrees
int
Number of trees.In
max_depth
int
Maximum tree depth (0 for unlimited).In
min_rows
double
Fewest allowed (weighted) observations in a leaf.In
nbins
int
For numerical columns (real/int), build a histogram of (at least) this many bins, then split at the best pointIn
nbins_top_level
int
For numerical columns (real/int), build a histogram of (at most) this many bins at the root level, then decrease by factor of two per levelIn
nbins_cats
int
For categorical columns (factors), build a histogram of this many bins, then split at the best point. Higher values can lead to more overfitting.In
r2_stopping
double
r2_stopping is no longer supported and will be ignored if set - please use stopping_rounds, stopping_metric and stopping_tolerance instead. Previous version of H2O would stop making trees when the R^2 metric equals or exceeds thisIn
seed
long
Seed for pseudo random number generator (if applicable)In
build_tree_one_node
boolean
Run on one node only; no network overhead but fewer cpus used. Suitable for small datasets.In
sample_rate_per_class
double[]
A list of row sample rates per class (relative fraction for each class, from 0.0 to 1.0), for each treeIn
col_sample_rate_per_tree
double
Column sample rate per tree (from 0.0 to 1.0)In
col_sample_rate_change_per_level
double
Relative change of the column sampling rate for every level (must be > 0.0 and <= 2.0)In
score_tree_interval
int
Score the model after every so many trees. Disabled if set to 0.In
min_split_improvement
double
Minimum relative improvement in squared error reduction for a split to happenIn
histogram_type
enum
What type of histogram to use for finding optimal split pointsIn
calibrate_model
boolean
Use Platt Scaling (default) or Isotonic Regression to calculate calibrated class probabilities. Calibration can provide more accurate estimates of class probabilities.In
in_training_checkpoints_dir
string
Create checkpoints into defined directory while training process is still running. In case of cluster shutdown, this checkpoint can be used to restart training.In
in_training_checkpoints_tree_interval
int
Checkpoint the model after every so many trees. Parameter is used only when in_training_checkpoints_dir is definedIn
distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
validation_response_column
VecSpecifier
(experimental) Name of the response column in the validation frame. Response column should be binary and indicate not anomaly/anomaly.In/Out
balance_classes
boolean
Balance training data class counts via over/under-sampling (for imbalanced data).In/Out
class_sampling_factors
float[]
Desired over/under-sampling ratios per class (in lexicographic order). If not specified, sampling factors will be automatically computed to obtain class balance during training. Requires balance_classes.In/Out
max_after_balance_size
float
Maximum relative size of the training data after balancing class counts (can be less than 1.0). Requires balance_classes.In/Out
max_confusion_matrix_size
int
[Deprecated] Maximum size (# classes) for confusion matrices to be printed in the LogsIn/Out
calibration_frame
Key
Data for model calibrationIn/Out
calibration_method
enum
Calibration method to useIn/Out
check_constant_response
boolean
Check if response column is constant. If enabled, then an exception is thrown if the response column is a constant value.If disabled, then model will train regardless of the response column being a constant value or not.In/Out
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

IsolationForestV3

parameters
IsolationForestParameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out

IsotonicRegressionModelOutputV3

thresholds_y
double[]
thresholds yIn
thresholds_x
double[]
thresholds XIn
min_x
double
min XIn
max_x
double
max XIn
names
string[]
Column namesOut
original_names
string[]
Original column namesOut
column_types
string[]
Column typesOut
domains
string[][]
Domains for categorical columnsOut
cross_validation_models
Key[]
Cross-validation models (model ids)Out
cross_validation_predictions
Key[]
Cross-validation predictions, one per cv model (deprecated, use cross_validation_holdout_predictions_frame_id instead)Out
cross_validation_holdout_predictions_frame_id
Key
Cross-validation holdout predictions (full out-of-sample predictions on training data)Out
cross_validation_fold_assignment_frame_id
Key
Cross-validation fold assignment (each row is assigned to one holdout fold)Out
model_category
enum
Category of the model (e.g., Binomial)Out
model_summary
TwoDimTable
Model summaryOut
scoring_history
TwoDimTable
Scoring historyOut
cv_scoring_history
TwoDimTable[]
Cross-Validation scoring historyOut
reproducibility_information_table
TwoDimTable[]
Model reproducibility informationOut
training_metrics
ModelMetrics
Training data model metricsOut
validation_metrics
ModelMetrics
Validation data model metricsOut
cross_validation_metrics
ModelMetrics
Cross-validation model metricsOut
cross_validation_metrics_summary
TwoDimTable
Cross-validation model metrics summaryOut
status
string
Job statusOut
start_time
long
Start time in millisecondsOut
end_time
long
End time in millisecondsOut
run_time
long
Runtime in millisecondsOut
default_threshold
double
Default threshold used for predictionsOut
help
Map
Help information for output fieldsOut

IsotonicRegressionModelV3

model_id
Key
Model keyIn/Out
parameters
IsotonicRegressionParameters
The build parameters for the model (e.g. K for KMeans).Out
output
IsotonicRegressionOutput
The build output for the model (e.g. the cluster centers for KMeans).Out
compatible_frames
string[]
Compatible frames, if requestedOut
checksum
long
Checksum for all the things that go into building the Model.Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

IsotonicRegressionParametersV3

distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
out_of_bounds
enum
Method of handling values of X predictor that are outside of the bounds seen in training.In/Out
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

IsotonicRegressionV3

parameters
IsotonicRegressionParameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out

JStackV3

_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
traces
DStackTrace[]
StacktracesOut

JobIV4

job_id
string
Id of the job to fetch.In
_fields
string
Filter on the set of output fields: if you set _fields=”foo,bar,baz”, then only those fields will be included in the output; or you can specify _fields=”-goo,gee” to include all fields except goo and gee. If the result contains nested data structures, then you can refer to the fields within those structures as well. For example if you specify _fields=”foo(oof),bar(-rab)”, then only fields foo and bar will be included, and within foo there will be only field oof, whereas within bar all fields except rab will be reported.In

JobKeyV3

name
string
Name (string representation) for this Key.In/Out
type
string
Name (string representation) for the type of Keyed this Key points to.In/Out
URL
string
URL for the resource that this Key points to, if one exists.In/Out

JobV3

key
Key
Job KeyIn
description
string
Job descriptionIn
dest
Key
destination keyIn/Out
status
string
job statusOut
progress
float
progress, from 0 to 1Out
progress_msg
string
current progress status descriptionOut
start_time
long
Start timeOut
msec
long
Runtime in millisecondsOut
warnings
string[]
exceptionOut
exception
string
exceptionOut
stacktrace
string
stacktraceOut
auto_recoverable
boolean
recoverableOut
ready_for_view
boolean
ready for viewOut

JobV4

job_id
string
Job idIn
status
enum
Job statusIn
progress
float
Current progress, a number going from 0 to 1In
progress_msg
string
Current progress status descriptionIn
start_time
long
Start timeIn
duration
long
Runtime in millisecondsIn
target_id
string
Id of the target object (being created by this Job)In
target_type
string
Type of the target: Frame, Model, etc.In
exception
string
Exception message, if an exception occurredIn
stacktrace
string
StacktraceIn
__schema
string
Url describing the schema of the current object.In

JobsV3

job_id
Key
Optional Job identifierIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
jobs
Job[]
jobsOut

KMeansModelOutputV3

centers
TwoDimTable
Cluster Centers[k][features]In
centers_std
TwoDimTable
Cluster Centers[k][features] on Standardized DataIn
names
string[]
Column namesOut
original_names
string[]
Original column namesOut
column_types
string[]
Column typesOut
domains
string[][]
Domains for categorical columnsOut
cross_validation_models
Key[]
Cross-validation models (model ids)Out
cross_validation_predictions
Key[]
Cross-validation predictions, one per cv model (deprecated, use cross_validation_holdout_predictions_frame_id instead)Out
cross_validation_holdout_predictions_frame_id
Key
Cross-validation holdout predictions (full out-of-sample predictions on training data)Out
cross_validation_fold_assignment_frame_id
Key
Cross-validation fold assignment (each row is assigned to one holdout fold)Out
model_category
enum
Category of the model (e.g., Binomial)Out
model_summary
TwoDimTable
Model summaryOut
scoring_history
TwoDimTable
Scoring historyOut
cv_scoring_history
TwoDimTable[]
Cross-Validation scoring historyOut
reproducibility_information_table
TwoDimTable[]
Model reproducibility informationOut
training_metrics
ModelMetrics
Training data model metricsOut
validation_metrics
ModelMetrics
Validation data model metricsOut
cross_validation_metrics
ModelMetrics
Cross-validation model metricsOut
cross_validation_metrics_summary
TwoDimTable
Cross-validation model metrics summaryOut
status
string
Job statusOut
start_time
long
Start time in millisecondsOut
end_time
long
End time in millisecondsOut
run_time
long
Runtime in millisecondsOut
default_threshold
double
Default threshold used for predictionsOut
help
Map
Help information for output fieldsOut

KMeansModelV3

model_id
Key
Model keyIn/Out
parameters
KMeansParameters
The build parameters for the model (e.g. K for KMeans).Out
output
KMeansOutput
The build output for the model (e.g. the cluster centers for KMeans).Out
compatible_frames
string[]
Compatible frames, if requestedOut
checksum
long
Checksum for all the things that go into building the Model.Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

KMeansParametersV3

user_points
Key
This option allows you to specify a dataframe, where each row represents an initial cluster center. The user-specified points must have the same number of columns as the training observations. The number of rows must equal the number of clustersIn
max_iterations
int
Maximum training iterations (if estimate_k is enabled, then this is for each inner Lloyds iteration)In
standardize
boolean
Standardize columns before computing distancesIn
seed
long
RNG SeedIn
init
enum
Initialization modeIn
estimate_k
boolean
Whether to estimate the number of clusters (<=k) iteratively and deterministically.In
cluster_size_constraints
int[]
An array specifying the minimum number of points that should be in each cluster. The length of the constraints array has to be the same as the number of clusters.In
distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
k
int
The max. number of clusters. If estimate_k is disabled, the model will find k centroids, otherwise it will find up to k centroids.In/Out
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

KMeansV3

parameters
KMeansParameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out

KeyV3

name
string
Name (string representation) for this Key.In/Out
type
string
Name (string representation) for the type of Keyed this Key points to.In/Out
URL
string
URL for the resource that this Key points to, if one exists.In/Out

KeyValueV3

key
string
KeyIn
value
double
ValueIn

KillMinus3V3

_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In

LeaderboardV99

project_name
string
Identifier for models that should be grouped together in the leaderboardIn/Out
sort_metric
string
Metric used to sort this leaderboardIn/Out
sort_decreasing
boolean
Metric direction used in the sortIn/Out
models
Key[]
List of models for this leaderboard, sorted by metric so that the best is firstOut
leaderboard_frame
Key
Frame for this leaderboardOut
leaderboard_frame_checksum
long
Checksum for the Frame for this leaderboardOut
sort_metrics
double[]
Sort metrics for the models in this leaderboard, in the same order as the modelsOut
table
TwoDimTable
A table representation of this leaderboard, for easy renderingOut

LeaderboardsV99

project_name
string
Name of project of interestIn
extensions
string[]
List of extension columns to add to leaderboardIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
leaderboards
Leaderboard[]
LeaderboardsOut

ListRequestV4

__schema
string
Url describing the schema of the current object.In

LogAndEchoV3

message
string
Message to be Logged and EchoedIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In

LogsV3

nodeidx
string
Identifier of the node to get logs from. It can be either node index starting from (0-based), where -1 means current node, or IP and port.In
name
string
Which specific log file to read from the log file directory. If left unspecified, the system chooses a default for you.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
log
string
Content of log fileOut

MakeGLMModelV3

model
Key
source modelIn
dest
Key
destination keyIn
names
string[]
coefficient namesIn
beta
double[]
new glm coefficientsIn
threshold
float
decision threshold for label-generationIn

MetadataV3

num
int
Number for specifying an endpointIn
http_method
string
HTTP method (GET, POST, DELETE) if fetching by pathIn
path
string
Path for specifying an endpointIn
classname
string
Class name, for fetching docs for a schema (DEPRECATED)In
schemaname
string
Schema name (e.g., DocsV1), for fetching docs for a schemaIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
routes
Route[]
List of endpoint routesOut
schemas
SchemaMetadata[]
List of schemasOut
markdown
string
Table of Contents MarkdownOut

MissingInserterV3

dataset
Key
datasetIn
fraction
double
Fraction of data to replace with a missing valueIn
seed
long
SeedIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In

ModelBuilderSchema

parameters
Parameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out

ModelBuilderV3

parameters
Parameters
Model builder parameters.Out
messages
ValidationMessage[]
Info, warning and error messages; NOTE: can be appended to while the Job is runningOut
error_count
int
Count of error messagesOut

ModelBuildersV3

algo
string
Algo of ModelBuilder of interestIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
model_builders
Map
ModelBuildersOut

ModelExportV3

model_id
Key
Name of Model of interestIn
dir
string
Destination file (hdfs, s3, local)In
force
boolean
Overwrite destination file in case it exists or throw exception if set to false.In
export_cross_validation_predictions
boolean
Flag indicating whether the exported model artifact should also include CV Holdout Frame predictionsIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In

ModelIdV3

model_id
string
Model IDOut

ModelImportV3

model_id
Key
Save imported model under given key into DKV.In
dir
string
Source directory (hdfs, s3, local) containing serialized modelIn
force
boolean
Override existing model in case it exists or throw exception if set to falseIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In

ModelInfoV4

algo
string
Algorithm name, such as ‘gbm’, ‘deeplearning’, etc.In
maturity
string
Development status of the algorithm: alpha, beta, or stable.In
have_pojo
boolean
Does the model support generation of POJOs?In
have_mojo
boolean
Does the model support generation of MOJOs?In
mojo_version
string
Mojo version number for this algorithm.In
__schema
string
Url describing the schema of the current object.In

ModelKeyV3

name
string
Name (string representation) for this Key.In/Out
type
string
Name (string representation) for the type of Keyed this Key points to.In/Out
URL
string
URL for the resource that this Key points to, if one exists.In/Out

ModelMetricsAnomalyV3

nobs
long
Number of observations.In
model
Key
The model used for this scoring run.In/Out
model_checksum
long
The checksum for the model used for this scoring run.In/Out
frame
Key
The frame used for this scoring run.In/Out
frame_checksum
long
The checksum for the frame used for this scoring run.In/Out
mean_score
double
Mean Anomaly Score.Out
mean_normalized_score
double
Mean Normalized Anomaly Score.Out
description
string
Optional description for this scoring run (to note out-of-bag, sampled data, etc.)Out
model_category
enum
The category (e.g., Clustering) for the model used for this scoring run.Out
scoring_time
long
The time in mS since the epoch for the start of this scoring run.Out
predictions
Frame
Predictions Frame.Out
MSE
double
The Mean Squared Error of the prediction for this scoring run.Out
RMSE
double
The Root Mean Squared Error of the prediction for this scoring run.Out
custom_metric_name
string
Name of custom metricOut
custom_metric_value
double
Value of custom metricOut

ModelMetricsAutoEncoderV3

nobs
long
Number of observations.In
model
Key
The model used for this scoring run.In/Out
model_checksum
long
The checksum for the model used for this scoring run.In/Out
frame
Key
The frame used for this scoring run.In/Out
frame_checksum
long
The checksum for the frame used for this scoring run.In/Out
description
string
Optional description for this scoring run (to note out-of-bag, sampled data, etc.)Out
model_category
enum
The category (e.g., Clustering) for the model used for this scoring run.Out
scoring_time
long
The time in mS since the epoch for the start of this scoring run.Out
predictions
Frame
Predictions Frame.Out
MSE
double
The Mean Squared Error of the prediction for this scoring run.Out
RMSE
double
The Root Mean Squared Error of the prediction for this scoring run.Out
custom_metric_name
string
Name of custom metricOut
custom_metric_value
double
Value of custom metricOut

ModelMetricsBaseV3

nobs
long
Number of observations.In
model
Key
The model used for this scoring run.In/Out
model_checksum
long
The checksum for the model used for this scoring run.In/Out
frame
Key
The frame used for this scoring run.In/Out
frame_checksum
long
The checksum for the frame used for this scoring run.In/Out
description
string
Optional description for this scoring run (to note out-of-bag, sampled data, etc.)Out
model_category
enum
The category (e.g., Clustering) for the model used for this scoring run.Out
scoring_time
long
The time in mS since the epoch for the start of this scoring run.Out
predictions
Frame
Predictions Frame.Out
MSE
double
The Mean Squared Error of the prediction for this scoring run.Out
RMSE
double
The Root Mean Squared Error of the prediction for this scoring run.Out
custom_metric_name
string
Name of custom metricOut
custom_metric_value
double
Value of custom metricOut

ModelMetricsBinomialGLMGenericV3

nobs
long
Number of observations.In
model
Key
The model used for this scoring run.In/Out
model_checksum
long
The checksum for the model used for this scoring run.In/Out
frame
Key
The frame used for this scoring run.In/Out
frame_checksum
long
The checksum for the frame used for this scoring run.In/Out
residual_deviance
double
residual devianceOut
null_deviance
double
null devianceOut
AIC
double
AICOut
null_degrees_of_freedom
long
null DOFOut
residual_degrees_of_freedom
long
residual DOFOut
coefficients_table
TwoDimTable
coefficients_tableOut
r2
double
The R^2 for this scoring run.Out
logloss
double
The logarithmic loss for this scoring run.Out
AUC
double
The AUC for this scoring run.Out
pr_auc
double
The precision-recall AUC for this scoring run.Out
Gini
double
The Gini score for this scoring run.Out
mean_per_class_error
double
The mean misclassification error per class.Out
domain
string[]
The class labels of the response.Out
cm
ConfusionMatrix
The ConfusionMatrix at the threshold for maximum F1.Out
thresholds_and_metric_scores
TwoDimTable
The Metrics for various thresholds.Out
max_criteria_and_metric_scores
TwoDimTable
The Metrics for various criteria.Out
gains_lift_table
TwoDimTable
Gains and Lift table.Out
description
string
Optional description for this scoring run (to note out-of-bag, sampled data, etc.)Out
model_category
enum
The category (e.g., Clustering) for the model used for this scoring run.Out
scoring_time
long
The time in mS since the epoch for the start of this scoring run.Out
predictions
Frame
Predictions Frame.Out
MSE
double
The Mean Squared Error of the prediction for this scoring run.Out
RMSE
double
The Root Mean Squared Error of the prediction for this scoring run.Out
custom_metric_name
string
Name of custom metricOut
custom_metric_value
double
Value of custom metricOut

ModelMetricsBinomialGLMV3

nobs
long
Number of observations.In
model
Key
The model used for this scoring run.In/Out
model_checksum
long
The checksum for the model used for this scoring run.In/Out
frame
Key
The frame used for this scoring run.In/Out
frame_checksum
long
The checksum for the frame used for this scoring run.In/Out
residual_deviance
double
residual devianceOut
null_deviance
double
null devianceOut
AIC
double
AICOut
null_degrees_of_freedom
long
null DOFOut
residual_degrees_of_freedom
long
residual DOFOut
r2
double
The R^2 for this scoring run.Out
logloss
double
The logarithmic loss for this scoring run.Out
AUC
double
The AUC for this scoring run.Out
pr_auc
double
The precision-recall AUC for this scoring run.Out
Gini
double
The Gini score for this scoring run.Out
mean_per_class_error
double
The mean misclassification error per class.Out
domain
string[]
The class labels of the response.Out
cm
ConfusionMatrix
The ConfusionMatrix at the threshold for maximum F1.Out
thresholds_and_metric_scores
TwoDimTable
The Metrics for various thresholds.Out
max_criteria_and_metric_scores
TwoDimTable
The Metrics for various criteria.Out
gains_lift_table
TwoDimTable
Gains and Lift table.Out
description
string
Optional description for this scoring run (to note out-of-bag, sampled data, etc.)Out
model_category
enum
The category (e.g., Clustering) for the model used for this scoring run.Out
scoring_time
long
The time in mS since the epoch for the start of this scoring run.Out
predictions
Frame
Predictions Frame.Out
MSE
double
The Mean Squared Error of the prediction for this scoring run.Out
RMSE
double
The Root Mean Squared Error of the prediction for this scoring run.Out
custom_metric_name
string
Name of custom metricOut
custom_metric_value
double
Value of custom metricOut

ModelMetricsBinomialGenericV3

nobs
long
Number of observations.In
model
Key
The model used for this scoring run.In/Out
model_checksum
long
The checksum for the model used for this scoring run.In/Out
frame
Key
The frame used for this scoring run.In/Out
frame_checksum
long
The checksum for the frame used for this scoring run.In/Out
r2
double
The R^2 for this scoring run.Out
logloss
double
The logarithmic loss for this scoring run.Out
AUC
double
The AUC for this scoring run.Out
pr_auc
double
The precision-recall AUC for this scoring run.Out
Gini
double
The Gini score for this scoring run.Out
mean_per_class_error
double
The mean misclassification error per class.Out
domain
string[]
The class labels of the response.Out
cm
ConfusionMatrix
The ConfusionMatrix at the threshold for maximum F1.Out
thresholds_and_metric_scores
TwoDimTable
The Metrics for various thresholds.Out
max_criteria_and_metric_scores
TwoDimTable
The Metrics for various criteria.Out
gains_lift_table
TwoDimTable
Gains and Lift table.Out
description
string
Optional description for this scoring run (to note out-of-bag, sampled data, etc.)Out
model_category
enum
The category (e.g., Clustering) for the model used for this scoring run.Out
scoring_time
long
The time in mS since the epoch for the start of this scoring run.Out
predictions
Frame
Predictions Frame.Out
MSE
double
The Mean Squared Error of the prediction for this scoring run.Out
RMSE
double
The Root Mean Squared Error of the prediction for this scoring run.Out
custom_metric_name
string
Name of custom metricOut
custom_metric_value
double
Value of custom metricOut

ModelMetricsBinomialUpliftV3

nobs
long
Number of observations.In
model
Key
The model used for this scoring run.In/Out
model_checksum
long
The checksum for the model used for this scoring run.In/Out
frame
Key
The frame used for this scoring run.In/Out
frame_checksum
long
The checksum for the frame used for this scoring run.In/Out
AUUC
double
The default AUUC for this scoring run.Out
auuc_normalized
double
The default normalized AUUC for this scoring run.Out
qini
double
The Qini value for this scoring run.Out
domain
string[]
The class labels of the response.Out
thresholds_and_metric_scores
TwoDimTable
The metrics for various thresholds.Out
auuc_table
TwoDimTable
Table of all types of AUUC.Out
aecu_table
TwoDimTable
Table of all types of AECU values.Out
description
string
Optional description for this scoring run (to note out-of-bag, sampled data, etc.)Out
model_category
enum
The category (e.g., Clustering) for the model used for this scoring run.Out
scoring_time
long
The time in mS since the epoch for the start of this scoring run.Out
predictions
Frame
Predictions Frame.Out
MSE
double
The Mean Squared Error of the prediction for this scoring run.Out
RMSE
double
The Root Mean Squared Error of the prediction for this scoring run.Out
custom_metric_name
string
Name of custom metricOut
custom_metric_value
double
Value of custom metricOut

ModelMetricsBinomialV3

nobs
long
Number of observations.In
model
Key
The model used for this scoring run.In/Out
model_checksum
long
The checksum for the model used for this scoring run.In/Out
frame
Key
The frame used for this scoring run.In/Out
frame_checksum
long
The checksum for the frame used for this scoring run.In/Out
r2
double
The R^2 for this scoring run.Out
logloss
double
The logarithmic loss for this scoring run.Out
AUC
double
The AUC for this scoring run.Out
pr_auc
double
The precision-recall AUC for this scoring run.Out
Gini
double
The Gini score for this scoring run.Out
mean_per_class_error
double
The mean misclassification error per class.Out
domain
string[]
The class labels of the response.Out
cm
ConfusionMatrix
The ConfusionMatrix at the threshold for maximum F1.Out
thresholds_and_metric_scores
TwoDimTable
The Metrics for various thresholds.Out
max_criteria_and_metric_scores
TwoDimTable
The Metrics for various criteria.Out
gains_lift_table
TwoDimTable
Gains and Lift table.Out
description
string
Optional description for this scoring run (to note out-of-bag, sampled data, etc.)Out
model_category
enum
The category (e.g., Clustering) for the model used for this scoring run.Out
scoring_time
long
The time in mS since the epoch for the start of this scoring run.Out
predictions
Frame
Predictions Frame.Out
MSE
double
The Mean Squared Error of the prediction for this scoring run.Out
RMSE
double
The Root Mean Squared Error of the prediction for this scoring run.Out
custom_metric_name
string
Name of custom metricOut
custom_metric_value
double
Value of custom metricOut

ModelMetricsClusteringV3

tot_withinss
double
Within Cluster Sum of Square ErrorIn
totss
double
Total Sum of Square Error to Grand MeanIn
betweenss
double
Between Cluster Sum of Square ErrorIn
centroid_stats
TwoDimTable
Centroid StatisticsIn
nobs
long
Number of observations.In
model
Key
The model used for this scoring run.In/Out
model_checksum
long
The checksum for the model used for this scoring run.In/Out
frame
Key
The frame used for this scoring run.In/Out
frame_checksum
long
The checksum for the frame used for this scoring run.In/Out
description
string
Optional description for this scoring run (to note out-of-bag, sampled data, etc.)Out
model_category
enum
The category (e.g., Clustering) for the model used for this scoring run.Out
scoring_time
long
The time in mS since the epoch for the start of this scoring run.Out
predictions
Frame
Predictions Frame.Out
MSE
double
The Mean Squared Error of the prediction for this scoring run.Out
RMSE
double
The Root Mean Squared Error of the prediction for this scoring run.Out
custom_metric_name
string
Name of custom metricOut
custom_metric_value
double
Value of custom metricOut

ModelMetricsGLRMV99

numerr
double
Sum of Squared Error (Numeric Cols)In
caterr
double
Misclassification Error (Categorical Cols)In
numcnt
long
Number of Non-Missing Numeric ValuesIn
catcnt
long
Number of Non-Missing Categorical ValuesIn
nobs
long
Number of observations.In
model
Key
The model used for this scoring run.In/Out
model_checksum
long
The checksum for the model used for this scoring run.In/Out
frame
Key
The frame used for this scoring run.In/Out
frame_checksum
long
The checksum for the frame used for this scoring run.In/Out
description
string
Optional description for this scoring run (to note out-of-bag, sampled data, etc.)Out
model_category
enum
The category (e.g., Clustering) for the model used for this scoring run.Out
scoring_time
long
The time in mS since the epoch for the start of this scoring run.Out
predictions
Frame
Predictions Frame.Out
MSE
double
The Mean Squared Error of the prediction for this scoring run.Out
RMSE
double
The Root Mean Squared Error of the prediction for this scoring run.Out
custom_metric_name
string
Name of custom metricOut
custom_metric_value
double
Value of custom metricOut

ModelMetricsHGLMGaussianGaussianGenericV3

nobs
long
Number of observations.In
model
Key
The model used for this scoring run.In/Out
model_checksum
long
The checksum for the model used for this scoring run.In/Out
frame
Key
The frame used for this scoring run.In/Out
frame_checksum
long
The checksum for the frame used for this scoring run.In/Out
sefe
double[]
standard error of fixed predictors/effectsOut
sere
double[]
standard error of random effectsOut
varfix
double
dispersion parameter of the mean model (residual variance for LMM)Out
varranef
double[]
dispersion parameter of the random effects (variance of random effects for GLMMOut
fixef
double[]
fixed coefficient)Out
ranef
double[]
random coefficientsOut
converge
boolean
true if model has convergedOut
randc
int[]
number of random columnsOut
dfrefe
double
deviance degrees of freedom for mean part of the modelOut
summvc1
double[]
estimates, standard errors of the linear predictor in the dispersion modelOut
summvc2
double[][]
estimates, standard errors of the linear predictor for dispersion parameter of random effectsOut
hlik
double
log h-likelihoodOut
pvh
double
adjusted profile log-likelihood profiled over random effectsOut
pbvh
double
adjusted profile log-likelihood profiled over fixed and random effectsOut
caic
double
conditional AICOut
bad
long
index of the most influential observationOut
sumetadiffsquare
double
sum(etai-eta0)^2 where etai is current eta and eta0 is the previous oneOut
convergence
double
sum(etai-eta0)^2/sum(etai)^2 Out
description
string
Optional description for this scoring run (to note out-of-bag, sampled data, etc.)Out
model_category
enum
The category (e.g., Clustering) for the model used for this scoring run.Out
scoring_time
long
The time in mS since the epoch for the start of this scoring run.Out
predictions
Frame
Predictions Frame.Out
MSE
double
The Mean Squared Error of the prediction for this scoring run.Out
RMSE
double
The Root Mean Squared Error of the prediction for this scoring run.Out
custom_metric_name
string
Name of custom metricOut
custom_metric_value
double
Value of custom metricOut

ModelMetricsHGLMGaussianGaussianV3

nobs
long
Number of observations.In
model
Key
The model used for this scoring run.In/Out
model_checksum
long
The checksum for the model used for this scoring run.In/Out
frame
Key
The frame used for this scoring run.In/Out
frame_checksum
long
The checksum for the frame used for this scoring run.In/Out
sefe
double[]
standard error of fixed predictors/effectsOut
sere
double[]
standard error of random effectsOut
varfix
double
dispersion parameter of the mean model (residual variance for LMM)Out
varranef
double[]
dispersion parameter of the random effects (variance of random effects for GLMMOut
fixef
double[]
fixed coefficient)Out
ranef
double[]
random coefficientsOut
converge
boolean
true if model has convergedOut
randc
int[]
number of random columnsOut
dfrefe
double
deviance degrees of freedom for mean part of the modelOut
summvc1
double[]
estimates, standard errors of the linear predictor in the dispersion modelOut
summvc2
double[][]
estimates, standard errors of the linear predictor for dispersion parameter of random effectsOut
hlik
double
log h-likelihoodOut
pvh
double
adjusted profile log-likelihood profiled over random effectsOut
pbvh
double
adjusted profile log-likelihood profiled over fixed and random effectsOut
caic
double
conditional AICOut
bad
long
index of the most influential observationOut
sumetadiffsquare
double
sum(etai-eta0)^2 where etai is current eta and eta0 is the previous oneOut
convergence
double
sum(etai-eta0)^2/sum(etai)^2 Out
description
string
Optional description for this scoring run (to note out-of-bag, sampled data, etc.)Out
model_category
enum
The category (e.g., Clustering) for the model used for this scoring run.Out
scoring_time
long
The time in mS since the epoch for the start of this scoring run.Out
predictions
Frame
Predictions Frame.Out
MSE
double
The Mean Squared Error of the prediction for this scoring run.Out
RMSE
double
The Root Mean Squared Error of the prediction for this scoring run.Out
custom_metric_name
string
Name of custom metricOut
custom_metric_value
double
Value of custom metricOut

ModelMetricsHGLMGenericV3

nobs
long
Number of observations.In
model
Key
The model used for this scoring run.In/Out
model_checksum
long
The checksum for the model used for this scoring run.In/Out
frame
Key
The frame used for this scoring run.In/Out
frame_checksum
long
The checksum for the frame used for this scoring run.In/Out
sefe
double[]
standard error of fixed predictors/effectsOut
sere
double[]
standard error of random effectsOut
varfix
double
dispersion parameter of the mean model (residual variance for LMM)Out
varranef
double[]
dispersion parameter of the random effects (variance of random effects for GLMMOut
fixef
double[]
fixed coefficient)Out
ranef
double[]
random coefficientsOut
converge
boolean
true if model has convergedOut
randc
int[]
number of random columnsOut
dfrefe
double
deviance degrees of freedom for mean part of the modelOut
summvc1
double[]
estimates, standard errors of the linear predictor in the dispersion modelOut
summvc2
double[][]
estimates, standard errors of the linear predictor for dispersion parameter of random effectsOut
hlik
double
log h-likelihoodOut
pvh
double
adjusted profile log-likelihood profiled over random effectsOut
pbvh
double
adjusted profile log-likelihood profiled over fixed and random effectsOut
caic
double
conditional AICOut
bad
long
index of the most influential observationOut
sumetadiffsquare
double
sum(etai-eta0)^2 where etai is current eta and eta0 is the previous oneOut
convergence
double
sum(etai-eta0)^2/sum(etai)^2 Out
description
string
Optional description for this scoring run (to note out-of-bag, sampled data, etc.)Out
model_category
enum
The category (e.g., Clustering) for the model used for this scoring run.Out
scoring_time
long
The time in mS since the epoch for the start of this scoring run.Out
predictions
Frame
Predictions Frame.Out
MSE
double
The Mean Squared Error of the prediction for this scoring run.Out
RMSE
double
The Root Mean Squared Error of the prediction for this scoring run.Out
custom_metric_name
string
Name of custom metricOut
custom_metric_value
double
Value of custom metricOut

ModelMetricsHGLMV3

nobs
long
Number of observations.In
model
Key
The model used for this scoring run.In/Out
model_checksum
long
The checksum for the model used for this scoring run.In/Out
frame
Key
The frame used for this scoring run.In/Out
frame_checksum
long
The checksum for the frame used for this scoring run.In/Out
sefe
double[]
standard error of fixed predictors/effectsOut
sere
double[]
standard error of random effectsOut
varfix
double
dispersion parameter of the mean model (residual variance for LMM)Out
varranef
double[]
dispersion parameter of the random effects (variance of random effects for GLMMOut
fixef
double[]
fixed coefficient)Out
ranef
double[]
random coefficientsOut
converge
boolean
true if model has convergedOut
randc
int[]
number of random columnsOut
dfrefe
double
deviance degrees of freedom for mean part of the modelOut
summvc1
double[]
estimates, standard errors of the linear predictor in the dispersion modelOut
summvc2
double[][]
estimates, standard errors of the linear predictor for dispersion parameter of random effectsOut
hlik
double
log h-likelihoodOut
pvh
double
adjusted profile log-likelihood profiled over random effectsOut
pbvh
double
adjusted profile log-likelihood profiled over fixed and random effectsOut
caic
double
conditional AICOut
bad
long
index of the most influential observationOut
sumetadiffsquare
double
sum(etai-eta0)^2 where etai is current eta and eta0 is the previous oneOut
convergence
double
sum(etai-eta0)^2/sum(etai)^2 Out
description
string
Optional description for this scoring run (to note out-of-bag, sampled data, etc.)Out
model_category
enum
The category (e.g., Clustering) for the model used for this scoring run.Out
scoring_time
long
The time in mS since the epoch for the start of this scoring run.Out
predictions
Frame
Predictions Frame.Out
MSE
double
The Mean Squared Error of the prediction for this scoring run.Out
RMSE
double
The Root Mean Squared Error of the prediction for this scoring run.Out
custom_metric_name
string
Name of custom metricOut
custom_metric_value
double
Value of custom metricOut

ModelMetricsListSchemaV3

model
Key
Key of Model of interest (optional)In
frame
Key
Key of Frame of interest (optional)In
reconstruction_error
boolean
Compute reconstruction error (optional, only for Deep Learning AutoEncoder models)In
reconstruction_error_per_feature
boolean
Compute reconstruction error per feature (optional, only for Deep Learning AutoEncoder models)In
deep_features_hidden_layer
int
Extract Deep Features for given hidden layer (optional, only for Deep Learning models)In
deep_features_hidden_layer_name
string
Extract Deep Features for given hidden layer by name (optional, only for Deep Water models)In
reconstruct_train
boolean
Reconstruct original training frame (optional, only for GLRM models)In
project_archetypes
boolean
Project GLRM archetypes back into original feature space (optional, only for GLRM models)In
reverse_transform
boolean
Reverse transformation applied during training to model output (optional, only for GLRM models)In
leaf_node_assignment
boolean
Return the leaf node assignment (optional, only for DRF/GBM models)In
leaf_node_assignment_type
enum
Type of the leaf node assignment (optional, only for DRF/GBM models)In
predict_staged_proba
boolean
Predict the class probabilities at each stage (optional, only for GBM models)In
predict_contributions
boolean
Predict the feature contributions - Shapley values (optional, only for DRF, GBM and XGBoost models)In
predict_contributions_output_format
enum
Specify how to output feature contributions in XGBoost - XGBoost by default outputs contributions for 1-hot encoded features, specifying a Compact output format will produce a per-feature contributionIn
top_n
int
Only for predict_contributions function - sort Shapley values and return top_n highest (optional)In
bottom_n
int
Only for predict_contributions function - sort Shapley values and return bottom_n lowest (optional)In
compare_abs
boolean
Only for predict_contributions function - sort absolute Shapley values (optional)In
feature_frequencies
boolean
Retrieve the feature frequencies on paths in trees in tree-based models (optional, only for GBM, DRF and Isolation Forest)In
exemplar_index
int
Retrieve all members for a given exemplar (optional, only for Aggregator models)In
deviances
boolean
Compute the deviances per row (optional, only for classification or regression models)In
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn
auc_type
string
Set default multinomial AUC type. Must be one of: “AUTO”, “NONE”, “MACRO_OVR”, “WEIGHTED_OVR”, “MACRO_OVO”, “WEIGHTED_OVO”. Default is “NONE” (optional, only for multinomial classification).In
auuc_type
string
Set default AUUC type for uplift binomial classification. Must be one of: “AUTO”, “qini”, “lift”, “gain”. Default is “AUTO” (optional, only for uplift binomial classification).In
auuc_nbins
int
Set number of bins to calculate AUUC. Must be -1 or higher than 0. Default is -1 which means 1000 (optional, only for uplift binomial classification).In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
predictions_frame
Key
Key of predictions frame, if predictions are requested (optional)In/Out
deviances_frame
Key
Key for the frame containing per-observation deviances (optional)In/Out
model_metrics
ModelMetrics[]
ModelMetricsOut

ModelMetricsMakerSchemaV3

predictions_frame
string
Predictions Frame.In/Out
actuals_frame
string
Actuals Frame.In/Out
weights_frame
string
Weights Frame.In/Out
treatment_frame
string
Treatment Frame.In/Out
domain
string[]
Domain (for classification).In/Out
distribution
enum
Distribution (for regression).In/Out
auc_type
enum
Default AUC type (for multinomial classification).In/Out
auuc_type
enum
Default AUUC type (for uplift binomial classification).In/Out
auuc_nbins
int
Number of bins to calculate AUUC (for uplift binomial classification).In/Out
model_metrics
ModelMetrics
Model Metrics.Out

ModelMetricsMultinomialGLMGenericV3

nobs
long
Number of observations.In
model
Key
The model used for this scoring run.In/Out
model_checksum
long
The checksum for the model used for this scoring run.In/Out
frame
Key
The frame used for this scoring run.In/Out
frame_checksum
long
The checksum for the frame used for this scoring run.In/Out
residual_deviance
double
residual devianceOut
null_deviance
double
null devianceOut
AIC
double
AICOut
null_degrees_of_freedom
long
null DOFOut
residual_degrees_of_freedom
long
residual DOFOut
coefficients_table
TwoDimTable
coefficients_tableOut
r2
double
The R^2 for this scoring run.Out
hit_ratio_table
TwoDimTable
The hit ratio table for this scoring run.Out
cm
ConfusionMatrix
The ConfusionMatrix object for this scoring run.Out
logloss
double
The logarithmic loss for this scoring run.Out
mean_per_class_error
double
The mean misclassification error per class.Out
AUC
double
The average AUC for this scoring run.Out
pr_auc
double
The average precision-recall AUC for this scoring run.Out
multinomial_auc_table
TwoDimTable
The multinomial AUC values.Out
multinomial_aucpr_table
TwoDimTable
The multinomial PR AUC values.Out
description
string
Optional description for this scoring run (to note out-of-bag, sampled data, etc.)Out
model_category
enum
The category (e.g., Clustering) for the model used for this scoring run.Out
scoring_time
long
The time in mS since the epoch for the start of this scoring run.Out
predictions
Frame
Predictions Frame.Out
MSE
double
The Mean Squared Error of the prediction for this scoring run.Out
RMSE
double
The Root Mean Squared Error of the prediction for this scoring run.Out
custom_metric_name
string
Name of custom metricOut
custom_metric_value
double
Value of custom metricOut

ModelMetricsMultinomialGLMV3

nobs
long
Number of observations.In
model
Key
The model used for this scoring run.In/Out
model_checksum
long
The checksum for the model used for this scoring run.In/Out
frame
Key
The frame used for this scoring run.In/Out
frame_checksum
long
The checksum for the frame used for this scoring run.In/Out
residual_deviance
double
residual devianceOut
null_deviance
double
null devianceOut
AIC
double
AICOut
null_degrees_of_freedom
long
null DOFOut
residual_degrees_of_freedom
long
residual DOFOut
r2
double
The R^2 for this scoring run.Out
hit_ratio_table
TwoDimTable
The hit ratio table for this scoring run.Out
cm
ConfusionMatrix
The ConfusionMatrix object for this scoring run.Out
logloss
double
The logarithmic loss for this scoring run.Out
mean_per_class_error
double
The mean misclassification error per class.Out
AUC
double
The average AUC for this scoring run.Out
pr_auc
double
The average precision-recall AUC for this scoring run.Out
multinomial_auc_table
TwoDimTable
The multinomial AUC values.Out
multinomial_aucpr_table
TwoDimTable
The multinomial PR AUC values.Out
description
string
Optional description for this scoring run (to note out-of-bag, sampled data, etc.)Out
model_category
enum
The category (e.g., Clustering) for the model used for this scoring run.Out
scoring_time
long
The time in mS since the epoch for the start of this scoring run.Out
predictions
Frame
Predictions Frame.Out
MSE
double
The Mean Squared Error of the prediction for this scoring run.Out
RMSE
double
The Root Mean Squared Error of the prediction for this scoring run.Out
custom_metric_name
string
Name of custom metricOut
custom_metric_value
double
Value of custom metricOut

ModelMetricsMultinomialGenericV3

nobs
long
Number of observations.In
model
Key
The model used for this scoring run.In/Out
model_checksum
long
The checksum for the model used for this scoring run.In/Out
frame
Key
The frame used for this scoring run.In/Out
frame_checksum
long
The checksum for the frame used for this scoring run.In/Out
r2
double
The R^2 for this scoring run.Out
hit_ratio_table
TwoDimTable
The hit ratio table for this scoring run.Out
cm
ConfusionMatrix
The ConfusionMatrix object for this scoring run.Out
logloss
double
The logarithmic loss for this scoring run.Out
mean_per_class_error
double
The mean misclassification error per class.Out
AUC
double
The average AUC for this scoring run.Out
pr_auc
double
The average precision-recall AUC for this scoring run.Out
multinomial_auc_table
TwoDimTable
The multinomial AUC values.Out
multinomial_aucpr_table
TwoDimTable
The multinomial PR AUC values.Out
description
string
Optional description for this scoring run (to note out-of-bag, sampled data, etc.)Out
model_category
enum
The category (e.g., Clustering) for the model used for this scoring run.Out
scoring_time
long
The time in mS since the epoch for the start of this scoring run.Out
predictions
Frame
Predictions Frame.Out
MSE
double
The Mean Squared Error of the prediction for this scoring run.Out
RMSE
double
The Root Mean Squared Error of the prediction for this scoring run.Out
custom_metric_name
string
Name of custom metricOut
custom_metric_value
double
Value of custom metricOut

ModelMetricsMultinomialV3

nobs
long
Number of observations.In
model
Key
The model used for this scoring run.In/Out
model_checksum
long
The checksum for the model used for this scoring run.In/Out
frame
Key
The frame used for this scoring run.In/Out
frame_checksum
long
The checksum for the frame used for this scoring run.In/Out
r2
double
The R^2 for this scoring run.Out
hit_ratio_table
TwoDimTable
The hit ratio table for this scoring run.Out
cm
ConfusionMatrix
The ConfusionMatrix object for this scoring run.Out
logloss
double
The logarithmic loss for this scoring run.Out
mean_per_class_error
double
The mean misclassification error per class.Out
AUC
double
The average AUC for this scoring run.Out
pr_auc
double
The average precision-recall AUC for this scoring run.Out
multinomial_auc_table
TwoDimTable
The multinomial AUC values.Out
multinomial_aucpr_table
TwoDimTable
The multinomial PR AUC values.Out
description
string
Optional description for this scoring run (to note out-of-bag, sampled data, etc.)Out
model_category
enum
The category (e.g., Clustering) for the model used for this scoring run.Out
scoring_time
long
The time in mS since the epoch for the start of this scoring run.Out
predictions
Frame
Predictions Frame.Out
MSE
double
The Mean Squared Error of the prediction for this scoring run.Out
RMSE
double
The Root Mean Squared Error of the prediction for this scoring run.Out
custom_metric_name
string
Name of custom metricOut
custom_metric_value
double
Value of custom metricOut

ModelMetricsOrdinalGLMGenericV3

nobs
long
Number of observations.In
model
Key
The model used for this scoring run.In/Out
model_checksum
long
The checksum for the model used for this scoring run.In/Out
frame
Key
The frame used for this scoring run.In/Out
frame_checksum
long
The checksum for the frame used for this scoring run.In/Out
residual_deviance
double
residual devianceOut
null_deviance
double
null devianceOut
AIC
double
AICOut
null_degrees_of_freedom
long
null DOFOut
residual_degrees_of_freedom
long
residual DOFOut
coefficients_table
TwoDimTable
coefficients_tableOut
r2
double
The R^2 for this scoring run.Out
hit_ratio_table
TwoDimTable
The hit ratio table for this scoring run.Out
cm
ConfusionMatrix
The ConfusionMatrix object for this scoring run.Out
logloss
double
The logarithmic loss for this scoring run.Out
mean_per_class_error
double
The mean misclassification error per class.Out
description
string
Optional description for this scoring run (to note out-of-bag, sampled data, etc.)Out
model_category
enum
The category (e.g., Clustering) for the model used for this scoring run.Out
scoring_time
long
The time in mS since the epoch for the start of this scoring run.Out
predictions
Frame
Predictions Frame.Out
MSE
double
The Mean Squared Error of the prediction for this scoring run.Out
RMSE
double
The Root Mean Squared Error of the prediction for this scoring run.Out
custom_metric_name
string
Name of custom metricOut
custom_metric_value
double
Value of custom metricOut

ModelMetricsOrdinalGLMV3

nobs
long
Number of observations.In
model
Key
The model used for this scoring run.In/Out
model_checksum
long
The checksum for the model used for this scoring run.In/Out
frame
Key
The frame used for this scoring run.In/Out
frame_checksum
long
The checksum for the frame used for this scoring run.In/Out
residual_deviance
double
residual devianceOut
null_deviance
double
null devianceOut
AIC
double
AICOut
null_degrees_of_freedom
long
null DOFOut
residual_degrees_of_freedom
long
residual DOFOut
r2
double
The R^2 for this scoring run.Out
hit_ratio_table
TwoDimTable
The hit ratio table for this scoring run.Out
cm
ConfusionMatrix
The ConfusionMatrix object for this scoring run.Out
logloss
double
The logarithmic loss for this scoring run.Out
mean_per_class_error
double
The mean misclassification error per class.Out
description
string
Optional description for this scoring run (to note out-of-bag, sampled data, etc.)Out
model_category
enum
The category (e.g., Clustering) for the model used for this scoring run.Out
scoring_time
long
The time in mS since the epoch for the start of this scoring run.Out
predictions
Frame
Predictions Frame.Out
MSE
double
The Mean Squared Error of the prediction for this scoring run.Out
RMSE
double
The Root Mean Squared Error of the prediction for this scoring run.Out
custom_metric_name
string
Name of custom metricOut
custom_metric_value
double
Value of custom metricOut

ModelMetricsOrdinalGenericV3

nobs
long
Number of observations.In
model
Key
The model used for this scoring run.In/Out
model_checksum
long
The checksum for the model used for this scoring run.In/Out
frame
Key
The frame used for this scoring run.In/Out
frame_checksum
long
The checksum for the frame used for this scoring run.In/Out
r2
double
The R^2 for this scoring run.Out
hit_ratio_table
TwoDimTable
The hit ratio table for this scoring run.Out
cm
ConfusionMatrix
The ConfusionMatrix object for this scoring run.Out
logloss
double
The logarithmic loss for this scoring run.Out
mean_per_class_error
double
The mean misclassification error per class.Out
description
string
Optional description for this scoring run (to note out-of-bag, sampled data, etc.)Out
model_category
enum
The category (e.g., Clustering) for the model used for this scoring run.Out
scoring_time
long
The time in mS since the epoch for the start of this scoring run.Out
predictions
Frame
Predictions Frame.Out
MSE
double
The Mean Squared Error of the prediction for this scoring run.Out
RMSE
double
The Root Mean Squared Error of the prediction for this scoring run.Out
custom_metric_name
string
Name of custom metricOut
custom_metric_value
double
Value of custom metricOut

ModelMetricsOrdinalV3

nobs
long
Number of observations.In
model
Key
The model used for this scoring run.In/Out
model_checksum
long
The checksum for the model used for this scoring run.In/Out
frame
Key
The frame used for this scoring run.In/Out
frame_checksum
long
The checksum for the frame used for this scoring run.In/Out
r2
double
The R^2 for this scoring run.Out
hit_ratio_table
TwoDimTable
The hit ratio table for this scoring run.Out
cm
ConfusionMatrix
The ConfusionMatrix object for this scoring run.Out
logloss
double
The logarithmic loss for this scoring run.Out
mean_per_class_error
double
The mean misclassification error per class.Out
description
string
Optional description for this scoring run (to note out-of-bag, sampled data, etc.)Out
model_category
enum
The category (e.g., Clustering) for the model used for this scoring run.Out
scoring_time
long
The time in mS since the epoch for the start of this scoring run.Out
predictions
Frame
Predictions Frame.Out
MSE
double
The Mean Squared Error of the prediction for this scoring run.Out
RMSE
double
The Root Mean Squared Error of the prediction for this scoring run.Out
custom_metric_name
string
Name of custom metricOut
custom_metric_value
double
Value of custom metricOut

ModelMetricsPCAV3

nobs
long
Number of observations.In
model
Key
The model used for this scoring run.In/Out
model_checksum
long
The checksum for the model used for this scoring run.In/Out
frame
Key
The frame used for this scoring run.In/Out
frame_checksum
long
The checksum for the frame used for this scoring run.In/Out
description
string
Optional description for this scoring run (to note out-of-bag, sampled data, etc.)Out
model_category
enum
The category (e.g., Clustering) for the model used for this scoring run.Out
scoring_time
long
The time in mS since the epoch for the start of this scoring run.Out
predictions
Frame
Predictions Frame.Out
MSE
double
The Mean Squared Error of the prediction for this scoring run.Out
RMSE
double
The Root Mean Squared Error of the prediction for this scoring run.Out
custom_metric_name
string
Name of custom metricOut
custom_metric_value
double
Value of custom metricOut

ModelMetricsRegressionCoxPHGenericV3

nobs
long
Number of observations.In
model
Key
The model used for this scoring run.In/Out
model_checksum
long
The checksum for the model used for this scoring run.In/Out
frame
Key
The frame used for this scoring run.In/Out
frame_checksum
long
The checksum for the frame used for this scoring run.In/Out
concordance
double
Concordance metric (c-index)Out
concordant
long
Number of concordant pairsOut
discordant
long
Number of discordant pairs.Out
tied_y
long
Number of tied pairsOut
r2
double
The R^2 for this scoring run.Out
mean_residual_deviance
double
The mean residual deviance for this scoring run.Out
mae
double
The mean absolute error for this scoring run.Out
rmsle
double
The root mean squared log error for this scoring run.Out
description
string
Optional description for this scoring run (to note out-of-bag, sampled data, etc.)Out
model_category
enum
The category (e.g., Clustering) for the model used for this scoring run.Out
scoring_time
long
The time in mS since the epoch for the start of this scoring run.Out
predictions
Frame
Predictions Frame.Out
MSE
double
The Mean Squared Error of the prediction for this scoring run.Out
RMSE
double
The Root Mean Squared Error of the prediction for this scoring run.Out
custom_metric_name
string
Name of custom metricOut
custom_metric_value
double
Value of custom metricOut

ModelMetricsRegressionCoxPHV3

nobs
long
Number of observations.In
model
Key
The model used for this scoring run.In/Out
model_checksum
long
The checksum for the model used for this scoring run.In/Out
frame
Key
The frame used for this scoring run.In/Out
frame_checksum
long
The checksum for the frame used for this scoring run.In/Out
concordance
double
concordance indexOut
concordant
long
number of concordant pairsOut
discordant
long
number of discordant pairsOut
tied_y
long
number of pairs tied in Y valueOut
r2
double
The R^2 for this scoring run.Out
mean_residual_deviance
double
The mean residual deviance for this scoring run.Out
mae
double
The mean absolute error for this scoring run.Out
rmsle
double
The root mean squared log error for this scoring run.Out
description
string
Optional description for this scoring run (to note out-of-bag, sampled data, etc.)Out
model_category
enum
The category (e.g., Clustering) for the model used for this scoring run.Out
scoring_time
long
The time in mS since the epoch for the start of this scoring run.Out
predictions
Frame
Predictions Frame.Out
MSE
double
The Mean Squared Error of the prediction for this scoring run.Out
RMSE
double
The Root Mean Squared Error of the prediction for this scoring run.Out
custom_metric_name
string
Name of custom metricOut
custom_metric_value
double
Value of custom metricOut

ModelMetricsRegressionGLMGenericV3

nobs
long
Number of observations.In
model
Key
The model used for this scoring run.In/Out
model_checksum
long
The checksum for the model used for this scoring run.In/Out
frame
Key
The frame used for this scoring run.In/Out
frame_checksum
long
The checksum for the frame used for this scoring run.In/Out
residual_deviance
double
residual devianceOut
null_deviance
double
null devianceOut
AIC
double
AICOut
null_degrees_of_freedom
long
null DOFOut
residual_degrees_of_freedom
long
residual DOFOut
coefficients_table
TwoDimTable
coefficients_tableOut
r2
double
The R^2 for this scoring run.Out
mean_residual_deviance
double
The mean residual deviance for this scoring run.Out
mae
double
The mean absolute error for this scoring run.Out
rmsle
double
The root mean squared log error for this scoring run.Out
description
string
Optional description for this scoring run (to note out-of-bag, sampled data, etc.)Out
model_category
enum
The category (e.g., Clustering) for the model used for this scoring run.Out
scoring_time
long
The time in mS since the epoch for the start of this scoring run.Out
predictions
Frame
Predictions Frame.Out
MSE
double
The Mean Squared Error of the prediction for this scoring run.Out
RMSE
double
The Root Mean Squared Error of the prediction for this scoring run.Out
custom_metric_name
string
Name of custom metricOut
custom_metric_value
double
Value of custom metricOut

ModelMetricsRegressionGLMV3

nobs
long
Number of observations.In
model
Key
The model used for this scoring run.In/Out
model_checksum
long
The checksum for the model used for this scoring run.In/Out
frame
Key
The frame used for this scoring run.In/Out
frame_checksum
long
The checksum for the frame used for this scoring run.In/Out
residual_deviance
double
residual devianceOut
null_deviance
double
null devianceOut
AIC
double
AICOut
null_degrees_of_freedom
long
null DOFOut
residual_degrees_of_freedom
long
residual DOFOut
r2
double
The R^2 for this scoring run.Out
mean_residual_deviance
double
The mean residual deviance for this scoring run.Out
mae
double
The mean absolute error for this scoring run.Out
rmsle
double
The root mean squared log error for this scoring run.Out
description
string
Optional description for this scoring run (to note out-of-bag, sampled data, etc.)Out
model_category
enum
The category (e.g., Clustering) for the model used for this scoring run.Out
scoring_time
long
The time in mS since the epoch for the start of this scoring run.Out
predictions
Frame
Predictions Frame.Out
MSE
double
The Mean Squared Error of the prediction for this scoring run.Out
RMSE
double
The Root Mean Squared Error of the prediction for this scoring run.Out
custom_metric_name
string
Name of custom metricOut
custom_metric_value
double
Value of custom metricOut

ModelMetricsRegressionGenericV3

nobs
long
Number of observations.In
model
Key
The model used for this scoring run.In/Out
model_checksum
long
The checksum for the model used for this scoring run.In/Out
frame
Key
The frame used for this scoring run.In/Out
frame_checksum
long
The checksum for the frame used for this scoring run.In/Out
mean_residual_deviance
double
The mean residual deviance for this scoring run.Out
mae
double
The mean absolute error for this scoring run.Out
rmsle
double
The root mean squared log error for this scoring run.Out
description
string
Optional description for this scoring run (to note out-of-bag, sampled data, etc.)Out
model_category
enum
The category (e.g., Clustering) for the model used for this scoring run.Out
scoring_time
long
The time in mS since the epoch for the start of this scoring run.Out
predictions
Frame
Predictions Frame.Out
MSE
double
The Mean Squared Error of the prediction for this scoring run.Out
RMSE
double
The Root Mean Squared Error of the prediction for this scoring run.Out
custom_metric_name
string
Name of custom metricOut
custom_metric_value
double
Value of custom metricOut

ModelMetricsRegressionV3

nobs
long
Number of observations.In
model
Key
The model used for this scoring run.In/Out
model_checksum
long
The checksum for the model used for this scoring run.In/Out
frame
Key
The frame used for this scoring run.In/Out
frame_checksum
long
The checksum for the frame used for this scoring run.In/Out
r2
double
The R^2 for this scoring run.Out
mean_residual_deviance
double
The mean residual deviance for this scoring run.Out
mae
double
The mean absolute error for this scoring run.Out
rmsle
double
The root mean squared log error for this scoring run.Out
description
string
Optional description for this scoring run (to note out-of-bag, sampled data, etc.)Out
model_category
enum
The category (e.g., Clustering) for the model used for this scoring run.Out
scoring_time
long
The time in mS since the epoch for the start of this scoring run.Out
predictions
Frame
Predictions Frame.Out
MSE
double
The Mean Squared Error of the prediction for this scoring run.Out
RMSE
double
The Root Mean Squared Error of the prediction for this scoring run.Out
custom_metric_name
string
Name of custom metricOut
custom_metric_value
double
Value of custom metricOut

ModelMetricsSVDV99

nobs
long
Number of observations.In
model
Key
The model used for this scoring run.In/Out
model_checksum
long
The checksum for the model used for this scoring run.In/Out
frame
Key
The frame used for this scoring run.In/Out
frame_checksum
long
The checksum for the frame used for this scoring run.In/Out
description
string
Optional description for this scoring run (to note out-of-bag, sampled data, etc.)Out
model_category
enum
The category (e.g., Clustering) for the model used for this scoring run.Out
scoring_time
long
The time in mS since the epoch for the start of this scoring run.Out
predictions
Frame
Predictions Frame.Out
MSE
double
The Mean Squared Error of the prediction for this scoring run.Out
RMSE
double
The Root Mean Squared Error of the prediction for this scoring run.Out
custom_metric_name
string
Name of custom metricOut
custom_metric_value
double
Value of custom metricOut

ModelOutputSchemaV3

names
string[]
Column namesOut
original_names
string[]
Original column namesOut
column_types
string[]
Column typesOut
domains
string[][]
Domains for categorical columnsOut
cross_validation_models
Key[]
Cross-validation models (model ids)Out
cross_validation_predictions
Key[]
Cross-validation predictions, one per cv model (deprecated, use cross_validation_holdout_predictions_frame_id instead)Out
cross_validation_holdout_predictions_frame_id
Key
Cross-validation holdout predictions (full out-of-sample predictions on training data)Out
cross_validation_fold_assignment_frame_id
Key
Cross-validation fold assignment (each row is assigned to one holdout fold)Out
model_category
enum
Category of the model (e.g., Binomial)Out
model_summary
TwoDimTable
Model summaryOut
scoring_history
TwoDimTable
Scoring historyOut
cv_scoring_history
TwoDimTable[]
Cross-Validation scoring historyOut
reproducibility_information_table
TwoDimTable[]
Model reproducibility informationOut
training_metrics
ModelMetrics
Training data model metricsOut
validation_metrics
ModelMetrics
Validation data model metricsOut
cross_validation_metrics
ModelMetrics
Cross-validation model metricsOut
cross_validation_metrics_summary
TwoDimTable
Cross-validation model metrics summaryOut
status
string
Job statusOut
start_time
long
Start time in millisecondsOut
end_time
long
End time in millisecondsOut
run_time
long
Runtime in millisecondsOut
default_threshold
double
Default threshold used for predictionsOut
help
Map
Help information for output fieldsOut

ModelParameterSchemaV3

is_member_of_frames
string[]
For Vec-type fields this is the set of Frame-type fields which must contain the named column; for example, for a SupervisedModel the response_column must be in both the training_frame and (if it’s set) the validation_frameIn
is_mutually_exclusive_with
string[]
For Vec-type fields this is the set of other Vec-type fields which must contain mutually exclusive values; for example, for a SupervisedModel the response_column must be mutually exclusive with the weights_columnIn
name
string
name in the JSON, e.g. “lambda”Out
label
string
[DEPRECATED] same as name.Out
help
string
help for the UI, e.g. “regularization multiplier, typically used for foo bar baz etc.”Out
required
boolean
the field is requiredOut
type
string
Java type, e.g. “double”Out
default_value
Polymorphic
default value, e.g. 1Out
actual_value
Polymorphic
actual value as set by the user and / or modified by the ModelBuilder, e.g., 10Out
input_value
Polymorphic
input value as set by the user, e.g., 10Out
level
string
the importance of the parameter, used by the UI, e.g. “critical”, “extended” or “expert”Out
values
string[]
list of valid values for use by the front-endOut
gridable
boolean
Parameter can be used in grid callOut

ModelParametersSchemaV3

distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

ModelSchemaBaseV3

model_id
Key
Model keyIn/Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

ModelSchemaV3

model_id
Key
Model keyIn/Out
parameters
Parameters
The build parameters for the model (e.g. K for KMeans).Out
output
Output
The build output for the model (e.g. the cluster centers for KMeans).Out
compatible_frames
string[]
Compatible frames, if requestedOut
checksum
long
Checksum for all the things that go into building the Model.Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

ModelSelectionModelOutputV3

best_predictors_subset
string[][]
Names of predictors in the best predictor subsetIn
best_r2_values
double[]
R2 values of all possible predictor subsets. Only for mode=’allsubsets’ or ‘maxr’.In
predictors_added_per_step
string[][]
at each predictor subset size, the predictor added is collected in this array. Not for mode = ‘backward’.In
predictors_removed_per_step
string[][]
at each predictor subset size, the predictor removed is collected in this array.In
coef_p_values
double[][]
p-values of chosen predictor subsets at each subset size. Only for model=’backward’.In
z_values
double[][]
z-values of chosen predictor subsets at each subset size. Only for model=’backward’.In
best_model_ids
Key[]
Key of models containing best 1-predictor model, best 2-predictors model, ….In
coefficient_names
string[][]
arrays of string arrays containing coefficient names of best 1-predictor model, best 2-predictors model, ….In
coefficient_values
double[][]
store coefficient values for each predictor subset. Only for maxrsweep when build_glm_model is false.In
coefficient_values_normalized
double[][]
store standardized coefficient values for each predictor subset. Only for maxrsweep when build_glm_model is false.In
names
string[]
Column namesOut
original_names
string[]
Original column namesOut
column_types
string[]
Column typesOut
domains
string[][]
Domains for categorical columnsOut
cross_validation_models
Key[]
Cross-validation models (model ids)Out
cross_validation_predictions
Key[]
Cross-validation predictions, one per cv model (deprecated, use cross_validation_holdout_predictions_frame_id instead)Out
cross_validation_holdout_predictions_frame_id
Key
Cross-validation holdout predictions (full out-of-sample predictions on training data)Out
cross_validation_fold_assignment_frame_id
Key
Cross-validation fold assignment (each row is assigned to one holdout fold)Out
model_category
enum
Category of the model (e.g., Binomial)Out
model_summary
TwoDimTable
Model summaryOut
scoring_history
TwoDimTable
Scoring historyOut
cv_scoring_history
TwoDimTable[]
Cross-Validation scoring historyOut
reproducibility_information_table
TwoDimTable[]
Model reproducibility informationOut
training_metrics
ModelMetrics
Training data model metricsOut
validation_metrics
ModelMetrics
Validation data model metricsOut
cross_validation_metrics
ModelMetrics
Cross-validation model metricsOut
cross_validation_metrics_summary
TwoDimTable
Cross-validation model metrics summaryOut
status
string
Job statusOut
start_time
long
Start time in millisecondsOut
end_time
long
End time in millisecondsOut
run_time
long
Runtime in millisecondsOut
default_threshold
double
Default threshold used for predictionsOut
help
Map
Help information for output fieldsOut

ModelSelectionModelV3

model_id
Key
Model keyIn/Out
parameters
ModelSelectionParameters
The build parameters for the model (e.g. K for KMeans).Out
output
ModelSelectionModelOutput
The build output for the model (e.g. the cluster centers for KMeans).Out
compatible_frames
string[]
Compatible frames, if requestedOut
checksum
long
Checksum for all the things that go into building the Model.Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

ModelSelectionParametersV3

seed
long
Seed for pseudo random number generator (if applicable)In
family
enum
Family. For maxr/maxrsweep, only gaussian. For backward, ordinal and multinomial families are not supportedIn
tweedie_variance_power
double
Tweedie variance powerIn
tweedie_link_power
double
Tweedie link powerIn
theta
double
ThetaIn
solver
enum
AUTO will set the solver based on given data and the other parameters. IRLSM is fast on on problems with small number of predictors and for lambda-search with L1 penalty, L_BFGS scales better for datasets with many columns.In
alpha
double[]
Distribution of regularization between the L1 (Lasso) and L2 (Ridge) penalties. A value of 1 for alpha represents Lasso regression, a value of 0 produces Ridge regression, and anything in between specifies the amount of mixing between the two. Default value of alpha is 0 when SOLVER = ‘L-BFGS’; 0.5 otherwise.In
lambda
double[]
Regularization strengthIn
lambda_search
boolean
Use lambda search starting at lambda max, given lambda is then interpreted as lambda minIn
build_glm_model
boolean
For maxrsweep mode only. If true, will return full blown GLM models with the desired predictorsubsets. If false, only the predictor subsets, predictor coefficients are returned. This is forspeeding up the model selection process. The users can choose to build the GLM models themselvesby using the predictor subsets themselves. Default to true.In
early_stopping
boolean
Stop early when there is no more relative improvement on train or validation (if provided)In
nlambdas
int
Number of lambdas to be used in a search. Default indicates: If alpha is zero, with lambda search set to True, the value of nlamdas is set to 30 (fewer lambdas are needed for ridge regression) otherwise it is set to 100.In
score_iteration_interval
int
Perform scoring for every score_iteration_interval iterationsIn
standardize
boolean
Standardize numeric columns to have zero mean and unit varianceIn
cold_start
boolean
Only applicable to multiple alpha/lambda values. If false, build the next model for next set of alpha/lambda values starting from the values provided by current model. If true will start GLM model from scratch.In
plug_values
Key
Plug Values (a single row frame containing values that will be used to impute missing values of the training/validation frame, use with conjunction missing_values_handling = PlugValues)In
non_negative
boolean
Restrict coefficients (not intercept) to be non-negativeIn
max_iterations
int
Maximum number of iterationsIn
beta_epsilon
double
Converge if beta changes less (using L-infinity norm) than beta esilon, ONLY applies to IRLSM solver In
objective_epsilon
double
Converge if objective value changes less than this. Default (of -1.0) indicates: If lambda_search is set to True the value of objective_epsilon is set to .0001. If the lambda_search is set to False and lambda is equal to zero, the value of objective_epsilon is set to .000001, for any other value of lambda the default value of objective_epsilon is set to .0001.In
gradient_epsilon
double
Converge if objective changes less (using L-infinity norm) than this, ONLY applies to L-BFGS solver. Default (of -1.0) indicates: If lambda_search is set to False and lambda is equal to zero, the default value of gradient_epsilon is equal to .000001, otherwise the default value is .0001. If lambda_search is set to True, the conditional values above are 1E-8 and 1E-6 respectively.In
obj_reg
double
Likelihood divider in objective value computation, default (of -1.0) will set it to 1/nobsIn
link
enum
Link function.In
startval
double[]
double array to initialize fixed and random coefficients for HGLM, coefficients for GLM.In
calc_like
boolean
if true, will return likelihood function value for HGLM.In
intercept
boolean
Include constant term in the modelIn
prior
double
Prior probability for y==1. To be used only for logistic regression iff the data has been sampled and the mean of response does not reflect reality.In
lambda_min_ratio
double
Minimum lambda used in lambda search, specified as a ratio of lambda_max (the smallest lambda that drives all coefficients to zero). Default indicates: if the number of observations is greater than the number of variables, then lambda_min_ratio is set to 0.0001; if the number of observations is less than the number of variables, then lambda_min_ratio is set to 0.01.In
beta_constraints
Key
Beta constraintsIn
max_active_predictors
int
Maximum number of active predictors during computation. Use as a stopping criterion to prevent expensive model building with many predictors. Default indicates: If the IRLSM solver is used, the value of max_active_predictors is set to 5000 otherwise it is set to 100000000.In
compute_p_values
boolean
Request p-values computation, p-values work only with IRLSM solver and no regularizationIn
remove_collinear_columns
boolean
In case of linearly dependent columns, remove some of the dependent columnsIn
max_predictor_number
int
Maximum number of predictors to be considered when building GLM models. Defaults to 1.In
min_predictor_number
int
For mode = ‘backward’ only. Minimum number of predictors to be considered when building GLM models starting with all predictors to be included. Defaults to 1.In
nparallelism
int
number of models to build in parallel. Defaults to 0.0 which is adaptive to the system capabilityIn
p_values_threshold
double
For mode=’backward’ only. If specified, will stop the model building process when all coefficientsp-values drop below this threshold In
distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
missing_values_handling
enum
Handling of missing values. Either MeanImputation, Skip or PlugValues.In/Out
mode
enum
Mode: Used to choose model selection algorithms to use. Options include ‘allsubsets’ for all subsets, ‘maxr’ that uses sequential replacement and GLM to build all models, slow but works with cross-validation, validation frames for more robust results, ‘maxrsweep’ that uses sequential replacement and sweeping action, much faster than ‘maxr’, ‘backward’ for backward selection.In/Out
balance_classes
boolean
Balance training data class counts via over/under-sampling (for imbalanced data).In/Out
class_sampling_factors
float[]
Desired over/under-sampling ratios per class (in lexicographic order). If not specified, sampling factors will be automatically computed to obtain class balance during training. Requires balance_classes.In/Out
max_after_balance_size
float
Maximum relative size of the training data after balancing class counts (can be less than 1.0). Requires balance_classes.In/Out
max_confusion_matrix_size
int
[Deprecated] Maximum size (# classes) for confusion matrices to be printed in the LogsIn/Out
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

ModelSelectionV3

parameters
ModelSelectionParameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out

ModelSynopsisV3

model_id
Key
Model keyIn/Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

ModelsInfoV4

models
ModelBuilder[]
Generic information about each model supported in H2O.In
__schema
string
Url describing the schema of the current object.In

ModelsKeyV3

name
string
Name (string representation) for this Key.In/Out
type
string
Name (string representation) for the type of Keyed this Key points to.In/Out
URL
string
URL for the resource that this Key points to, if one exists.In/Out

ModelsV3

model_id
Key
Name of Model of interestIn
preview
boolean
Return potentially abridged model suitable for viewing in a browserIn
find_compatible_frames
boolean
Find and return compatible frames?In
export_cross_validation_predictions
boolean
Flag indicating whether the exported model artifact should also include CV Holdout Frame predictionsIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
models
Model[]
ModelsOut
compatible_frames
Frame[]
Compatible framesOut

NaiveBayesModelOutputV3

levels
string[]
Categorical levels of the responseIn
apriori
TwoDimTable
A-priori probabilities of the responseIn
pcond
TwoDimTable[]
Conditional probabilities of the predictorsIn
names
string[]
Column namesOut
original_names
string[]
Original column namesOut
column_types
string[]
Column typesOut
domains
string[][]
Domains for categorical columnsOut
cross_validation_models
Key[]
Cross-validation models (model ids)Out
cross_validation_predictions
Key[]
Cross-validation predictions, one per cv model (deprecated, use cross_validation_holdout_predictions_frame_id instead)Out
cross_validation_holdout_predictions_frame_id
Key
Cross-validation holdout predictions (full out-of-sample predictions on training data)Out
cross_validation_fold_assignment_frame_id
Key
Cross-validation fold assignment (each row is assigned to one holdout fold)Out
model_category
enum
Category of the model (e.g., Binomial)Out
model_summary
TwoDimTable
Model summaryOut
scoring_history
TwoDimTable
Scoring historyOut
cv_scoring_history
TwoDimTable[]
Cross-Validation scoring historyOut
reproducibility_information_table
TwoDimTable[]
Model reproducibility informationOut
training_metrics
ModelMetrics
Training data model metricsOut
validation_metrics
ModelMetrics
Validation data model metricsOut
cross_validation_metrics
ModelMetrics
Cross-validation model metricsOut
cross_validation_metrics_summary
TwoDimTable
Cross-validation model metrics summaryOut
status
string
Job statusOut
start_time
long
Start time in millisecondsOut
end_time
long
End time in millisecondsOut
run_time
long
Runtime in millisecondsOut
default_threshold
double
Default threshold used for predictionsOut
help
Map
Help information for output fieldsOut

NaiveBayesModelV3

model_id
Key
Model keyIn/Out
parameters
NaiveBayesParameters
The build parameters for the model (e.g. K for KMeans).Out
output
NaiveBayesOutput
The build output for the model (e.g. the cluster centers for KMeans).Out
compatible_frames
string[]
Compatible frames, if requestedOut
checksum
long
Checksum for all the things that go into building the Model.Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

NaiveBayesParametersV3

laplace
double
Laplace smoothing parameterIn
min_sdev
double
Min. standard deviation to use for observations with not enough dataIn
eps_sdev
double
Cutoff below which standard deviation is replaced with min_sdevIn
min_prob
double
Min. probability to use for observations with not enough dataIn
eps_prob
double
Cutoff below which probability is replaced with min_probIn
compute_metrics
boolean
Compute metrics on training dataIn
distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
balance_classes
boolean
Balance training data class counts via over/under-sampling (for imbalanced data).In/Out
class_sampling_factors
float[]
Desired over/under-sampling ratios per class (in lexicographic order). If not specified, sampling factors will be automatically computed to obtain class balance during training. Requires balance_classes.In/Out
max_after_balance_size
float
Maximum relative size of the training data after balancing class counts (can be less than 1.0). Requires balance_classes.In/Out
max_confusion_matrix_size
int
[Deprecated] Maximum size (# classes) for confusion matrices to be printed in the LogsIn/Out
seed
long
Seed for pseudo random number generator (only used for cross-validation and fold_assignment=”Random” or “AUTO”)In/Out
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

NaiveBayesV3

parameters
NaiveBayesParameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out

NetworkBenchV3

_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
results
TwoDimTable[]
NetworkBenchResultsOut

NetworkEvent

is_send
boolean
Boolean flag distinguishing between sends (true) and receives(false)In
protocol
string
network protocol (UDP/TCP)In
msg_type
string
UDP type (exec,ack, ackack,…In
from
string
Sending nodeIn
to
string
Receiving nodeIn
data
string
Pretty print of the first few bytes of the msg payload. Contains class name for tasks.In
date
string
Time when the event was recorded. Format is hh:mm:ss:msIn
nanos
long
Time in nanosIn
type
enum
type of recorded eventIn

NetworkTestV3

_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
microseconds_collective
double[]
Collective broadcast/reduce times in microseconds (for each message size)Out
bandwidths_collective
double[]
Collective bandwidths in Bytes/sec (for each message size, for each node)Out
microseconds
double[][]
Round-trip times in microseconds (for each message size, for each node)Out
bandwidths
double[][]
Bi-directional bandwidths in Bytes/sec (for each message size, for each node)Out
nodes
string[]
NodesOut
table
TwoDimTable
NetworkTestResultsOut

NodeMemoryInfoV3

ip_port
string
IP address and port in the form a.b.c.d:eOut
free_mem
long
Free heapOut

NodePersistentStorageEntryV3

category
string
Category nameOut
name
string
Key nameOut
size
long
Size in bytes of valueOut
timestamp_millis
long
Epoch time in milliseconds of when the value was writtenOut

NodePersistentStorageV3

_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
category
string
Category nameIn/Out
name
string
Key nameIn/Out
value
string
ValueIn/Out
configured
boolean
ConfiguredOut
exists
boolean
ExistsOut
entries
Iced[]
List of entriesOut

NodeV3

h2o
string
IPOut
ip_port
string
IP address and port in the form a.b.c.d:eOut
healthy
boolean
(now-last_ping)<HeartbeatThread.TIMEOUTOut
last_ping
long
Time (in msec) of last pingOut
pid
int
PIDOut
num_cpus
int
num_cpusOut
cpus_allowed
int
cpus_allowedOut
nthreads
int
nthreadsOut
sys_load
float
System load; average #runnables/#coresOut
my_cpu_pct
int
System CPU percentage used by this H2O process in last intervalOut
sys_cpu_pct
int
System CPU percentage used by everything in last intervalOut
mem_value_size
long
Data on Node memoryOut
pojo_mem
long
Temp (non Data) memoryOut
free_mem
long
Free heapOut
max_mem
long
Maximum memory size for nodeOut
swap_mem
long
Size of data on node’s diskOut
num_keys
int
#local keysOut
free_disk
long
Free diskOut
max_disk
long
Max diskOut
rpcs_active
int
Active Remote Procedure CallsOut
fjthrds
short[]
F/J Thread count, by priorityOut
fjqueue
short[]
F/J Task count, by priorityOut
tcps_active
int
Open TCP connectionsOut
open_fds
int
Open File DescriptersOut
gflops
double
Linpack GFlopsOut
mem_bw
double
Memory BandwidthOut

OutputSchemaV4

__schema
string
Url describing the schema of the current object.In

PCAModelOutputV3

importance
TwoDimTable
Standard deviation and importance of each principal componentIn
eigenvectors
TwoDimTable
Principal components matrixIn
objective
double
Final value of GLRM squared loss functionIn
names
string[]
Column namesOut
original_names
string[]
Original column namesOut
column_types
string[]
Column typesOut
domains
string[][]
Domains for categorical columnsOut
cross_validation_models
Key[]
Cross-validation models (model ids)Out
cross_validation_predictions
Key[]
Cross-validation predictions, one per cv model (deprecated, use cross_validation_holdout_predictions_frame_id instead)Out
cross_validation_holdout_predictions_frame_id
Key
Cross-validation holdout predictions (full out-of-sample predictions on training data)Out
cross_validation_fold_assignment_frame_id
Key
Cross-validation fold assignment (each row is assigned to one holdout fold)Out
model_category
enum
Category of the model (e.g., Binomial)Out
model_summary
TwoDimTable
Model summaryOut
scoring_history
TwoDimTable
Scoring historyOut
cv_scoring_history
TwoDimTable[]
Cross-Validation scoring historyOut
reproducibility_information_table
TwoDimTable[]
Model reproducibility informationOut
training_metrics
ModelMetrics
Training data model metricsOut
validation_metrics
ModelMetrics
Validation data model metricsOut
cross_validation_metrics
ModelMetrics
Cross-validation model metricsOut
cross_validation_metrics_summary
TwoDimTable
Cross-validation model metrics summaryOut
status
string
Job statusOut
start_time
long
Start time in millisecondsOut
end_time
long
End time in millisecondsOut
run_time
long
Runtime in millisecondsOut
default_threshold
double
Default threshold used for predictionsOut
help
Map
Help information for output fieldsOut

PCAModelV3

model_id
Key
Model keyIn/Out
parameters
PCAParameters
The build parameters for the model (e.g. K for KMeans).Out
output
PCAOutput
The build output for the model (e.g. the cluster centers for KMeans).Out
compatible_frames
string[]
Compatible frames, if requestedOut
checksum
long
Checksum for all the things that go into building the Model.Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

PCAParametersV3

transform
enum
Transformation of training dataIn
pca_method
enum
Specify the algorithm to use for computing the principal components: GramSVD - uses a distributed computation of the Gram matrix, followed by a local SVD; Power - computes the SVD using the power iteration method (experimental); Randomized - uses randomized subspace iteration method; GLRM - fits a generalized low-rank model with L2 loss function and no regularization and solves for the SVD using local matrix algebra (experimental)In
pca_impl
enum
Specify the implementation to use for computing PCA (via SVD or EVD): MTJ_EVD_DENSEMATRIX - eigenvalue decompositions for dense matrix using MTJ; MTJ_EVD_SYMMMATRIX - eigenvalue decompositions for symmetric matrix using MTJ; MTJ_SVD_DENSEMATRIX - singular-value decompositions for dense matrix using MTJ; JAMA - eigenvalue decompositions for dense matrix using JAMA. References: JAMA - http://math.nist.gov/javanumerics/jama/; MTJ - https://github.com/fommil/matrix-toolkits-java/In
distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
k
int
Rank of matrix approximationIn/Out
max_iterations
int
Maximum training iterationsIn/Out
seed
long
RNG seed for initializationIn/Out
use_all_factor_levels
boolean
Whether first factor level is included in each categorical expansionIn/Out
compute_metrics
boolean
Whether to compute metrics on the training dataIn/Out
impute_missing
boolean
Whether to impute missing entries with the column meanIn/Out
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

PCAV3

parameters
PCAParameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out

PSVMModelOutputV3

svs_count
long
Total number of support vectorsIn
bsv_count
long
Number of bounded support vectorsIn
rho
double
rhoIn
alpha_key
Key
Weights of support vectorsIn
names
string[]
Column namesOut
original_names
string[]
Original column namesOut
column_types
string[]
Column typesOut
domains
string[][]
Domains for categorical columnsOut
cross_validation_models
Key[]
Cross-validation models (model ids)Out
cross_validation_predictions
Key[]
Cross-validation predictions, one per cv model (deprecated, use cross_validation_holdout_predictions_frame_id instead)Out
cross_validation_holdout_predictions_frame_id
Key
Cross-validation holdout predictions (full out-of-sample predictions on training data)Out
cross_validation_fold_assignment_frame_id
Key
Cross-validation fold assignment (each row is assigned to one holdout fold)Out
model_category
enum
Category of the model (e.g., Binomial)Out
model_summary
TwoDimTable
Model summaryOut
scoring_history
TwoDimTable
Scoring historyOut
cv_scoring_history
TwoDimTable[]
Cross-Validation scoring historyOut
reproducibility_information_table
TwoDimTable[]
Model reproducibility informationOut
training_metrics
ModelMetrics
Training data model metricsOut
validation_metrics
ModelMetrics
Validation data model metricsOut
cross_validation_metrics
ModelMetrics
Cross-validation model metricsOut
cross_validation_metrics_summary
TwoDimTable
Cross-validation model metrics summaryOut
status
string
Job statusOut
start_time
long
Start time in millisecondsOut
end_time
long
End time in millisecondsOut
run_time
long
Runtime in millisecondsOut
default_threshold
double
Default threshold used for predictionsOut
help
Map
Help information for output fieldsOut

PSVMModelV3

model_id
Key
Model keyIn/Out
parameters
PSVMParameters
The build parameters for the model (e.g. K for KMeans).Out
output
PSVMModelOutput
The build output for the model (e.g. the cluster centers for KMeans).Out
compatible_frames
string[]
Compatible frames, if requestedOut
checksum
long
Checksum for all the things that go into building the Model.Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

PSVMParametersV3

hyper_param
double
Penalty parameter C of the error termIn
kernel_type
enum
Type of used kernelIn
gamma
double
Coefficient of the kernel (currently RBF gamma for gaussian kernel, -1 means 1/#features)In
rank_ratio
double
Desired rank of the ICF matrix expressed as an ration of number of input rows (-1 means use sqrt(#rows)).In
positive_weight
double
Weight of positive (+1) class of observationsIn
negative_weight
double
Weight of positive (-1) class of observationsIn
disable_training_metrics
boolean
Disable calculating training metrics (expensive on large datasets)In
sv_threshold
double
Threshold for accepting a candidate observation into the set of support vectorsIn
max_iterations
int
Maximum number of iteration of the algorithmIn
fact_threshold
double
Convergence threshold of the Incomplete Cholesky Factorization (ICF)In
feasible_threshold
double
Convergence threshold for primal-dual residuals in the IPM iterationIn
surrogate_gap_threshold
double
Feasibility criterion of the surrogate duality gap (eta)In
mu_factor
double
Increasing factor muIn
seed
long
Seed for pseudo random number generator (if applicable)In
distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

PSVMV3

parameters
PSVMParameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out

ParseSVMLightV3

destination_frame
Key
Final frame nameIn
source_frames
Key[]
Source framesIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In

ParseSetupV3

decrypt_tool
Key
Key-reference to an initialized instance of a Decryption ToolIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
source_frames
Key[]
Source framesIn/Out
parse_type
enum
Parser typeIn/Out
separator
byte
Field separatorIn/Out
single_quotes
boolean
Single quotesIn/Out
check_header
int
Check header: 0 means guess, +1 means 1st line is header not data, -1 means 1st line is data not headerIn/Out
column_names
string[]
Column namesIn/Out
skipped_columns
int[]
Skipped columns indicesIn/Out
column_types
string[]
Value types for columnsIn/Out
na_strings
string[][]
NA strings for columnsIn/Out
column_name_filter
string
Regex for names of columns to returnIn/Out
column_offset
int
Column offset to returnIn/Out
column_count
int
Number of columns to returnIn/Out
total_filtered_column_count
int
Total number of columns we would return with no column paginationIn/Out
custom_non_data_line_markers
string
Custom characters to be treated as non-data line markersIn/Out
partition_by
string[]
Names of the columns the persisted dataset has been partitioned by.In/Out
escapechar
byte
One ASCII character used to escape other characters.In/Out
destination_frame
string
Suggested nameOut
header_lines
long
Number of header lines foundOut
number_columns
int
Number of columnsOut
data
string[][]
Sample dataOut
warnings
string[]
WarningsOut
chunk_size
int
Size of individual parse tasksOut

ParseV3

destination_frame
Key
Final frame nameIn
source_frames
Key[]
Source framesIn
parse_type
enum
Parser typeIn
separator
byte
Field separatorIn
single_quotes
boolean
Single QuotesIn
check_header
int
Check header: 0 means guess, +1 means 1st line is header not data, -1 means 1st line is data not headerIn
number_columns
int
Number of columnsIn
column_names
string[]
Column namesIn
column_types
string[]
Value types for columnsIn
domains
string[][]
Domains for categorical columnsIn
na_strings
string[][]
NA strings for columnsIn
chunk_size
int
Size of individual parse tasksIn
delete_on_done
boolean
Delete input key after parseIn
blocking
boolean
Block until the parse completes (as opposed to returning early and requiring pollingIn
decrypt_tool
Key
Key-reference to an initialized instance of a Decryption ToolIn
custom_non_data_line_markers
string
Custom characters to be treated as non-data line markersIn
partition_by
string[]
Name of the column the persisted dataset has been partitioned by.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
skipped_columns
int[]
Skipped columns indicesIn/Out
escapechar
byte
One ASCII character used to escape other characters.In/Out
job
Job
Parse jobOut
rows
long
RowsOut

PartialDependenceKeyV3

name
string
Name (string representation) for this Key.In/Out
type
string
Name (string representation) for the type of Keyed this Key points to.In/Out
URL
string
URL for the resource that this Key points to, if one exists.In/Out

PartialDependenceV3

destination_key
Key
Key to store the destinationIn
targets
string[]
Target classes for multinomial classificationIn
model_id
Key
ModelIn/Out
frame_id
Key
FrameIn/Out
row_index
long
Row IndexIn/Out
cols
string[]
Column(s)In/Out
weight_column_index
int
weight_column_indexIn/Out
add_missing_na
boolean
add_missing_naIn/Out
nbins
int
Number of binsIn/Out
user_splits
double[]
User define split pointsIn/Out
user_cols
string[]
Column(s) of user defined splitsIn/Out
num_user_splits
int[]
Number of user defined splits per columnIn/Out
col_pairs_2dpdp
string[][]
lists of column name pairs to plot 2D pdp forIn/Out
partial_dependence_data
TwoDimTable[]
Partial Dependence DataOut

PersistS3CredentialsV3

secret_key_id
string
S3 Secret Key IDIn
secret_access_key
string
S3 Secret KeyIn
session_token
string
S3 Session tokenIn

PingV3

_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
cloud_uptime_millis
long
cloud_uptime_millisOut
cloud_healthy
boolean
cloud_healthyOut
nodes
Iced[]
nodesOut

PreprocessingStepDefinitionV99

type
enum
A type representing the preprocessing step to be executed.In/Out

ProfilerNodeEntryV3

stacktrace
string
Stack traceOut
count
int
Profile CountOut

ProfilerNodeV3

node_name
string
Node namesOut
timestamp
long
Timestamp (millis since epoch)Out
entries
Iced[]
Profile entry listOut

ProfilerV3

depth
int
Stack trace depthIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
nodes
Iced[]
(No description available)Out

QuantileParametersV3

probs
double[]
Probabilities for quantilesIn
combine_method
enum
How to combine quantiles for even sample sizesIn
distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

QuantileV3

parameters
QuantileParameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out

RandomDiscreteValueSearchCriteriaV99

seed
long
Seed for random number generator; set to a value other than -1 for reproducibility.In/Out
max_models
int
Maximum number of models to build (optional).In/Out
max_runtime_secs
double
Maximum time to spend building models (optional).In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression)In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
strategy
enum
Hyperparameter space search strategy.In/Out

RapidsExpressionV3

name
string
(Class) name of the language constructIn
pattern
string
Code fragment pattern.In
description
string
Description of the functionality provided by this language construct.In

RapidsFrameV3

ast
string
A Rapids AstRoot expressionIn
session_id
string
Session keyIn
id
string
[DEPRECATED] Key name to assign Frame resultsIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
key
Key
Frame resultOut
num_rows
long
Rows in Frame resultOut
num_cols
int
Columns in Frame resultOut

RapidsFunctionV3

ast
string
A Rapids AstRoot expressionIn
session_id
string
Session keyIn
id
string
[DEPRECATED] Key name to assign Frame resultsIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
funstr
string
Function resultOut

RapidsHelpV3

expressions
Iced[]
Description of the rapids language.Out

RapidsMapFrameV3

ast
string
A Rapids AstRoot expressionIn
session_id
string
Session keyIn
id
string
[DEPRECATED] Key name to assign Frame resultsIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
frames
Iced[]
FramesOut
map_keys
Iced
Keys of the mapOut

RapidsNumberV3

ast
string
A Rapids AstRoot expressionIn
session_id
string
Session keyIn
id
string
[DEPRECATED] Key name to assign Frame resultsIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
scalar
double
Number resultOut

RapidsNumbersV3

ast
string
A Rapids AstRoot expressionIn
session_id
string
Session keyIn
id
string
[DEPRECATED] Key name to assign Frame resultsIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
scalar
double[]
Number array resultOut

RapidsSchemaV3

ast
string
A Rapids AstRoot expressionIn
session_id
string
Session keyIn
id
string
[DEPRECATED] Key name to assign Frame resultsIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In

RapidsStringV3

ast
string
A Rapids AstRoot expressionIn
session_id
string
Session keyIn
id
string
[DEPRECATED] Key name to assign Frame resultsIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
string
string
String resultOut

RapidsStringsV3

ast
string
A Rapids AstRoot expressionIn
session_id
string
Session keyIn
id
string
[DEPRECATED] Key name to assign Frame resultsIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
string
string[]
String array resultOut

RapidsV99

ast
string
An Abstract Syntax Tree.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
error
string
Parsing error, if anyOut
scalar
double
Scalar resultOut
funstr
string
Function resultOut
string
string
String resultOut
key
Key
Result keyOut
num_rows
long
Rows in Frame resultOut
num_cols
int
Columns in Frame resultOut

RemoveAllV3

retained_keys
Key[]
Keys of the models to retainIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In

RemoveV3

key
Key
Object to be removed.In
cascade
boolean
If true, removal operation will cascade down the object tree.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In

RequestSchemaV3

_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In

ResumeV3

recovery_dir
string
Full path to the directory with recovery dataIn

RouteV3

http_method
string
(No description available)Out
url_pattern
string
(No description available)Out
summary
string
(No description available)Out
api_name
string
(No description available)Out
handler_class
string
(No description available)Out
handler_method
string
(No description available)Out
input_schema
string
(No description available)Out
output_schema
string
(No description available)Out
path_params
string[]
(No description available)Out
markdown
string
(No description available)Out

RuleFitModelOutputV3

rule_importance
TwoDimTable
The estimated coefficients without language representations for each of the significant baselearners.In
intercept
double[]
Intercept.In
names
string[]
Column namesOut
original_names
string[]
Original column namesOut
column_types
string[]
Column typesOut
domains
string[][]
Domains for categorical columnsOut
cross_validation_models
Key[]
Cross-validation models (model ids)Out
cross_validation_predictions
Key[]
Cross-validation predictions, one per cv model (deprecated, use cross_validation_holdout_predictions_frame_id instead)Out
cross_validation_holdout_predictions_frame_id
Key
Cross-validation holdout predictions (full out-of-sample predictions on training data)Out
cross_validation_fold_assignment_frame_id
Key
Cross-validation fold assignment (each row is assigned to one holdout fold)Out
model_category
enum
Category of the model (e.g., Binomial)Out
model_summary
TwoDimTable
Model summaryOut
scoring_history
TwoDimTable
Scoring historyOut
cv_scoring_history
TwoDimTable[]
Cross-Validation scoring historyOut
reproducibility_information_table
TwoDimTable[]
Model reproducibility informationOut
training_metrics
ModelMetrics
Training data model metricsOut
validation_metrics
ModelMetrics
Validation data model metricsOut
cross_validation_metrics
ModelMetrics
Cross-validation model metricsOut
cross_validation_metrics_summary
TwoDimTable
Cross-validation model metrics summaryOut
status
string
Job statusOut
start_time
long
Start time in millisecondsOut
end_time
long
End time in millisecondsOut
run_time
long
Runtime in millisecondsOut
default_threshold
double
Default threshold used for predictionsOut
help
Map
Help information for output fieldsOut

RuleFitModelV3

model_id
Key
Model keyIn/Out
parameters
RuleFitParameters
The build parameters for the model (e.g. K for KMeans).Out
output
RuleFitOutput
The build output for the model (e.g. the cluster centers for KMeans).Out
compatible_frames
string[]
Compatible frames, if requestedOut
checksum
long
Checksum for all the things that go into building the Model.Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

RuleFitParametersV3

seed
long
Seed for pseudo random number generator (if applicable).In
algorithm
enum
The algorithm to use to generate rules.In
min_rule_length
int
Minimum length of rules. Defaults to 3.In
max_rule_length
int
Maximum length of rules. Defaults to 3.In
max_num_rules
int
The maximum number of rules to return. defaults to -1 which means the number of rules is selected by diminishing returns in model deviance.In
model_type
enum
Specifies type of base learners in the ensemble.In
rule_generation_ntrees
int
Specifies the number of trees to build in the tree model. Defaults to 50.In
remove_duplicates
boolean
Whether to remove rules which are identical to an earlier rule. Defaults to true.In
lambda
double[]
Lambda for LASSO regressor.In
distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

RuleFitV3

parameters
RuleFitParameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out

SVDModelOutputV99

v_key
Key
Frame key of right singular vectorsIn
d
double[]
Singular valuesIn
u_key
Key
Frame key of left singular vectorsIn
names
string[]
Column namesOut
original_names
string[]
Original column namesOut
column_types
string[]
Column typesOut
domains
string[][]
Domains for categorical columnsOut
cross_validation_models
Key[]
Cross-validation models (model ids)Out
cross_validation_predictions
Key[]
Cross-validation predictions, one per cv model (deprecated, use cross_validation_holdout_predictions_frame_id instead)Out
cross_validation_holdout_predictions_frame_id
Key
Cross-validation holdout predictions (full out-of-sample predictions on training data)Out
cross_validation_fold_assignment_frame_id
Key
Cross-validation fold assignment (each row is assigned to one holdout fold)Out
model_category
enum
Category of the model (e.g., Binomial)Out
model_summary
TwoDimTable
Model summaryOut
scoring_history
TwoDimTable
Scoring historyOut
cv_scoring_history
TwoDimTable[]
Cross-Validation scoring historyOut
reproducibility_information_table
TwoDimTable[]
Model reproducibility informationOut
training_metrics
ModelMetrics
Training data model metricsOut
validation_metrics
ModelMetrics
Validation data model metricsOut
cross_validation_metrics
ModelMetrics
Cross-validation model metricsOut
cross_validation_metrics_summary
TwoDimTable
Cross-validation model metrics summaryOut
status
string
Job statusOut
start_time
long
Start time in millisecondsOut
end_time
long
End time in millisecondsOut
run_time
long
Runtime in millisecondsOut
default_threshold
double
Default threshold used for predictionsOut
help
Map
Help information for output fieldsOut

SVDModelV99

model_id
Key
Model keyIn/Out
parameters
SVDParameters
The build parameters for the model (e.g. K for KMeans).Out
output
SVDOutput
The build output for the model (e.g. the cluster centers for KMeans).Out
compatible_frames
string[]
Compatible frames, if requestedOut
checksum
long
Checksum for all the things that go into building the Model.Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

SVDParametersV99

transform
enum
Transformation of training dataIn
svd_method
enum
Method for computing SVD (Caution: Randomized is currently experimental and unstable)In
nv
int
Number of right singular vectorsIn
max_iterations
int
Maximum iterationsIn
seed
long
RNG seed for k-means++ initializationIn
keep_u
boolean
Save left singular vectors?In
u_name
string
Frame key to save left singular vectorsIn
distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
use_all_factor_levels
boolean
Whether first factor level is included in each categorical expansionIn/Out
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

SVDV99

parameters
SVDParameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out

SaveToHiveTableV3

frame_id
Key
H2O Frame IDIn
jdbc_url
string
HIVE JDBC URLIn
table_name
string
Name of table to save data to.In
table_path
string
HDFS Path to where the table should be stored.In
format
enum
Storage format of the created table.In
tmp_path
string
HDFS Path where to store temporary data.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In

SchemaMetadataV3

version
int
Version number of the Schema.In
name
string
Simple name of the Schema. NOTE: the schema_names form a single namespace.In
superclass
string
Simple name of the superclass of the Schema. NOTE: the schema_names form a single namespace.In
type
string
Simple name of H2O type that this Schema represents. Must not be changed after creation (treat as final).In
label
string
[DEPRECATED] This field is always the same as name.Out
fields
FieldMetadata[]
All the public fields of the schemaOut
markdown
string
Documentation for the schema in Markdown format with GitHub extensionsOut

SchemaV3

(No fields)

SegmentModelsKeyV3

name
string
Name (string representation) for this Key.In/Out
type
string
Name (string representation) for the type of Keyed this Key points to.In/Out
URL
string
URL for the resource that this Key points to, if one exists.In/Out

SegmentModelsParametersV3

segment_models_id
Key
Uniquely identifies the collection of the segment modelsIn
segments
Key
Enumeration of all segments for which to build models forIn
segment_columns
string[]
List of columns to segment-by, models will be built for all segments in the dataIn
parallelism
int
Level of parallelism of bulk model building, it is the maximum number of models each H2O node will be building in parallelIn

SegmentModelsV3

segment_models_id
Key
Segment Models idIn

SequentialSearchCriteriaV99

max_models
int
Maximum number of models to build (optional).In/Out
max_runtime_secs
double
Maximum time to spend building models (optional).In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression)In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
early_stopping
boolean
Use early stoppingIn/Out
strategy
enum
Hyperparameter space search strategy.In/Out

SessionIdV4

session_key
string
Session IDIn
__schema
string
Url describing the schema of the current object.In

SessionPropertyV3

_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
session_key
string
Session IDIn/Out
key
string
Property KeyIn/Out
value
string
Property ValueIn/Out

SharedTreeModelOutputV3

variable_importances
TwoDimTable
Variable ImportancesOut
init_f
double
The Intercept term, the initial model function value to which trees make adjustmentsOut
names
string[]
Column namesOut
original_names
string[]
Original column namesOut
column_types
string[]
Column typesOut
domains
string[][]
Domains for categorical columnsOut
cross_validation_models
Key[]
Cross-validation models (model ids)Out
cross_validation_predictions
Key[]
Cross-validation predictions, one per cv model (deprecated, use cross_validation_holdout_predictions_frame_id instead)Out
cross_validation_holdout_predictions_frame_id
Key
Cross-validation holdout predictions (full out-of-sample predictions on training data)Out
cross_validation_fold_assignment_frame_id
Key
Cross-validation fold assignment (each row is assigned to one holdout fold)Out
model_category
enum
Category of the model (e.g., Binomial)Out
model_summary
TwoDimTable
Model summaryOut
scoring_history
TwoDimTable
Scoring historyOut
cv_scoring_history
TwoDimTable[]
Cross-Validation scoring historyOut
reproducibility_information_table
TwoDimTable[]
Model reproducibility informationOut
training_metrics
ModelMetrics
Training data model metricsOut
validation_metrics
ModelMetrics
Validation data model metricsOut
cross_validation_metrics
ModelMetrics
Cross-validation model metricsOut
cross_validation_metrics_summary
TwoDimTable
Cross-validation model metrics summaryOut
status
string
Job statusOut
start_time
long
Start time in millisecondsOut
end_time
long
End time in millisecondsOut
run_time
long
Runtime in millisecondsOut
default_threshold
double
Default threshold used for predictionsOut
help
Map
Help information for output fieldsOut

SharedTreeModelV3

model_id
Key
Model keyIn/Out
parameters
Parameters
The build parameters for the model (e.g. K for KMeans).Out
output
Output
The build output for the model (e.g. the cluster centers for KMeans).Out
compatible_frames
string[]
Compatible frames, if requestedOut
checksum
long
Checksum for all the things that go into building the Model.Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

SharedTreeParametersV3

ntrees
int
Number of trees.In
max_depth
int
Maximum tree depth (0 for unlimited).In
min_rows
double
Fewest allowed (weighted) observations in a leaf.In
nbins
int
For numerical columns (real/int), build a histogram of (at least) this many bins, then split at the best pointIn
nbins_top_level
int
For numerical columns (real/int), build a histogram of (at most) this many bins at the root level, then decrease by factor of two per levelIn
nbins_cats
int
For categorical columns (factors), build a histogram of this many bins, then split at the best point. Higher values can lead to more overfitting.In
r2_stopping
double
r2_stopping is no longer supported and will be ignored if set - please use stopping_rounds, stopping_metric and stopping_tolerance instead. Previous version of H2O would stop making trees when the R^2 metric equals or exceeds thisIn
seed
long
Seed for pseudo random number generator (if applicable)In
build_tree_one_node
boolean
Run on one node only; no network overhead but fewer cpus used. Suitable for small datasets.In
sample_rate_per_class
double[]
A list of row sample rates per class (relative fraction for each class, from 0.0 to 1.0), for each treeIn
col_sample_rate_per_tree
double
Column sample rate per tree (from 0.0 to 1.0)In
col_sample_rate_change_per_level
double
Relative change of the column sampling rate for every level (must be > 0.0 and <= 2.0)In
score_tree_interval
int
Score the model after every so many trees. Disabled if set to 0.In
min_split_improvement
double
Minimum relative improvement in squared error reduction for a split to happenIn
histogram_type
enum
What type of histogram to use for finding optimal split pointsIn
calibrate_model
boolean
Use Platt Scaling (default) or Isotonic Regression to calculate calibrated class probabilities. Calibration can provide more accurate estimates of class probabilities.In
in_training_checkpoints_dir
string
Create checkpoints into defined directory while training process is still running. In case of cluster shutdown, this checkpoint can be used to restart training.In
in_training_checkpoints_tree_interval
int
Checkpoint the model after every so many trees. Parameter is used only when in_training_checkpoints_dir is definedIn
distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
balance_classes
boolean
Balance training data class counts via over/under-sampling (for imbalanced data).In/Out
class_sampling_factors
float[]
Desired over/under-sampling ratios per class (in lexicographic order). If not specified, sampling factors will be automatically computed to obtain class balance during training. Requires balance_classes.In/Out
max_after_balance_size
float
Maximum relative size of the training data after balancing class counts (can be less than 1.0). Requires balance_classes.In/Out
max_confusion_matrix_size
int
[Deprecated] Maximum size (# classes) for confusion matrices to be printed in the LogsIn/Out
calibration_frame
Key
Data for model calibrationIn/Out
calibration_method
enum
Calibration method to useIn/Out
check_constant_response
boolean
Check if response column is constant. If enabled, then an exception is thrown if the response column is a constant value.If disabled, then model will train regardless of the response column being a constant value or not.In/Out
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

SharedTreeV3

parameters
Parameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out

ShutdownV3

_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In

SignificantRulesV3

model_id
Key
Model id of interestIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
significant_rules_table
TwoDimTable
The estimated coefficients and language representations (in case it is a rule) for each of the significant baselearners.Out

SplitFrameV3

key
Key
Job KeyIn
dataset
Key
DatasetIn
ratios
double[]
Split ratios - resulting number of split is ratios.length+1In
destination_frames
Key[]
Destination keys for each output frame split.In/Out

StackedEnsembleModelOutputV99

metalearner
Key
Model which combines the base_models into a stacked ensemble.Out
levelone_frame_id
Key
Level one frame used for metalearner training.Out
stacking_strategy
enum
The stacking strategy used for training.Out
names
string[]
Column namesOut
original_names
string[]
Original column namesOut
column_types
string[]
Column typesOut
domains
string[][]
Domains for categorical columnsOut
cross_validation_models
Key[]
Cross-validation models (model ids)Out
cross_validation_predictions
Key[]
Cross-validation predictions, one per cv model (deprecated, use cross_validation_holdout_predictions_frame_id instead)Out
cross_validation_holdout_predictions_frame_id
Key
Cross-validation holdout predictions (full out-of-sample predictions on training data)Out
cross_validation_fold_assignment_frame_id
Key
Cross-validation fold assignment (each row is assigned to one holdout fold)Out
model_category
enum
Category of the model (e.g., Binomial)Out
model_summary
TwoDimTable
Model summaryOut
scoring_history
TwoDimTable
Scoring historyOut
cv_scoring_history
TwoDimTable[]
Cross-Validation scoring historyOut
reproducibility_information_table
TwoDimTable[]
Model reproducibility informationOut
training_metrics
ModelMetrics
Training data model metricsOut
validation_metrics
ModelMetrics
Validation data model metricsOut
cross_validation_metrics
ModelMetrics
Cross-validation model metricsOut
cross_validation_metrics_summary
TwoDimTable
Cross-validation model metrics summaryOut
status
string
Job statusOut
start_time
long
Start time in millisecondsOut
end_time
long
End time in millisecondsOut
run_time
long
Runtime in millisecondsOut
default_threshold
double
Default threshold used for predictionsOut
help
Map
Help information for output fieldsOut

StackedEnsembleModelV99

model_id
Key
Model keyIn/Out
parameters
StackedEnsembleParameters
The build parameters for the model (e.g. K for KMeans).Out
output
StackedEnsembleOutput
The build output for the model (e.g. the cluster centers for KMeans).Out
compatible_frames
string[]
Compatible frames, if requestedOut
checksum
long
Checksum for all the things that go into building the Model.Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

StackedEnsembleParametersV99

keep_levelone_frame
boolean
Keep level one frame used for metalearner training.In
seed
long
Seed for random numbers; passed through to the metalearner algorithm. Defaults to -1 (time-based random number)In
distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
base_models
Key[]
List of models or grids (or their ids) to ensemble/stack together. Grids are expanded to individual models. If not using blending frame, then models must have been cross-validated using nfolds > 1, and folds must be identical across models.In/Out
metalearner_algorithm
enum
Type of algorithm to use as the metalearner. Options include ‘AUTO’ (GLM with non negative weights; if validation_frame is present, a lambda search is performed), ‘deeplearning’ (Deep Learning with default parameters), ‘drf’ (Random Forest with default parameters), ‘gbm’ (GBM with default parameters), ‘glm’ (GLM with default parameters), ‘naivebayes’ (NaiveBayes with default parameters), or ‘xgboost’ (if available, XGBoost with default parameters).In/Out
metalearner_nfolds
int
Number of folds for K-fold cross-validation of the metalearner algorithm (0 to disable or >= 2).In/Out
metalearner_fold_assignment
enum
Cross-validation fold assignment scheme for metalearner cross-validation. Defaults to AUTO (which is currently set to Random). The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
metalearner_fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation for cross-validation of the metalearner.In/Out
metalearner_transform
enum
Transformation used for the level one frame.In/Out
metalearner_params
string
Parameters for metalearner algorithmIn/Out
blending_frame
Key
Frame used to compute the predictions that serve as the training frame for the metalearner (triggers blending mode if provided)In/Out
score_training_samples
long
Specify the number of training set samples for scoring. The value must be >= 0. To use all training samples, enter 0.In/Out
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

StackedEnsembleV99

parameters
StackedEnsembleParameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out

SteamMetricsV3

_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
version
long
Steam metrics API versionOut
idle_millis
long
Number of milliseconds that the cluster has been idleOut

StepDefinitionV99

name
string
Name of the step provider (usually, this is also the name of an algorithm).In/Out
alias
enum
An alias representing a predefined list of steps to be executed.In/Out
steps
Step[]
The list of steps to be executed (Mutually exclusive with alias).In/Out

StepV99

id
string
The id of the step (must be unique per step provider).In/Out
group
int
The group of execution of the given step (groups are executed in ascending order of priority).Steps with group=0 are skipped. Defaults to -1 to use the default group assigned to the step id.In/Out
weight
int
The relative weight for the given step (can impact time and/or number of models allocated for this step). Steps with weight=0 are skipped. Defaults to -1 to use the default weight assigned to the step id.In/Out

StreamingSchema

(No fields)

StringPairV3

a
string
Value AIn
b
string
Value BIn

TabulateV3

dataset
Key
DatasetIn
nbins_predictor
int
Number of bins for predictor columnIn
nbins_response
int
Number of bins for response columnIn
predictor
VecSpecifier
PredictorIn/Out
response
VecSpecifier
ResponseIn/Out
weight
VecSpecifier
Observation weights (optional)In/Out
count_table
TwoDimTable
Counts tableOut
response_table
TwoDimTable
Response tableOut

TargetEncoderModelOutputV3

input_to_output_columns
ColumnsMapping[]
Mapping between input column(s) and their corresponding target encoded output column(s). Please note that there can be multiple columns on the input/from side if columns grouping was used, and there can also be multiple columns on the output/to side if the target was multiclass.Out
names
string[]
Column namesOut
original_names
string[]
Original column namesOut
column_types
string[]
Column typesOut
domains
string[][]
Domains for categorical columnsOut
cross_validation_models
Key[]
Cross-validation models (model ids)Out
cross_validation_predictions
Key[]
Cross-validation predictions, one per cv model (deprecated, use cross_validation_holdout_predictions_frame_id instead)Out
cross_validation_holdout_predictions_frame_id
Key
Cross-validation holdout predictions (full out-of-sample predictions on training data)Out
cross_validation_fold_assignment_frame_id
Key
Cross-validation fold assignment (each row is assigned to one holdout fold)Out
model_category
enum
Category of the model (e.g., Binomial)Out
model_summary
TwoDimTable
Model summaryOut
scoring_history
TwoDimTable
Scoring historyOut
cv_scoring_history
TwoDimTable[]
Cross-Validation scoring historyOut
reproducibility_information_table
TwoDimTable[]
Model reproducibility informationOut
training_metrics
ModelMetrics
Training data model metricsOut
validation_metrics
ModelMetrics
Validation data model metricsOut
cross_validation_metrics
ModelMetrics
Cross-validation model metricsOut
cross_validation_metrics_summary
TwoDimTable
Cross-validation model metrics summaryOut
status
string
Job statusOut
start_time
long
Start time in millisecondsOut
end_time
long
End time in millisecondsOut
run_time
long
Runtime in millisecondsOut
default_threshold
double
Default threshold used for predictionsOut
help
Map
Help information for output fieldsOut

TargetEncoderModelV3

model_id
Key
Model keyIn/Out
parameters
TargetEncoderParameters
The build parameters for the model (e.g. K for KMeans).Out
output
TargetEncoderOutput
The build output for the model (e.g. the cluster centers for KMeans).Out
compatible_frames
string[]
Compatible frames, if requestedOut
checksum
long
Checksum for all the things that go into building the Model.Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

TargetEncoderParametersV3

columns_to_encode
string[][]
List of categorical columns or groups of categorical columns to encode. When groups of columns are specified, each group is encoded as a single column (interactions are created internally).In
keep_original_categorical_columns
boolean
If true, the original non-encoded categorical features will remain in the result frame.In
blending
boolean
If true, enables blending of posterior probabilities (computed for a given categorical value) with prior probabilities (computed on the entire set). This allows to mitigate the effect of categorical values with small cardinality. The blending effect can be tuned using the inflection_point and smoothing parameters.In
inflection_point
double
Inflection point of the sigmoid used to blend probabilities (see blending parameter). For a given categorical value, if it appears less that inflection_point in a data sample, then the influence of the posterior probability will be smaller than the prior.In
smoothing
double
Smoothing factor corresponds to the inverse of the slope at the inflection point on the sigmoid used to blend probabilities (see blending parameter). If smoothing tends towards 0, then the sigmoid used for blending turns into a Heaviside step function.In
data_leakage_handling
enum
Data leakage handling strategy used to generate the encoding. Supported options are: 1) “none” (default) - no holdout, using the entire training frame. 2) “leave_one_out” - current row’s response value is subtracted from the per-level frequencies pre-calculated on the entire training frame. 3) “k_fold” - encodings for a fold are generated based on out-of-fold data.In
noise
double
The amount of noise to add to the encoded column. Use 0 to disable noise, and -1 (=AUTO) to let the algorithm determine a reasonable amount of noise.In
seed
long
Seed used to generate the noise. By default, the seed is chosen randomly.In
distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

TargetEncoderTransformParametersV3

model
Key
Target Encoder model to use.In
frame
Key
Frame to transform.In
as_training
boolean
Force encoding mode for training data: when using a leakage handling strategy different from None, training data should be transformed with this flag set to true (Defaults to false).In
blending
boolean
Enables or disables blending. Defaults to the value assigned at model creation.In
inflection_point
double
Inflection point. Defaults to the value assigned at model creation.In
smoothing
double
Smoothing. Defaults to the value assigned at model creation.In
noise
double
Noise. Defaults to the value assigned at model creation.In

TargetEncoderV3

parameters
TargetEncoderParameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out

TimelineV3

_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
now
long
Current time in millis.Out
self
string
This nodeOut
events
Iced[]
recorded timeline eventsOut

TreeStatsV3

min_depth
int
minDepthIn
max_depth
int
maxDepthIn
mean_depth
float
meanDepthIn
min_leaves
int
minLeavesIn
max_leaves
int
maxLeavesIn
mean_leaves
float
meanLeavesIn

TreeV3

model
Key
Key of the model the desired tree belongs toIn
tree_number
int
Index of the tree in the model.In
plain_language_rules
enum
Whether to generate plain language rules.In
tree_class
string
Name of the class of the tree. Ignored for regression and binomial.In/Out
left_children
int[]
Left child nodes in the treeOut
right_children
int[]
Right child nodes in the treeOut
root_node_id
int
Number of the root nodeOut
thresholds
float[]
Split thresholds (numeric and possibly categorical columns)Out
features
string[]
Names of the column of the splitOut
nas
string[]
Which way NA Splits (LEFT, RIGHT, NA)Out
descriptions
string[]
Description of the tree’s nodesOut
levels
int[][]
Categorical levels on the edge from the parent nodeOut
predictions
float[]
Prediction values on terminal nodesOut
tree_decision_path
string
Plain language rules representation of a trained decision treeOut
decision_paths
string[]
Plain language rules that were used in a particular predictionOut

TwoDimTableV3

name
string
Table NameOut
description
string
Table DescriptionOut
columns
Iced[]
Column SpecificationOut
rowcount
int
Number of RowsOut
data
Polymorphic[][]
Table Data (col-major)Out

TypeaheadV3

src
string
training_frameIn
limit
int
limitIn
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
matches
string[]
matchesOut

UnlockKeysV3

_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In

UpliftDRFModelOutputV3

variable_importances
TwoDimTable
Variable ImportancesOut
init_f
double
The Intercept term, the initial model function value to which trees make adjustmentsOut
names
string[]
Column namesOut
original_names
string[]
Original column namesOut
column_types
string[]
Column typesOut
domains
string[][]
Domains for categorical columnsOut
cross_validation_models
Key[]
Cross-validation models (model ids)Out
cross_validation_predictions
Key[]
Cross-validation predictions, one per cv model (deprecated, use cross_validation_holdout_predictions_frame_id instead)Out
cross_validation_holdout_predictions_frame_id
Key
Cross-validation holdout predictions (full out-of-sample predictions on training data)Out
cross_validation_fold_assignment_frame_id
Key
Cross-validation fold assignment (each row is assigned to one holdout fold)Out
model_category
enum
Category of the model (e.g., Binomial)Out
model_summary
TwoDimTable
Model summaryOut
scoring_history
TwoDimTable
Scoring historyOut
cv_scoring_history
TwoDimTable[]
Cross-Validation scoring historyOut
reproducibility_information_table
TwoDimTable[]
Model reproducibility informationOut
training_metrics
ModelMetrics
Training data model metricsOut
validation_metrics
ModelMetrics
Validation data model metricsOut
cross_validation_metrics
ModelMetrics
Cross-validation model metricsOut
cross_validation_metrics_summary
TwoDimTable
Cross-validation model metrics summaryOut
status
string
Job statusOut
start_time
long
Start time in millisecondsOut
end_time
long
End time in millisecondsOut
run_time
long
Runtime in millisecondsOut
default_threshold
double
Default threshold used for predictionsOut
help
Map
Help information for output fieldsOut

UpliftDRFModelV3

model_id
Key
Model keyIn/Out
parameters
UpliftDRFParameters
The build parameters for the model (e.g. K for KMeans).Out
output
UpliftDRFOutput
The build output for the model (e.g. the cluster centers for KMeans).Out
compatible_frames
string[]
Compatible frames, if requestedOut
checksum
long
Checksum for all the things that go into building the Model.Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

UpliftDRFParametersV3

mtries
int
Number of variables randomly sampled as candidates at each split. If set to -1, defaults to sqrt{p} for classification and p/3 for regression (where p is the # of predictorsIn
sample_rate
double
Row sample rate per tree (from 0.0 to 1.0)In
treatment_column
string
Define the column which will be used for computing uplift gain to select best split for a tree. The column has to divide the dataset into treatment (value 1) and control (value 0) groups.In
uplift_metric
enum
Divergence metric used to find best split when building an uplift tree.In
auuc_type
enum
Metric used to calculate Area Under Uplift Curve.In
auuc_nbins
int
Number of bins to calculate Area Under Uplift Curve.In
ntrees
int
Number of trees.In
max_depth
int
Maximum tree depth (0 for unlimited).In
min_rows
double
Fewest allowed (weighted) observations in a leaf.In
nbins
int
For numerical columns (real/int), build a histogram of (at least) this many bins, then split at the best pointIn
nbins_top_level
int
For numerical columns (real/int), build a histogram of (at most) this many bins at the root level, then decrease by factor of two per levelIn
nbins_cats
int
For categorical columns (factors), build a histogram of this many bins, then split at the best point. Higher values can lead to more overfitting.In
r2_stopping
double
r2_stopping is no longer supported and will be ignored if set - please use stopping_rounds, stopping_metric and stopping_tolerance instead. Previous version of H2O would stop making trees when the R^2 metric equals or exceeds thisIn
seed
long
Seed for pseudo random number generator (if applicable)In
build_tree_one_node
boolean
Run on one node only; no network overhead but fewer cpus used. Suitable for small datasets.In
sample_rate_per_class
double[]
A list of row sample rates per class (relative fraction for each class, from 0.0 to 1.0), for each treeIn
col_sample_rate_per_tree
double
Column sample rate per tree (from 0.0 to 1.0)In
col_sample_rate_change_per_level
double
Relative change of the column sampling rate for every level (must be > 0.0 and <= 2.0)In
score_tree_interval
int
Score the model after every so many trees. Disabled if set to 0.In
min_split_improvement
double
Minimum relative improvement in squared error reduction for a split to happenIn
histogram_type
enum
What type of histogram to use for finding optimal split pointsIn
calibrate_model
boolean
Use Platt Scaling (default) or Isotonic Regression to calculate calibrated class probabilities. Calibration can provide more accurate estimates of class probabilities.In
in_training_checkpoints_dir
string
Create checkpoints into defined directory while training process is still running. In case of cluster shutdown, this checkpoint can be used to restart training.In
in_training_checkpoints_tree_interval
int
Checkpoint the model after every so many trees. Parameter is used only when in_training_checkpoints_dir is definedIn
distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
balance_classes
boolean
Balance training data class counts via over/under-sampling (for imbalanced data).In/Out
class_sampling_factors
float[]
Desired over/under-sampling ratios per class (in lexicographic order). If not specified, sampling factors will be automatically computed to obtain class balance during training. Requires balance_classes.In/Out
max_after_balance_size
float
Maximum relative size of the training data after balancing class counts (can be less than 1.0). Requires balance_classes.In/Out
max_confusion_matrix_size
int
[Deprecated] Maximum size (# classes) for confusion matrices to be printed in the LogsIn/Out
calibration_frame
Key
Data for model calibrationIn/Out
calibration_method
enum
Calibration method to useIn/Out
check_constant_response
boolean
Check if response column is constant. If enabled, then an exception is thrown if the response column is a constant value.If disabled, then model will train regardless of the response column being a constant value or not.In/Out
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

UpliftDRFV3

parameters
UpliftDRFParameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out

ValidationMessageV3

message_type
string
Type of validation message (ERROR, WARN, INFO, HIDE)Out
field_name
string
Field to which the message appliesOut
message
string
Message textOut

VarImpV3

varimp
float[]
Variable importance of individual variablesOut
names
string[]
Names of variablesOut

WaterMeterCpuTicksV3

nodeidx
int
Index of node to query ticks for (0-based)In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
cpu_ticks
long[][]
array of tick counts per coreOut

WaterMeterIoV3

nodeidx
int
Index of node to query ticks for (0-based)In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
persist_stats
Iced[]
array of IO infoOut

Word2VecModelOutputV3

epochs
int
Number of epochs executedIn
names
string[]
Column namesOut
original_names
string[]
Original column namesOut
column_types
string[]
Column typesOut
domains
string[][]
Domains for categorical columnsOut
cross_validation_models
Key[]
Cross-validation models (model ids)Out
cross_validation_predictions
Key[]
Cross-validation predictions, one per cv model (deprecated, use cross_validation_holdout_predictions_frame_id instead)Out
cross_validation_holdout_predictions_frame_id
Key
Cross-validation holdout predictions (full out-of-sample predictions on training data)Out
cross_validation_fold_assignment_frame_id
Key
Cross-validation fold assignment (each row is assigned to one holdout fold)Out
model_category
enum
Category of the model (e.g., Binomial)Out
model_summary
TwoDimTable
Model summaryOut
scoring_history
TwoDimTable
Scoring historyOut
cv_scoring_history
TwoDimTable[]
Cross-Validation scoring historyOut
reproducibility_information_table
TwoDimTable[]
Model reproducibility informationOut
training_metrics
ModelMetrics
Training data model metricsOut
validation_metrics
ModelMetrics
Validation data model metricsOut
cross_validation_metrics
ModelMetrics
Cross-validation model metricsOut
cross_validation_metrics_summary
TwoDimTable
Cross-validation model metrics summaryOut
status
string
Job statusOut
start_time
long
Start time in millisecondsOut
end_time
long
End time in millisecondsOut
run_time
long
Runtime in millisecondsOut
default_threshold
double
Default threshold used for predictionsOut
help
Map
Help information for output fieldsOut

Word2VecModelV3

model_id
Key
Model keyIn/Out
parameters
Word2VecParameters
The build parameters for the model (e.g. K for KMeans).Out
output
Word2VecOutput
The build output for the model (e.g. the cluster centers for KMeans).Out
compatible_frames
string[]
Compatible frames, if requestedOut
checksum
long
Checksum for all the things that go into building the Model.Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

Word2VecParametersV3

vec_size
int
Set size of word vectorsIn
window_size
int
Set max skip length between wordsIn
sent_sample_rate
float
Set threshold for occurrence of words. Those that appear with higher frequency in the training data will be randomly down-sampled; useful range is (0, 1e-5)In
norm_model
enum
Use Hierarchical SoftmaxIn
epochs
int
Number of training iterations to runIn
min_word_freq
int
This will discard words that appear less than timesIn
init_learning_rate
float
Set the starting learning rateIn
word_model
enum
The word model to use (SkipGram or CBOW)In
pre_trained
Key
Id of a data frame that contains a pre-trained (external) word2vec modelIn
distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

Word2VecSynonymsV3

model
Key
Source word2vec ModelIn
word
string
Target word to find synonyms forIn
count
int
Number of synonymsIn
synonyms
string[]
Synonymous wordsIn
scores
double[]
Similarity scoresIn

Word2VecTransformV3

model
Key
Source word2vec ModelIn
words_frame
Key
Words FrameIn
aggregate_method
enum
Method of aggregating word-vector sequences into a single vectorIn
vectors_frame
Key
Word Vectors FrameOut

Word2VecV3

parameters
Word2VecParameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out

XGBoostExecReqV3

key
Key
IdentifierIn
data
string
Arbitrary request data stored as Base64 encoded binaryIn

XGBoostExecRespV3

key
Key
IdentifierIn
data
string
Arbitrary response data stored as Base64 encoded binaryIn

XGBoostModelOutputV3

variable_importances
TwoDimTable
Variable ImportancesOut
variable_importances_cover
TwoDimTable
Variable Importances - CoverOut
variable_importances_frequency
TwoDimTable
Variable Importances - FrequencyOut
native_parameters
TwoDimTable
XGBoost Native ParametersOut
sparse
boolean
SparseOut
names
string[]
Column namesOut
original_names
string[]
Original column namesOut
column_types
string[]
Column typesOut
domains
string[][]
Domains for categorical columnsOut
cross_validation_models
Key[]
Cross-validation models (model ids)Out
cross_validation_predictions
Key[]
Cross-validation predictions, one per cv model (deprecated, use cross_validation_holdout_predictions_frame_id instead)Out
cross_validation_holdout_predictions_frame_id
Key
Cross-validation holdout predictions (full out-of-sample predictions on training data)Out
cross_validation_fold_assignment_frame_id
Key
Cross-validation fold assignment (each row is assigned to one holdout fold)Out
model_category
enum
Category of the model (e.g., Binomial)Out
model_summary
TwoDimTable
Model summaryOut
scoring_history
TwoDimTable
Scoring historyOut
cv_scoring_history
TwoDimTable[]
Cross-Validation scoring historyOut
reproducibility_information_table
TwoDimTable[]
Model reproducibility informationOut
training_metrics
ModelMetrics
Training data model metricsOut
validation_metrics
ModelMetrics
Validation data model metricsOut
cross_validation_metrics
ModelMetrics
Cross-validation model metricsOut
cross_validation_metrics_summary
TwoDimTable
Cross-validation model metrics summaryOut
status
string
Job statusOut
start_time
long
Start time in millisecondsOut
end_time
long
End time in millisecondsOut
run_time
long
Runtime in millisecondsOut
default_threshold
double
Default threshold used for predictionsOut
help
Map
Help information for output fieldsOut

XGBoostModelV3

model_id
Key
Model keyIn/Out
parameters
XGBoostParameters
The build parameters for the model (e.g. K for KMeans).Out
output
XGBoostOutput
The build output for the model (e.g. the cluster centers for KMeans).Out
compatible_frames
string[]
Compatible frames, if requestedOut
checksum
long
Checksum for all the things that go into building the Model.Out
algo
string
The algo name for this Model.Out
algo_full_name
string
The pretty algo name for this Model (e.g., Generalized Linear Model, rather than GLM).Out
response_column_name
string
The response column name for this Model (if applicable). Is null otherwise.Out
treatment_column_name
string
The treatment column name for this Model (if applicable). Is null otherwise.Out
data_frame
Key
The Model’s training frame keyOut
timestamp
long
Timestamp for when this model was completedOut
have_pojo
boolean
Indicator, whether export to POJO is availableOut
have_mojo
boolean
Indicator, whether export to MOJO is availableOut

XGBoostParametersV3

ntrees
int
(same as n_estimators) Number of trees.In
max_depth
int
Maximum tree depth (0 for unlimited).In
min_rows
double
(same as min_child_weight) Fewest allowed (weighted) observations in a leaf.In
min_child_weight
double
(same as min_rows) Fewest allowed (weighted) observations in a leaf.In
learn_rate
double
(same as eta) Learning rate (from 0.0 to 1.0)In
eta
double
(same as learn_rate) Learning rate (from 0.0 to 1.0)In
sample_rate
double
(same as subsample) Row sample rate per tree (from 0.0 to 1.0)In
subsample
double
(same as sample_rate) Row sample rate per tree (from 0.0 to 1.0)In
col_sample_rate
double
(same as colsample_bylevel) Column sample rate (from 0.0 to 1.0)In
colsample_bylevel
double
(same as col_sample_rate) Column sample rate (from 0.0 to 1.0)In
col_sample_rate_per_tree
double
(same as colsample_bytree) Column sample rate per tree (from 0.0 to 1.0)In
colsample_bytree
double
(same as col_sample_rate_per_tree) Column sample rate per tree (from 0.0 to 1.0)In
colsample_bynode
double
Column sample rate per tree node (from 0.0 to 1.0)In
monotone_constraints
KeyValue[]
A mapping representing monotonic constraints. Use +1 to enforce an increasing constraint and -1 to specify a decreasing constraint.In
max_abs_leafnode_pred
float
(same as max_delta_step) Maximum absolute value of a leaf node predictionIn
max_delta_step
float
(same as max_abs_leafnode_pred) Maximum absolute value of a leaf node predictionIn
score_tree_interval
int
Score the model after every so many trees. Disabled if set to 0.In
seed
long
Seed for pseudo random number generator (if applicable)In
min_split_improvement
float
(same as gamma) Minimum relative improvement in squared error reduction for a split to happenIn
gamma
float
(same as min_split_improvement) Minimum relative improvement in squared error reduction for a split to happenIn
nthread
int
Number of parallel threads that can be used to run XGBoost. Cannot exceed H2O cluster limits (-nthreads parameter). Defaults to maximum availableIn
build_tree_one_node
boolean
Run on one node only; no network overhead but fewer cpus used. Suitable for small datasets.In
save_matrix_directory
string
Directory where to save matrices passed to XGBoost library. Useful for debugging.In
calibrate_model
boolean
Use Platt Scaling (default) or Isotonic Regression to calculate calibrated class probabilities. Calibration can provide more accurate estimates of class probabilities.In
max_bins
int
For tree_method=hist only: maximum number of binsIn
max_leaves
int
For tree_method=hist only: maximum number of leavesIn
tree_method
enum
Tree methodIn
grow_policy
enum
Grow policy - depthwise is standard GBM, lossguide is LightGBMIn
booster
enum
Booster typeIn
reg_lambda
float
L2 regularizationIn
reg_alpha
float
L1 regularizationIn
quiet_mode
boolean
Enable quiet modeIn
sample_type
enum
For booster=dart only: sample_typeIn
normalize_type
enum
For booster=dart only: normalize_typeIn
rate_drop
float
For booster=dart only: rate_drop (0..1)In
one_drop
boolean
For booster=dart only: one_dropIn
skip_drop
float
For booster=dart only: skip_drop (0..1)In
dmatrix_type
enum
Type of DMatrix. For sparse, NAs and 0 are treated equally.In
backend
enum
Backend. By default (auto), a GPU is used if available.In
gpu_id
int[]
Which GPU(s) to use. In
interaction_constraints
string[][]
A set of allowed column interactions.In
scale_pos_weight
float
Controls the effect of observations with positive labels in relation to the observations with negative labels on gradient calculation. Useful for imbalanced problems.In
distribution
enum
Distribution functionIn
tweedie_power
double
Tweedie power for Tweedie regression, must be between 1 and 2.In
quantile_alpha
double
Desired quantile for Quantile regression, must be between 0 and 1.In
huber_alpha
double
Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1).In
max_categorical_levels
int
For every categorical feature, only use this many most frequent categorical levels for model training. Only used for categorical_encoding == EnumLimited.In
calibration_frame
Key
Data for model calibrationIn/Out
calibration_method
enum
Calibration method to useIn/Out
model_id
Key
Destination id for this model; auto-generated if not specified.In/Out
training_frame
Key
Id of the training data frame.In/Out
validation_frame
Key
Id of the validation data frame.In/Out
nfolds
int
Number of folds for K-fold cross-validation (0 to disable or >= 2).In/Out
keep_cross_validation_models
boolean
Whether to keep the cross-validation models.In/Out
keep_cross_validation_predictions
boolean
Whether to keep the predictions of the cross-validation models.In/Out
keep_cross_validation_fold_assignment
boolean
Whether to keep the cross-validation fold assignment.In/Out
parallelize_cross_validation
boolean
Allow parallel training of cross-validation modelsIn/Out
response_column
VecSpecifier
Response variable column.In/Out
weights_column
VecSpecifier
Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. If you set weight = 0 for a row, the returned prediction frame at that row is zero and this is incorrect. To get an accurate prediction, remove all rows with weight == 0.In/Out
offset_column
VecSpecifier
Offset column. This will be added to the combination of columns before applying the link function.In/Out
fold_column
VecSpecifier
Column with cross-validation fold index assignment per observation.In/Out
fold_assignment
enum
Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems.In/Out
categorical_encoding
enum
Encoding scheme for categorical featuresIn/Out
ignored_columns
string[]
Names of columns to ignore for training.In/Out
ignore_const_cols
boolean
Ignore constant columns.In/Out
score_each_iteration
boolean
Whether to score during each iteration of model training.In/Out
checkpoint
Key
Model checkpoint to resume training with.In/Out
stopping_rounds
int
Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable)In/Out
max_runtime_secs
double
Maximum allowed runtime in seconds for model training. Use 0 to disable.In/Out
stopping_metric
enum
Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client.In/Out
stopping_tolerance
double
Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much)In/Out
gainslift_bins
int
Gains/Lift table number of bins. 0 means disabled.. Default value -1 means automatic binning.In/Out
custom_metric_func
string
Reference to custom evaluation function, format: language:keyName=funcNameIn/Out
custom_distribution_func
string
Reference to custom distribution, format: language:keyName=funcNameIn/Out
export_checkpoints_dir
string
Automatically export generated models to this directory.In/Out
auc_type
enum
Set default multinomial AUC type.In/Out

XGBoostV3

parameters
XGBoostParameters
Model builder parameters.In
_exclude_fields
string
Comma-separated list of JSON field paths to exclude from the result, used like: “/3/Frames?_exclude_fields=frames/frame_id/URL,__meta”In
algo
string
The algo name for this ModelBuilder.Out
algo_full_name
string
The pretty algo name for this ModelBuilder (e.g., Generalized Linear Model, rather than GLM).Out
can_build
enum[]
Model categories this ModelBuilder can build.Out
supervised
boolean
Indicator whether the model is supervised or not.Out
visibility
enum
Should the builder always be visible, be marked as beta, or only visible if the user starts up with the experimental flag?Out
job
Job
Job KeyOut
messages
ValidationMessage[]
Parameter validation messagesOut
error_count
int
Count of parameter validation errorsOut
__http_status
int
HTTP status to return for this build.Out