public class DRF extends SharedTreeModelBuilder<DRF.DRFModel>
Modifier and Type | Class and Description |
---|---|
static class |
DRF.DRFModel
DRF model holding serialized tree and implementing logic for scoring a row
|
SharedTreeModelBuilder.Score, SharedTreeModelBuilder.ScoreBuildHistogram
Job.ValidatedJob.Response2CMAdaptor
Job.ChunkProgress, Job.ChunkProgressJob, Job.ColumnsJob, Job.ColumnsResJob, Job.Fail, Job.FrameJob, Job.HexJob, Job.JobCancelledException, Job.JobState, Job.List, Job.ModelJob, Job.ModelJobWithoutClassificationField, Job.Progress, Job.ProgressMonitor, Job.ValidatedJob
Request2.ColumnSelect, Request2.Dependent, Request2.DoClassBoolean, Request2.DRFCopyDataBoolean, Request2.MultiVecSelect, Request2.MultiVecSelectType, Request2.TypeaheadKey, Request2.VecClassSelect, Request2.VecSelect
Request.API, Request.Default, Request.Filter, Request.Validator<V>
RequestBuilders.ArrayBuilder, RequestBuilders.ArrayHeaderRowBuilder, RequestBuilders.ArrayRowBuilder, RequestBuilders.ArrayRowElementBuilder, RequestBuilders.ArrayRowSingleColBuilder, RequestBuilders.BooleanStringBuilder, RequestBuilders.Builder, RequestBuilders.ElementBuilder, RequestBuilders.HideBuilder, RequestBuilders.KeyCellBuilder, RequestBuilders.KeyElementBuilder, RequestBuilders.KeyLinkElementBuilder, RequestBuilders.KeyMinAvgMaxBuilder, RequestBuilders.NoCaptionObjectBuilder, RequestBuilders.ObjectBuilder, RequestBuilders.PaginatedTable, RequestBuilders.PreFormattedBuilder, RequestBuilders.Response, RequestBuilders.ResponseInfo, RequestBuilders.WarningCellBuilder
RequestArguments.Argument<T>, RequestArguments.Bool, RequestArguments.CaseModeSelect, RequestArguments.CaseSelect, RequestArguments.ClassifyBool, RequestArguments.DRFCopyDataBool, RequestArguments.EnumArgument<T extends java.lang.Enum<T>>, RequestArguments.ExistingFile, RequestArguments.FrameClassVec, RequestArguments.FrameKeyMultiVec, RequestArguments.FrameKeyVec, RequestArguments.GeneralFile, RequestArguments.H2OCategoryStrata, RequestArguments.H2OCategoryWeights, RequestArguments.H2OExistingKey, RequestArguments.H2OGLMModelKey, RequestArguments.H2OHexKey, RequestArguments.H2OHexKeyCol, RequestArguments.H2OIllegalArgumentException, RequestArguments.H2OKey, RequestArguments.H2OKey2, RequestArguments.H2OKMeansModelKey, RequestArguments.H2OModelKey<TM extends OldModel,TK extends TypeaheadKeysRequest>, RequestArguments.HexAllColumnSelect, RequestArguments.HexColumnSelect, RequestArguments.HexKeyClassCol, RequestArguments.HexNonClassColumnSelect, RequestArguments.HexNonConstantColumnSelect, RequestArguments.HexPCAColumnSelect, RequestArguments.InputCheckBox, RequestArguments.InputSelect<T>, RequestArguments.InputText<T>, RequestArguments.Int, RequestArguments.LongInt, RequestArguments.MultipleSelect<T>, RequestArguments.MultipleText<T>, RequestArguments.NTree, RequestArguments.NumberSequence, RequestArguments.NumberSequenceFloat, RequestArguments.Real, RequestArguments.Record<T>, RequestArguments.RFModelKey, RequestArguments.RSeq, RequestArguments.RSeqFloat, RequestArguments.Str, RequestArguments.StringList, RequestArguments.TypeaheadInputText<T>
RequestStatics.RequestType
Constants.Extensions, Constants.Schemes, Constants.Suffixes
Modifier and Type | Field and Description |
---|---|
protected int |
_mtry |
protected long |
_seed |
boolean |
build_tree_one_node |
static DocGen.FieldDoc[] |
DOC_FIELDS |
_distribution, _nclass, _ncols, _nrows, balance_classes, DECIDED_ROW, importance, max_after_balance_size, max_depth, MAX_SUPPORTED_LEVELS, min_rows, nbins, ntrees, OUT_OF_BAG, score_each_iteration
_cmDomain, _names, _responseName, _sourceResponseDomain, _train, _valid, _validResponse, _validResponseDomain, validation
classification
response
cols, ignored_cols, ignored_cols_by_name
source
_fjtask, description, destination_key, end_time, exception, job_key, LIST, start_time, state
_parms, response_info
_requestHelp, SUPPORTS_ONLY_V1, SUPPORTS_ONLY_V2, SUPPORTS_V1_V2
ARRAY_BUILDER, ARRAY_HEADER_ROW_BUILDER, ARRAY_ROW_BUILDER, ARRAY_ROW_ELEMENT_BUILDER, ARRAY_ROW_SINGLECOL_BUILDER, ELEMENT_BUILDER, GSON_BUILDER, OBJECT_BUILDER, ROOT_OBJECT
_queryHtml
_arguments
ALPHA, ARGUMENTS, AUC, BASE, BEST_THRESHOLD, BETA_EPS, BIN_LIMIT, BROWSE, BUCKET, BUILT_IN_KEY_JOBS, CANCELLED, CARDINALITY, CASE, CASE_MODE, CHUNK, CLASS, CLOUD_HEALTH, CLOUD_NAME, CLOUD_SIZE, CLOUD_UPTIME_MILLIS, CLUSTERS, COEFFICIENTS, COL_INDEX, COLS, COLUMN_NAME, COLUMNS_DISPLAY, CONSENSUS, CONTENTS, COUNT, DATA_KEY, DEPTH, DESCRIPTION, DEST_KEY, DTHRESHOLDS, ELAPSED, END_TIME, ENUM_DOMAIN_SIZE, ERROR, ESCAPE_NAN, EXCLUSIVE_SPLIT_LIMIT, EXPRESSION, FAILED, FAMILY, FEATURES, FILE, FILES, FILTER, FIRST_CHUNK, FJ_QUEUE_HI, FJ_QUEUE_LO, FJ_THREADS_HI, FJ_THREADS_LO, FREE_DISK, FREE_MEM, HEADER, HEIGHT, HELP, IGNORE, ITEMS, ITERATIVE_CM, JOB, JOB_KEY, JOBS, JSON_H2O, KEY, KEYS, LAMBDA, LAST_CONTACT, LIMIT, LINK, LOCKED, MAX, MAX_DISK, MAX_ITER, MAX_MEM, MAX_ROWS, MEAN, MIN, MODEL_KEY, MODELS, MORE, MTRY, MTRY_NODES, NAME, NEG_X, NO_CM, NODE, NODE_HEALTH, NODE_NAME, NODES, NORMALIZE, NUM_COLS, NUM_CPUS, NUM_FAILED, NUM_KEYS, NUM_MISSING_VALUES, NUM_ROWS, NUM_SUCCEEDED, NUM_TREES, OBJECT, OFFSET, OOBEE, PARALLEL, PARSER_TYPE, PATH, PREVIEW, PREVIOUS_MODEL_KEY, PRIOR, PROGRESS, PROGRESS_KEY, PROGRESS_TOTAL, REDIRECT, REDIRECT_ARGS, REPLICATION_FACTOR, REQUEST_TIME, RESPONSE, RHO, ROW, ROW_SIZE, ROWS, RPCS, SAMPLE, SAMPLING_STRATEGY, SCALE, SEED, SENT_ROWS, SEPARATOR, SIZE, SOURCE_KEY, STACK_TRACES, START_TIME, STAT_TYPE, STATUS, STEP, STRATA_SAMPLES, SUCCEEDED, SYSTEM_LOAD, TASK_KEY, TCPS_ACTIVE, TCPS_DUTY, TIME, TO_ENUM, TOT_MEM, TREE_COUNT, TREE_DEPTH, TREE_LEAVES, TREE_NUM, TREES, TWEEDIE_POWER, TYPE, URL, USE_NON_LOCAL_DATA, VALUE, VALUE_SIZE, VALUE_TYPE, VARIANCE, VERSION, VIEW, WARNINGS, WEIGHT, WEIGHTS, WIDTH, X, XVAL, Y
Constructor and Description |
---|
DRF() |
Modifier and Type | Method and Description |
---|---|
protected DRF.DRFModel |
buildModel(DRF.DRFModel model,
Frame fr,
java.lang.String[] names,
java.lang.String[][] domains,
Timer t_build)
Builds model
|
protected Chunk |
chk_oobt(Chunk[] chks) |
protected VarImp |
doVarImpCalc(DRF.DRFModel model,
DTree[] ktrees,
int tid,
Frame fTrain,
boolean scale)
On-the-fly version for varimp.
|
protected void |
execImpl()
Compute a DRF tree.
|
protected boolean |
inBagRow(Chunk[] chks,
int row) |
protected void |
init()
Invoked before job runs.
|
protected void |
initAlgo(DRF.DRFModel initialModel)
Initialize algorithm - e.g., allocate algorithm specific datastructure.
|
protected void |
initWorkFrame(DRF.DRFModel initialModel,
Frame fr)
Initialize working frame.
|
static java.lang.String |
link(Key k,
java.lang.String content)
Return the query link to this page
|
protected Log.Tag.Sys |
logTag()
Returns a log tag for a particular model builder (e.g., DRF, GBM)
|
protected DTree.DecidedNode |
makeDecided(DTree.UndecidedNode udn,
DHistogram[] hs) |
protected DRF.DRFModel |
makeModel(DRF.DRFModel model,
double err,
ConfusionMatrix cm,
VarImp varimp,
AUC validAUC) |
protected DRF.DRFModel |
makeModel(DRF.DRFModel model,
DTree[] ktrees,
DTree.TreeModel.TreeStats tstats) |
protected DRF.DRFModel |
makeModel(Key outputKey,
Key dataKey,
Key testKey,
java.lang.String[] names,
java.lang.String[][] domains,
java.lang.String[] cmDomain) |
protected RequestBuilders.Response |
redirect() |
Frame |
score(Frame fr) |
protected float |
score1(Chunk[] chks,
float[] fs,
int row) |
buildLayer, buildModel, chk_nids, chk_resp, chk_tree, chk_work, cleanUp, createRNG, data_row, doScoring, isClassification, isDecidedRow, isOOBRow, makeAUC, nid2Oob, oob2Nid, printGenerateTrees, progress, speedDescription, speedValue, vec_nids, vec_resp
getCMDomain, getOrigValidation, getValidAdaptor, getValidation, getVectorDomain, hasValidation, prepareValidationWithModel, toJSON
registered
selectFrame, selectVecs
all, cancel, cancel, cancel, checkIdx, defaultDestKey, defaultJobKey, dest, findJob, findJobByDest, fork, get, getState, gridParallelism, invoke, isCancelledOrCrashed, isCancelledXX, isCrashed, isDone, isEnded, isRunning, isRunning, onCancelled, remove, runTimeMs, self, serve, start, waitUntilJobEnded, waitUntilJobEnded
create, fillResponseInfo, filterNaCols, input, logStart, makeJsonBox, serveGrid, set, split, superServeGrid, supportedVersions, toString
addToNavbar, addToNavbar, addToNavbar, DocExampleFail, DocExampleSucc, href, href, hrefType, HTMLHelp, htmlTemplate, initializeNavBar, log, mapTypeahead, ReSTHelp, serve, serveJava, serveResponse, toDocGET, toHTML, toJava, wrap, wrap, wrap, writeJSONFields
build, buildJSONResponseBox, buildResponseHeader, name
buildQuery, checkArguments, queryArgumentValueSet
arguments, argumentsToJson, frameColumnNameToIndex, vaCategoryNames, vaCategoryNames, vaColumnNameToIndex
checkJsonName, encodeRedirectArgs, JSON2HTML, jsonError, requestName, Str2JSON
clone, frozenType, init, newInstance, read, toDocField, write, writeJSON
public static DocGen.FieldDoc[] DOC_FIELDS
@Request.API(help="Run on one node only; no network overhead but fewer cpus used. Suitable for small datasets.", filter=hex.drf.DRF.myClassFilter.class, importance=SECONDARY) public boolean build_tree_one_node
@Request.API(help="Computed number of split features", importance=EXPERT) protected int _mtry
@Request.API(help="Autogenerated seed", importance=EXPERT) protected long _seed
protected Log.Tag.Sys logTag()
SharedTreeModelBuilder
logTag
in class SharedTreeModelBuilder<DRF.DRFModel>
protected DRF.DRFModel makeModel(Key outputKey, Key dataKey, Key testKey, java.lang.String[] names, java.lang.String[][] domains, java.lang.String[] cmDomain)
makeModel
in class SharedTreeModelBuilder<DRF.DRFModel>
protected DRF.DRFModel makeModel(DRF.DRFModel model, double err, ConfusionMatrix cm, VarImp varimp, AUC validAUC)
makeModel
in class SharedTreeModelBuilder<DRF.DRFModel>
protected DRF.DRFModel makeModel(DRF.DRFModel model, DTree[] ktrees, DTree.TreeModel.TreeStats tstats)
makeModel
in class SharedTreeModelBuilder<DRF.DRFModel>
public static java.lang.String link(Key k, java.lang.String content)
protected void execImpl()
protected RequestBuilders.Response redirect()
protected void init()
Job
init
in class SharedTreeModelBuilder<DRF.DRFModel>
protected void initAlgo(DRF.DRFModel initialModel)
SharedTreeModelBuilder
initAlgo
in class SharedTreeModelBuilder<DRF.DRFModel>
protected void initWorkFrame(DRF.DRFModel initialModel, Frame fr)
SharedTreeModelBuilder
initWorkFrame
in class SharedTreeModelBuilder<DRF.DRFModel>
initialModel
- initial modelfr
- working frame which contains train data and additional columns prepared by this builder.protected DRF.DRFModel buildModel(DRF.DRFModel model, Frame fr, java.lang.String[] names, java.lang.String[][] domains, Timer t_build)
SharedTreeModelBuilder
buildModel
in class SharedTreeModelBuilder<DRF.DRFModel>
model
- initial model created by SharedTreeModelBuilder.makeModel(Key, Key, Key, String[], String[][], String[])
method.fr
- training dataset which can contain additional temporary vectors prepared by buildModel() method.names
- names of columns in trainFr
used for model trainingdomains
- domains of columns in trainFr
used for model trainingt_build
- timer to measure model building processprotected VarImp doVarImpCalc(DRF.DRFModel model, DTree[] ktrees, int tid, Frame fTrain, boolean scale)
The page says: "In every tree grown in the forest, put down the oob cases and count the number of votes cast for the correct class. Now randomly permute the values of variable m in the oob cases and put these cases down the tree. Subtract the number of votes for the correct class in the variable-m-permuted oob data from the number of votes for the correct class in the untouched oob data. The average of this number over all trees in the forest is the raw importance score for variable m."
doVarImpCalc
in class SharedTreeModelBuilder<DRF.DRFModel>
protected float score1(Chunk[] chks, float[] fs, int row)
score1
in class SharedTreeModelBuilder<DRF.DRFModel>
protected boolean inBagRow(Chunk[] chks, int row)
inBagRow
in class SharedTreeModelBuilder<DRF.DRFModel>
protected DTree.DecidedNode makeDecided(DTree.UndecidedNode udn, DHistogram[] hs)
makeDecided
in class SharedTreeModelBuilder<DRF.DRFModel>