public abstract class ModelBuilder<M extends Model<M,P,O>,P extends Model.Parameters,O extends Model.Output> extends Job<M>
Modifier and Type | Class and Description |
---|---|
static class |
ModelBuilder.BuilderVisibility
Visibility for this algo: is it always visible, is it beta (always visible but with a note in the UI)
or is it experimental (hidden by default, visible in the UI if the user gives an "experimental" flag
at startup).
|
static class |
ModelBuilder.ValidationMessage
The result of an abnormal Model.Parameter check.
|
Job.JobCancelledException, Job.JobState, Job.Progress, Job.ProgressUpdate
Keyed.BinarySerializer<X extends Keyed>
Modifier and Type | Field and Description |
---|---|
ModelBuilder.ValidationMessage[] |
_messages
A list of field validation issues.
|
protected int |
_nclass |
protected Vec |
_offset |
P |
_parms
All the parameters required to build the model.
|
protected Vec |
_response |
protected Frame |
_train |
protected Frame |
_valid |
protected Vec |
_vresponse |
protected Vec |
_weights |
_description, _dest, _end_time, _exception, _progressKey, _start_time, _state, LIST
_key, EMPTY_KEY_LIST
Constructor and Description |
---|
ModelBuilder(Key dest,
java.lang.String desc,
P parms)
Default constructor, given all arguments
|
ModelBuilder(P ignore)
Constructor called from an http request; MUST override in subclasses.
|
ModelBuilder(java.lang.String desc,
P parms)
Constructor making a default destination key
|
Modifier and Type | Method and Description |
---|---|
abstract ModelBuilder.BuilderVisibility |
builderVisibility()
Visibility for this algo: is it always visible, is it beta (always visible but with a note in the UI)
or is it experimental (hidden by default, visible in the UI if the user gives an "experimental" flag
at startup).
|
abstract hex.ModelCategory[] |
can_build()
List containing the categories of models that this builder can
build.
|
protected void |
checkMemoryFootPrint()
Override this method to call error() if the model is expected to not fit in memory, and say why
|
void |
clearInitState()
Clear whatever was done by init() so it can be run again.
|
protected boolean |
computePriorClassDistribution() |
static ModelBuilder |
createModelBuilder(java.lang.String algo)
Factory method to create a ModelBuilder instance of the correct class given the algo name.
|
int |
error_count() |
void |
error(java.lang.String field_name,
java.lang.String message) |
java.lang.String |
getAlgo() |
static java.lang.String |
getAlgo(java.lang.Class<? extends ModelBuilder> clz) |
static java.lang.String |
getAlgo(Model model)
Get the algo name for the given Model.
|
static java.lang.String |
getAlgoFullName(java.lang.String algo)
Get the algo full name for the given algo.
|
static java.lang.Class<? extends ModelBuilder> |
getModelBuilder(java.lang.String name)
Get the ModelBuilder class for the given algo name.
|
static java.util.Map<java.lang.String,java.lang.Class<? extends ModelBuilder>> |
getModelBuilders()
Get a Map of all algo names to their ModelBuilder classes.
|
static java.lang.Class<? extends Model> |
getModelClass(java.lang.String name)
Get the Model class for the given algo name.
|
boolean |
hasOffset() |
boolean |
hasWeights() |
void |
hide(java.lang.String field_name,
java.lang.String message) |
protected void |
ignoreBadColumns(int npredictors,
boolean expensive)
Ignore constant columns, columns with all NAs and strings.
|
protected boolean |
ignoreStringColumns() |
void |
info(java.lang.String field_name,
java.lang.String message) |
void |
init(boolean expensive)
Initialize the ModelBuilder, validating all arguments and preparing the
training frame.
|
boolean |
isClassifier() |
boolean |
isSupervised() |
int |
nclasses() |
static void |
registerModelBuilder(java.lang.String name,
java.lang.String full_name,
java.lang.Class<? extends ModelBuilder> clz)
Register a ModelBuilder, assigning it an algo name.
|
Vec |
response()
Train response vector.
|
abstract ModelBuilderSchema |
schema()
Externally visible default schema
TODO: this is in the wrong layer: the internals should not know anything about the schemas!!!
This puts a reverse edge into the dependency graph.
|
protected int |
separateFeatureVecs()
Find and set response/weights/offset and put them all in the end,
|
Frame |
train()
Training frame: derived from the parameter's training frame, excluding
all ignored columns, all constant and bad columns, perhaps flipping the
response column to an Categorical, etc.
|
abstract Job<M> |
trainModel()
Method to launch training of a Model, based on its parameters.
|
void |
updateValidationMessages()
init(expensive) is called inside a DTask, not from the http request thread.
|
Frame |
valid()
Validation frame: derived from the parameter's validation frame, excluding
all ignored columns, all constant and bad columns, perhaps flipping the
response column to a Categorical, etc.
|
java.lang.String |
validationErrors()
Get a string representation of only the ERROR ValidationMessages (e.g., to use in an exception throw).
|
Vec |
vresponse()
Validation response vector.
|
void |
warn(java.lang.String field_name,
java.lang.String message) |
cancel, cancel, checksum_impl, createProgressKey, deleteProgressKey, dest, done, failed, get, isCancelledOrCrashed, isDone, isRunning, isRunning, isStopped, jobs, msec, onCancelled, progress_msg, progress, remove_impl, start, update, update, update, update
checksum, getBinarySerializer, getPublishedKeys, remove, remove, remove, remove
clone, frozenType, read_impl, read, readExternal, readJSON_impl, readJSON, toJsonString, write_impl, write, writeExternal, writeHTML_impl, writeHTML, writeJSON_impl, writeJSON
public final P extends Model.Parameters _parms
protected transient Frame _train
protected transient Frame _valid
protected transient Vec _response
protected transient Vec _vresponse
protected transient Vec _offset
protected transient Vec _weights
protected int _nclass
public ModelBuilder.ValidationMessage[] _messages
public ModelBuilder(P ignore)
public ModelBuilder(java.lang.String desc, P parms)
public final Frame train()
public final Frame valid()
public Vec response()
public Vec vresponse()
public static void registerModelBuilder(java.lang.String name, java.lang.String full_name, java.lang.Class<? extends ModelBuilder> clz)
public static java.util.Map<java.lang.String,java.lang.Class<? extends ModelBuilder>> getModelBuilders()
public static java.lang.Class<? extends ModelBuilder> getModelBuilder(java.lang.String name)
public static java.lang.Class<? extends Model> getModelClass(java.lang.String name)
public static java.lang.String getAlgo(Model model)
public static java.lang.String getAlgoFullName(java.lang.String algo)
public java.lang.String getAlgo()
public static java.lang.String getAlgo(java.lang.Class<? extends ModelBuilder> clz)
public abstract ModelBuilderSchema schema()
public static ModelBuilder createModelBuilder(java.lang.String algo)
public abstract Job<M> trainModel()
public abstract hex.ModelCategory[] can_build()
public abstract ModelBuilder.BuilderVisibility builderVisibility()
public void clearInitState()
public boolean isSupervised()
public boolean hasOffset()
public boolean hasWeights()
public int nclasses()
public final boolean isClassifier()
protected int separateFeatureVecs()
protected boolean ignoreStringColumns()
protected void ignoreBadColumns(int npredictors, boolean expensive)
npredictors
- expensive
- protected void checkMemoryFootPrint()
protected boolean computePriorClassDistribution()
public void init(boolean expensive)
expensive
is false; it will be called once again at the start of
model building trainModel()
with expensive set to true.
The incoming training frame (and validation frame) will have ignored columns dropped out, plus whatever work the parent init did.
NOTE: The front end initially calls this through the parameters validation
endpoint with no training_frame, so each subclass's init()
method
has to work correctly with the training_frame missing.
updateValidationMessages()
public void updateValidationMessages()
NOTE: this should only be called when no other threads are updating the job, for example from init() or after the DTask is stopped and is getting cleaned up.
init(boolean)
public int error_count()
public void hide(java.lang.String field_name, java.lang.String message)
public void info(java.lang.String field_name, java.lang.String message)
public void warn(java.lang.String field_name, java.lang.String message)
public void error(java.lang.String field_name, java.lang.String message)
public java.lang.String validationErrors()