public abstract class Model extends Lockable<Model>
Modifier and Type | Class and Description |
---|---|
static class |
Model.ModelCategory |
Modifier and Type | Field and Description |
---|---|
Key |
_dataKey
Dataset key used to *build* the model, for models for which this makes
sense, or null otherwise.
|
java.lang.String[][] |
_domains
Categorical/factor/enum mappings, per column.
|
protected float[] |
_modelClassDist |
java.lang.String[] |
_names
Columns used in the model and are used to match up with scoring data
columns.
|
protected float[] |
_priorClassDist |
static DocGen.FieldDoc[] |
DOC_FIELDS |
long |
training_duration_in_ms
The duration in mS for model training.
|
long |
training_start_time
The start time in mS since the epoch for model training.
|
Constructor and Description |
---|
Model(Key selfKey,
Key dataKey,
Frame fr)
Constructor from frame (without prior class dist): Strips out the Vecs to just the names needed
to match columns later for future datasets.
|
Model(Key selfKey,
Key dataKey,
Frame fr,
float[] priorClassDist)
Full constructor from frame: Strips out the Vecs to just the names needed
to match columns later for future datasets.
|
Model(Key selfKey,
Key dataKey,
java.lang.String[] names,
java.lang.String[][] domains)
Constructor without prior class distribution
|
Model(Key selfKey,
Key dataKey,
java.lang.String[] names,
java.lang.String[][] domains,
float[] priorClassDist)
Full constructor
|
Model(Key selfKey,
Model m)
Simple shallow copy constructor to a new Key
|
Modifier and Type | Method and Description |
---|---|
Frame[] |
adapt(Frame fr,
boolean exact)
Build an adapted Frame from the given Frame.
|
double |
calcError(Frame ftest,
Vec vactual,
Frame fpreds,
Frame hitratio_fpreds,
java.lang.String label,
boolean printMe,
int max_conf_mat_size,
ConfusionMatrix cm,
AUC auc,
HitRatio hr)
Compute the model error for a given test data set
For multi-class classification, this is the classification error based on assigning labels for the highest predicted per-class probability.
|
java.lang.String[] |
classNames() |
ConfusionMatrix |
cm()
For classifiers, confusion matrix on validation set.
|
Futures |
delete_impl(Futures fs)
Remove any Model internal Keys
|
java.lang.String |
errStr() |
Request2 |
get_params() |
static int[][] |
getDomainMapping(java.lang.String[] modelDom,
java.lang.String[] colDom,
boolean exact)
Returns a mapping between values of model domains (
modelDom ) and given column domain. |
static int[][] |
getDomainMapping(java.lang.String colName,
java.lang.String[] modelDom,
java.lang.String[] colDom,
boolean logNonExactMapping)
Returns a mapping for given column according to given
modelDom . |
Model.ModelCategory |
getModelCategory() |
UniqueId |
getUniqueId() |
boolean |
isClassifier() |
Request2 |
job() |
protected double |
missingColumnsType()
Type of missing columns during adaptation between train/test datasets
Overload this method for models that have sparse data handling.
|
double |
mse()
Returns mse for validation set.
|
int |
nclasses() |
int |
nfeatures()
Returns number of input features
|
java.lang.String |
responseName() |
double |
score(double[] data) |
Frame |
score(Frame fr)
Bulk score for given
fr frame. |
Frame |
score(Frame fr,
boolean adapt)
Bulk score the frame
fr , producing a Frame result; the 1st Vec is the
predicted class, the remaining Vecs are the probability distributions. |
float[] |
score(Frame fr,
boolean exact,
int row)
Single row scoring, on a compatible Frame.
|
float[] |
score(int[][][] map,
double[] row,
float[] preds)
Single row scoring, on a compatible set of data, given an adaption vector
|
float[] |
score(java.lang.String[] names,
java.lang.String[][] domains,
boolean exact,
double[] row)
Single row scoring, on a compatible set of data.
|
protected float[] |
score0(Chunk[] chks,
int row_in_chunk,
double[] tmp,
float[] preds)
Bulk scoring API for one row.
|
protected abstract float[] |
score0(double[] data,
float[] preds)
Subclasses implement the scoring logic.
|
void |
setModelClassDistribution(float[] classdist) |
void |
start_training(long training_start_time) |
void |
start_training(Model previous) |
void |
stop_training() |
void |
testJavaScoring(Frame fr) |
java.lang.String |
toJava()
Return a String which is a valid Java program representing a class that
implements the Model.
|
SB |
toJava(SB sb) |
protected java.lang.String |
toJavaDefaultMaxIters() |
protected void |
toJavaInit(javassist.CtClass ct) |
protected SB |
toJavaInit(SB sb,
SB fileContextSB) |
protected void |
toJavaPredictBody(SB bodySb,
SB classCtxSb,
SB fileCtxSb) |
protected SB |
toJavaSuper(SB sb)
Generate implementation for super class.
|
VarImp |
varimp()
Variable importance of individual input features measured by this model.
|
delete_and_lock, delete, delete, delete, delete, is_unlocked, is_wlocked, read_lock, read_lock, unlock_all, unlock, update, write_lock
clone, frozenType, init, newInstance, read, toDocField, write, writeJSON, writeJSONFields
public static DocGen.FieldDoc[] DOC_FIELDS
@Request.API(help="Datakey used to *build* the model") public final Key _dataKey
@Request.API(help="Column names used to build the model") public final java.lang.String[] _names
@Request.API(help="Column names used to build the model") public final java.lang.String[][] _domains
@Request.API(help="Relative class distribution factors in original data") protected final float[] _priorClassDist
@Request.API(help="Relative class distribution factors used for model building") protected float[] _modelClassDist
public long training_start_time
public long training_duration_in_ms
public Model(Key selfKey, Key dataKey, Frame fr, float[] priorClassDist)
public Model(Key selfKey, Key dataKey, Frame fr)
public Model(Key selfKey, Key dataKey, java.lang.String[] names, java.lang.String[][] domains)
public Model(Key selfKey, Key dataKey, java.lang.String[] names, java.lang.String[][] domains, float[] priorClassDist)
public void setModelClassDistribution(float[] classdist)
public Request2 get_params()
public Request2 job()
public Model.ModelCategory getModelCategory()
public Futures delete_impl(Futures fs)
delete_impl
in class Lockable<Model>
public UniqueId getUniqueId()
public void start_training(long training_start_time)
public void start_training(Model previous)
public void stop_training()
public java.lang.String responseName()
public java.lang.String[] classNames()
public boolean isClassifier()
public int nclasses()
public int nfeatures()
public ConfusionMatrix cm()
public double mse()
public VarImp varimp()
public final Frame score(Frame fr)
fr
frame.
The frame is always adapted to this model.fr
- frame to be scoredscore(Frame, boolean)
public final Frame score(Frame fr, boolean adapt)
fr
, producing a Frame result; the 1st Vec is the
predicted class, the remaining Vecs are the probability distributions.
For Regression (single-class) models, the 1st and only Vec is the
prediction value.
The flat adapt
fr
- frame which should be scoredadapt
- a flag enforcing an adaptation of fr
to this model. If flag
is false
scoring code expect that fr
is already adapted.public final float[] score(Frame fr, boolean exact, int row)
public final float[] score(java.lang.String[] names, java.lang.String[][] domains, boolean exact, double[] row)
public final float[] score(int[][][] map, double[] row, float[] preds)
protected double missingColumnsType()
public Frame[] adapt(Frame fr, boolean exact)
public static int[][] getDomainMapping(java.lang.String[] modelDom, java.lang.String[] colDom, boolean exact)
modelDom
) and given column domain.public static int[][] getDomainMapping(java.lang.String colName, java.lang.String[] modelDom, java.lang.String[] colDom, boolean logNonExactMapping)
modelDom
.
In this case, modelDom
iscolName
- name of column which is mapped, can be null.modelDom
- logNonExactMapping
- protected float[] score0(Chunk[] chks, int row_in_chunk, double[] tmp, float[] preds)
public double calcError(Frame ftest, Vec vactual, Frame fpreds, Frame hitratio_fpreds, java.lang.String label, boolean printMe, int max_conf_mat_size, ConfusionMatrix cm, AUC auc, HitRatio hr)
ftest
- Frame containing test datavactual
- The response column Vecfpreds
- Frame containing ADAPTED (domain labels from train+test data) predicted data (classification: label + per-class probabilities, regression: target)hitratio_fpreds
- Frame containing predicted data (domain labels from test data) (classification: label + per-class probabilities, regression: target)label
- Name for the scored data set to be printedprintMe
- Whether to print the scoring results to Log.infomax_conf_mat_size
- Largest size of Confusion Matrix (#classes) for it to be printed to Log.infocm
- Confusion Matrix object to populate for multi-class classification (also used for regression)auc
- AUC object to populate for binary classificationhr
- HitRatio object to populate for classificationprotected abstract float[] score0(double[] data, float[] preds)
public double score(double[] data)
public java.lang.String toJava()
class UUIDxxxxModel { public static final String NAMES[] = { ....column names... } public static final String DOMAINS[][] = { ....domain names... } // Pass in data in a double[], pre-aligned to the Model's requirements. // Jam predictions into the preds[] array; preds[0] is reserved for the // main prediction (class for classifiers or value for regression), // and remaining columns hold a probability distribution for classifiers. float[] predict( double data[], float preds[] ); double[] map( HashMap < String,Double > row, double data[] ); // Does the mapping lookup for every row, no allocation float[] predict( HashMap < String,Double > row, double data[], float preds[] ); // Allocates a double[] for every row float[] predict( HashMap < String,Double > row, float preds[] ); // Allocates a double[] and a float[] for every row float[] predict( HashMap < String,Double > row ); }
protected void toJavaInit(javassist.CtClass ct)
protected java.lang.String toJavaDefaultMaxIters()
public void testJavaScoring(Frame fr)