hex.gbm
Class DHistogram<T extends DHistogram>
java.lang.Object
water.Iced
hex.gbm.DHistogram<T>
- All Implemented Interfaces:
- java.lang.Cloneable, Freezable
- Direct Known Subclasses:
- DBinHistogram
public class DHistogram<T extends DHistogram>
- extends Iced
A DHistogram, computed in parallel over a Vec.
A DHistogram bins (by default into bins)
every value added to it, and computes a the min, max, and either
class distribution or mean & variance for each bin. DHistograms are initialized with a min, max and number-of-elements
to be added (all of which are generally available from a Vec).
Bins normally run from min to max in uniform sizes, but if the
DHistogram can determine that fewer bins are needed
(e.g. boolean columns run from 0 to 1, but only ever take on 2
values, so only 2 bins are needed), then fewer bins are used.
If we are successively splitting rows (e.g. in a decision tree), then a
fresh DHistogram for each split will dynamically re-bin the data.
Each successive split then, will logarithmically divide the data. At the
first split, outliers will end up in their own bins - but perhaps some
central bins may be very full. At the next split(s), the full bins will get
split, and again until (with a log number of splits) each bin holds roughly
the same amount of data.
|
Constructor Summary |
DHistogram(java.lang.String name,
byte isInt)
|
DHistogram(java.lang.String name,
byte isInt,
float min,
float max)
|
| Methods inherited from class java.lang.Object |
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
DHistogram
public DHistogram(java.lang.String name,
byte isInt,
float min,
float max)
DHistogram
public DHistogram(java.lang.String name,
byte isInt)
smallCopy
public DHistogram smallCopy()
bigCopy
public DHistogram bigCopy()
nbins
public int nbins()
bins
public long bins(int i)
mins
public float mins(int i)
maxs
public float maxs(int i)
scoreMSE
public DTree.Split scoreMSE(int col)
mean
public double mean(int bin)
var
public double var(int bin)
tightenMinMax
public void tightenMinMax()
fini
public void fini()
min
public final double min()
max
public final double max()
name
public final java.lang.String name()
byteSize
protected static int byteSize(byte[] bs)
byteSize
protected static int byteSize(short[] ss)
byteSize
protected static int byteSize(float[] fs)
byteSize
protected static int byteSize(int[] is)
byteSize
protected static int byteSize(long[] ls)
byteSize
protected static int byteSize(double[] fs)
byteSize
protected static int byteSize(java.lang.Object[] ls)