Tree Class in H2O

H2O Tree Class

class h2o.tree.H2OTree(model, tree_number, tree_class=None)[source]

Bases: object

Represents a model of a Tree built by one of H2O’s algorithms (GBM, Random Forest).

descriptions

Descriptions for each node to be found in the tree. Contains split threshold if the split is based on numerical column. For categorical splits, it contains a list of categorical levels for transition from the parent node.

features

Names of the feature/column used for the split.

left_children

An array with left child nodes of tree’s nodes

levels

Categorical levels on split from parent’s node belonging into this node. None for root node or non-categorical splits.

model_id

Name (identification) of the model this tree is related to.

nas

representing if NA values go to the left node or right node. The value may be None if node is a leaf or there is no possibility of an NA value appearing on a node.

node_ids

Array with identification numbers of nodes. Node IDs are generated by H2O.

predictions

Values predicted on tree’s nodes.

right_children

An array with right child nodes of tree’s nodes

root_node

An instance of H2ONode representing the beginning of the tree behind the model. Allows further tree traversal.

show()[source]
thresholds

Node split thresholds. Split thresholds are not only related to numerical splits, but might be present in case of categorical split as well.

tree_class

The name of a tree’s class.

The number of tree classes equals the number of levels in the categorical response column. As there is exactly one class per categorical level, the name of tree’s class is equal to the corresponding categorical level of the response column.

In the case of regression and binomial, the name of the categorical level is ignored can be omitted, as there is exactly one tree built in both cases.

tree_number

The order in which the tree has been built in the model.

H2O Node Class

class h2o.tree.H2ONode(node_id)[source]

Bases: object

Represents a single abstract node in an H2OTree

id

Node’s unique identifier (integer). Generated by H2O.

H2O Leaf Node Class

class h2o.tree.H2OLeafNode(node_id, prediction)[source]

Bases: h2o.tree.tree.H2ONode

Represents a single terminal node in an H2OTree with final prediction.

id

Node’s unique identifier (integer). Generated by H2O.

prediction

Prediction value in the terminal node (numeric floating point)

show()[source]

H2O Split Node Class

class h2o.tree.H2OSplitNode(node_id, threshold, left_child, right_child, split_feature, na_direction, left_levels, right_levels)[source]

Bases: h2o.tree.tree.H2ONode

Represents a single node with either numerical or categorical split in an H2OTree with all its attributes.

id

Node’s unique identifier (integer). Generated by H2O.

left_child

Integer identifier of the left child node, if there is any. Otherwise None.

left_levels

Categorical levels on the edge from this node to the left child node. None for non-categorical splits.

na_direction

The direction of NA values. LEFT means NA values go to the left child node, RIGH means NA values go to the right child node.

A value of None means occurance of NA for the given split column is not possible on this node due to an earlier split on the very same feature.

right_child

Integer identifier of the right child node, if there is any. Otherwise None.

right_levels

Categorical levels on the edge from this node to the right child node. None for non-categorical splits.

show()[source]
split_feature

The name of the column this node splits on.

threshold

Split threshold, typically when the split column is numerical.