Other¶
Connection¶
An H2OConnection represents the latest active handle to a cloud. No more than a single H2OConnection object will be active at any one time.
- class h2o.connection.H2OConnection(ip='localhost', port=54321, size=1, start_h2o=False, enable_assertions=False, license=None, max_mem_size_GB=None, min_mem_size_GB=None, ice_root=None, strict_version_check=True)[source]¶
Bases: object
H2OConnection is a class that represents a connection to the H2O cluster. It is specified by an IP address and a port number.
Objects of type H2OConnection are not instantiated directly!
This class contains static methods for performing the common REST methods GET, POST, and DELETE.
Expr¶
- class h2o.expr.ExprNode(op='', *args)[source]¶
Composable Expressions: This module contains code for the lazy expression DAG.
The job of ExprNode is to provide a layer of indirection to H2OFrame instances that are built of arbitrarily large, lazy expression DAGs. In order to do this job well, ExprNode must also track top-level entry points to the such DAGs, maintain a sane amount of state to know which H2OFrame instances are temporary (or not), and maintain a cache of H2OFrame properties (nrows, ncols, types, names, few rows of data).
- An expression is declared top-level if it
- Computes and returns an H2OFrame to some on-demand call from somewhere
- An H2OFrame instance has more referrers than the 4 needed for the usual flow of python execution (see MAGIC_REF_COUNT below for more details).
Instances of H2OFrame live and die by the state contained in the _ex field. The three pieces of state – _op, _children, _cache – are the fewest pieces of state (and no fewer) needed to unambiguously track temporary H2OFrame instances and prune them according to the usual scoping laws of python.
If _cache._id is None, then this DAG has never been sent over to H2O, and there’s nothing more to do when the object goes out of scope.
If _cache._id is not None, then there has been some work done by H2O to compute the big data object sitting in H2O to which _id points. At the time that __del__ is called on this object, a determination to throw out the corresponding data in H2O or to keep that data is made by the None’ness of _children.
- tl;dr:
- If _cache._id is not None and _children is None, then do not delete in H2O cluster If _cache._id is not None and _children is not None, then do delete in H2O cluster
To prevent several unnecessary REST calls and unnecessary execution, a few of the oft-needed attributes of the H2OFrame are cached for convenience. The primary consumers of these cached values are __getitem__, __setitem__, and a few other H2OFrame ops that do argument validation or exchange (e.g. colnames for indices). There are more details available under the H2OCache class declaration.- MAGIC_REF_COUNT = 5¶