H2OAssembly

class h2o.assembly.H2OAssembly(steps)[source]

Bases: object

H2OAssembly class can be used to specify multiple frame operations in one place.

Sample usage:

>>> iris = h2o.load_dataset("iris")
>>> assembly = H2OAssembly(steps=[
... ("col_select",       H2OColSelect(["Sepal.Length", "Petal.Length", "Species"])),
... ("cos_Sepal.Length", H2OColOp(op=H2OFrame.cos, col="Sepal.Length", inplace=True)),
... ("str_cnt_Species",  H2OColOp(op=H2OFrame.countmatches, col="Species",
...                               inplace=False, pattern="s"))
... ])
>>> result = assembly.fit(iris)  # fit the assembly and perform the munging operations

In this example, we first load the iris frame. Next, the following data munging operations are performed on the iris frame

  1. only select three columns out of the five columns;
  2. take the cosine of the column Sepal.Length and replace the original column with the cosine of the column;
  3. want to count the number of rows with the letter s in the class column. Since inplace is set to False, a new column is generated to hold the result.

Extension class of Pipeline implementing additional methods:

  • to_pojo: Exports the assembly to a self-contained Java POJO used in a per-row, high-throughput environment.

In addition, H2OAssembly provides a few static methods that perform element to element operations between two frames. They all are called as

>>> H2OAssembly.op(frame1, frame2)

while frame1, frame2 are H2OFrame of the same size and same column types. It will return a H2OFrame containing the element-wise result of operation op. The following operations are currently supported

  • divide
  • plus
  • multiply
  • minus
  • less_than
  • less_than_equal
  • equal_equal
  • not_equal
  • greater_than
  • greater_than_equal
divide(rhs)
equal_equal(rhs)
fit(fr)[source]

To perform the munging operations on a frame specified in steps on the frame fr.

Parameters:fr – H2OFrame where munging operations are to be performed on.
Returns:H2OFrame after munging operations are completed.
greater_than(rhs)
greater_than_equal(rhs)
less_than(rhs)
less_than_equal(rhs)
minus(rhs)
multiply(rhs)
names
not_equal(rhs)
plus(rhs)
to_pojo(pojo_name='', path='', get_jar=True)[source]

Convert the munging operations performed on H2OFrame into a POJO.

Parameters:
  • pojo_name – (str) Name of POJO
  • path – (str) path of POJO.
  • get_jar – (bool) Whether to also download the h2o-genmodel.jar file needed to compile the POJO
Returns:

None