Spark Frame <--> H2O Frame Conversions
--------------------------------------

Quick links:

- `Converting an H2OFrame into an RDD[T]`_
- `Converting an H2OFrame into a DataFrame`_
- `Converting an RDD[T] into an H2OFrame`_
- `Converting a DataFrame into an H2OFrame`_

Converting an H2OFrame into an RDD[T]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. content-tabs::

    .. tab-container:: Scala
        :title: Scala

        The ``H2OContext`` class provides the method ``asRDD``, which creates an RDD-like wrapper around the provided H2O's H2OFrame:

        .. code:: scala

            def asRDD[A <: Product: TypeTag: ClassTag](fr: H2OFrame): RDD[A]

        The call expects the type ``A`` to create a correctly typed RDD. The conversion requires type ``A`` to be bound by the ``Product`` interface.
        The relationship between the columns of the H2OFrame and the attributes of class ``A`` is based on name matching.

        **Example**

        .. code:: scala

            case class Person(name: String, age: Int)
            val rdd = asRDD[Person](h2oFrame)

Converting an H2OFrame into a DataFrame
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. content-tabs::

    .. tab-container:: Scala
        :title: Scala

        The ``H2OContext`` class provides the method ``asSparkFrame``, which creates a DataFrame-like wrapper around the provided H2OFrame:

        .. code:: scala

            def asSparkFrame(fr: H2OFrame): DataFrame

        The schema of the created instance of the ``DataFrame`` is derived from the column names and types of the specified ``H2OFrame``.

        **Example**

        .. code:: scala

            val dataFrame = h2oContext.asSparkFrame(h2oFrame)

    .. tab-container:: Python
        :title: Python

        The ``H2OContext`` class provides the method ``asSparkFrame``, which creates a DataFrame-like wrapper around the provided H2OFrame:

        .. code:: python

            def asSparkFrame(self, h2oFrame)

        The schema of the created instance of the ``DataFrame`` is derived from the column names and types of the specified ``H2OFrame``.

        **Example**

        .. code:: python

            dataFrame = h2oContext.asSparkFrame(h2oFrame)

    .. tab-container:: R
        :title: R

        The ``H2OContext`` class provides the method ``asSparkFrame``, which creates a DataFrame-like wrapper around the provided H2OFrame:

        .. code:: R

            asSparkFrame = function(h2oFrame)

        The schema of the created instance of the ``DataFrame`` is derived from the column names and types of the specified ``H2OFrame``.

        **Example**

        .. code:: R

            dataFrame <- h2oContext$asSparkFrame(h2oFrame)

Converting an RDD[T] into an H2OFrame
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. content-tabs::

    .. tab-container:: Scala
        :title: Scala

        The ``H2OContext`` provides a conversion method from the specified ``RDD[A]`` to ``H2OFrame``. As with conversion
        in the opposite direction, the type ``A`` has to satisfy the upper bound expressed by the type ``Product``. The conversion
        creates a new ``H2OFrame``, transfers data from the specified RDD, and saves it to the DKV store on the H2O backend.

        .. code:: scala

            def asH2OFrame[A <: Product : TypeTag](rdd : RDD[A]): H2OFrame

        The API also provides a version, which allows for specifying the name for the resulting H2OFrame.

        .. code:: scala

            def asH2OFrame[A <: Product : TypeTag](rdd : RDD[A], frameName: String): H2OFrame

        **Example**

        .. code:: scala

            val h2oFrame = h2oContext.asH2OFrame(rdd)

    .. tab-container:: Python
        :title: Python

        The ``H2OContext`` provides a conversion method from the specified PySpark ``RDD`` to ``H2OFrame``. The conversion
        creates a new ``H2OFrame``, transfers data from the specified RDD, and saves it to the DKV store on the H2O backend.

        .. code:: python

            def asH2OFrame(self, rdd, h2oFrameName=None, fullCols=-1)

        **Parameters**

        - ``rdd`` : PySpark RDD
        - ``h2oFrameName`` : Optional name for resulting H2OFrame
        - ``fullCols`` : A number of first n columns which are considered for conversion. -1 represents 'no limit'.

        **Example**

        .. code:: python

            h2oFrame = h2oContext.asH2OFrame(df)


Converting a DataFrame into an H2OFrame
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. content-tabs::

    .. tab-container:: Scala
        :title: Scala

        The ``H2OContext`` provides conversion method from the specified ``DataFrame`` to ``H2OFrame``.
        The conversion creates a new ``H2OFrame``, transfers data from the specified ``DataFrame``, and saves it
        to the DKV store on the H2O backend.

        .. code:: scala

            def asH2OFrame(df: DataFrame): H2OFrame

        The API also provides a version, which allows for specifying the name for the resulting H2OFrame.

        .. code:: scala

            def asH2OFrame(rdd : DataFrame, frameName: String): H2OFrame

        **Example**

        .. code:: scala

            val h2oFrame = h2oContext.asH2OFrame(df)

    .. tab-container:: Python
        :title: Python

        The ``H2OContext`` provides conversion method from the specified ``DataFrame`` to ``H2OFrame``.
        The conversion creates a new ``H2OFrame``, transfers data from the specified ``DataFrame``, and saves it
        to the DKV store on the H2O backend.

        .. code:: python

            def asH2OFrame(self, sparkFrame, h2oFrameName=None, fullCols=-1)

        **Parameters**

        - ``sparkFrame`` : PySpark data frame
        - ``h2oFrameName`` : Optional name for resulting H2OFrame
        - ``fullCols`` : A number of first n columns which are considered for conversion. -1 represents 'no limit'.

        **Example**

        .. code:: python

            h2oFrame = h2oContext.asH2OFrame(df)

    .. tab-container:: R
        :title: R

        The ``H2OContext`` provides conversion method from the specified ``DataFrame`` to ``H2OFrame``.
        The conversion creates a new ``H2OFrame``, transfers data from the specified ``DataFrame``, and saves it
        to the DKV store on the H2O backend.

        .. code:: R

            asH2OFrame = function(sparkFrame, h2oFrameName = NULL)

        **Parameters**

        - ``sparkFrame`` : Spark data frame
        - ``h2oFrameName`` : Optional name for resulting H2OFrame

        **Example**

        .. code:: R

            h2oFrame <- h2oContext$asH2OFrame(df)