Spark - H2O Frame Mapping¶
Type Mapping between H2O H2OFrame Types and Spark DataFrame Types¶
For all primitive Scala types or Spark SQL (see org.apache.spark.sql.types
) types that can be part of Spark RDD/DataFrame, we provide mapping into H2O vector types (numeric, categorical, string, time, UUID - see water.fvec.Vec
):
Scala type | SQL type | H2O type |
---|---|---|
NA | BinaryType | Numeric |
Byte | ByteType | Numeric |
Short | ShortType | Numeric |
Integer | IntegerType | Numeric |
Long | LongType | Numeric |
Float | FloatType | Numeric |
Double | DoubleType | Numeric |
String | StringType | String |
Boolean | BooleanType | Numeric |
java.sql.Timestamp | TimestampType | Time |
Type Mapping Between H2O H2OFrame Types and RDD[T] Types¶
As type T
, we support following types:
T |
---|
NA |
Byte |
Short |
Integer |
Long |
Float |
Double |
String |
Boolean |
java.sql.Timestamp |
Any scala class extending scala Product |
org.apache.spark.mllib.regression.LabeledPoint |
As is specified in the table, Sparkling Water provides support for transforming arbitrary scala class extending Product
, which are, for example, all case classes.