Download Sparkling Water 2.1.1

Download Run on Hadoop Run on Standalone Cluster Use from Maven R/Python/Spark

Get started with Sparkling Water in a few easy steps

1. Download Spark (if not already installed) from the Spark Downloads Page

Choose Spark release : 2.1.0
Choose a package type: Pre-built for Hadoop 2.4 and later

2. Point SPARK_HOME to the existing installation of Spark and export variable MASTER.

export SPARK_HOME="/path/to/spark/installation"
# To launch a local Spark cluster with 3 worker nodes with 2 cores and 1g per node.
export MASTER="local[*]"

3. From your terminal, run:

cd ~/Downloads
unzip sparkling-water-2.1.1.zip
cd sparkling-water-2.1.1
bin/sparkling-shell --conf "spark.executor.memory=1g"

4. Create an H₂O cloud inside the Spark cluster:

import org.apache.spark.h2o._
val h2oContext = H2OContext.getOrCreate(sc)
import h2oContext._

5. Follow this demo, which imports airlines and weather data and runs predictions on delays.

Download Sparkling Water

Launch Sparkling Water on Hadoop using Yarn.

1. Download Spark (if not already installed) from the Spark Downloads Page.

Choose Spark release : 2.1.0
Choose a package type: Pre-built for Hadoop 2.4 and later

2. Point SPARK_HOME to an existing installation of Spark:

export SPARK_HOME='/path/to/spark/installation'

3. Set the HADOOP_CONF_DIR and Spark MASTER environmental variables.

export HADOOP_CONF_DIR=/etc/hadoop/conf
export MASTER="yarn-client"

4. Download Spark and Use spark-submit to launch Sparkling Shell on YARN.

wget /sparkling-water-2.1.1.zip
unzip sparkling-water-2.1.1.zip
cd sparkling-water-2.1.1/
bin/sparkling-shell --num-executors 3 --executor-memory 2g --master yarn-client

5. Create an H₂O cloud inside the Spark cluster:

import org.apache.spark.h2o._
val h2oContext = H2OContext.getOrCreate(sc)
import h2oContext._

Download Sparkling Water

Launch H2O on a Standalone Spark Cluster

1. Download Spark (if not already installed) from the Spark Downloads Page.

Choose Spark release : 2.1.0
Choose a package type: Pre-built for Hadoop 2.4 and later

2. Point SPARK_HOME to an existing installation of Spark:

export SPARK_HOME='/path/to/spark/installation'

3. From your terminal, run:

cd ~/Downloads
unzip sparkling-water-2.1.1.zip
cd sparkling-water-2.1.1
bin/launch-spark-cloud.sh
export MASTER="spark://localhost:7077"
bin/sparkling-shell

4. Create an H₂O cloud inside the Spark cluster:

import org.apache.spark.h2o._
val h2oContext = H2OContext.getOrCreate(sc)
import h2oContext._

Documentation

Integration info

H2O version: 3.10.4.2 ueno (documentation)
Spark version: 2.1.0 (documentation)

Sparkling Water

Get started with Sparkling Water in a few easy steps

Launch Sparkling Water on Hadoop using Yarn.

Launch H2O on a Standalone Spark Cluster

Gradle-style specification for Maven artifacts

R client

Python client

Documentation

Integration info