Running Sparkling Water¶
In order to run Sparkling Water, the environment must contain the property SPARK_HOME
that points to the Spark distribution.
H2O on Spark can be started in the Spark Shell or in the Spark application as:
./bin/sparkling-shell
Sparkling Water (H2O on Spark) can be initiated using the following call:
val hc = H2OContext.getOrCreate(spark)
The semantic of the call depends on the configured Sparkling Water backend. For more information about the backends, please see Sparkling Water Backends.
In internal backend mode, the call will:
- Collect the number and host names of the executors (worker nodes) in the Spark cluster
- Launch H2O services on each detected executor
- Create a cloud for H2O services based on the list of executors
- Verify the H2O cloud status
In external backend mode, the call will:
- Start H2O in client mode on the Spark driver
- Start the separated H2O cluster on the configured YARN queue
- Connect to the external cluster from the H2O client
To see how to run Sparkling Water on Windows, please visit Use Sparkling Water in Windows Environments.