Use Sparkling Water in Windows Environments

Windows environments require several additional steps to run Spark and Sparkling Water. A great summary of the configuration steps is available here.

To use Sparkling Water in Windows environments:

  1. Download the appropriate Spark distribution from the Spark Downloads page.

  2. Point the SPARK_HOME variable to the location of your Spark distribution:

    SET SPARK_HOME=<location of your downloaded Spark distribution>
  3. From, download winutils.exe for the Hadoop version that is referenced by your Spark distribution (For example, for spark-2.1.2-bin-hadoop2.7.tgz, you need wintutils.exe for Hadoop 2.7.)

  4. Move winutils.exe into a new directory %SPARK_HOME%\hadoop\bin and set:

  5. Create a new file %SPARK_HOME%\hadoop\conf\hive-site.xml, which sets up a default Hive scratch directory. The best location is a writable temporary directory, for example %TEMP%\hive:

        <value>PUT HERE LOCATION OF TEMP FOLDER</value>
        <description>Scratch space for Hive jobs</description>

    Note: You can also use the Hive default scratch directory, which is c:\tmp\hive. In this case, you need to create the directory manually and call winutils.exe chmod -R 777 c:\tmp\hive to set up the correct permissions.

  6. Set the HADOOP_CONF_DIR property:

  7. Go to the Sparkling Water directory and run the Sparkling Water shell: