Use Sparkling Water in Windows Environments¶

Windows environments require several additional steps to run Spark and Sparkling Water. A great summary of the configuration steps is available here.

To use Sparkling Water in Windows environments:

Point the SPARK_HOME variable to the location of your Spark distribution:

SET SPARK_HOME=<location of your downloaded Spark distribution>

From https://github.com/steveloughran/winutils, download winutils.exe for the Hadoop version that is referenced by your Spark distribution (For example, for spark-2.4.0-bin-hadoop2.7.tgz, you need wintutils.exe for Hadoop 2.7.)
Move winutils.exe into a new directory %SPARK_HOME%\hadoop\bin and set:
```
SET HADOOP_HOME=%SPARK_HOME%\hadoop
```
Create a new file %SPARK_HOME%\hadoop\conf\hive-site.xml, which sets up a default Hive scratch directory. The best location is a writable temporary directory, for example %TEMP%\hive:
```
<configuration>
  <property>
    <name>hive.exec.scratchdir</name>
    <value>PUT HERE LOCATION OF TEMP FOLDER</value>
    <description>Scratch space for Hive jobs</description>
  </property>
</configuration>
```
Note: You can also use the Hive default scratch directory, which is c:\tmp\hive. In this case, you need to create the directory manually and call winutils.exe chmod -R 777 c:\tmp\hive to set up the correct permissions.

Set the HADOOP_CONF_DIR property:

SET HADOOP_CONF_DIR=%SPARK_HOME%\hadoop\conf

Go to the Sparkling Water directory and run the Sparkling Water shell:
```
bin/sparkling-shell.cmd
```