.. _run_on_windows: Use Sparkling Water in Windows Environments ------------------------------------------- Windows environments require several additional steps to run Spark and Sparkling Water. A great summary of the configuration steps is available `here <https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-tips-and-tricks-running-spark-windows.html>`__. To use Sparkling Water in Windows environments: 1. Download the appropriate Spark distribution from the `Spark Downloads page <https://spark.apache.org/downloads.html>`__. 2. Point the ``SPARK_HOME`` variable to the location of your Spark distribution: .. code:: bat SET SPARK_HOME=<location of your downloaded Spark distribution> 3. From https://github.com/steveloughran/winutils, download ``winutils.exe`` for the Hadoop version that is referenced by your Spark distribution (For example, for ``spark-SUBST_SPARK_VERSION-bin-hadoop2.7.tgz``, you need ``wintutils.exe`` for Hadoop 2.7.) 4. Move ``winutils.exe`` into a new directory ``%SPARK_HOME%\hadoop\bin`` and set: .. code:: bat SET HADOOP_HOME=%SPARK_HOME%\hadoop 5. Create a new file ``%SPARK_HOME%\hadoop\conf\hive-site.xml``, which sets up a default Hive scratch directory. The best location is a writable temporary directory, for example ``%TEMP%\hive``: .. code:: xml <configuration> <property> <name>hive.exec.scratchdir</name> <value>PUT HERE LOCATION OF TEMP FOLDER</value> <description>Scratch space for Hive jobs</description> </property> </configuration> **Note**: You can also use the Hive default scratch directory, which is ``c:\tmp\hive``. In this case, you need to create the directory manually and call ``winutils.exe chmod -R 777 c:\tmp\hive`` to set up the correct permissions. 6. Set the ``HADOOP_CONF_DIR`` property: .. code:: bat SET HADOOP_CONF_DIR=%SPARK_HOME%\hadoop\conf 7. Go to the Sparkling Water directory and run the Sparkling Water shell: .. code:: bat bin/sparkling-shell.cmd