Downloading & Installing H2O¶
This section describes how to download and install the latest stable version of H2O. These instructions are also available on the H2O Download page. Please first make sure you meet the requirements listed here. Java is a prerequisite for H2O, even if using it from the R or Python packages.
Note: To download the nightly bleeding edge release, go to h2o-release.s3.amazonaws.com/h2o/master/latest.html. Choose the type of installation you want to perform (for example, “Install in Python”) by clicking on the tab.
Choose your desired method of use below. Most users will want to use H2O from either R or Python; however we also include instructions for using H2O’s web GUI Flow and Hadoop below.
Download and Run from the Command Line¶
If you plan to exclusively use H2O’s web GUI, Flow, this is the method you should use. If you plan to use H2O from R or Python, skip to the appropriate sections below.
Click the
Download H2O
button on the http://h2o-release.s3.amazonaws.com/h2o/latest_stable.html page. This downloads a zip file that contains everything you need to get started.
Note
By default, this setup is open. Follow security guidelines if you want to secure your installation.
From your terminal, unzip and start H2O as in the example below.
cd ~/Downloads unzip h2o-3.42.0.2.zip cd h2o-3.42.0.2 java -jar h2o.jar
Point your browser to http://localhost:54321 to open up the H2O Flow web GUI.
Install in R¶
Perform the following steps in R to install H2O. Copy and paste these commands one line at a time.
Note
By default, this setup is open. Follow security guidelines if you want to secure your installation.
The following two commands remove any previously installed H2O packages for R.
if ("package:h2o" %in% search()) { detach("package:h2o", unload=TRUE) } if ("h2o" %in% rownames(installed.packages())) { remove.packages("h2o") }
Next, download packages that H2O depends on.
pkgs <- c("RCurl","jsonlite") for (pkg in pkgs) { if (! (pkg %in% rownames(installed.packages()))) { install.packages(pkg) } }
Download and install the H2O package for R.
install.packages("h2o", type="source", repos=(c("http://h2o-release.s3.amazonaws.com/h2o/latest_stable_R")))
Optionally initialize H2O and run a demo to see H2O at work.
library(h2o) localH2O = h2o.init() demo(h2o.kmeans)
Installing H2O’s R Package from CRAN¶
Alternatively you can install H2O’s R package from CRAN or by typing install.packages("h2o")
in R. Sometimes there can be a delay in publishing the latest stable release to CRAN, so to guarantee you have the latest stable version, use the instructions above to install directly from the H2O website.
Install in Python¶
Note
By default, this setup is open. Follow security guidelines if you want to secure your installation.
Run the following commands in a Terminal window to install H2O for Python.
Install dependencies (prepending with
sudo
if needed):
pip install requests pip install tabulate pip install future # Required for plotting: pip install matplotlibNote: These are the dependencies required to run H2O.
matplotlib
is optional and only required to plot in H2O. A complete list of dependencies is maintained in the following file: https://github.com/h2oai/h2o-3/blob/master/h2o-py/conda/h2o-main/meta.yaml.
Run the following command to remove any existing H2O module for Python.
pip uninstall h2o
Use
pip
to install this version of the H2O Python module.
pip install -f http://h2o-release.s3.amazonaws.com/h2o/latest_stable_Py.html h2oNote: When installing H2O from
pip
in OS X El Capitan, users must include the--user
flag. For example:pip install -f http://h2o-release.s3.amazonaws.com/h2o/latest_stable_Py.html h2o --user
Optionally initialize H2O in Python and run a demo to see H2O at work.
import h2o h2o.init() h2o.demo("glm")
Install on Anaconda Cloud¶
This section describes how to set up and run H2O in an Anaconda Cloud environment. Conda 2.7, 3.5, and 3.6 repos are supported as are a number of H2O versions. Refer to https://anaconda.org/h2oai/h2o/files to view a list of available H2O versions.
Open a terminal window and run the following command to install H2O on the Anaconda Cloud. The H2O version in this command should match the version that you want to download. If you leave the h2o version blank and specify just h2o
, then the latest version will be installed. For example:
user$ conda install -c h2oai h2o=3.42.0.2
or:
user$ conda install -c h2oai h2o
Note: For Python 3.6 users, H2O has tabulate>=0.75
as a dependency; however, there is no tabulate
available in the default channels for Python 3.6. This is available in the conda-forge channel. As a result, Python 3.6 users must add the conda-forge
channel in order to load the latest version of H2O. This can be done by performing the following steps:
conda create -n py36 python=3.6 anaconda source activate py36 conda config --append channels conda-forge conda install -c h2oai h2o
After H2O is installed, refer to the Starting H2O from Anaconda section for information on how to start H2O and to view a GBM example run in Jupyter Notebook.
Install on Hadoop¶
Go to http://h2o-release.s3.amazonaws.com/h2o/latest_stable.html. Click on the Install on Hadoop tab, and download H2O for your version of Hadoop. This is a zip file that contains everything you need to get started.
Unpack the zip file and launch a 6g instance of H2O. For example:
unzip h2o-3.42.0.2-*.zip cd h2o-3.42.0.2-* hadoop jar h2odriver.jar -nodes 1 -mapperXmx 6g
Point your browser to H2O. (See “Open H2O Flow in your web browser” in the output below.)
Determining driver host interface for mapper->driver callback... [Possible callback IP address: 172.16.2.181] [Possible callback IP address: 127.0.0.1] ... Waiting for H2O cluster to come up... H2O node 172.16.2.188:54321 requested flatfile Sending flatfiles to nodes... [Sending flatfile to node 172.16.2.188:54321] H2O node 172.16.2.188:54321 reports H2O cluster size 1 H2O cluster (1 nodes) is up (Note: Use the -disown option to exit the driver after cluster formation) Open H2O Flow in your web browser: http://172.16.2.188:54321 (Press Ctrl-C to kill the cluster) Blocking until the H2O cluster shuts down...