At the command line, copy and paste these commands one line at a time:
# The following command removes the H2O module for Python.
pip uninstall h2o
# Next, use pip to install this version of the H2O Python module.
pip install /Python/h2o-3.10.5.1-py2.py3-none-any.whl
Use H2O directly from R
Copy and paste these commands into R one line at a time:
# The following two commands remove any previously installed H2O packages for R.
if ("package:h2o" %in% search()) { detach("package:h2o", unload=TRUE) }
if ("h2o" %in% rownames(installed.packages())) { remove.packages("h2o") }
# Next, we download packages that H2O depends on.
pkgs <- c("statmod","RCurl","jsonlite")
for (pkg in pkgs) {
if (! (pkg %in% rownames(installed.packages()))) { install.packages(pkg) }
}
# Now we download, install and initialize the H2O package for R.
install.packages("h2o", type="source", repos="/R")
# Finally, let's load H2O and start up an H2O cluster
library(h2o)
h2o.init()
Run H2O on Hadoop in just 3 steps
1. Download H2O for your version of Hadoop. This is a zip file that contains everything you need to get started.
2. Unpack the zip file and launch a 6g instance of H2O:
unzip h2o-3.10.5.1-*.zip
cd h2o-3.10.5.1-*
hadoop jar h2odriver.jar -nodes 1 -mapperXmx 6g -output hdfsOutputDirName
3. Point your browser to H2O (see "Open H2O Flow in your web browser" in the output below):
Determining driver host interface for mapper->driver callback...
[Possible callback IP address: 172.16.2.181]
[Possible callback IP address: 127.0.0.1]
...
Waiting for H2O cluster to come up...
H2O node 172.16.2.188:54321 requested flatfile
Sending flatfiles to nodes...
[Sending flatfile to node 172.16.2.188:54321]
H2O node 172.16.2.188:54321 reports H2O cluster size 1
H2O cluster (1 nodes) is up
(Note: Use the -disown option to exit the driver after cluster formation)
Open H2O Flow in your web browser: http://172.16.2.188:54321
(Press Ctrl-C to kill the cluster)
Blocking until the H2O cluster shuts down...