2. Install dependencies (prepending with `sudo` if needed):
pip install h2o
# This line is needed only if there are import errors when running h2o.
pip install colorama requests tabulate future --upgrade
# If you have Anaconda use:
pip install tabulate
At the command line, copy and paste these commands one line at a time:
# The following command removes the H2O module for Python.
pip uninstall h2o
# Next, use pip to install this version of the H2O Python module.
pip install /Python/h2o-3.10.0.10-py2.py3-none-any.whl
Use H2O directly from R
Copy and paste these commands into R one line at a time:
# The following two commands remove any previously installed H2O packages for R.
if ("package:h2o" %in% search()) { detach("package:h2o", unload=TRUE) }
if ("h2o" %in% rownames(installed.packages())) { remove.packages("h2o") }
# Next, we download packages that H2O depends on.
pkgs <- c("methods","statmod","stats","graphics","RCurl","jsonlite","tools","utils")
for (pkg in pkgs) {
if (! (pkg %in% rownames(installed.packages()))) { install.packages(pkg, repos = "http://cran.rstudio.com/") }
}
# Now we download, install and initialize the H2O package for R.
install.packages("h2o", type="source", repos=(c("/R")))
library(h2o)
localH2O = h2o.init(nthreads=-1)
# Finally, let's run a demo to see H2O at work.
demo(h2o.kmeans)
Run H2O on Hadoop in just 3 steps
1. Download H2O for your version of Hadoop. This is a zip file that contains everything you need to get started.
2. Unpack the zip file and launch a 6g instance of H2O:
unzip h2o-3.10.0.10-*.zip
cd h2o-3.10.0.10-*
hadoop jar h2odriver.jar -nodes 1 -mapperXmx 6g -output hdfsOutputDirName
3. Point your browser to H2O (see "Open H2O Flow in your web browser" in the output below):
Determining driver host interface for mapper->driver callback...
[Possible callback IP address: 172.16.2.181]
[Possible callback IP address: 127.0.0.1]
...
Waiting for H2O cluster to come up...
H2O node 172.16.2.188:54321 requested flatfile
Sending flatfiles to nodes...
[Sending flatfile to node 172.16.2.188:54321]
H2O node 172.16.2.188:54321 reports H2O cluster size 1
H2O cluster (1 nodes) is up
(Note: Use the -disown option to exit the driver after cluster formation)
Open H2O Flow in your web browser: http://172.16.2.188:54321
(Press Ctrl-C to kill the cluster)
Blocking until the H2O cluster shuts down...