Get started with H2O in 3 easy steps
1. Download H2O. This is a zip file that contains everything you need to get started.
2. From your terminal, run:
cd ~/Downloads
unzip h2o-3.0.1.1.zip
cd h2o-3.0.1.1
java -jar h2o.jar
3. Point your browser to http://localhost:54321
Use H2O directly from Python
1. Prerequisite: Python 2.7
2. Install dependencies:
pip install requests
pip install tabulate
At the command line, copy and paste these commands one line at a time:
# The following command removes the H2O module for Python.
pip uninstall h2o
# Next, use pip to install this version of the H2O Python module.
pip install /Python/h2o-3.0.1.1-py2.py3-none-any.whl
Use H2O directly from R
Copy and paste these commands into R one line at a time:
# The following two commands remove any previously installed H2O packages for R.
if ("package:h2o" %in% search()) { detach("package:h2o", unload=TRUE) }
if ("h2o" %in% rownames(installed.packages())) { remove.packages("h2o") }
# Next, we download packages that H2O depends on.
if (! ("methods" %in% rownames(installed.packages()))) { install.packages("methods") }
if (! ("statmod" %in% rownames(installed.packages()))) { install.packages("statmod") }
if (! ("stats" %in% rownames(installed.packages()))) { install.packages("stats") }
if (! ("graphics" %in% rownames(installed.packages()))) { install.packages("graphics") }
if (! ("RCurl" %in% rownames(installed.packages()))) { install.packages("RCurl") }
if (! ("rjson" %in% rownames(installed.packages()))) { install.packages("rjson") }
if (! ("tools" %in% rownames(installed.packages()))) { install.packages("tools") }
if (! ("utils" %in% rownames(installed.packages()))) { install.packages("utils") }
# Now we download, install and initialize the H2O package for R.
install.packages("h2o", type="source", repos=(c("/R")))
library(h2o)
localH2O = h2o.init()
# Finally, let's run a demo to see H2O at work.
demo(h2o.kmeans)
Run H2O on Hadoop in just 3 steps
1. Download H2O for your version of Hadoop. This is a zip file that contains everything you need to get started.
wget /h2o-3.0.1.1-cdh5.2.zip
wget /h2o-3.0.1.1-cdh5.3.zip
wget /h2o-3.0.1.1-cdh5.4.2.zip
wget /h2o-3.0.1.1-hdp2.1.zip
wget /h2o-3.0.1.1-hdp2.2.zip
wget /h2o-3.0.1.1-mapr3.1.1.zip
wget /h2o-3.0.1.1-mapr4.0.1.zip
2. Unpack the zip file and launch a 6g instance of H2O:
unzip h2o-3.0.1.1-*.zip
cd h2o-3.0.1.1-*
hadoop jar h2odriver.jar -nodes 1 -mapperXmx 6g -output hdfsOutputDirName
3. Point your browser to one of the H2O nodes (i.e. "H2O node ... reports H2O cluster size" in the output below):
Determining driver host interface for mapper->driver callback...
[Possible callback IP address: 172.16.2.181]
[Possible callback IP address: 127.0.0.1]
...
Waiting for H2O cluster to come up...
H2O node 172.16.2.184:54321 requested flatfile
Sending flatfiles to nodes...
[Sending flatfile to node 172.16.2.184:54321]
H2O node 172.16.2.184:54321 reports H2O cluster size 1
H2O cluster (1 nodes) is up
Blocking until the H2O cluster shuts down...
Gradle-style specification for Maven artifacts
See the h2o-droplets github repository for a working example.
def h2oProjectVersion = "3.0.1.1"
repositories {
maven {
url "/maven/repo/"
}
}
dependencies {
compile "ai.h2o:h2o-core:${h2oProjectVersion}"
compile "ai.h2o:h2o-algos:${h2oProjectVersion}"
compile "ai.h2o:h2o-web:${h2oProjectVersion}"
compile "ai.h2o:h2o-app:${h2oProjectVersion}"
}