Launch H2O from Command LineΒΆ

Launch scripts are available on our Github repository From Source Code (Github)

Step 1

Prerequisite - Install boto python library. In order to run the scripts you must have the boto library installed, references available are available on boto and amazon’s website.

Step 2

First edit launch script for parameter changes, refer to EC2 Glossary for help.

# Environment variables you MUST set (either here or by passing them in).
# -----------------------------------------------------------------------
os.environ['AWS_ACCESS_KEY_ID'] = '...'
os.environ['AWS_SECRET_ACCESS_KEY'] = '...'
os.environ['AWS_SSH_PRIVATE_KEY_FILE'] = '/path/to/private_key.pem'

# Launch EC2 instances with an IAM role
# --------------------------------------
# Change either but not both the IAM Profile Name.
iam_profile_resource_name = None
iam_profile_name = 'testing_role'
# SSH key pair name.
keyName = 'testing_key'
securityGroupName = 'SecurityDisabled'
numInstancesToLaunch = 2
instanceType = 't1.micro'
instanceNameRoot = 'amy_is_testing'
regionName = 'us-east-1'
amiId = 'ami-ed550784'

Step 3

Launch the EC2 instances using the H2O AMI by running

$ python
Using boto version 2.27.0
Launching 2 instances.
Waiting for instance 1 of 2 ...
  instance 1 of 2 is up.
Waiting for instance 2 of 2 ...
  instance 2 of 2 is up.

Creating output files:  nodes-public nodes-private

Instance 1 of 2
  Name:    amy_is_testing0

Instance 2 of 2
  Name:    amy_is_testing1

Sleeping for 60 seconds for ssh to be available...
Testing ssh access ...

Distributing flatfile...

Step 4

Download the latest build of H2O onto each of the instances using ./ –OR– ./ Download will typically be faster than distribute since the file is being downloaded from S3.

$ ./
Fetching latest build number for branch master...
Fetching full version number for build 1480...
Downloading H2O version to cluster...
Downloading h2o.jar to node 1:
Downloading h2o.jar to node 2:
Warning: Permanently added ','
(RSA) to the list of known hosts.
Warning: Permanently added ','
(RSA) to the list of known hosts.
Unzipping h2o.jar within node 1:
Unzipping h2o.jar within node 2:
Copying h2o.jar within node 1:
Copying h2o.jar within node 2:

Step 5

Distribute a flatfile.txt of all the private node IP address.

$ ./
Copying flatfile to node 1:
flatfile.txt                             100%   40     0.0KB/s   00:00
Copying flatfile to node 2:
flatfile.txt                             100%   40     0.0KB/s   00:00

Step 6

[Optional] For users that want to import data from a private S3 bucket, permission must be given to each launched node. If the cluster was launched without an IAM profile and policy, then AWS credentials would have to be distributed to each node as a file using ./ If cluster was launched with IAM profile H2O will detect the temporary credentials on the cluster.

$ ./
Copying aws credential files to node 1:
core-site.xml                              100%  500     0.5KB/s   00:00                 100%   82     0.1KB/s   00:00
Copying aws credential files to node 2:
core-site.xml                              100%  500     0.0KB/s   00:17                 100%   82     0.1KB/s   00:00

Step 7

Start H2O by executing ./

Starting on node 1:
JAVA_HOME is ./jdk1.7.0_40
java version "1.7.0_40"
Java(TM) SE Runtime Environment (build 1.7.0_40-b43)
Java HotSpot(TM) 64-Bit Server VM (build 24.0-b56, mixed mode)
01:55:18.438 main      INFO WATER: ----- H2O started -----
01:55:18.632 main      INFO WATER: Build git branch: master
01:55:18.633 main      INFO WATER: Build git hash: 1fbeb98671c73d4e2a61fc3defecb6bd1646c4d5
01:55:18.633 main      INFO WATER: Build git describe: nn-2-9356-g1fbeb98
01:55:18.634 main      INFO WATER: Build project version:
01:55:18.634 main      INFO WATER: Built by: 'jenkins'
01:55:18.635 main      INFO WATER: Built on: 'Thu Aug 21 23:51:30 PDT 2014'
01:55:18.635 main      INFO WATER: Java availableProcessors: 1
01:55:18.649 main      INFO WATER: Java heap totalMemory: 0.01 gb
01:55:18.649 main      INFO WATER: Java heap maxMemory: 0.14 gb
01:55:18.650 main      INFO WATER: Java version: Java 1.7.0_40 (from Oracle Corporation)
01:55:18.651 main      INFO WATER: OS   version: Linux 2.6.32-358.14.1.el6.x86_64 (amd64)
01:55:18.959 main      INFO WATER: Machine physical memory: 0.58 gb
Starting on node 2:
JAVA_HOME is ./jdk1.7.0_40
java version "1.7.0_40"
Java(TM) SE Runtime Environment (build 1.7.0_40-b43)
Java HotSpot(TM) 64-Bit Server VM (build 24.0-b56, mixed mode)
01:55:21.983 main      INFO WATER: ----- H2O started -----
01:55:22.067 main      INFO WATER: Build git branch: master
01:55:22.068 main      INFO WATER: Build git hash: 1fbeb98671c73d4e2a61fc3defecb6bd1646c4d5
01:55:22.068 main      INFO WATER: Build git describe: nn-2-9356-g1fbeb98
01:55:22.069 main      INFO WATER: Build project version:
01:55:22.069 main      INFO WATER: Built by: 'jenkins'
01:55:22.069 main      INFO WATER: Built on: 'Thu Aug 21 23:51:30 PDT 2014'
01:55:22.070 main      INFO WATER: Java availableProcessors: 1
01:55:22.082 main      INFO WATER: Java heap totalMemory: 0.01 gb
01:55:22.082 main      INFO WATER: Java heap maxMemory: 0.14 gb
01:55:22.083 main      INFO WATER: Java version: Java 1.7.0_40 (from Oracle Corporation)
01:55:22.084 main      INFO WATER: OS   version: Linux 2.6.32-358.14.1.el6.x86_64 (amd64)
01:55:22.695 main      INFO WATER: Machine physical memory: 0.58 gb