Note
The following instructions should work on a reasonably modern OS X (10.6 and up), but were only tested on OS X 10.9
Warning
Homebrew might cause conflicts if you already have MacPorts enabled. Proceed with the next step at your own risk or use MacPorts instead of Homebrew.
$ ruby -e "$(curl -fsSL https://raw.github.com/mxcl/homebrew/go/install)" $ brew doctor $ brew update
$ sudo easy_install sphinx $ sudo easy_install sphinxcontrib-fulltoc
$ brew install poppler
$ git clone https://github.com/0xdata/h2o.git
$ cd h2o $ make
$ brew install hadoop
$ sudo chmod -R a+w /usr/local/{include,lib,etc}
Note: In Hadoop 1.x these files are found in, e.g., /usr/local/Cellar/hadoop/1.2.1/libexec/conf/. In Hadoop 2.x these files are found in, e.g., /usr/local/Cellar/hadoop/2.2.0/libexec/etc/hadoop/.
Modify core-site.xml to contain the following:
<configuration> <property> <name>fs.default.name</name> <value>hdfs://localhost:8020</value> </property> </configuration>Modify mapred-site.xml to contain the following (NOTE: you may need to create the file from mapred-site.xml.template):
<configuration> <property> <name>mapred.job.tracker</name> <value>localhost:9001</value> </property> <property> <name>mapred.tasktracker.map.tasks.maximum</name> <value>5</value> </property> </configuration>Modify hdfs-site.xml to contain the following:
<configuration> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration>
First enable remote login in the system sharing control panel, and then:
$ brew install ssh-copy-id $ ssh-keygen $ ssh-copy-id -i ~/.ssh/id_rsa.pub localhost
$ /usr/local/Cellar/hadoop/1.2.1/bin/start-all.sh
or
$ /usr/local/Cellar/hadoop/2.2.0/sbin/start-dfs.sh $ /usr/local/Cellar/hadoop/2.2.0/sbin/start-yarn.sh
$ jps 81829 JobTracker 81556 NameNode 81756 SecondaryNameNode 9382 Jps 81655 DataNode 81928 TaskTracker
$ hadoop namenode -format $ hadoop dfsadmin -safemode leave
$ hadoop jar target/hadoop/h2odriver_cdh4.jar water.hadoop.h2odriver \ -libjars target/h2o.jar -mapperXmx 1g -nodes 5 -output out
$ hadoop fs -rmr out