Welcome to H2O-3¶
H2O-3 is an open source, in-memory, distributed, fast, and scalable machine learning and predictive analytics platform. It lets you build machine learning models on big data and provides easy productionalization of those models in an enterprise environment.
Basic framework¶
H2O-3’s core code is written in Java. A distributed key-value store is used to access and reference data, models, objects, etc. across all nodes and machines. The algorithms are implemented on top of H2O-3’s distributed map-reduce framework and utilize the Java fork/join framework for multi-threading. The data is read in parallel and is distributed across the cluster. It is stored in-memory in a columnar format in a compressed way. H2O’s data parser has built-in intelligence to guess the schema of the incoming dataset and supports data ingest from multiple sources in various formats.
REST API¶
H2O-3’s REST API allow access to all the capabilities of H2O-3 from an external program or script through JSON over HTTP. The REST API is used by H2O-3’s web interface (Flow UI), R binding (H2O-R), and Python binding (H2O-Python).
The speed, quality, ease-of-use, and model-deployment for our various supervised and unsupervised algorithms (such as Deep Learning, GLRM, or our tree ensembles) make H2O-3 a highly sought after API for big data data science.
H2O is licensed under the Apache License, Version 2.0.
Requirements¶
At a minimum, we recommend the following for compatibility with H2O-3:
Operating Systems:
Windows 7 or later
OS X 10.9 or later
Ubuntu 12.04
RHEL/CentOS 6 or later
Languages: R and Python are not required to use H2O-3 unless you want to use H2O in those environments, but Java is always required (see below).
R version 3 or later
Python 3.6.x, 3.7.x, 3.8.x, 3.9.x, 3.10.x, 3.11.x
Browser: An internet browser is required to use H2O-3’s web UI, Flow. Supported versions include the latest version of Chrome, Firefox, Safari, or Internet Explorer.
numpy: H2O-3 only supports
numpy<2
. To work around havingnumpy2
installed, run the following command:pip install --force-reinstall 'numpy<2'
Java Requirements¶
H2O-3 runs on Java. To build H2O-3 or run H2O-3 tests, the 64-bit JDK is required. To run the H2O-3 binary using either the command line, R, or Python packages, only 64-bit JRE is required.
H2O-3 supports the following versions of Java:
Java SE 17, 16, 15, 14, 13, 12, 11, 10, 9, 8
Operating systems:
Windows 7+
Mac OS 10.9+
Ubuntu 12.04
RHEL/CentOS 6+
Languages: R and Python are not required to use H2O-3 (unless you want to use H2O-3 in those environments), but Java is always required (see Java requirements).
R version 3+
Python 3.6.x, 3.7.x, 3.8.x, 3.9.x, 3.10.x, 3.11.x
Browser: An internet browser is required to use H2O-3’s web UI, Flow.
Google Chrome
Firefox
Safari
Microsoft Edge
Java requirements¶
H2O-3 runs on Java. The 64-bit JDK is required to build H2O-3 or run H2O-3 tests. Only the 64-bit JRE is required to run the H2O-3 binary using either the command line, R, or Python packages.
Java support¶
H2O-3 supports the following versions of Java:
Java SE 17
Java SE 16
Java SE 15
Java SE 14
Java SE 13
Java SE 12
Java SE 11
Java SE 10
Java SE 9
Java SE 8
Unsupported Java versions¶
We recommend that only power users force an unsupported Java version. Unsupported Java versions can only be used for experiments. For production versions, we only guarantee the Java versions from the supported list.
How to force an unsupported Java version¶
The following code forces an unsupported Java version:
java -jar -Dsys.ai.h2o.debug.allowJavaVersions=19 h2o.jar
Java support with H2O-3 and Hadoop¶
Java support is different between H2O-3 and Hadoop. Hadoop only supports Java 8 and Java 11. Therefore, when running H2O on Hadoop, we recommend only running H2O-3 on Java 8 or Java 11.
Optional requirements¶
This section outlines requirements for optional ways you can run H2O-3.
Optional Hadoop requirements¶
Hadoop is only required if you want to deploy H2O-3 on a Hadoop cluster. Supported versions are listed on the Downloads page (when you select the Install on Hadoop tab) and include:
Cloudera CDH 5.4+
Hortonworks HDP 2.2+
MapR 4.0+
IBM Open Platform 4.2
See the Hadoop users section for more details.
Optional Conda requirements¶
Conda is only required if you want to run H2O-3 on the Anaconda cloud:
Conda 3.6+ repository
Optional Spark requirements¶
Spark is only required if you want to run Sparkling Water. Supported spark versions:
Spark 3.4
Spark 3.3
Spark 3.2
Spark 3.1
Spark 3.0
Spark 2.4
Spark 2.3
User support¶
H2O-3 supports many different types of users.