Generic parameters |
|
|
spark.ext.h2o.backend.cluster.mode |
internal |
This option can be set either to
internal or external . When set
to external , H2O Context is
created by connecting to existing H2O
cluster, otherwise H2O cluster located
inside Spark is created. That means
that each Spark executor will have one
H2O instance running in it. The
internal mode is not recommended
for big clusters and clusters where
Spark executors are not stable. |
spark.ext.h2o.cloud.name |
Generated
unique name |
Name of H2O cluster. |
spark.ext.h2o.nthreads |
-1 |
Limit for number of threads used by
H2O, default -1 means:
Use value of spark.executor.cores
in case this property is set.
Otherwise use H2O’s default value
Runtime.getRuntime().availableProcessors() . |
spark.ext.h2o.repl.enabled |
true |
Decides whether H2O REPL is initiated
or not. |
spark.ext.scala.int.default.num |
1 |
Number of parallel REPL sessions
started at the start of Sparkling
Water |
spark.ext.h2o.topology.change.listener.enabled |
true |
Decides whether listener which kills
H2O cluster on the change of the
underlying cluster’s topology is
enabled or not. This configuration
has effect only in non-local mode. |
spark.ext.h2o.spark.version.check.enabled |
true |
Enables check if run-time Spark
version matches build time Spark
version. |
spark.ext.h2o.fail.on.unsupported.spark.param |
true |
If unsupported Spark parameter is
detected, then application is forced
to shutdown. |
spark.ext.h2o.jks |
None |
Path to Java KeyStore file. |
spark.ext.h2o.jks.pass |
None |
Password for Java KeyStore file. |
spark.ext.h2o.jks.alias |
None |
Alias to certificate in keystore to
secure H2O Flow. |
spark.ext.h2o.hash.login |
false |
Enable hash login. |
spark.ext.h2o.ldap.login |
false |
Enable LDAP login. |
spark.ext.h2o.kerberos.login |
false |
Enable Kerberos login. |
spark.ext.h2o.login.conf |
None |
Login configuration file. |
spark.ext.h2o.user.name |
None |
Username used for the backend H2O
cluster and to authenticate the
client against the backend. |
spark.ext.h2o.password |
None |
Password used to authenticate the
client against the backend. |
spark.ext.h2o.internal_security_conf |
None |
Path to a file containing H2O or
Sparkling Water internal security
configuration. |
spark.ext.h2o.auto.flow.ssl |
false |
Automatically generate the required
key store and password to secure H2O
flow by SSL. |
spark.ext.h2o.node.log.level |
INFO |
H2O internal log level used for H2O
nodes except the client. |
spark.ext.h2o.node.log.dir |
{user.dir}/h
2ologs/{SparkA
ppId}
or YARN
container dir |
Location of H2O logs on H2O nodes
except on the client. |
spark.ext.h2o.backend.heartbeat.interval |
10000ms |
Interval for getting heartbeat from
the H2O backend. |
spark.ext.h2o.cloud.timeout |
60*1000 |
Timeout (in msec) for cluster
formation. |
spark.ext.h2o.node.network.mask |
None |
Subnet selector for H2O running inside
Spark executors. This disables using
IP reported by Spark but tries to find
IP based on the specified mask. |
spark.ext.h2o.stacktrace.collector.interval |
-1 |
Interval specifying how often stack
traces are taken on each H2O node.
-1 means that no stack traces will be
taken. |
spark.ext.h2o.context.path |
None |
Context path to expose H2O web server. |
spark.ext.h2o.flow.scala.cell.async |
false |
Decide whether the Scala cells in
H2O Flow will run synchronously or
Asynchronously. Default is
synchronously. |
spark.ext.h2o.flow.scala.cell.max.parallel |
-1 |
Number of max parallel Scala cell
jobs The value -1 means
not limited. |
spark.ext.h2o.internal.port.offset |
1 |
Offset between the API(=web) port and
the internal communication port on the
client node;
api_port + port_offset = h2o_port |
spark.ext.h2o.node.port.base |
54321 |
Base port used for individual H2O
nodes. |
spark.ext.h2o.mojo.destroy.timeout |
600000 |
If a scoring MOJO instance is not used
within a Spark executor JVM for
a given timeout in milliseconds, it’s
evicted from executor’s cache. Default
timeout value is 10 minutes. |
spark.ext.h2o.cluster.client.retry.timeout |
60000 |
Timeout in milliseconds specifying
how often we check whether the
the client is still connected. |
spark.ext.h2o.node.extra |
None |
A string containing extra parameters
passed to H2O nodes during startup.
This parameter should be configured
only if H2O parameters do not have any
corresponding parameters in Sparkling
Water. |
spark.ext.h2o.flow.extra.http.headers |
None |
Extra HTTP headers that will be used
in communication between the front-end
and back-end part of Flow UI.
The headers should be delimited by
a new line. Don’t forget to escape
special characters when passing
the parameter from a command line. |
spark.ext.h2o.internal_secure_connections |
false |
Enables secure communications among
H2O nodes. The security is based on
automatically generated keystore
and truststore. This is equivalent for
-internal_secure_conections option
in H2O Hadoop deployments. |
H2O client parameters |
|
|
spark.ext.h2o.client.flow.dir |
None |
Directory where flows from H2O Flow
are saved. |
spark.ext.h2o.client.ip |
None |
IP of H2O client node. |
spark.ext.h2o.client.iced.dir |
None |
Location of iced directory for the
driver instance. |
spark.ext.h2o.client.log.level |
INFO |
H2O internal log level used for H2O
client running inside Spark driver. |
spark.ext.h2o.client.log.dir |
{user.dir}/h
2ologs/{SparkA
ppId} |
Location of H2O logs on the driver
machine. |
spark.ext.h2o.client.port.base |
54321 |
Port on which H2O client publishes
its API. If already occupied, the next
odd port is tried on so on. |
spark.ext.h2o.client.web.port |
-1 |
Exact client port to access web UI.
The value -1 means automatic
search for a free port starting at
spark.ext.h2o.port.base . |
spark.ext.h2o.client.verbose |
false |
The client outputs verbose log output
directly into console. Enabling the
flag increases the client log level to
INFO . |
spark.ext.h2o.client.network.mask |
None |
Subnet selector for H2O client, this
disables using IP reported by Spark
but tries to find IP based on the
specified mask. |
spark.ext.h2o.client.ignore.SPARK_PUBLIC_DNS |
false |
Ignore SPARK_PUBLIC_DNS setting on
the H2O client. The option still
applies to the Spark application. |
spark.ext.h2o.client.enable.web |
true |
Enable or disable web on h2o client
node. It is enabled by default.
Disabling the web just on the client
node just restricts everybody from
accessing flow, the internal ports
between client and rest of the cluster
remain open. |
spark.ext.h2o.client.flow.baseurl.override |
None |
Allows to override the base URL
address of Flow UI, including the
scheme, which is showed to the user. |
spark.ext.h2o.client.extra |
None |
A string containing extra parameters
passed to H2O client during startup.
This parameter should be configured
only if H2O parameters do not have any
corresponding parameters in Sparkling
Water. |