.. _sw_config_properties:

Sparkling Water Configuration Properties
----------------------------------------

The following configuration properties can be passed to Spark to configure Sparking Water.

Configuration properties independent of selected backend
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| Property name                                           | Default value | H2OConf setter (* getter_)                                           | Description                                                                                                                                            |
+=========================================================+===============+======================================================================+========================================================================================================================================================+
| ``spark.ext.h2o.backend.cluster.mode``                  | internal      | ``setInternalClusterMode()``                                         | This option can be set either to ``internal`` or ``external``. When set to ``external``, ``H2O Context`` is                                            |
|                                                         |               |                                                                      | created by connecting to existing H2O cluster, otherwise H2O cluster located inside Spark is created. That                                             |
|                                                         |               | ``setExternalClusterMode()``                                         | means that each Spark executor will have one H2O instance running in it. The ``internal`` mode is not                                                  |
|                                                         |               |                                                                      | recommended for big clusters and clusters where Spark executors are not stable.                                                                        |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.cloud.name``                            | None          | ``setCloudName(String)``                                             | Name of H2O cluster. If this option is not set, the name is automatically generated                                                                    |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.nthreads``                              | -1            | ``setNthreads(Integer)``                                             | Limit for number of threads used by H2O.                                                                                                               |
|                                                         |               |                                                                      | Default ``-1`` using internal backend means: Use the value of ``spark.executor.cores`` if the property is set,                                         |
|                                                         |               |                                                                      | otherwise use H2O's default value Runtime.getRuntime().availableProcessors().                                                                          |
|                                                         |               |                                                                      | Default ``-1`` using automatically started external backend on Hadoop means:                                                                           |
|                                                         |               |                                                                      | Use H2O's default value Runtime.getRuntime().availableProcessors()                                                                                     |
|                                                         |               |                                                                      | Default ``-1`` using automatically started external backend on Kubernetes means: Use just one cpu.                                                     |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.progressbar.enabled``                   | true          | ``setProgressBarEnabled()``                                          | Decides whether to display progress bar related to H2O jobs on stdout or not.                                                                          |
|                                                         |               |                                                                      |                                                                                                                                                        |
|                                                         |               | ``setProgressBarDisabled()``                                         |                                                                                                                                                        |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.model.print.after.training.enabled``    | true          | ``setModelPrintAfterTrainingEnabled()``                              | Decides whether to display model info on stdout after training or not.                                                                                 |
|                                                         |               |                                                                      |                                                                                                                                                        |
|                                                         |               | ``setModelPrintAfterTrainingDisabled()``                             |                                                                                                                                                        |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.repl.enabled``                          | true          | ``setReplEnabled()``                                                 | Decides whether H2O REPL is initiated or not.                                                                                                          |
|                                                         |               |                                                                      |                                                                                                                                                        |
|                                                         |               | ``setReplDisabled()``                                                |                                                                                                                                                        |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.scala.int.default.num``                     | 1             | ``setDefaultNumReplSessions(Integer)``                               | Number of parallel REPL sessions started at the start of Sparkling Water.                                                                              |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.topology.change.listener.enabled``      | true          | ``setClusterTopologyListenerEnabled()``                              | Decides whether listener which kills H2O cluster on the change of the underlying cluster's topology is                                                 |
|                                                         |               |                                                                      | enabled or not. This configuration has effect only in non-local mode.                                                                                  |
|                                                         |               | ``setClusterTopologyListenerDisabled()``                             |                                                                                                                                                        |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.spark.version.check.enabled``           | true          | ``setSparkVersionCheckEnabled()``                                    | Enables check if run-time Spark version matches build time Spark version.                                                                              |
|                                                         |               |                                                                      |                                                                                                                                                        |
|                                                         |               | ``setSparkVersionCheckDisabled()``                                   |                                                                                                                                                        |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.fail.on.unsupported.spark.param``       | true          | ``setFailOnUnsupportedSparkParamEnabled()``                          | If unsupported Spark parameter is detected, then application is forced to shutdown.                                                                    |
|                                                         |               |                                                                      |                                                                                                                                                        |
|                                                         |               | ``setFailOnUnsupportedSparkParamDisabled()``                         |                                                                                                                                                        |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.jks``                                   | None          | ``setJks(String)``                                                   | Path to a Java keystore file with certificates securing H2O Flow UI and internal REST connections between                                              |
|                                                         |               |                                                                      | instances (driver + executors) and H2O nodes. When configuring this property, you must consider that a Spark executor                                  |
|                                                         |               |                                                                      | can communicate to any of H2O nodes and verifies H2O node according to the hostname specified in the keystore                                          |
|                                                         |               |                                                                      | certificate. You can consider usage of a wildcard certificate or you can disable the hostname verification                                             |
|                                                         |               |                                                                      | completely with the ``spark.ext.h2o.verify_ssl_hostnames`` property.                                                                                   |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.jks.pass``                              | None          | ``setJksPass(String)``                                               | Password for the Java keystore file.                                                                                                                   |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.jks.alias``                             | None          | ``setJksAlias(String)``                                              | Alias to certificate in the to the Java keystore file to secure H2O Flow UI and internal REST connections                                              |
|                                                         |               |                                                                      | between Spark instances (driver + executors) and H2O nodes.                                                                                            |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.ssl.ca.cert``                           | None          | ``setSslCACert(String)``                                             | A path to a CA bundle file or a directory with certificates of trusted CAs. This path is used by RSparkling or                                         |
|                                                         |               |                                                                      | PySparking for connecting to a Sparkling Water backend.                                                                                                |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.hash.login``                            | false         | ``setHashLoginEnabled()``                                            | Enable hash login.                                                                                                                                     |
|                                                         |               |                                                                      |                                                                                                                                                        |
|                                                         |               | ``setHashLoginDisabled()``                                           |                                                                                                                                                        |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.ldap.login``                            | false         | ``setLdapLoginEnabled()``                                            | Enable LDAP login.                                                                                                                                     |
|                                                         |               |                                                                      |                                                                                                                                                        |
|                                                         |               | ``setLdapLoginDisabled()``                                           |                                                                                                                                                        |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.proxy.login.only``                      | false         | ``setProxyLoginOnlyEnabled()``                                       | Enable proxy only login for the chosen login method.                                                                                                   |
|                                                         |               |                                                                      |                                                                                                                                                        |
|                                                         |               | ``setProxyLoginOnlyDisabled()``                                      |                                                                                                                                                        |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.kerberos.login``                        | false         | ``setKerberosLoginEnabled()``                                        | Enable Kerberos login.                                                                                                                                 |
|                                                         |               |                                                                      |                                                                                                                                                        |
|                                                         |               | ``setKerberosLoginDisabled()``                                       |                                                                                                                                                        |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.pam.login``                             | false         | ``setPamLoginEnabled()``                                             | Enable PAM login. PAM has to be configured on the system where Spark driver is running.                                                                |
|                                                         |               |                                                                      |                                                                                                                                                        |
|                                                         |               | ``setPamLoginDisabled()``                                            |                                                                                                                                                        |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.login.conf``                            | None          | ``setLoginConf(String)``                                             | Login configuration file.                                                                                                                              |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.user.name``                             | None          | ``setUserName(String)``                                              | Username used for the backend H2O cluster and to authenticate the client against the backend.                                                          |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.password``                              | None          | ``setPassword(String)``                                              | Password used to authenticate the client against the backend.                                                                                          |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.internal_security_conf``                | None          | ``setSslConf(String)``                                               | Path to a file containing H2O or Sparkling Water internal security configuration.                                                                      |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.auto.flow.ssl``                         | false         | ``setAutoFlowSslEnabled()``                                          | Automatically generate the required key store and password to secure secure H2O Flow UI and internal REST                                              |
|                                                         |               |                                                                      | connections between Spark executors and H2O nodes. Hostname verification is disabled when creating SSL                                                 |
|                                                         |               | ``setAutoFlowSslDisabled()``                                         | connections to H2O nodes.                                                                                                                              |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.log.level``                             | INFO          | ``setLogLevel(String)``                                              | H2O log level.                                                                                                                                         |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.log.dir``                               | None          | ``setLogDir(String)``                                                | Location of H2O logs. When not specified, it uses {user.dir}/h2ologs/{AppId} or YARN container dir                                                     |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.backend.heartbeat.interval``            | 10000         | ``setBackendHeartbeatInterval(Integer)``                             | Interval (in msec) for getting heartbeat from the H2O backend.                                                                                         |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.cloud.timeout``                         | 60000         | ``setCloudTimeout(Integer)``                                         | Timeout (in msec) for cluster formation.                                                                                                               |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.node.network.mask``                     | None          | ``setNodeNetworkMask(String)``                                       | Subnet selector for H2O running inside park executors. This disables using IP reported by Spark but tries to                                           |
|                                                         |               |                                                                      | find IP based on the specified mask.                                                                                                                   |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.stacktrace.collector.interval``         | -1            | ``setStacktraceCollectorInterval(Integer)``                          | Interval specifying how often stack traces are taken on each H2O node. -1 means                                                                        |
|                                                         |               |                                                                      | that no stack traces will be taken                                                                                                                     |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.context.path``                          | None          | ``setContextPath(String)``                                           | Context path to expose H2O web server.                                                                                                                 |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.flow.scala.cell.async``                 | false         | ``setFlowScalaCellAsyncEnabled()``                                   | Decide whether the Scala cells in H2O Flow will run synchronously or Asynchronously. Default is synchronously.                                         |
|                                                         |               |                                                                      |                                                                                                                                                        |
|                                                         |               | ``setFlowScalaCellAsyncDisabled()``                                  |                                                                                                                                                        |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.flow.scala.cell.max.parallel``          | -1            | ``setMaxParallelScalaCellJobs(Integer)``                             | Number of max parallel Scala cell jobs. The value -1 means not limited.                                                                                |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.internal.port.offset``                  | 1             | ``setInternalPortOffset(Integer)``                                   | Offset between the API(=web) port and the internal communication port on the client                                                                    |
|                                                         |               |                                                                      | node; ``api_port + port_offset = h2o_port``                                                                                                            |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.base.port``                             | 54321         | ``setBasePort(Integer)``                                             | Base port used for individual H2O nodes                                                                                                                |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.mojo.destroy.timeout``                  | 600000        | ``setMojoDestroyTimeout(Integer)``                                   | If a scoring MOJO instance is not used within a Spark executor JVM for a given timeout in milliseconds, it's                                           |
|                                                         |               |                                                                      | evicted from executor's cache. Default timeout value is 10 minutes.                                                                                    |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.extra.properties``                      | None          | ``setExtraProperties(String)``                                       | A string containing extra parameters passed to H2O nodes during startup. This parameter should be                                                      |
|                                                         |               |                                                                      | configured only if H2O parameters do not have any corresponding parameters in Sparkling Water.                                                         |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.flow.dir``                              | None          | ``setFlowDir(String)``                                               | Directory where flows from H2O Flow are saved.                                                                                                         |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.flow.extra.http.headers``               | None          | ``setFlowExtraHttpHeaders(Map[String,String])``                      | Extra HTTP headers that will be used in communication between the front-end and back-end part of Flow UI. The                                          |
|                                                         |               |                                                                      | headers should be delimited by a new line. Don't forget to escape special characters when passing                                                      |
|                                                         |               | ``setFlowExtraHttpHeaders(String)``                                  | the parameter from a command line. Example: ``"spark.ext.h2o.flow.extra.http.headers=Strict-Transport-Security:max-age=31536000"``                     |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.flow.proxy.request.maxSize``            | 32768         | ``setFlowProxyRequestMaxSize(Integer)``                              | The maximum size of a request coming to flow UI proxy running on the Spark driver. The request is forwarded to Flow UI on H2O leader node.             |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.flow.proxy.response.maxSize``           | 32768         | ``setFlowProxyResponseMaxSize(Integer)``                             | The maximum size of a response coming from flow UI proxy running on the Spark driver. The content for the response comes from Flow UI H2O leader node. |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.internal_secure_connections``           | false         | ``setInternalSecureConnectionsEnabled()``                            | Enables secure communications among H2O nodes. The security is based on                                                                                |
|                                                         |               |                                                                      | automatically generated keystore and truststore. This is equivalent for                                                                                |
|                                                         |               | ``setInternalSecureConnectionsDisabled()``                           | ``-internal_secure_conections`` option in H2O Hadoop. More information                                                                                 |
|                                                         |               |                                                                      | is available in the H2O documentation.                                                                                                                 |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.allow_insecure_xgboost``                | false         | ``setInsecureXGBoostAllowed()``                                      | If the property set to true, insecure communication among H2O nodes is                                                                                 |
|                                                         |               |                                                                      | allowed for the XGBoost algorithm even if the other security options are enabled                                                                       |
|                                                         |               | ``setInsecureXGBoostDenied()``                                       |                                                                                                                                                        |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.client.ip``                             | None          | ``setClientIp(String)``                                              | IP of H2O client node.                                                                                                                                 |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.client.web.port``                       | -1            | ``setClientWebPort(Integer)``                                        | Exact client port to access web UI. The value ``-1`` means automatic                                                                                   |
|                                                         |               |                                                                      | search for a free port starting at ``spark.ext.h2o.base.port``.                                                                                        |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.client.verbose``                        | false         | ``setClientVerboseEnabled()``                                        | The client outputs verbose log output directly into console. Enabling the                                                                              |
|                                                         |               |                                                                      | flag increases the client log level to ``INFO``.                                                                                                       |
|                                                         |               | ``setClientVerboseDisabled()``                                       |                                                                                                                                                        |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.client.network.mask``                   | None          | ``setClientNetworkMask(String)``                                     | Subnet selector for H2O client, this disables using IP reported by Spark                                                                               |
|                                                         |               |                                                                      | but tries to find IP based on the specified mask.                                                                                                      |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.client.flow.baseurl.override``          | None          | ``setClientFlowBaseurlOverride(String)``                             | Allows to override the base URL address of Flow UI, including the                                                                                      |
|                                                         |               |                                                                      | scheme, which is showed to the user.                                                                                                                   |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.cluster.client.retry.timeout``          | 60000         | ``setClientCheckRetryTimeout(Integer)``                              | Timeout in milliseconds specifying how often we check whether the                                                                                      |
|                                                         |               |                                                                      | the client is still connected.                                                                                                                         |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.verify_ssl_certificates``               | true          | ``setVerifySslCertificates(Boolean)``                                | If the property is enabled, Pysparkling or RSparkling client will verify certificates when connecting                                                  |
|                                                         |               |                                                                      | Sparkling Water Flow UI.                                                                                                                               |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.internal.rest.verify_ssl_certificates`` | true          | ``setSslCertificateVerificationInInternalRestConnectionsEnabled()``  | If the property is enabled, Sparkling Water will verify ssl certificates during establishing secured http connections                                  |
|                                                         |               |                                                                      | to one of H2O nodes. Such connections are utilized for delegation of Flow UI calls to H2O leader node or                                               |
|                                                         |               | ``setSslCertificateVerificationInInternalRestConnectionsDisabled()`` | during data exchange between Spark executors and H2O nodes. If the property is disabled, hostname verification is                                      |
|                                                         |               |                                                                      | disabled as well.                                                                                                                                      |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.internal.rest.verify_ssl_hostnames``    | true          | ``setSslHostnameVerificationInInternalRestConnectionsEnabled()``     | If the property is enabled, Sparkling Water will verify a hostname during establishing of secured http connections                                     |
|                                                         |               |                                                                      | to one of H2O nodes. Such connections are utilized for delegation of Flow UI calls to H2O leader node or                                               |
|                                                         |               | ``setSslHostnameVerificationInInternalRestConnectionsDisabled()``    | during data exchange between Spark executors and H2O nodes.                                                                                            |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.kerberized.hive.enabled``               | false         | ``setKerberizedHiveEnabled()``                                       | If enabled, H2O instances will create  JDBC connections to a Kerberized Hive                                                                           |
|                                                         |               |                                                                      | so that all clients can read data from HiveServer2. Don't forget to put                                                                                |
|                                                         |               | ``setKerberizedHiveDisabled()``                                      | a jar with Hive driver on Spark classpath if the internal backend is used.                                                                             |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.hive.host``                             | None          | ``setHiveHost(String)``                                              | The full address of HiveServer2, for example hostname:10000.                                                                                           |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.hive.principal``                        | None          | ``setHivePrincipal(String)``                                         | Hiveserver2 Kerberos principal, for example hive/hostname@DOMAIN.COM                                                                                   |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.hive.jdbc_url_pattern``                 | None          | ``setHiveJdbcUrlPattern(String)``                                    | A pattern of JDBC URL used for connecting to Hiveserver2. Example: ``jdbc:hive2://{{host}}/;{{auth}}``                                                 |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.hive.token``                            | None          | ``setHiveToken(String)``                                             | An authorization token to Hive.                                                                                                                        |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.iced.dir``                              | None          | ``setIcedDir(String)``                                               | Location of iced directory for H2O nodes.                                                                                                              |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.rest.api.timeout``                      | 300000        | ``setSessionTimeout(Boolean)``                                       | Timeout in milliseconds for Rest API requests.                                                                                                         |
+---------------------------------------------------------+---------------+----------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------+

--------------

Internal backend configuration properties
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

+----------------------------------------------+---------------+----------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------+
| Property name                                | Default value | H2OConf setter (* getter_)                                     | Description                                                                                                                            |
+==============================================+===============+================================================================+========================================================================================================================================+
| ``spark.ext.h2o.cluster.size``               | None          | ``setNumH2OWorkers(Integer)``                                  | Expected number of workers of H2O cluster. Value None means automatic                                                                  |
|                                              |               |                                                                | detection of cluster size. This number must be equal to number of Spark executors. If Spark property                                   |
|                                              |               |                                                                | ``spark.executor.instances`` is specified, this Sparkling Water property is set to its value.                                          |
+----------------------------------------------+---------------+----------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.extra.cluster.nodes``        | false         | ``setExtraClusterNodesEnabled()``                              | If the property is set true and the Sparkling Water internal backend identifies more executors than specified in                       |
|                                              |               |                                                                | the Spark property  ``spark.executor.instances`` or in  the Sparkling Water property                                                   |
|                                              |               | ``setExtraClusterNodesDisabled()``                             | ``spark.ext.h2o.cluster.size``, Sparkling Water deploys H2O nodes to all discovered Spark executors. Otherwise,                        |
|                                              |               |                                                                | Sparkling Water deploys just a number of executors specified in  ``spark.ext.h2o.cluster.size``                                        |
|                                              |               |                                                                | (or ``spark.executor.instances``).                                                                                                     |
+----------------------------------------------+---------------+----------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.dummy.rdd.mul.factor``       | 10            | ``setDrddMulFactor(Integer)``                                  | Multiplication factor for dummy RDD  generation. Size of dummy RDD is                                                                  |
|                                              |               |                                                                | ``spark.ext.h2o.cluster.size`` multiplied by this option.                                                                              |
+----------------------------------------------+---------------+----------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.spreadrdd.retries``          | 10            | ``setNumRddRetries(Integer)``                                  | Number of retries for creation of an RDD spread across all existing Spark executors                                                    |
+----------------------------------------------+---------------+----------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.default.cluster.size``       | 20            | ``setDefaultCloudSize(Integer)``                               | Starting size of cluster in case that size is not explicitly configured.                                                               |
+----------------------------------------------+---------------+----------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.subseq.tries``               | 5             | ``setSubseqTries(Integer)``                                    | Subsequent successful tries to figure out size of Spark cluster, which are                                                             |
|                                              |               |                                                                | producing the same number of nodes.                                                                                                    |
+----------------------------------------------+---------------+----------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.hdfs_conf``                  | None          | ``setHdfsConf(String)``                                        | Either a string with the Path to a file with Hadoop HDFS configuration or the                                                          |
|                                              |               |                                                                | hadoop.conf.Configuration object in the org.apache package. Useful for HDFS credentials                                                |
|                                              |               |                                                                | settings and other HDFS-related configurations. Default value None means                                                               |
|                                              |               |                                                                | use `sc.hadoopConfig`.                                                                                                                 |
+----------------------------------------------+---------------+----------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.spreadrdd.retries.timeout``  | 0             | ``setSpreadRddRetriesTimeout(Int)``                            | Specifies how long the discovering of Spark executors should last. This                                                                |
|                                              |               |                                                                | option has precedence over other options influencing the discovery                                                                     |
|                                              |               |                                                                | mechanism. That means that as long as the timeout hasn't expired, we keep                                                              |
|                                              |               |                                                                | trying to discover new executors. This option might be useful in environments                                                          |
|                                              |               |                                                                | where Spark executors might join the cloud with some delays.                                                                           |
+----------------------------------------------+---------------+----------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.direct.configuration.ip``    | true          | ``setDirectIpConfigurationEnabled()``                          | If the property is disabled, Spark executor doesn't assign its IP address to H2O node directly. The IP address is                      |
|                                              |               |                                                                | suggested to H2O node and its bootstrap logic performs additional network interface availability checks before                         |
|                                              |               | ``setDirectIpConfigurationDisabled()``                         | the IP is assigned to the node.                                                                                                        |
+----------------------------------------------+---------------+----------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.jetty.aes.login.module.key`` | None          | ``setJettyLdapAesEncryptedBindPasswordLoginModuleKey(String)`` | Specific to water.webserver.jetty9.LdapAesEncryptedBindPasswordLoginModule. AES CBC Key                                                |
+----------------------------------------------+---------------+----------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.jetty.aes.login.module.iv``  | None          | ``setJettyLdapAesEncryptedBindPasswordLoginModuleIV(String)``  | Specific to water.webserver.jetty9.LdapAesEncryptedBindPasswordLoginModule. AES CBC IV. When no IV is provided an all zero IV is used. |
+----------------------------------------------+---------------+----------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------+

--------------

External backend configuration properties
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| Property name                                        | Default value   | H2OConf setter (* getter_)                      | Description                                                                                                                          |
+======================================================+=================+=================================================+======================================================================================================================================+
| ``spark.ext.h2o.external.driver.if``                 | None            | ``setExternalH2ODriverIf(String)``              | Ip address or network of mapper->driver callback interface. Default value means automatic detection.                                 |
+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.external.driver.port``               | None            | ``setExternalH2ODriverPort(Integer)``           | Port of mapper->driver callback interface. Default value means automatic detection.                                                  |
+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.external.driver.port.range``         | None            | ``setExternalH2ODriverPortRange(String)``       | Range portX-portY of mapper->driver callback interface; eg: 50000-55000.                                                             |
+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.external.extra.memory.percent``      | 10              | ``setExternalExtraMemoryPercent(Integer)``      | This option is a percentage of external memory option and specifies memory                                                           |
|                                                      |                 |                                                 | for internal JVM use outside of Java heap.                                                                                           |
+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.cloud.representative``               | None            | ``setH2OCluster(String)``                       | ip:port of a H2O cluster leader node to identify external H2O cluster.                                                               |
+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.external.cluster.size``              | None            | ``setClusterSize(Integer)``                     | Number of H2O nodes to start when ``auto`` mode of the external backend is set.                                                      |
+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.cluster.start.timeout``              | 120             | ``setClusterStartTimeout(Integer)``             | Timeout in seconds for starting H2O external cluster                                                                                 |
+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.cluster.info.name``                  | None            | ``setClusterInfoFile(Integer)``                 | Full path to a file which is used as the notification file for the startup of external H2O cluster.                                  |
+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.external.memory``                    | 6G              | ``setExternalMemory(String)``                   | Amount of memory assigned to each external H2O node                                                                                  |
+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.external.hdfs.dir``                  | None            | ``setHDFSOutputDir(String)``                    | Path to the directory on HDFS used for storing temporary files.                                                                      |
+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.external.start.mode``                | manual          | ``useAutoClusterStart()``                       | If this option is set to ``auto`` then H2O external cluster is automatically started using the                                       |
|                                                      |                 |                                                 | provided H2O driver JAR on YARN, otherwise it is expected that the cluster is started by the user                                    |
|                                                      |                 | ``useManualClusterStart()``                     | manually                                                                                                                             |
+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.external.h2o.driver``                | None            | ``setH2ODriverPath(String)``                    |  Path to H2O driver used during ``auto`` start mode.                                                                                 |
+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.external.yarn.queue``                | None            | ``setYARNQueue(String)``                        | Yarn queue on which external H2O cluster is started.                                                                                 |
+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.external.kill.on.unhealthy``         | true            | ``setKillOnUnhealthyClusterEnabled()``          | If true, the client will try to kill the cluster and then itself in                                                                  |
|                                                      |                 |                                                 | case some nodes in the cluster report unhealthy status.                                                                              |
|                                                      |                 | ``setKillOnUnhealthyClusterDisabled()``         |                                                                                                                                      |
+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.external.kerberos.principal``        | None            | ``setKerberosPrincipal(String)``                | Kerberos Principal                                                                                                                   |
+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.external.kerberos.keytab``           | None            | ``setKerberosKeytab(String)``                   | Kerberos Keytab                                                                                                                      |
+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.external.run.as.user``               | None            | ``setRunAsUser(String)``                        | Impersonated Hadoop user                                                                                                             |
+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.external.backend.stop.timeout``      | 10000           | ``setExternalBackendStopTimeout(Integer)``      | Timeout for confirmation from worker nodes when stopping the  external backend. It is also                                           |
|                                                      |                 |                                                 | possible to pass ``-1`` to ensure the indefinite timeout. The unit is milliseconds.                                                  |
+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.external.hadoop.executable``         | hadoop          | ``setExternalHadoopExecutable(String)``         | Name or path to path to a hadoop  executable binary which is used                                                                    |
|                                                      |                 |                                                 | to start external H2O backend on YARN.                                                                                               |
+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.external.extra.jars``                | None            | ``setExternalExtraJars(String)``                | Comma-separated paths to jars that will be placed onto classpath of each H2O node.                                                   |
|                                                      |                 |                                                 |                                                                                                                                      |
|                                                      |                 | ``setExternalExtraJars(String[])``              |                                                                                                                                      |
+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.external.communication.compression`` | SNAPPY          | ``setExternalCommunicationCompression(String)`` | The type of compression used for data transfer between Spark and H2O node.                                                           |
|                                                      |                 |                                                 | Possible values are ``NONE``, ``DEFLATE``, ``GZIP``, ``SNAPPY``.                                                                     |
+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.external.auto.start.backend``        | yarn            | ``setExternalAutoStartBackend(String)``         | The backend on which the external H2O backend will be started in auto start mode.                                                    |
|                                                      |                 |                                                 | Possible values are ``YARN`` and ``KUBERNETES``.                                                                                     |
+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.external.k8s.h2o.service.name``      | h2o-service     | ``setExternalK8sH2OServceName(String)``         | Name of H2O service required to start H2O on K8s.                                                                                    |
+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.external.k8s.h2o.statefulset.name``  | h2o-statefulset | ``setExternalK8sH2OStatefulsetName(String)``    | Name of H2O stateful set required to start H2O on K8s.                                                                               |
+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.external.k8s.h2o.label``             | app=h2o         | ``setExternalK8sH2OLabel(String)``              | Label used to select node for H2O cluster formation.                                                                                 |
+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.external.k8s.h2o.api.port``          | 8081            | ``setExternalK8sH2OApiPort(String)``            | Kubernetes API port.                                                                                                                 |
+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.external.k8s.namespace``             | default         | ``setExternalK8sNamespace(String)``             | Kubernetes namespace where external H2O is started.                                                                                  |
+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.external.k8s.docker.image``          | See doc         | ``setExternalK8sDockerImage(String)``           | Docker image containing Sparkling Water external H2O backend. Default value is h2oai/sparkling-water-external-backend:3.46.0.6-1-3.2 |
+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.external.k8s.domain``                | cluster.local   | ``setExternalK8sDomain(String)``                | Domain of the Kubernetes cluster.                                                                                                    |
+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| ``spark.ext.h2o.external.k8s.svc.timeout``           | 300             | ``setExternalK8sServiceTimeout(Int)``           | [Deprecated] Timeout in seconds used as a limit for K8s service creation.                                                            |
+------------------------------------------------------+-----------------+-------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+

--------------

.. _getter:

H2OConf getter can be derived from the corresponding setter. All getters are parameter-less. If the type of the property is Boolean, the getter is prefixed with
``is`` (E.g. ``setReplEnabled()`` -> ``isReplEnabled()``). Property getters of other types do not have any prefix and start with lowercase
(E.g. ``setUserName(String)`` -> ``userName`` for Scala, ``userName()`` for Python).