R/import.R
h2o.import_sql_select.Rd
Creates a temporary SQL table from the specified sql_query. Runs multiple SELECT SQL queries on the temporary table concurrently for parallel ingestion, then drops the table. Be sure to start the h2o.jar in the terminal with your downloaded JDBC driver in the classpath: `java -cp <path_to_h2o_jar>:<path_to_jdbc_driver_jar> water.H2OApp` Also see h2o.import_sql_table. Currently supported SQL databases are MySQL, PostgreSQL, MariaDB, Hive, Oracle and Microsoft SQL Server.
h2o.import_sql_select(connection_url, select_query, username, password, use_temp_table = NULL, temp_table_name = NULL, optimize = NULL, fetch_mode = NULL)
connection_url | URL of the SQL database connection as specified by the Java Database Connectivity (JDBC) Driver. For example, "jdbc:mysql://localhost:3306/menagerie?&useSSL=false" |
---|---|
select_query | SQL query starting with `SELECT` that returns rows from one or more database tables. |
username | Username for SQL server |
password | Password for SQL server |
use_temp_table | Whether a temporary table should be created from select_query |
temp_table_name | Name of temporary table to be created from select_query |
optimize | (Optional) Optimize import of SQL table for faster imports. Experimental. Default is true. |
fetch_mode | (Optional) Set to DISTRIBUTED to enable distributed import. Set to SINGLE to force a sequential read from the database Can be used for databases that do not support OFFSET-like clauses in SQL statements. |
For example, my_sql_conn_url <- "jdbc:mysql://172.16.2.178:3306/ingestSQL?&useSSL=false" select_query <- "SELECT bikeid from citibike20k" username <- "root" password <- "abc123" my_citibike_data <- h2o.import_sql_select(my_sql_conn_url, select_query, username, password)