Change Log

v2.4.13 (2019-06-24)

Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.4/13/index.html

  • Bug
    • SW-1140 - Add more logging to discover intermittent RSparkling Issue in jenkins tests
    • SW-1318 - add back to JavaH2OContext method asDataFrame(.., SQLContext) but deprecated
    • SW-1321 - Remove mention of H2O UDP from user documentation
    • SW-1322 - Fix wrong doc in ssl.rst -> val conf: H2OConf = // generate H2OConf file
    • SW-1323 - Model ID not available on our algo pipeline wrappers
    • SW-1338 - Follow up fixes after RSparkling change
    • SW-1339 - Use s3-cli instead of s3cmd because of performance reasons on nightlies
    • SW-1340 - Fix spinx warning
    • SW-1342 - Fix dist
    • SW-1343 - Fix dist structure
    • SW-1345 - Fix missing rsparkling in dist package
    • SW-1347 - Scaladoc not uploaded to S3 after porting make-dist to gradle
    • SW-1359 - Fix wrong links on nightly build page
    • SW-1360 - Explicitly send hearbeat after we have complete flatfile
    • SW-1361 - sparkling water package on maven should assembly jar
    • SW-1362 - gradle.properties in distribution contains wrong version
    • SW-1364 - Rename SVM to SparkSVM
    • SW-1374 - Minor documentation fixes
  • New Feature
    • SW-1021 - Upload RSparkling to S3 in a form of R repository
    • SW-1353 - Introduce logic flatting data frames with arbitrarily nested structures
  • Improvement
    • SW-554 - Include all used dependency licenses in the uber jar.
    • SW-1308 - Bundle Sparkling Water jar into rsparkling -> making rsparkling version dependent on specific sparkling water
    • SW-1317 - Unify repl acros different rel branches
    • SW-1325 - Expose jks_alias in Sparkling Water
    • SW-1326 - Include SW version in more log statements
    • SW-1327 - Support Spark 2.4.1 and 2.4.3
    • SW-1330 - Add additional log to H2O cloudup in internal backend mode
    • SW-1331 - Create local repo with RSparkling
    • SW-1332 - [RSparkling] Make installation from S3 the default recommended option
    • SW-1333 - Move the conversion logic from Spark Row to H2O RowData to a separate entity
    • SW-1334 - Store H2O models in transient lazy variables of SW Mojo models
    • SW-1335 - Make automl tests more deterministic by using max_models instead of max_runtime_secs
    • SW-1341 - Use readme as main dispatch for documentation
    • SW-1346 - Remove chache and unpersist call in SpreadRDDBuilder
    • SW-1348 - Switch to s3 cli on release pipelines
    • SW-1349 - Use withColumn instead of select in MOJO models
    • SW-1350 - Fix links to doc & scaladoc on nightly builds
    • SW-1352 - Upgrade H2O to 3.24.0.5
    • SW-1365 - Run only last build in jenkins
    • SW-1369 - Download page is missing one step on RSparkling tab -> library(rsparkling)
    • SW-1371 - Create maven repo on our s3 for each release and nightly
    • SW-1373 - Update DBC documentation with respoect to latest RSparkling development

v2.4.12 (2019-06-03)

Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.4/12/index.html

  • Bug
    • SW-1259 - Unify ratio param across pipeline api
    • SW-1287 - Use RPC endpoints to orchestrate cloud in internal mode
    • SW-1290 - Fix doc
    • SW-1301 - Fix class-loading for Sparkling Water assembly JAR in PySparkling
    • SW-1311 - Add numpy as PySparkling dependency ( it is required because of Spark but missing from list of dependencies)
    • SW-1312 - Warn that default value of convertUnknownCategoricalLevelsToNa will be changed to false on GridSearch & AutoML
    • SW-1316 - Fix wrong fat jar name
  • Task
    • SW-1292 - Benchmarks: Subproject Skeleton
  • Improvement
    • SW-1212 - Make sure python zip/wheel is downloadable from our release s3
    • SW-1274 - On download page -> list all supported minor versions
    • SW-1286 - Remove Param propagation of MOJOModels from Python to Java
    • SW-1288 - H2OCommonParams in pysparkling
    • SW-1289 - Move shared params to H2OCommonParams
    • SW-1298 - Don't use deprecated methods
    • SW-1299 - Warn user that default value of predictionCol on H2OMOJOModel will change in the next major release to 'prediction'
    • SW-1300 - Upgrade to H2O 3.24.0.4
    • SW-1304 - Definition of assembly jar via transitive exclusions
    • SW-1305 - Move ability to change behavior of MOJO models to MOJOLoader
    • SW-1306 - Move make-dist logic to gradle
    • SW-1307 - Expose binary model in spark pipeline stage
    • SW-1309 - Fix xgboost doc
    • SW-1313 - Rename the 'create_from_mojo' method of H2OMOJOModel and H2OMOJOPipelineModel to 'createFromMojo'

v2.4.11 (2019-05-17)

Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.4/11/index.html

  • Bug
    • SW-1256 - Fix constructor of H2OMojoModel
    • SW-1258 - Remove internal constructors & Deprecate implicit constructor parameters for H2O Algo Spark Estimators( to be the same as in PySparkling)
    • SW-1270 - Fix version check in PySpakrling shell
    • SW-1278 - Clean workspace on the hadoop node in integ tests
    • SW-1279 - Fix inconsistencies between H2OAutoML, H2OGridSearch & H2OALgorithm
    • SW-1281 - Fix bad representation of predictionCol on H2OMOJOModel
    • SW-1282 - XGBoost can't be used in H2OGridSearch pipeline wrapper
    • SW-1283 - Correctly return mojo model in pysparkling after fit
  • Story
    • SW-1271 - Remove SparkContext from H2OSchemaUtils
    • SW-1273 - Upgrade to H2O 3.24.0.3
  • New Feature
    • SW-1248 - getFeaturesCols() should not return the fold column or weight column
    • SW-1249 - probability calibration does not work in Sparkling Water Dataframe API
  • Improvement
    • SW-369 - Override spark locality so we use only nodes on which h2o is running.
    • SW-1216 - Improve PySparkling README
    • SW-1261 - Remove binary H2O model from ML pipelines
    • SW-1263 - Don't require initializer call to be called during pysparkling pipelines
    • SW-1264 - Use default params reader in pipelines
    • SW-1269 - Remove six as dependency from PySparkling launcher ( six is no longer dependency)
    • SW-1275 - Remove unnecessary constructor in helper class
    • SW-1280 - Add predictionCol to mojo pipeline model

v2.4.10 (2019-04-26)

Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.4/10/index.html

  • Bug
    • SW-1186 - No need to pass properties defined in spark-defaults.conf to cli
    • SW-1189 - Fix Sparkling Water 2.1.x compile on Scala 2.10
    • SW-1194 - RSparkling Can't be used on Spark 2.4
    • SW-1195 - Disable gradle daemon via gradle.properties
    • SW-1196 - Fix org.apache.spark.ml.spark.models.PipelinePredictionTest
    • SW-1203 - Custom metric not evaluated in internal mode of Sparkling Water
    • SW-1227 - Change get-extended-jar to use https instead of http
    • SW-1230 - Fix typo in GLM API - getRemoteCollinearColumns, setRemoteCollinearColumns
    • SW-1232 - Fix RUnits after upgrading to Gradle 5.3.1
  • Story
    • SW-1198 - Introduce new annotation deprecating legacy methods in API
    • SW-1209 - Rename the 'predictionCol' model parameter to 'labelCol'
    • SW-1226 - Introduce mechanism for enabling backward compatibility of MOJO files when properties are renamed
  • New Feature
    • SW-1193 - Expose weights_column parameter
  • Improvement
    • SW-1188 - RSparkling: Add ability to add authentication details when calling h2o_context(sc)
    • SW-1190 - Improve hint description for disabling automatic usage of broadcast joins
    • SW-1199 - Improve memory efficiency of H2OMOJOPipelineModel
    • SW-1202 - Simplify Sparkling Water build
    • SW-1204 - Fix formating in python tests
    • SW-1208 - Create pysparkling tests report file if it does not exist
    • SW-1210 - Add fold column to python and scala pipelines
    • SW-1211 - Automatically download H2O Wheel
    • SW-1213 - Upgrade to H2O 3.24.0.2
    • SW-1214 - Remove PySparkling six dependency as it was removed in H2O
    • SW-1215 - Automatically generate PySparkling README
    • SW-1217 - Automatically generate last pieces of doc subproject
    • SW-1219 - Remove suport for testing external cluster in manual mode
    • SW-1221 - Remove unnecessary branch check
    • SW-1222 - Remove duplicate readme file (contains old info & the correct info is in doc)
    • SW-1223 - Remove confusing meetup dir
    • SW-1224 - Upgrade to Gradle 5.3.1
    • SW-1228 - Rename the 'ignoredColumns' parameter of H2OAutoML to 'ignoredCols'
    • SW-1236 - Reformat few python classes
    • SW-1238 - Parametrize EMR version in templates generation
    • SW-1239 - Remove old README and DEVEL doc files (not just pointer to new doc)
    • SW-1240 - Use minSupportedJava for source and target compatibility in build.gradle

v2.4.9 (2019-04-03)

Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.4/9/index.html

  • Bug
    • SW-1162 - Exception when there is a column with BOOLEAN type in dataset during H2OMOJOModel transformation
    • SW-1177 - In Pysparkling script, setting –driver-class-path influences the environment
    • SW-1178 - Upgrade to h2O 3.24.0.1
    • SW-1180 - Use specific metrics in grid search, in the same way as H2O Grid
    • SW-1181 - Document off heap memory configuration for Spark in Standalone mode/IBM conductor
    • SW-1182 - Fix random project name generation in H2OAutoML Spark Wrapper
  • New Feature
    • SW-1167 - Expose search_criteria for H2OGridSearch
    • SW-1174 - expose H2OGridSearch models
    • SW-1183 - Add includeAlgos to H2o AutoML pipeline stage & ability to ignore XGBoost
  • Improvement
    • SW-1164 - Add Sparkling Water to Jupyter spark/pyspark kernels in EMR terraform template
    • SW-1171 - Upgrade build to Gradle 5.2.1
    • SW-1175 - Integrate with H2O native hive support

v2.4.8 (2019-03-15)

Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.4/8/index.html

  • Bug
    • SW-1163 - Expose missing variables in shared TF EMR SW tamplate
  • Improvement
    • SW-1145 - Start jupyter notebook with Scala & Python Spark in AWS EMR Terraform template
    • SW-1165 - Upgrade to H2O 3.22.1.6

v2.4.7 (2019-03-07)

Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.4/7/index.html

  • Bug
    • SW-1150 - hc.stop() shows 'exit' not defined error
    • SW-1152 - Fix RSparkling in case the jars are being fetched from maven
    • SW-1156 - H2OXgboost pipeline stage does not define updateH2OParams method
    • SW-1159 - Unique project name in automl to avoid sharing one leaderboard
    • SW-1161 - Fix grid search pipeline step on pyspark side
  • Improvement
    • SW-1052 - Document teraform scripts for AWS
    • SW-1089 - Document using Google Cloud Storage In Sparkling Water
    • SW-1135 - Speed up conversion between sparse spark vectors and h2o frames by using sparse new chunk
    • SW-1141 - Improve terraform templates for AWS EMR and make them part of the release process
    • SW-1149 - Allow login via ssh to created cluster using terraform
    • SW-1153 - Add H2OGridSearch pipeline stage to PySpark
    • SW-1155 - Test GBM Grid Search Scala pipeline step
    • SW-1158 - Generalize H2OGridSearch Pipeline step to support other available algos
    • SW-1160 - Upgrade to H2O 3.22.1.5

v2.4.6 (2019-02-18)

Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.4/6/index.html

  • Bug
    • SW-1136 - Fix bug affecting loading pipeline in python when stored in scala
    • SW-1138 - Fix several cases in spark vector -> h2o conversion
  • Improvement
    • SW-1134 - Add H2OGLM Wrapper to Sparkling Water
    • SW-1139 - Update mojo2 to 0.3.16
    • SW-1143 - Fix s3 bootstrap templates for nightly builds
    • SW-1144 - Upgrade to H2O 3.22.1.4

v2.4.5 (2019-01-29)

Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.4/5/index.html

  • Bug
    • SW-1133 - Upgrade to H2O 3.22.1.3

v2.4.4 (2019-01-21)

Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.4/4/index.html

  • Bug
    • SW-1129 - Fix support for unsupervised mojo models
  • Improvement
    • SW-1101 - Update code to work with latest jetty changes
    • SW-1127 - Upgrade H2O to 3.22.1.2

v2.4.3 (2019-01-17)

Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.4/3/index.html

  • Bug
    • SW-1116 - Cannot serialize DAI model
  • Improvement
    • SW-1113 - Update to H2O 3.22.0.5
    • SW-1115 - Enable tabs in the documentation based on the language
    • SW-1120 - Prepare Terraform scripts for Sparkling Water on EMR
    • SW-1121 - Use getTimestamp method instead of _timestamp directly

v2.4.2 (2019-01-08)

Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.4/2/index.html

  • Bug
    • SW-1107 - NullPointerException at water.H2ONode.openChan(H2ONode.java:417) after upgrade to H2O 3.22.0.3
    • SW-1110 - Fix test suite to test PySparkling YARN integration tests on external backend as well
  • Task
    • SW-1109 - Docs: Change copyright year in docs to include 2019
  • Improvement
    • SW-464 - Publish PySparkling as conda package
    • SW-1111 - Update H2O to 3.22.0.4

v2.4.1 (2018-12-27)

Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.4/1/index.html

  • Bug
    • SW-1084 - Documentation link does not work on the Nightly Bleeding Edge download page
    • SW-1100 - Fix Travis builds
    • SW-1102 - Fix Travis builds (test just scala unit tests)
  • Task
    • SW-857 - Make behaviour introduced by SW-851 as default in Spark 2.4 and up
  • Improvement
    • SW-464 - Publish PySparkling as conda package
    • SW-995 - Don't require implicit sqlContext parameter as part of asDataFrame as we can get it in spark session internally
    • SW-1079 - Upgrade to Spark 2.4 (Without making use the barier API so far)
    • SW-1080 - Fix deprecation warning regarding automl -> AutoML
    • SW-1086 - Re-enable RSparkling tests for master & rel-2.4 when SparklyR supports Spark 2.4
    • SW-1090 - Upgrade shadowJar plugin
    • SW-1091 - Upgrade to Gradle 5.0
    • SW-1092 - Updates to streaming app
    • SW-1093 - Update to H2O 3.22.0.3
    • SW-1095 - Enable GCS in Sparkling Water
    • SW-1097 - Properly integrate GCS with Sparkling Water, including test in PySparkling
    • SW-1098 - Fix pyspark dependency for pysparkling for Spark 2.4
    • SW-1106 - Remove deprecated Gradle option in Gradle 5
  • Docs
    • SW-1083 - Add Installation and Starting instructions to the docs

v2.3.x (2018-03-29)

  • Sparkling Water 2.3 brings support of Spark 2.3.
  • For detailed changelog, please read rel-2.3/CHANGELOG.

v2.2.x (2017-08-17)

  • Sparkling Water 2.2 brings support of Spark 2.2.
  • For detailed changelog, please read rel-2.2/CHANGELOG.

v2.1.x (2017-03-02)

  • Sparkling Water 2.1 brings support of Spark 2.1.
  • For detailed changelog, please read rel-2.1/CHANGELOG.

v2.0.x (2016-09-26)

  • Sparkling Water 2.0 brings support of Spark 2.0.
  • For detailed changelog, please read rel-2.0/CHANGELOG.

v1.6.x (2016-03-15)

  • Sparkling Water 1.6 brings support of Spark 1.6.
  • For detailed changelog, please read rel-1.6/CHANGELOG.

v1.5.x (2015-09-28)

  • Sparkling Water 1.5 brings support of Spark 1.5.
  • For detailed changelog, please read rel-1.5/CHANGELOG.

v1.4.x (2015-07-06)

  • Sparkling Water 1.4 brings support of Spark 1.4.
  • For detailed changelog, please read rel-1.4/CHANGELOG.

v1.3.x (2015-05-25)

  • Sparkling Water 1.3 brings support of Spark 1.3.
  • For detailed changelog, please read rel-1.3/CHANGELOG.

v1.2.x (2015-05-18) and older

  • Sparkling Water 1.2 brings support of Spark 1.2.
  • For detailed changelog, please read rel-1.2/CHANGELOG.