Change Log¶
v2.1.55 (2019-06-03)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/55/index.html
- Bug
- SW-1259 - Unify ratio param across pipeline api
- SW-1287 - Use RPC endpoints to orchestrate cloud in internal mode
- SW-1290 - Fix doc
- SW-1301 - Fix class-loading for Sparkling Water assembly JAR in PySparkling
- SW-1311 - Add numpy as PySparkling dependency ( it is required because of Spark but missing from list of dependencies)
- SW-1312 - Warn that default value of convertUnknownCategoricalLevelsToNa will be changed to false on GridSearch & AutoML
- SW-1316 - Fix wrong fat jar name
- Task
- SW-1292 - Benchmarks: Subproject Skeleton
- Improvement
- SW-1212 - Make sure python zip/wheel is downloadable from our release s3
- SW-1274 - On download page -> list all supported minor versions
- SW-1286 - Remove Param propagation of MOJOModels from Python to Java
- SW-1288 - H2OCommonParams in pysparkling
- SW-1289 - Move shared params to H2OCommonParams
- SW-1298 - Don't use deprecated methods
- SW-1299 - Warn user that default value of predictionCol on H2OMOJOModel will change in the next major release to 'prediction'
- SW-1300 - Upgrade to H2O 3.24.0.4
- SW-1304 - Definition of assembly jar via transitive exclusions
- SW-1305 - Move ability to change behavior of MOJO models to MOJOLoader
- SW-1306 - Move make-dist logic to gradle
- SW-1307 - Expose binary model in spark pipeline stage
- SW-1309 - Fix xgboost doc
- SW-1313 - Rename the 'create_from_mojo' method of H2OMOJOModel and H2OMOJOPipelineModel to 'createFromMojo'
v2.1.54 (2019-05-17)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/54/index.html
- Bug
- SW-1256 - Fix constructor of H2OMojoModel
- SW-1258 - Remove internal constructors & Deprecate implicit constructor parameters for H2O Algo Spark Estimators( to be the same as in PySparkling)
- SW-1270 - Fix version check in PySpakrling shell
- SW-1278 - Clean workspace on the hadoop node in integ tests
- SW-1279 - Fix inconsistencies between H2OAutoML, H2OGridSearch & H2OALgorithm
- SW-1281 - Fix bad representation of predictionCol on H2OMOJOModel
- SW-1282 - XGBoost can't be used in H2OGridSearch pipeline wrapper
- SW-1283 - Correctly return mojo model in pysparkling after fit
- Story
- New Feature
- Improvement
- SW-369 - Override spark locality so we use only nodes on which h2o is running.
- SW-1216 - Improve PySparkling README
- SW-1261 - Remove binary H2O model from ML pipelines
- SW-1263 - Don't require initializer call to be called during pysparkling pipelines
- SW-1264 - Use default params reader in pipelines
- SW-1268 - Non-named columns are long time deprecated. Switch to named columns by default
- SW-1269 - Remove six as dependency from PySparkling launcher ( six is no longer dependency)
- SW-1275 - Remove unnecessary constructor in helper class
- SW-1280 - Add predictionCol to mojo pipeline model
v2.1.53 (2019-04-26)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/53/index.html
- Bug
- SW-1189 - Fix Sparkling Water 2.1.x compile on Scala 2.10
- SW-1194 - RSparkling Can't be used on Spark 2.4
- SW-1195 - Disable gradle daemon via gradle.properties
- SW-1196 - Fix org.apache.spark.ml.spark.models.PipelinePredictionTest
- SW-1203 - Custom metric not evaluated in internal mode of Sparkling Water
- SW-1227 - Change get-extended-jar to use https instead of http
- SW-1230 - Fix typo in GLM API - getRemoteCollinearColumns, setRemoteCollinearColumns
- SW-1232 - Fix RUnits after upgrading to Gradle 5.3.1
- SW-1234 - Deprecate asDataFrame with implicit argument
- Story
- New Feature
- SW-1193 - Expose weights_column parameter
- Improvement
- SW-1188 - RSparkling: Add ability to add authentication details when calling h2o_context(sc)
- SW-1190 - Improve hint description for disabling automatic usage of broadcast joins
- SW-1199 - Improve memory efficiency of H2OMOJOPipelineModel
- SW-1202 - Simplify Sparkling Water build
- SW-1204 - Fix formating in python tests
- SW-1208 - Create pysparkling tests report file if it does not exist
- SW-1210 - Add fold column to python and scala pipelines
- SW-1211 - Automatically download H2O Wheel
- SW-1213 - Upgrade to H2O 3.24.0.2
- SW-1214 - Remove PySparkling six dependency as it was removed in H2O
- SW-1215 - Automatically generate PySparkling README
- SW-1217 - Automatically generate last pieces of doc subproject
- SW-1219 - Remove suport for testing external cluster in manual mode
- SW-1221 - Remove unnecessary branch check
- SW-1222 - Remove duplicate readme file (contains old info & the correct info is in doc)
- SW-1223 - Remove confusing meetup dir
- SW-1224 - Upgrade to Gradle 5.3.1
- SW-1228 - Rename the 'ignoredColumns' parameter of H2OAutoML to 'ignoredCols'
- SW-1229 - Remove dependencies to Scala 2.10
- SW-1235 - Remove support for Python 2.6 on rel-2.1
- SW-1236 - Reformat few python classes
- SW-1238 - Parametrize EMR version in templates generation
- SW-1239 - Remove old README and DEVEL doc files (not just pointer to new doc)
- SW-1240 - Use minSupportedJava for source and target compatibility in build.gradle
v2.1.52 (2019-04-03)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/52/index.html
- Bug
- SW-1162 - Exception when there is a column with BOOLEAN type in dataset during H2OMOJOModel transformation
- SW-1177 - In Pysparkling script, setting –driver-class-path influences the environment
- SW-1178 - Upgrade to h2O 3.24.0.1
- SW-1180 - Use specific metrics in grid search, in the same way as H2O Grid
- SW-1181 - Document off heap memory configuration for Spark in Standalone mode/IBM conductor
- SW-1182 - Fix random project name generation in H2OAutoML Spark Wrapper
- New Feature
- Improvement
v2.1.51 (2019-03-15)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/51/index.html
v2.1.50 (2019-03-07)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/50/index.html
- Bug
- SW-1150 - hc.stop() shows 'exit' not defined error
- SW-1152 - Fix RSparkling in case the jars are being fetched from maven
- SW-1156 - H2OXgboost pipeline stage does not define updateH2OParams method
- SW-1159 - Unique project name in automl to avoid sharing one leaderboard
- SW-1161 - Fix grid search pipeline step on pyspark side
- Improvement
- SW-1052 - Document teraform scripts for AWS
- SW-1089 - Document using Google Cloud Storage In Sparkling Water
- SW-1135 - Speed up conversion between sparse spark vectors and h2o frames by using sparse new chunk
- SW-1141 - Improve terraform templates for AWS EMR and make them part of the release process
- SW-1149 - Allow login via ssh to created cluster using terraform
- SW-1153 - Add H2OGridSearch pipeline stage to PySpark
- SW-1155 - Test GBM Grid Search Scala pipeline step
- SW-1158 - Generalize H2OGridSearch Pipeline step to support other available algos
- SW-1160 - Upgrade to H2O 3.22.1.5
v2.1.49 (2019-02-18)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/49/index.html
- Bug
- Improvement
v2.1.48 (2019-01-29)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/48/index.html
- Bug
- SW-1133 - Upgrade to H2O 3.22.1.3
v2.1.47 (2019-01-21)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/47/index.html
v2.1.46 (2019-01-17)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/46/index.html
v2.1.45 (2019-01-08)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/45/index.html
- Bug
- Task
- SW-1109 - Docs: Change copyright year in docs to include 2019
- Improvement
v2.1.44 (2018-12-27)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/44/index.html
- Bug
- Improvement
- SW-464 - Publish PySparkling as conda package
- SW-1080 - Fix deprecation warning regarding automl -> AutoML
- SW-1092 - Updates to streaming app
- SW-1093 - Update to H2O 3.22.0.3
- SW-1094 - Upgrade gradle to 4.10.3
- SW-1095 - Enable GCS in Sparkling Water
- SW-1097 - Properly integrate GCS with Sparkling Water, including test in PySparkling
- Docs
- SW-1083 - Add Installation and Starting instructions to the docs
v2.1.43 (2018-11-27)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/43/index.html
- Improvement
- SW-1078 - Upgrade H2O to 3.22.0.2
v2.1.42 (2018-10-27)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/42/index.html
v2.1.41 (2018-10-17)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/41/index.html
- Bug
- Improvement
- SW-1057 - Sparkling shell ignores parameters after last updates
- SW-1058 - Automatic detection of client ip in external backend
- SW-1059 - Pysparkling in external backend, manual mode stops the backend cluster, but the cluster should be left intact
- SW-1060 - Create nightly release for 2.1, 2.2 and 2.3
- SW-1061 - Upgrade to Mojo 0.3.15
- SW-1062 - Don't expose mojo internal types
- SW-1063 - More explicit checks for valid values of Backend mode and external backend start mode
- SW-1064 - Expose run_as_user for External H2O Backend
- SW-1069 - Upgrade H2O to 3.20.0.10
v2.1.40 (2018-10-02)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/40/index.html
- Bug
- Improvement
v2.1.39 (2018-09-24)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/39/index.html
- New Feature
- Improvement
- SW-1024 - remove call to ./gradlew –help in jenkins pipeline
- SW-1025 - Ensure that release does not depend on build id
- SW-1026 - Automatically update master after RSparkling release with latest version
- SW-1030 - [RSparkling] In case only path to SW jar file is specified, discover the version from JAR file instead of requiring it as parameter
- SW-1031 - Enable installation ot RSparkling using devtools from Github repo
- SW-1032 - Upgrade mojo pipeline to 0.13.2
- SW-1033 - Document automatic certificate creation for Flow UI
- SW-1034 - PySparkling fails if we specify https argument as part of getOrCreate()
- SW-1035 - Document using s3a and s3n on Sparkling Water
- SW-1036 - Upgrade to H2O 3.20.0.8
- SW-1038 - The shell script bin/pysparkling should print missing dependencies
- SW-1039 - Upgrade Gradle to 4.10.2
- Docs
- SW-1018 - Fix link to Installing RSparkling on Windows
v2.1.38 (2018-09-14)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/38/index.html
v2.1.37 (2018-08-28)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/37/index.html
- Bug
- SW-270 - Add test for RDD[TimeStamp] -> H2OFrame[Time] -> RDD[Timestamp] conversion
- SW-319 - SVMModelTest is failing
- SW-986 - Fix links on RSparkling Readme page
- SW-996 - Fix typos in documentation
- SW-997 - Fix javadoc on JavaH2OContext
- SW-1000 - Setting context path in pysparkling fails to launch h2o
- SW-1001 - RSparkling does not respect context path
- SW-1002 - Automatically generate the keystore for H2O Flow ssl (self-signed certificates)
- SW-1003 - When running in Local mode, we ignore some configuration
- SW-1004 - Fix context path value checks
- SW-1005 - Use correct scheme in sparkling water when ssl on flow is enabled
- SW-1006 - Fix context path setting on RSparkling
- SW-1015 - Add context path after value of spark.ext.h2o.client.flow.baseurl.override when specified
- New Feature
- Task
- SW-988 - Add to docs that pysparkling has a new dependency pyspark
- Improvement
- SW-175 - JavaH2OContext#asRDD implementation is missing
- SW-920 - Sparkling Water/RSparkling needs to declare additional repository
- SW-989 - Improve Scala Doc API of the support classes
- SW-991 - Update Gradle Spinx libraries - faster documentation builds
- SW-992 - Create abstract class from creating parameters from Enum for Sparkling Water pipelines
- SW-993 - [PySparkling] Fix Wrong H2O version detection on latest bundled H2Os
- SW-994 - Add timeouts & retries for docker pull
- SW-998 - Document using PySparkling on the edge node ( EMR)
- SW-1007 - Upgrade H2O to 3.20.0.6
- SW-1011 - Fix EMR bootstrap scripts
- SW-1013 - Add option which can be used to change the flow address which is printed out after H2OConetext started
- SW-1014 - Document how to run Sparkling Water on kerberized cluster
v2.1.36 (2018-08-09)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/36/index.html
- Bug
- SW-971 - Change maintainer of RSparkling to jakub@h2o.ai
- SW-972 - Fix Content of RSparkling release table
- SW-973 - Allow passing custom cars when running ./bin/sparkling/shell
- SW-975 - Fix CRAN issues of Rsparkling
- SW-981 - Fix wrong comparison of versions when detecing other h2o versions in PySparkling
- SW-982 - Set up client_disconnect_timeout correctly in context on External backend, auto mode
- SW-983 - Fix missing mojo impl artifact when running pysparkling tests in jenkins
- Task
- SW-633 - Add to doc that 100 columns are displayed in the preview data by default
- Improvement
- SW-528 - Update PySparkling Notebooks to work for Python 3
- SW-548 - List nodes and driver memory in Spark UI - SParkling Water Tab
- SW-910 - Use Mojo Pipeline API in Sparkling Water
- SW-969 - Port documentation for mojo pipeline on Spark to SW repo
- SW-970 - Upgrade Mojo 2 in SW to 0.11.0
- SW-976 - Upgrade H2O to 3.20.0.5
- SW-977 - Need ability to disable Flow UI for Sparkling-Water
- SW-979 - Verify that we are running on correct Spark for PySparkling at init time
- SW-984 - Cache also test and runtime dependencies in docker image
- Docs
- SW-946 - Add "How to" for using Sparkling Water on Google Cloud Dataproc
v2.1.35 (2018-08-01)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/35/index.html
- Bug
- SW-903 - Automate releases of RSparkling and create release pipeline for this release proccess
- SW-911 - Add missing repository to the documentation
- SW-944 - Fix Sphinx gradle plugin, the latest version does not work
- SW-945 - Stabilize releasing to Nexus Repository
- SW-953 - Do not stop external H2O backend in case of manual start mode
- SW-958 - Fix RSparkling README style issues
- SW-959 - Fix address for fetching H2O R package in nightly tests
- SW-961 - Add option to ignore SPARK_PUBLIC_DNS
- SW-962 - Add option which ensures that items in flatfile are translated to IP address
- SW-967 - Deprecate old behaviour of mojo pipeline output in SW
- Improvement
- SW-233 - Warn if user's h2o in python env is different then the one bundled in pysparkling
- SW-921 - Move Rsparkling to Sparkling Water repo
- SW-941 - Upgrade Gradle to 4.9
- SW-952 - Fix issues when stopping Sparkling Water (Scala) in yarn-cluster mode for external Backend
- SW-957 - RSparkling should run tests in both, external and internal mode
- SW-963 - Upgrade H2O to 3.20.0.4
- SW-965 - Expose port offset in Sparkling Water
- SW-968 - Remove confusing message about stopping H2OContext in PySparkling
v2.1.34 (2018-07-16)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/34/index.html
- Bug
- New Feature
- SW-826 - Implement Synchronous and Asynchronous Scala cell behaviour
- Improvement
- SW-846 - Don't parse types again when passing data to mojo pipeline
- SW-886 - Several Scala cell improvements in H2O flow
- SW-887 - Make sure that we can use schemes unsupported by H2O in H2O Confoguration
- SW-889 - Port AWS preparation scripts into SW codebase
- SW-894 - Add support for queuing of Scala cell jobs
- SW-914 - Wrong Spark version in documentation
- SW-915 - Upgrade to Spark 2.1.3
- SW-917 - Dockerize Sparkling Water release pipeline
- SW-919 - Clean gradle build with regards to mojo2
- SW-922 - Upgrade H2O to 3.20.0.3
- SW-928 - Expose AutoML max models
- Docs
- SW-878 - Add section for using Sparkling Water with AWS
v2.1.33 (2018-06-18)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/33/index.html
- Improvement
- SW-885 - Upgrade H2O to 3.20.0.2
v2.1.32 (2018-06-18)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/32/index.html
- Bug
- SW-861 - Upgrade Gradle to 4.8 (publishing plugin)
- SW-872 - Fix reference to local-cluster on download page
- SW-880 - Update Hadoop version on download page
- SW-881 - Fix Script tests on Dockerized Jenkins infrastructure
- SW-882 - Call h2oContext.stop after ham or spam Scala example
- SW-883 - Add mising description in publish.gradle
- Improvement
v2.1.31 (2018-06-13)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/31/index.html
- Bug
- SW-850 - Expose methods to get input/output names in H2OMOJOPipelineModel
- SW-859 - Print Warning when spark-home is defined on PATH
- SW-862 - Create & fix test in PySparkling for named mojo columns
- SW-864 - Fix & more readable test
- SW-865 - Better Naming of the UDF method to obtain predictions
- SW-869 - Add repository to build required by xgboost-predictor
- Story
- SW-856 - Upgrade Mojo2 to latest version
- Improvement
- SW-839 - Verify that Spark time column representation can be digested by Mojo2
- SW-848 - Document Kerberos on Sparkling Water
- SW-849 - Update use from maven on sparkling water download page
- SW-851 - Make use of output types when creating Spark DataFrame out of mojo2 predicted values
- SW-852 - Create spark UDF used to extract predicted values
- SW-853 - Sparkling Water py should require pyspark dependency
- SW-854 - Upgrade MojoPipeline to 0.10.0
- SW-855 - Upgrade H2O to 3.18.0.11
v2.1.30 (2018-05-23)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/30/index.html
v2.1.29 (2018-05-18)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/29/index.html
v2.1.28 (2018-05-15)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/28/index.html
- Bug
- Improvement
- SW-829 - Type checking in PySparkling pipelines
- SW-832 - Small refactoring in identifiers
- SW-833 - Explicitly set source and target java versions
- SW-837 - Upgrade H2O to 3.18.0.9
- SW-838 - Upgrade Mojo pipeline dependency to 0.9.8
- SW-840 - Add test checking column names and types between spark and mojo2
v2.1.27 (2018-05-02)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/27/index.html
- Bug
- New Feature
- SW-827 - Change color highlight in scala cell as it is too dark
- Improvement
- SW-815 - Upgrade H2O to 3.18.0.8
- SW-816 - Update Mojo2 dependency to one which is compatible with Java7
- SW-818 - Spark Pipeline imports do not work in PySparkling
- SW-819 - Add ability to convert specific columns to categoricals in Sparkling Water pipelines
- SW-820 - Sparkling Water pipelines add duplicate response column to the list of features
v2.1.26 (2018-04-19)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/26/index.html
- Bug
- SW-672 - Enable using sparkling water maven packages in databricks cloud
- SW-787 - Documentation fixes
- SW-790 - Add missing seed argument to H2OAutoml pipeline step
- SW-794 - Point to proper web-based docs
- SW-796 - Use parquet provided by Spark
- SW-797 - Automatically update redirect table as part of release pipeline
- SW-806 - Fix exporting and importing of pipeline steps and mojo models to and from HDFS
- Improvement
- SW-772 - Integrate & Test Mojo Pipeline with Sparkling Water
- SW-789 - Upgrade H2O to 3.18.0.7
- SW-791 - Expose context_path in Sparkling Water
- SW-793 - Create additional test verifying that the new light endpoint works as expected
- SW-798 - Additional link to documentation
- SW-800 - Remove references to Sparkling Water 2.0
- SW-804 - Reduce time of H2OAutoml step in pipeline tests to 1 minute
- SW-808 - Upgrade to Gradle 4.7
v2.1.25 (2018-03-29)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/25/index.html
- Bug
- SW-696 - Intermittent script test issue on external backend
- SW-726 - Mark Spark dependencies as provided on artefacts published to maven
- SW-740 - Increase timeout for conversion in pyunit test for external cluster
- SW-760 - Fix doc artefact publication
- SW-763 - Remove support for downloading H2O logs from Spark UI
- SW-766 - Fix coding style issue
- SW-769 - Fix import
- SW-776 - sparkling water from maven does not know the stacktrace_collector_interval option
- SW-778 - Handle nulls properly in H2OMojoModel
- New Feature
- SW-722 - [PySparkling] Check for correct data type as part of as_h2o_frame
- Improvement
- SW-733 - Parametrize pipeline scripts to be able to specify different algorithms
- SW-746 - Log chunk layout after the conversion of data to external H2O cluster
- SW-755 - Document GBM Grid Search Pipeline Step
- SW-765 - Remove test artefacts from the sparkling-water assembly
- SW-768 - Add missing import
- SW-773 - Don't use default value for output dir in external backend, it's not required
- SW-780 - Upgrade H2O to 3.18.0.5
- Docs
- SW-775 - Fix link for documentation on DEVEL.md
v2.1.24 (2018-03-08)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/24/index.html
- Bug
- SW-739 - Sparkling Water Doc artefact is still missing Scala version
- SW-742 - Fix setting up node network mask on external cluster
- SW-743 - Allow to set LDAP and different security options in external backend as well
- SW-747 - Fix bug in documentation for manual mode of external backend
- SW-757 - Fix tests after enabling the stack-trace collection
- Improvement
- SW-744 - Document how to use Sparkling Water with LDAP in Sparkling Water docs
- SW-745 - Expose Grid search as Spark pipeline step in the Scala API
- SW-748 - Upgrade to Gradle 4.6
- SW-752 - Collect stack traces on each h2o node as part of log collecting extension
- SW-754 - Upgrade H2O to 3.18.0.3
- SW-756 - Upgrade H2O to 3.18.0.4
- Docs
- SW-753 - Add "How to" for changing the default H2O port
v2.1.23 (2018-02-26)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/23/index.html
- Bug
- SW-723 - Sparkling water doc artefact is missing scala version
- SW-727 - Improve method for downloading H2O logs
- SW-728 - Use new light endpoint introduced in 3.18.0.1
- SW-734 - Make sure we use the unique key names in split method
- SW-736 - Document how to download logs on Databricks cluster
- SW-737 - Expose downloadH2OLogs on H2OContext in PySparkling
- SW-738 - Move spark.ext.h2o.node.network.mask setter to SharedArguments
- Improvement
v2.1.22 (2018-02-14)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/22/index.html
- Technical task
- SW-652 - Deliver SW documentation in HTML output
- Bug
- SW-685 - Fix Typo in documentation
- SW-695 - Make printHadoopDistributions gradle task available again for testing
- SW-701 - Kill the client when one of the h2o nodes went OOM in external mode
- SW-706 - Fix pysparkling.ml import for non-interactive sessions
- SW-707 - parquet import fails on HDP with Spark 2.0 (azure hdi cluster)
- SW-708 - Make sure H2OMojoModel does not required H2OContext to be initialized
- SW-709 - Fix mojo predictions tests
- SW-710 - In PySparkling pipelines, ensure that if users pass integer to double type we handle that correctly for all possible double values
- SW-713 - Write a simple test for parquet import in Sparkling Water
- SW-714 - Add option to H2OModel pipeline step allowing us to convert unknown categoricals to NAs
- SW-715 - Fix driverif configuration on the external backend
- Improvement
- SW-606 - Verify & Document run of RSparkling on top of Databricks Azure cluster
- SW-678 - Document how to change log location
- SW-683 - H2OContext can't be initalized on Databricks cloud
- SW-686 - Fix typo in documentation
- SW-687 - Upgrade Gradle to 4.5
- SW-688 - Update docs - SparklyR supports Spark 2.2.1 in the latest release
- SW-690 - Log Sparkling Water version during startup of Sparkling Water
- SW-693 - Allow to set driverIf on external H2O backend
- SW-694 - Fix creation of Extended JAR in gradle task
- SW-700 - Report Yarn App ID of spark application in H2OContext
- SW-703 - Upload generated sphinx documentation to S3
- SW-704 - Update links on the download page to point to the new docs
- SW-705 - Increase memory for JUNIT tests
- SW-718 - Upgrade to Gradle 4.5.1
- SW-719 - Upgrade to H2O 3.18.0.1
- SW-720 - Fix parquet import test on external backend
- Docs
v2.1.21 (2018-01-18)¶
Download at: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-2.1/21/index.html
- Bug
- SW-273 - Remove workaround introduced by SW-272 for yarn/cluster mode
- SW-551 - Remove hotfix introduced by [SW-541] and implement proper fix
- SW-662 - Remove extra files that got into the repo
- SW-666 - Kill the cluster when a new executors joins in the internal backend
- SW-668 - Generate download link as part of the release notes
- SW-669 - Remove mentions of local-cluster in public docs
- SW-670 - Deprecated call in H2OContextInitDemo
- SW-671 - Fix jenkinsfile for builds again specific h2o branches
- Improvement
v2.1.20 (2018-01-03)¶
- Bug
- SW-627 - [PySparkling] calling as_spark_frame for the second time results in exception
- SW-630 - Fix ham or spam flow to reflect latest changes in pipelines
- SW-631 - Ensure that we do not access RDDs in pipelines ( to unblock the deployment)
- SW-646 - Fix incosistencies in ham or spam examples between scala and python
- SW-648 - Fix ham or spam pipeline tests
- SW-649 - Fix ham or spam tests for deeplearning pipeline
- SW-661 - Use always correct Spark version on the R download page
- Improvement
- SW-608 - Measure time of conversions to H2OFrame in debug mode
- SW-612 - Port all arguments available to Scala ML to PySparkling ML
- SW-617 - Support for exporting mojo to hdfs
- SW-632 - Dump full spark configuration during H2OContext.getOrCreate into DEBUG
- SW-635 - Fix wrong instruction at PySparkling download page
- SW-637 - Create new DataFrame with new schema when it actually contain any dot in names
- SW-638 - Port release script into the sw repo
- SW-639 - Use persist layer for exportPOJOModel
- SW-640 - export H2OMOJOMOdel.createFromMOJO to pysparkling
- SW-642 - Create test for mojo predictions in PySparkling
- SW-643 - Add tests for H2ODeeplearning in Scala and Python and Fix potential problems
- SW-644 - Log spark configuration to INFO level
- SW-650 - Upgrade Gradle to 4.4.1
- SW-656 - Upgrade ShadowJar to 2.0.2
v2.1.19 (2017-12-11)¶
- Bug
- SW-615 - pysparkling.__version__ returns incorrectly ‘SUBST_PROJECT_VERSION’
- SW-616 - PySparkling fails on python 3.6 because long time does not exist in python 3.6
- SW-621 - PySParkling failing on Python3.6
- SW-624 - Python build does not support H2O_PYTHON_WHEEL when building against h2o older then 3.16.0.1
- SW-628 - PySparkling fails when installed from pypi
- Improvement
- SW-626 - Upgrade Gradle to 4.4
v2.1.18 (2017-12-01)¶
v2.1.17 (2017-11-25)¶
- Bug
- SW-320 - H2OConfTest Python test blocks test run
- SW-499 - BinaryType handling is not implemented in SparkDataFrameConverter
- SW-535 - asH2OFrame gives error if column names have DOT in it
- SW-547 - Don’t use md5skip in external mode
- SW-569 - pysparkling: h2o on exit does not shut down cleanly
- SW-572 - Additional fix for [SW-571]
- SW-573 - Minor Gradle build improvements and fixes
- SW-575 - Incorrect comment in hamOrSpamMojo pipeline
- SW-576 - Cleanup pysparkling test infrastructure
- SW-577 - Fix conditions in jenkins file
- SW-580 - Fix composite build in Jenkins
- SW-581 - Fix H2OConf test on external cluster
- SW-582 - Opening Chicago Crime Demo Notebook errors on the first opening
- SW-584 - Create extended directory automatically
- SW-588 - Fix links in README
- SW-589 - Wrap stages in try finally in jenkins file
- SW-592 - Properly pass all parameters to algorithm
- SW-593 - H2Conf cannot be initialized on windows
- SW-594 - Gradle ml submodule reports success even though tests fail
- SW-595 - Fix ML tests
- New Feature
- SW-519 - Introduce SW Models into Spark python pipelines
- Task
- SW-609 - Upgrade H2O dependency to 3.16.0.1
- Improvement
- SW-318 - Keep H2O version inside sparklin-water-core.jar and provide utility to query it
- SW-420 - Shell scripts miss-leading error message
- SW-504 - Provides Sparkling Water Spark Uber package which can be used in –packages
- SW-570 - Stop previous jobs in jenkins in case of PR
- SW-571 - In PySparkling, getOrCreate(spark) still incorrectly complains that we should use spark session
- SW-583 - Upgrade to Gradle 4.3
- SW-585 - Add the custom commit status for internal and external pipelines
- SW-586 - [ML] Remove some duplicities, enable mojo for deep learning
- SW-590 - Replace deprecated method call in ChicagoCrime python example
- SW-591 - Repl doesn’t require H2O dependencies to compile
- SW-596 - Minor build improvements
- SW-603 - Upgrade Gradle to 4.3.1
- SW-605 - addFiles doesn’t accept sparkSession
- SW-610 - Change default client mode to INFO, let user to change it at runtime
v2.1.16 (2017-10-23)¶
- Bug
- SW-555 - Fix documentation issue in PySparkling
- SW-558 - Increase default value for client connection retry timeout in
- SW-560 - SW documentation for nthreads is inconsistent with code
- SW-561 - Fix reporting artefacts in Jenkins and remove use of h2o-3-shared-lib
- SW-564 - Clean test workspace in jenkins
- SW-565 - Fix creation of extended jar in jenkins
- SW-567 - Fix failing tests on external backend
- SW-568 - Remove obsolete and failing idea configuration
- SW-559 - GLM fails to build model when weights are specified
- Improvement
- SW-557 - Create 2 jenkins files ( for internal and external backend ) backed by configurable pipeline
- SW-562 - Disable web on external H2O nodes in external cluster mode
- SW-563 - In external cluster mode, print also YARN job ID of the external cluster once context is available
- SW-566 - Upgrade H2O to 3.14.0.7
- SW-553 - Improve handling of sparse vectors in internal cluster
v2.1.15 (2017-10-10)¶
- Bug
- SW-423 - Tests of External Cluster mode fails
- SW-516 - External cluster improperly convert RDD[ml.linalg.Vector]
- SW-525 - Don’t use GPU nodes for sparkling water testing in Jenkins
- SW-526 - Add missing when clause to scripts test stage in Jenkinsfile
- SW-527 - Use dX cluster for Jenkins testing
- SW-529 - Code defect in Scala example
- SW-531 - Use code which is compatible between Scala 2.10 and 2.11
- SW-532 - Make auto mode in external cluster default for tests in jenkins
- SW-534 - Ensure that all tests run on both, internal and external backends
- SW-536 - Allow to test sparkling water against specific h2o branch
- SW-537 - Update Gradle to 4.2RC2
- SW-538 - Fix problem in Jenkinsfile where H2O_HOME has higher priority then H2O_PYTHON_WHEEL
- SW-539 - Fix PySparkling issue when running multiple times on the same node
- SW-541 - Model training hangs in SW
- SW-542 - sw does not support parquet import
- SW-552 - Fix documentation bug
- New Feature
- Improvement
v2.1.13 (2017-08-02)¶
- Bug
- Improvement
- SW-355 - Include H2O R client distribution in Sparkling Water binary
- SW-500 - Warehouse dir does not have to be set in tests on Spark from 2.1+
- SW-506 - Documentation for the backends should mention get-extended-h2o.sh instead of manual jar extending
- SW-507 - Upgrade to Gradle 4.0.2
- SW-508 - More robust get-extended-h2o.sh
- SW-509 - Add back DEVEL.md and CHANGELOG.md and redirect to new versions
v2.1.12 (2017-07-17)¶
- Improvement
- SW-490 - Upgrade Gradle to 4.0.1
- SW-491 - Increase default value for Write and Read confirmation timeout
- SW-492 - Remove dead code and deprecation warning in tests
- SW-493 - Enforce Scala Style rules
- SW-494 - Remove hard dependency on RequestServer by using RestApiContext
- SW-496 - Remove ignored empty “H2OFrame[Time] to DataFrame[TimeStamp]” test
- SW-498 - Upgrade H2O to 3.10.5.4
v2.1.11 (2017-07-12)¶
v2.1.10 (2017-06-29)¶
- Bug
- SW-469 - Remove accidentally added kerb.conf file
- SW-470 - Allow to pask sparkSession to Security.enableSSL and deprecate sparkContext
- SW-474 - Use deprecated HTTPClient as some CDH versions does not have the new method
- SW-475 - Handle duke library in case it’s loaded using –packages
- SW-479 - Fix CHANGELOG location in make-dist.sh
- Improvement
- SW-457 - Clean up windows scripts
- SW-466 - Separate Devel.md into multiple rst files
- SW-472 - Convert to rst README in gradle dir
- SW-473 - Upgrade to gradle 4.0
- SW-477 - Upgrade H2O to 3.10.5.2
- SW-480 - Bring back publishToMavenLocal task
- SW-482 - Updates to change log location
- SW-483 - Make rel-2.1 changelog consistent and also rst
v2.1.9 (2017-06-15)¶
- Technical task
- SW-211 - In PySparkling for spark 2.0 document how to build the package
- Bug
- SW-448 - Add missing jar into the assembly
- SW-450 - Fix instructions on the download site
- SW-453 - Use size method to get attr num
- SW-454 - Replace sparkSession with spark in backends documentation
- SW-456 - Make shell scripts safe
- SW-459 - Update PySparkling run-time dependencies
- SW-461 - Fix wrong getters and setters in pysparkling
- SW-467 - Fix typo in the FAQ documentation
- SW-468 - Fix make-dist
- New Feature
- SW-455 - Replace the remaining references to egg files
- Improvement
- SW-24 - Append tab on Sparkling Water download page - how to use Sparkling Water package
- SW-111 - Update FAQ with information about hive metastore location
- SW-112 - Sparkling Water Tunning doc: add heartbeat dcoumentation
- SW-311 - Please report Application Type to Yarn Resource Manager
- SW-340 - Improve structure of SW README
- SW-426 - Allow to download sparkling water logs from the spark UI
- SW-444 - Remove references to Spark 1.5, 1.4 ( as it’s old ) in README.rst and other docs
- SW-447 - Upgrade H2O to 3.10.5.1
- SW-452 - Add missing spaces after “,” in H2OContextImplicits
- SW-460 - Allow to configure flow dir location in SW
- SW-463 - Extract sparkling water configuration to extra doc in rst format
- SW-465 - Mark tensorflow demo as experimental
v2.1.8 (2017-05-25)¶
- Bug
- SW-263 - Cannot run build in parallel because of Python module
- SW-336 - Wrong documentation of PyPi h2o_pysparkling_2.0 package
- SW-430 - pysparkling: adding a column to a data frame does not work when parse the original frame in spark
- SW-431 - Allow to pass additional arguments to run-python-script.sh
- SW-436 - Fix getting of sparkling water jar in pysparkling
- SW-437 - Don’t call atexit in case of pysparkling in cluster deploy mode
- SW-438 - store h2o logs int unique directories
- SW-439 - handle interrupted exception in H2ORuntimeInfoUIThread
- SW-335 - Cannot install pysparkling from PyPi
- Improvement
- SW-445 - Remove information from README.pst that pip cannot be used
- SW-341 - Support Python 3 distribution
- SW-380 - Define Jenkins pipeline via Jenkinsfile
- SW-433 - Add change logs link to the sw download page
- SW-435 - Upgrade shadow jar plugin to 2.0.0
- SW-440 - Sparkling Water cluster name should contain spark app id instead of random number
- SW-441 - Replace deprecated DefaultHTTPClient in AnnouncementService
- SW-442 - Get array size from metadata in case of ml.lilang.VectorUDT
- SW-443 - Upgrade H2O version to 3.10.4.8
v2.1.7 (2017-05-10)¶
- Bug
- SW-429 - Different cluster name between client and h2o nodes in case of external cluster
v2.1.6 (2017-05-09)¶
v2.1.5 (2017-04-27)¶
v2.1.4 (2017-04-20)¶
- Bug
- SW-65 - Add pysparkling instruction to download page
- SW-365 - Properexit status handling of external cluster
- SW-398 - Usetimeout for read/write confirmation in external cluster mode
- SW-400 - Fix stopping of H2OContext in case of running standalone application
- SW-401 - Add configuration property to external backend allowing to specify the maximal timeout the cloud will wait for watchdog client to connect
- SW-405 - Use correct quote in backend documentation
- SW-408 - Use kwargs for h2o.connect in pysparkling
- SW-409 - Fix stopping of python tests
- SW-410 - Honor –core Spark settings in H2O executors
- Improvement
- SW-231 - Sparkling Water download page is missing PySParkling/RSparkling info
- SW-404 - Upgrade H2O dependency to 3.10.4.4
- SW-406 - Download page should list available jars for external cluster.
- SW-411 - Migrate Pysparkling tests and examples to SparkSession
- SW-412 - Upgrade H2O dependency to 3.10.4.5
v2.1.3 (2017-04-7)¶
- Bug
- SW-334 - as_factor() ‘corrupts’ dataframe if it fails
- SW-353 - Kerberos for SW not loading JAAS module
- SW-364 - Repl session not set on scala 2.11
- SW-368 - bin/pysparkling.cmd is missing
- SW-371 - Fix MarkDown syntax
- SW-372 - Run negative test for PUBDEV-3808 multiple times to observe failure
- SW-375 - Documentation fix in external cluster manual
- SW-376 - Tests for DecimalType and DataType fail on external backend
- SW-377 - Implement stopping of external H2O cluster in external backend mode
- SW-383 - Update PySparkling README with info about SW-335 and using SW from Pypi
- SW-385 - Fix residual plot R code generator
- SW-386 - SW REPL cannot be used in combination with Spark Dataset
- SW-387 - Fix typo in setClientIp method
- SW-388 - Stop h2o when running inside standalone pysparkling job
- SW-389 - Extending h2o jar from SW doesn’t work when the jar is already downloaded
- SW-392 - Python in gradle is using wrong python - it doesn’t respect the PATH variable
- SW-393 - Allow to specify timeout for h2o cloud up in external backend mode
- SW-394 - Allow to specify log level to external h2o cluster
- SW-396 - Create setter in pysparkling conf for h2o client log level
- SW-397 - Better error message covering the most often case when cluster info file doesn’t exist
- Improvement
v2.1.2 (2017-03-20)¶
v2.1.1 (2017-03-18)¶
- Bug
- SW-308 - Intermittent failure in creating H2O cloud
- SW-321 - composite function fail when inner cbind()
- SW-342 - Environment detection does not work with Spark2.1
- SW-347 - Cannot start Sparkling Water at HDP Yarn cluster
- SW-349 - Sparkling Shell scripts for Windows do not work
- SW-350 - Fix command line environment for Windows
- SW-357 - PySparkling in Zeppelin environment using wrong class loader
- Improvement
- SW-333 - ApplicationMaster info in Yarn for external cluster
- SW-337 - Use
h2o.connect
in PySpark to connect to H2O cluster - SW-345 - Create configuration manual for External cluster
- SW-356 - Improve documentation for spark.ext.h2o.fail.on.unsupported.spark.param
- SW-360 - Upgrade H2O dependency to 3.10.4.2
v2.1.0 (2017-03-02)¶
v2.0.x (2016-09-26)¶
- Sparkling Water 2.0 brings support of Spark 2.0.
- For detailed changelog, please read rel-2.0/CHANGELOG.
v1.6.x (2016-03-15)¶
- Sparkling Water 1.6 brings support of Spark 1.6.
- For detailed changelog, please read rel-1.6/CHANGELOG.
v1.5.x (2015-09-28)¶
- Sparkling Water 1.5 brings support of Spark 1.5.
- For detailed changelog, please read rel-1.5/CHANGELOG.
v1.4.x (2015-07-06)¶
- Sparkling Water 1.4 brings support of Spark 1.4.
- For detailed changelog, please read rel-1.4/CHANGELOG.
v1.3.x (2015-05-25)¶
- Sparkling Water 1.3 brings support of Spark 1.3.
- For detailed changelog, please read rel-1.3/CHANGELOG.
v1.2.x (2015-05-18) and older¶
- Sparkling Water 1.2 brings support of Spark 1.2.
- For detailed changelog, please read rel-1.2/CHANGELOG.