Change Log

v3.28.0.3-1 (2020-02-06)

Downloads:
  • Bug
    • SW-1559 - Cloud up of SW fails on EMR
    • SW-1650 - SparklingWater forms only H2O cluster on Azure only with one node
    • SW-1830 - Support h2o3 mojo prediction in rsparkling
    • SW-1849 - Add missing 'rel-' prefix when suggesting correct H2O package to install in R
    • SW-1865 - Fix Typo in Backends Documentation
    • SW-1867 - Add Sparkling Water UI tab only in case the UI is enabled
    • SW-1871 - Use local maven in our test infra instead of –includeBuild
    • SW-1879 - Fix R tests
    • SW-1882 - Fix setNthreads method on H2OConf
    • SW-1887 - is_internal_secure_connections_enabled method needs to be in SharedBackendConf.py
    • SW-1895 - Fix jenkins pipeline so it can also run PRE_RELEASE_TESTS
  • Improvement
    • SW-1686 - Expose offset_column in XGBoost
    • SW-1790 - RSparkling in cran should be dummy code to point to our rsparkling in custom repo
    • SW-1828 - Ensure H2OContext in RSparkling is a class so we don't have to pass sc to methods asH2OFrame and asDataFrame
    • SW-1848 - Cleanup package.R in RSparkling
    • SW-1851 - [Proposal]Rename conversion methods to be consistent with other changes
    • SW-1860 - Fix ArrayIndexOutOfBoundsException on internal backend
    • SW-1861 - Remove extra import
    • SW-1863 - Mention in documentation that High Currency clusters are not yet supported
    • SW-1866 - Add option to specify full path to hadoop command
    • SW-1873 - Use ntrees instead of deprecated nEstimators on H2OXGBoost API
    • SW-1874 - Keep migration guide up-to-date
    • SW-1884 - Deprecate externalWriteConfirmationTimeout option
    • SW-1885 - Upgrade to H2O 3.28.0.3
    • SW-1889 - Make sure getters and setters on python ExternalBackendConf are consistent with scala counterpart
    • SW-1891 - Make sure getters and setters on python InternalBackendConf are consistent with scala counterpart
    • SW-1893 - Make sure getters and setters on python SharedBackendConf are consistent with scala counterpart
  • Engineering Story
    • SW-1852 - Add Tests Covering Scenarios with XGBoost and Offset Column

v3.28.0.2-1 (2020-01-23)

Downloads:
  • Bug
    • SW-1841 - Fix Examples in LDAP and Kerberos Tutorials
    • SW-1843 - The Second Call of H2OContext.getOrCreate Throws an Exception
  • New Feature
    • SW-1802 - Introduce stoppingRounds, stoppingMetric and stoppingTolerance Parameters on GBM, DRF, XGBoost and DeepLearning
    • SW-1835 - Enable to Specify Number of Partitions of Virtual Datasets Used in Benchmarks
  • Improvement
    • SW-1801 - Deprecate the r2stopping Parameter on GBM and DRF
    • SW-1826 - Deprecate using username and password in RSparkling in favor of the spark options used for this
    • SW-1829 - Remove sctrict version check argument in RSparkling
    • SW-1831 - Test Spark to H2O Conversions on Big Data
    • SW-1836 - Make Execution of Individual Backends Configurable in Benchmarks
    • SW-1842 - Upgrade to H2O 3.28.0.2
  • Engineering Story
    • SW-1838 - Move Model and Algorithm Tests to 'ai.h2o.sparkling.ml' Namespace
    • SW-1839 - Iterate over Transformed DataFrame in H2OFrameToDataFrameConversionBenchmark
  • Docs
    • SW-1832 - Update copyright year in conf.py file to include 2020

v3.28.0.1-1 (2019-12-19)

Downloads:
  • Bug
    • SW-1492 - [Spark-2.1] Switch minimal java version for Java 1.8
    • SW-1743 - Run rest api client tests only in external backend mode
    • SW-1747 - The option "-sw_ext_backend" must be enabled when REST-based client is used
    • SW-1748 - Try to fix NPE on Spark 2.1, 2.2 and 2.3 related to metadata
    • SW-1749 - Percentiles Are Not Propagated to Metadata
    • SW-1750 - Fix uploading nightlies when we build them against H2O branches
    • SW-1754 - H2OContext.getOrCreate Should Create Only One Cluster in Automatic Mode
    • SW-1762 - Fix unit_test_utils.assert_h2o_frames_are_identical for Python 2
    • SW-1771 - Fix documentation warnings
    • SW-1774 - Fix bug introduced in 3.26.11 - using flatfile for client connection in external backend
    • SW-1777 - Proper stopping of PySparkling in case of REST API
    • SW-1782 - Improve exception when the user is not authentification in rest api client
    • SW-1789 - Fix build with latest AutoML changes
    • SW-1791 - Make ProxyStarter more robust
    • SW-1792 - Remove extra pre_create_hook and h2o_connect_hook parameters from H2OContext on PySparkling
    • SW-1793 - In rest api approach, auto mode, external backend keeps running even if we weren't able to authentificate
    • SW-1795 - RSparkling doc is missing library(rsparkling) step
    • SW-1798 - Check that timeout for pinging the backend is always smaller then the timeout for killing the external cluster
    • SW-1799 - Fix Invalid use of BasicClientConnManager error in client-less approach
    • SW-1800 - Add wait timeout to ConfigurationPropertiesTestSuite.testNotifyLocalPropertyCreatesFile
    • SW-1803 - Sparkling Water does not stop automaticaly in client-less mode when hc.stop is not defined
    • SW-1811 - Client needs to recognize itself as client
    • SW-1814 - Register shutdown hook after H2O is running to avoid NPE in case app is stopped during start of H2OContext
    • SW-1817 - Implement retry for rest api requests
  • Epic
    • SW-1529 - [PySparkling] Client Separation from Spark Driver
  • Story
    • SW-1647 - Convert PySpark DataFrame to H2OFrame without Client
  • New Feature
    • SW-1496 - Expose H2O-3 DRF in Sparkling Water
    • SW-1588 - Add DRF to grid search
    • SW-1606 - Create gradle task to create pysparkling docker image for kubernetes
    • SW-1607 - Create gradle task to create rsparkling docker image for kubernetes
    • SW-1689 - Document how to use Sparkling Water with Kubernetes
    • SW-1690 - Create script to generate kubernetes docker images
    • SW-1696 - Remove announcement service as it is not used
    • SW-1722 - Ensure external backend (rest api) is stopped in automatic mode if the spark app is killed ( avoid zombie clusters)
    • SW-1724 - Create UDF for Ordinal Predictions
    • SW-1740 - Add parallelism option to GridSearch
  • Improvement
    • SW-1445 - Remove Deprecated setters on algorithms which has enum as argument
    • SW-1495 - Expose Number of Trees in SW MOJO
    • SW-1518 - Remove deprecated parameter colsampleBytree and related methods from H2OXGBoost
    • SW-1519 - Switch to single value in the predictionCol and put all the details on the detailedPredictionCol
    • SW-1545 - Use argumentbuilder to build arguments for the external h2o backend
    • SW-1572 - Remove deprecated option spark.ext.h2o.external.cluster.num.h2o.nodes and related methods
    • SW-1574 - Remove algos and features in deprecated org.apache.spark.h2o.ml.algos package
    • SW-1575 - Remove deprecated option spark.ext.h2o.external.read.confirmation.timeout and related getters and setters
    • SW-1586 - Remove deprecated getLambda & getAlpha getters and related setters
    • SW-1587 - Remove deprecated getter and setter SelectBestModelDecreasing on H2OGridSearch
    • SW-1628 - Create breaking changes document in doc there breaking changes so far in 3.28
    • SW-1636 - Enable to specify outputCols on H2OTargetEncoder
    • SW-1661 - Deprecate multicast search for cluster in external backend in manual mode
    • SW-1681 - Change sw version to include also patch within one h2o version
    • SW-1684 - Migration guide was missing several changes which are already resolved in 3.28
    • SW-1704 - Expose Extra Http Headers for H2O Nodes
    • SW-1729 - Remove deprecated option for ipBasedFlatfile
    • SW-1730 - Ensure worker nodes have always open Flow in the external Backend
    • SW-1758 - Reuse stopped field from scala backend
    • SW-1765 - Hide internal fields in H2OContext
    • SW-1766 - Move stacktrace extension to our sparkling water package ai.h2o.sparkling
    • SW-1768 - Get Default Values of AutoML Parameters synchronized with H2O-3
    • SW-1769 - Avoid Usage of AutoML Deprecations
    • SW-1773 - Mention that worker nodes in manual modeneed to have rest api available in migration guide
    • SW-1775 - Add security tests for the client-less approach
    • SW-1776 - [TEST] Add test for download logs when using rest api client
    • SW-1780 - Deprecate set_h2o_driver_if, h2o_driver_if and scala counterparts
    • SW-1781 - Remove deprecated methods deprecated by [SW-1780]
    • SW-1783 - Unify passing of authentication information
    • SW-1785 - Remove deprecated set_user_name and user_name method on H2OConf in Python
    • SW-1787 - Version checks can be done via rest API in all cases on External backend
    • SW-1788 - Lock cloud in case of rest api client, auto mode
    • SW-1794 - Add test for the zombie cluster in client-less approach
    • SW-1796 - In client-less tests, verify that stopped cluster contains are shutdown correctly
    • SW-1797 - Move several check threads under single backend heartbeat thread
    • SW-1804 - Ensure internal communication does not go throug proxy, no reason for it
    • SW-1805 - Communicate always via leader node
    • SW-1806 - Upgrade to H2O 3.28.0.1
    • SW-1807 - Use Spark for logging on client in case of External backend
    • SW-1808 - Add test for automatic cluster stopping
    • SW-1809 - Obtaining nodes can be done via rest API in all cases on External backend
    • SW-1810 - Watchdog can be replaced by rest API on external backend
    • SW-1815 - Remove support for multicast cloud up in case of external H2O backend in manual standalone (no Hadoop) mode
    • SW-1816 - Remove out-dated check for duke library. Sparkling Water package is now fat jar so this issue does not exist
  • Engineering Story
    • SW-1698 - Remove unnecessary Log level change
    • SW-1733 - Tests for Clientless Conversion from H2OFrames to DataFrames
    • SW-1734 - Configuration File for Tests against H2O Branch
    • SW-1741 - Adapt internal Target encoding code to latest changes in H2O
    • SW-1744 - Tests for Clientless Conversion from DataFrames to H2OFrames
    • SW-1745 - Disable version check in tests on external backend
    • SW-1746 - spark.ext.h2o.external.disable.version.check was false also when we needed to run rest api tests
    • SW-1767 - Move remaining java classes under scala dir
    • SW-1772 - Add Python GBM Test Running with REST-based H2OContext
    • SW-1812 - Small refactor of ExternalH2OBackend class
    • SW-1813 - Fix script and integ tests

v3.26.11 (2019-12-06)

Downloads:
  • Bug
    • SW-1711 - Fix propagation of internal security conf in Sparkling Water
    • SW-1732 - Iimplement shutdown hook to ensure H2O will go down on normal stop of Spark
    • SW-1739 - Fix target encoder multiline doc descriptions
    • SW-1755 - Don't need to stop worker nodes in internal backend, spark takes care of it as it shutdowns the executors
    • SW-1756 - externalCommunicationBlockSizeAsBytes is missing on H2OConf in python
    • SW-1757 - Fix PipelinePredictionTest and Regenerate Reference Results
    • SW-1761 - Fix deadlock when user explicitly calls hc.stop()
  • New Feature
    • SW-1764 - Upgrade H2O to 3.26.0.11
  • Improvement
    • SW-1646 - Run H2O Nodes With Security Parameters
    • SW-1687 - Expose Offset Column in Supervised Algorithms
    • SW-1705 - Deprecate spark.ext.h2o.client.flow.extra.http.headers
    • SW-1717 - Correctness Tests for Usage of 'offsetCol' with H2OGBM
    • SW-1737 - Ensure the client has full flatfile in external backend
  • Engineering Story
    • SW-1702 - Upgrade to Gradle 5.6.4
    • SW-1706 - Put h2o-security package into sparkling water assembly
    • SW-1708 - Simplify distribution of security files
    • SW-1710 - Enable client mode in Sparkling Water (needs to be done explictly)
    • SW-1731 - Deprecate h2oNodeWebEnabled for external backend
    • SW-1735 - Add job to test against rel and master branch of H2O
    • SW-1736 - Create nightly job where we build sparkling water against h2o branches
    • SW-1738 - Ensure benchmarks are not run as part of regular build

v3.26.10 (2019-11-07)

Downloads:
  • Bug
    • SW-1685 - Fix Propagation of Extra Properties in In Internal Backend
    • SW-1688 - Fix docker image generation for Scala Sparkling Water in Kuberntes Environment
    • SW-1697 - MOJO Cache Causes That Scoring Applications Don't Finish When Everything Is Done
    • SW-1700 - sparkling-water-utils is not published to maven
    • SW-1701 - Upgrade H2O to 3.26.0.10

v3.26.9 (2019-10-31)

Downloads:
  • Bug
    • SW-1676 - The getGridModelsParams() Method of H2OGridSearch Returns Incorrect Values for Nested Hyper-Paremeter Types
    • SW-1678 - Scoring package is not published to nexus
    • SW-1680 - Docs page is always missing last changelog
  • Improvement
    • SW-1662 - Retry for conda upload in release pipeline
    • SW-1675 - Expose Base Port for Worker Nodes in External Backend
    • SW-1677 - Upgrade to H2O 3.26.0.9
    • SW-1682 - Enable Users to Specify Extra H2O Parameters

v3.26.8 (2019-10-18)

Downloads:
  • Bug
    • SW-1670 - Improve Synchronization in H2OMOJOBaseCache
  • New Feature
    • SW-1664 - Enable Users to Specify Extra Http Headers for H2O Flow as SW Parameter
    • SW-1667 - Enable Users to Specify Block Size of Communication in External Backend
    • SW-1671 - Expose Property for Setting Lifetime of MOJOs in Cache
  • Improvement
    • SW-1669 - Improve Variable Names in the ExternalBackendUtils Class
    • SW-1672 - Remove Relocation of com.google.protobuf in Assembly Jar
    • SW-1674 - Upgrade to H2O 3.26.0.8

v3.26.7 (2019-10-11)

Downloads:
  • Bug
    • SW-1635 - Update Documentation of Deploying SW to Azure HDI
    • SW-1656 - Ensure that after Cloud size X under Y failure the rest of the external cluster is killed
  • Improvement
    • SW-1658 - Figure out better way of caching MOJO Pipelines in H2OMOJOPipelineModel transformer
    • SW-1666 - Improve Performance of Loading Pipeline MOJO Files

v3.26.6 (2019-10-02)

Downloads:
  • Bug
    • SW-1618 - HamOrSparm tests return false for both predictions in scripts tests
    • SW-1623 - Fix intermittent NPE in PySparkling with rollups on external backend
    • SW-1624 - H2OTargetEncoderMOJOModel Returns Wrong Results If Input Cols Are Not Ordered According To Training Dataset
    • SW-1626 - Intermittent failure during conversion to h2o frame on External backend in PySparking
    • SW-1642 - Prevent sending empty partitions to external H2O backend
    • SW-1643 - Fix script test - ham or spam pipeline on grid search
    • SW-1644 - Pysparkling 2.1 fails on parsing PySpark version
    • SW-1645 - Revert SW-1337
  • Task
    • SW-1617 - Benchmarks: Report Failure if Execution Goes Wrong
  • Improvement
    • SW-1619 - [Spark2.3]Upgrade to Spark 2.3.4
    • SW-1631 - Automatically increase client timeout on top of Azure
    • SW-1638 - Make port 9009 configurable on Azure
    • SW-1659 - Upgrade H2O to 3.26.0.6
  • Engineering Story
    • SW-1602 - Enable all TargetEncoder tests

v3.26.5 (2019-09-16)

Downloads:
  • Bug
    • SW-1570 - Fix typo contribution -> contributions on SHAPLY documentation page
    • SW-1580 - importing pysparkling in Zeppelin fails on SW for 3.26.3
    • SW-1583 - Add missing namedMojoOutputParameter to PySparkling Algo constructors
    • SW-1584 - Remove extra asserts for types already covered by type converters
    • SW-1585 - Improve H2oGridSearch internal handling of Algo + improve API of ordering
    • SW-1592 - [BUILD] Use numpy compatible with python 2 and python 3
    • SW-1596 - Jupyter notebook is unable to start kernel for Spark 2.4
    • SW-1598 - Deprecate cases in H2OMojoPrediction where the prediction column does not directly contain the predicted value.
    • SW-1600 - Fix IIOB when using calibrated probabilities on MOJO
    • SW-1604 - PySparkling fails to parse pyspark version with build number, such as 2.3.1.dev0
    • SW-1605 - Gradle reported success even though the python test failed
    • SW-1610 - Fix running python tests by changing the env directly
    • SW-1611 - Script tests are not being executed correctly - some tests are not being executed
    • SW-1614 - Fix running python integ tests on external backend
  • Epic
  • New Feature
    • SW-1547 - Update H2OTargetEncoder according to changes in H2O-3 3.26.0.4
    • SW-1593 - Upgrade to Spark 2.4.4
    • SW-1609 - Automatically configure H2OContext in case we run on DBC in order to correctly show Flow
  • Task
    • SW-1295 - Benchmarks: Configuration for External Backend
    • SW-1296 - Benchmarks: Code Clean up
    • SW-1553 - Benchmarks: Jenkins Pipeline
    • SW-1616 - Benchmarks: Automatic Cluster Shutdown after Timeout
  • Improvement
    • SW-1564 - Create gradle task to build scala image for kubernetes
    • SW-1573 - Remove use of deprecated option spark.ext.h2o.external.cluster.num.h2o.nodes
    • SW-1582 - Make sure H2OGLM API is consistent with others and does not use the labmda_ hack
    • SW-1591 - Don't use strings to define algo name
    • SW-1597 - Add a note to documentation informing that terraform template is available only for Spark 2.4
    • SW-1613 - Upgrade H2O to 3.26.0.4
    • SW-1615 - Remove examples from assembly jar
  • Engineering Story
    • SW-1567 - Upgrade version of h2o docker image from 58 to the 64
    • SW-1571 - Upgrade CI docker image for 19 ( build on latest hadoop docker image hd2.2:64)
    • SW-1576 - Retrain sms_pipeline_model for PipelinePredictionTest so it uses ai.h2o package
    • SW-1577 - Generic logic to test parametr passing to algo wrappers on PySParkling
    • SW-1594 - Upgrade to a docker image with Spark 2.4.4
    • SW-1599 - Target Encoder Tests Covering Various Order of Columns
    • SW-1603 - Install Terraform to Docker Image Running Tests
  • Docs
    • SW-1581 - Add Rule of Thumb for Data Conversion
    • SW-1595 - Document How to Generate Prediction Contributions from an Existing MOJO

v3.26.3 (2019-08-28)

Downloads:
  • Bug
    • SW-1477 - Fix bug with setting init on KMeans
    • SW-1478 - Don't need to start H2O to initialize algo on PySparkling side
    • SW-1486 - Remove extra argument on H2OAUtoML pyspark wrapper
    • SW-1491 - Fix wrong statement regarding version in terraform documentation
    • SW-1493 - Properly apply type checks even for None values on all H2OAutoML parameters
    • SW-1498 - Check if Jar is already attached to the cluster in Initializer
    • SW-1499 - Extract zip file into temp directory owned and configured by Spark configured by spark.local.dir
    • SW-1509 - Rename colsampleBytree to colSampleByTree on H2OXGBoost in Scala & Python
    • SW-1512 - Fix broken link to Running Sparkling Water on supported platforms in the doc
    • SW-1516 - latest-stable/doc/deployment/pysparkling_pipeline.html is refering to old ratio and predictionCol parameters
    • SW-1517 - Update doc for RSparkling on Windows for latest RSparkling changes
    • SW-1520 - Fix warning 'H2OMOJOSettings' object has no attribute '_java_obj'
    • SW-1521 - Fix wrong link to latest spark version in main README
    • SW-1522 - createTempDit in SharedBackendUtils should create temp fiels in the Spark temp dir
    • SW-1523 - Fix IP based cloud-up on client side
    • SW-1524 - Fix getters on MOJO models
    • SW-1525 - Fix broken links in the documentation (poiting to old release or unexistent locations)
    • SW-1526 - PySparkling cannot parse embedded version.txt file.
    • SW-1527 - Verify version between H2O external back-end & H2O client on Spark driver
    • SW-1541 - Fix get-extended-h2o script to reflect new location of sparkling water releases
    • SW-1543 - Expose driver if, port, port range and extra memory percent configuration for external H2O cluster
    • SW-1546 - Fix SpreadRDDBuilder not serializable exception
    • SW-1548 - Move all pysparkling source files into single src dir
    • SW-1551 - Fix path to external jars generated by ./gradlew extendJar
    • SW-1555 - Fix obtaining the version when pysparkling is installed via pip
    • SW-1558 - Use absolute imports in tests as the relative ones are removed in python3
    • SW-1565 - startWorkerNodes and startClient was returning hostname instead of ip address
  • Story
    • SW-1530 - Conversion to H2OFrame needs to work without running H2O client
  • New Feature
    • SW-1557 - GLM no longer use MissingValuesHandling enum from DeepLearning
  • Task
    • SW-1487 - Update examples/README file
    • SW-1506 - Benchmarks: Terraform template for running benchmarks in EMR
    • SW-1542 - Benchmarks: Name of result file should contain backed and master
    • SW-1552 - Benchmarks: Gradle Task for Execution of Benchmarks
  • Improvement
    • SW-1368 - MOJO depploymet package
    • SW-1475 - Expose predict_contributions for H2OMOJOModel
    • SW-1481 - Deprecate H2OMOJOModel, H2OMOJOPipelineModel and H2OMOJOSettings in the org.apache.spark package
    • SW-1483 - Deprecate algos and features in org.apache.spark package
    • SW-1485 - Handle sortMetric param in H2OAutoML the same way as other enums
    • SW-1490 - Immutable projectName on H2OAUtoML
    • SW-1494 - Upgrade Terraform Templates to AWS Provider 2.23
    • SW-1501 - Fix 'ai.h2o:sparkling-water-package_2.11:2.4.13'/'h2o_pysparkling_2.4' conflict on Azure Databricks
    • SW-1502 - Upgrade to mojo2 library v2.1.3
    • SW-1503 - Avoid null on AutoML include & exlude Algos params
    • SW-1504 - Apply type converterts to rest of the PySparkling
    • SW-1515 - Upgrade to H2O 3.26.0.3
    • SW-1528 - Upgrade Gradle to Gradle 5.6
    • SW-1540 - Remove unnecessary read confirmation timeout
    • SW-1549 - Upgrade default instances in terraform templates to M5.xlarge
    • SW-1550 - Remove unsupported notebook (referencing dead deepwater)
  • Engineering Story
    • SW-1476 - Avoid duplication between mojo params and algo params
    • SW-1480 - Cleanup of PySparkling package -> moving to new package ai.h2o
    • SW-1511 - Remove unused init_scala_int_session() from PySparkling
    • SW-1539 - Avoid boiler plate code when introducing new test suite in PySpakrling

v3.26.2 (2019-07-30)

Downloads:
  • Bug
    • SW-1337 - Restarting h2o cluster makes all Spark Sessions connected to it unusable
    • SW-1379 - Fix IOOB exception when converting H2OFrame to DataFrame
    • SW-1381 - Bad quotes in documentation
    • SW-1382 - Remove extra quote in exception on ExternalH2OBackend
    • SW-1383 - Fix cloud up in external backend manual mode
    • SW-1384 - Fix wrong statement in rsparkling documentation
    • SW-1390 - Fix NPE when reading modelDetails in Mojo
    • SW-1393 - Use Python formatting for Python in secured_flow.rst
    • SW-1396 - Fix wrong exception in H2OAutoML sort metric handling
    • SW-1397 - User setClusterSize instead of deprecated setter in tests
    • SW-1400 - Nullability tests in DataFrameConverterTest should use data frames with an explicit schema
    • SW-1413 - Use VectorUDT in RowConverter
    • SW-1418 - Lower memory requirements in tests
    • SW-1439 - [Prototype] Switch to using String value on the setters & getters in the ml API on distribution param
    • SW-1441 - PySparkling can't be started after version change using pysparkling.sh
    • SW-1447 - Remove missingValuesHandling param from XGBoost wrapper
    • SW-1454 - It is no longer possible to specify predictionCol :(
    • SW-1462 - convertInvalidNumbersToNa missing on PySparkling
    • SW-1463 - Fix setters which accept both int and float
    • SW-1464 - Fix nullableArrayArray param for pyspakrling
    • SW-1468 - Use absolute imports as the relative ones are removed in python3
    • SW-1470 - DatasetWrapper should use withColumn insteadOf withColumns method
    • SW-1472 - Fix tests after modifying allStringsToCategorical
  • New Feature
    • SW-1425 - Add Target Encoding to Sparkling Water Python API
    • SW-1446 - Implement H2OKmeans pipeline wrapper
    • SW-1455 - Introduce NullableDoubleArrayParam for KMeans
    • SW-1456 - Documentation of Target Encoder
  • Task
    • SW-1294 - Benchmarks: Infrastructure for Getting Information about Execution Details
  • Improvement
    • SW-1207 - Add Target Encoding to Sparkling Water Scala API
    • SW-1344 - Unify ml package accross rel branches
    • SW-1351 - Unify jenkins scripts & create gradle profiles
    • SW-1375 - Single execution path for all spark->h2o frame conversions
    • SW-1387 - Handle vectors in SparkDataFrameConverter more explicitly
    • SW-1388 - Specify spark specific source dir per project, so they can differ in subprojects
    • SW-1392 - Document an example of training AutoML model
    • SW-1394 - Modify sw_xgboost.rst to use tabs for Python and Scala code
    • SW-1395 - ML Code simplifications & improvements
    • SW-1402 - [MAJOR_RELESE] Remove deprecated methods
    • SW-1412 - Integrate generic conversion logic to data frame conversion to H2O frames
    • SW-1417 - Improve SNAPSHOT handling
    • SW-1419 - Jenkins file improvements -> publish nihhtly only if both External & internal test pass for all Spark versions
    • SW-1421 - Upgrade to H2O 3.26.0.1
    • SW-1422 - Switch to one version of Sparkling Water
    • SW-1424 - Upgrade to H2O 3.26.0.2
    • SW-1430 - Use downloadLogs method from H2O and remove relevant methods on Sparkling Water side
    • SW-1436 - Fix warninig in python package as SW version no longer starts with spark major version
    • SW-1437 - Remove duplicate spark version specifier on pysparkling
    • SW-1442 - Update build SW doc
    • SW-1443 - Ignore local-cluster failing tests
    • SW-1444 - Use string representations instead of enums on Pipeline API
    • SW-1448 - Refactor parameters into supervised & unsupervised
    • SW-1449 - Create Supervised & Unsupervised Algorithm
    • SW-1451 - Document H2OKmeans pipeline wrapper
    • SW-1452 - Refactor params to supervised and unsupervised on PySparklin side
    • SW-1453 - Put back constructor checks for Enums on PySparkling side ( accidentally removed)
    • SW-1473 - Rename H2OTargetEncoderMojoModel to H2OTargetEncoderMOJOModel
  • Engineering Story
    • SW-1378 - Integration test for flattening logic
    • SW-1386 - Micro benchmark for conversion from a DataFrame to H2OFrame
    • SW-1404 - Unification of creating header page across different spark versions
    • SW-1457 - Test passing params to pipeline wrappers of H2O Algos
    • SW-1458 - No longer need to H2OContext.getOrCreate ini __init__ methods of pysparkling algo wrappers
    • SW-1459 - Avoid duplicating MojoParams on PySparkling side
    • SW-1460 - Infrastructure for prediction column with a simple prediction value
    • SW-1461 - prepare ai.h2o.sparkling structure on PySpakrling side
    • SW-1466 - Move logic for converting columns to categorical to prepareDatasetForFitting method

v2.1.56, v2.2.42, v2.3.31, v2.4.13 (2019-06-24)

Downloads:

  • Bug
    • SW-1140 - Add more logging to discover intermittent RSparkling Issue in jenkins tests
    • SW-1318 - add back to JavaH2OContext method asDataFrame(.., SQLContext) but deprecated
    • SW-1321 - Remove mention of H2O UDP from user documentation
    • SW-1322 - Fix wrong doc in ssl.rst -> val conf: H2OConf = // generate H2OConf file
    • SW-1323 - Model ID not available on our algo pipeline wrappers
    • SW-1338 - Follow up fixes after RSparkling change
    • SW-1339 - Use s3-cli instead of s3cmd because of performance reasons on nightlies
    • SW-1340 - Fix spinx warning
    • SW-1342 - Fix dist
    • SW-1343 - Fix dist structure
    • SW-1345 - Fix missing rsparkling in dist package
    • SW-1347 - Scaladoc not uploaded to S3 after porting make-dist to gradle
    • SW-1359 - Fix wrong links on nightly build page
    • SW-1360 - Explicitly send hearbeat after we have complete flatfile
    • SW-1361 - sparkling water package on maven should assembly jar
    • SW-1362 - gradle.properties in distribution contains wrong version
    • SW-1364 - Rename SVM to SparkSVM
    • SW-1374 - Minor documentation fixes
  • New Feature
    • SW-1021 - Upload RSparkling to S3 in a form of R repository
    • SW-1353 - Introduce logic flatting data frames with arbitrarily nested structures
  • Improvement
    • SW-554 - Include all used dependency licenses in the uber jar.
    • SW-1308 - Bundle Sparkling Water jar into rsparkling -> making rsparkling version dependent on specific sparkling water
    • SW-1317 - Unify repl acros different rel branches
    • SW-1325 - Expose jks_alias in Sparkling Water
    • SW-1326 - Include SW version in more log statements
    • SW-1330 - Add additional log to H2O cloudup in internal backend mode
    • SW-1331 - Create local repo with RSparkling
    • SW-1332 - [RSparkling] Make installation from S3 the default recommended option
    • SW-1333 - Move the conversion logic from Spark Row to H2O RowData to a separate entity
    • SW-1334 - Store H2O models in transient lazy variables of SW Mojo models
    • SW-1335 - Make automl tests more deterministic by using max_models instead of max_runtime_secs
    • SW-1341 - Use readme as main dispatch for documentation
    • SW-1346 - Remove chache and unpersist call in SpreadRDDBuilder
    • SW-1348 - Switch to s3 cli on release pipelines
    • SW-1349 - Use withColumn instead of select in MOJO models
    • SW-1350 - Fix links to doc & scaladoc on nightly builds
    • SW-1352 - Upgrade H2O to 3.24.0.5
    • SW-1365 - Run only last build in jenkins
    • SW-1369 - Download page is missing one step on RSparkling tab -> library(rsparkling)
    • SW-1371 - Create maven repo on our s3 for each release and nightly
    • SW-1373 - Update DBC documentation with respoect to latest RSparkling development

v2.1.55, v2.2.41, v2.3.30, v2.4.12 (2019-06-03)

Downloads:

  • Bug
    • SW-1259 - Unify ratio param across pipeline api
    • SW-1287 - Use RPC endpoints to orchestrate cloud in internal mode
    • SW-1290 - Fix doc
    • SW-1301 - Fix class-loading for Sparkling Water assembly JAR in PySparkling
    • SW-1311 - Add numpy as PySparkling dependency ( it is required because of Spark but missing from list of dependencies)
    • SW-1312 - Warn that default value of convertUnknownCategoricalLevelsToNa will be changed to false on GridSearch & AutoML
    • SW-1316 - Fix wrong fat jar name
  • Task
    • SW-1292 - Benchmarks: Subproject Skeleton
  • Improvement
    • SW-1212 - Make sure python zip/wheel is downloadable from our release s3
    • SW-1274 - On download page -> list all supported minor versions
    • SW-1286 - Remove Param propagation of MOJOModels from Python to Java
    • SW-1288 - H2OCommonParams in pysparkling
    • SW-1289 - Move shared params to H2OCommonParams
    • SW-1298 - Don't use deprecated methods
    • SW-1299 - Warn user that default value of predictionCol on H2OMOJOModel will change in the next major release to 'prediction'
    • SW-1300 - Upgrade to H2O 3.24.0.4
    • SW-1304 - Definition of assembly jar via transitive exclusions
    • SW-1305 - Move ability to change behavior of MOJO models to MOJOLoader
    • SW-1306 - Move make-dist logic to gradle
    • SW-1307 - Expose binary model in spark pipeline stage
    • SW-1309 - Fix xgboost doc
    • SW-1313 - Rename the 'create_from_mojo' method of H2OMOJOModel and H2OMOJOPipelineModel to 'createFromMojo'

v2.1.54, v2.2.40, v2.3.29, v2.4.11 (2019-05-17)

Downloads:

  • Bug
    • SW-1256 - Fix constructor of H2OMojoModel
    • SW-1258 - Remove internal constructors & Deprecate implicit constructor parameters for H2O Algo Spark Estimators( to be the same as in PySparkling)
    • SW-1270 - Fix version check in PySpakrling shell
    • SW-1278 - Clean workspace on the hadoop node in integ tests
    • SW-1279 - Fix inconsistencies between H2OAutoML, H2OGridSearch & H2OALgorithm
    • SW-1281 - Fix bad representation of predictionCol on H2OMOJOModel
    • SW-1282 - XGBoost can't be used in H2OGridSearch pipeline wrapper
    • SW-1283 - Correctly return mojo model in pysparkling after fit
  • Story
    • SW-1271 - Remove SparkContext from H2OSchemaUtils
    • SW-1273 - Upgrade to H2O 3.24.0.3
  • New Feature
    • SW-1248 - getFeaturesCols() should not return the fold column or weight column
    • SW-1249 - probability calibration does not work in Sparkling Water Dataframe API
  • Improvement
    • SW-369 - Override spark locality so we use only nodes on which h2o is running.
    • SW-1216 - Improve PySparkling README
    • SW-1261 - Remove binary H2O model from ML pipelines
    • SW-1263 - Don't require initializer call to be called during pysparkling pipelines
    • SW-1264 - Use default params reader in pipelines
    • SW-1268 - Non-named columns are long time deprecated. Switch to named columns by default
    • SW-1269 - Remove six as dependency from PySparkling launcher ( six is no longer dependency)
    • SW-1275 - Remove unnecessary constructor in helper class
    • SW-1280 - Add predictionCol to mojo pipeline model

v2.1.53, v2.2.39, v2.3.28, v2.4.10 (2019-04-26)

Downloads:

  • Bug
    • SW-1189 - Fix Sparkling Water 2.1.x compile on Scala 2.10
    • SW-1194 - RSparkling Can't be used on Spark 2.4
    • SW-1195 - Disable gradle daemon via gradle.properties
    • SW-1196 - Fix org.apache.spark.ml.spark.models.PipelinePredictionTest
    • SW-1203 - Custom metric not evaluated in internal mode of Sparkling Water
    • SW-1227 - Change get-extended-jar to use https instead of http
    • SW-1230 - Fix typo in GLM API - getRemoteCollinearColumns, setRemoteCollinearColumns
    • SW-1232 - Fix RUnits after upgrading to Gradle 5.3.1
    • SW-1234 - Deprecate asDataFrame with implicit argument
  • Story
    • SW-1198 - Introduce new annotation deprecating legacy methods in API
    • SW-1209 - Rename the 'predictionCol' model parameter to 'labelCol'
    • SW-1226 - Introduce mechanism for enabling backward compatibility of MOJO files when properties are renamed
  • New Feature
    • SW-1193 - Expose weights_column parameter
  • Improvement
    • SW-1188 - RSparkling: Add ability to add authentication details when calling h2o_context(sc)
    • SW-1190 - Improve hint description for disabling automatic usage of broadcast joins
    • SW-1199 - Improve memory efficiency of H2OMOJOPipelineModel
    • SW-1202 - Simplify Sparkling Water build
    • SW-1204 - Fix formating in python tests
    • SW-1208 - Create pysparkling tests report file if it does not exist
    • SW-1210 - Add fold column to python and scala pipelines
    • SW-1211 - Automatically download H2O Wheel
    • SW-1213 - Upgrade to H2O 3.24.0.2
    • SW-1214 - Remove PySparkling six dependency as it was removed in H2O
    • SW-1215 - Automatically generate PySparkling README
    • SW-1217 - Automatically generate last pieces of doc subproject
    • SW-1219 - Remove suport for testing external cluster in manual mode
    • SW-1221 - Remove unnecessary branch check
    • SW-1222 - Remove duplicate readme file (contains old info & the correct info is in doc)
    • SW-1223 - Remove confusing meetup dir
    • SW-1224 - Upgrade to Gradle 5.3.1
    • SW-1228 - Rename the 'ignoredColumns' parameter of H2OAutoML to 'ignoredCols'
    • SW-1229 - Remove dependencies to Scala 2.10
    • SW-1235 - Remove support for Python 2.6 on rel-2.1
    • SW-1236 - Reformat few python classes
    • SW-1238 - Parametrize EMR version in templates generation
    • SW-1239 - Remove old README and DEVEL doc files (not just pointer to new doc)
    • SW-1240 - Use minSupportedJava for source and target compatibility in build.gradle

v2.1.52, v2.2.38, v2.3.27, v2.4.9 (2019-04-03)

Downloads:

  • Bug
    • SW-1162 - Exception when there is a column with BOOLEAN type in dataset during H2OMOJOModel transformation
    • SW-1177 - In Pysparkling script, setting –driver-class-path influences the environment
    • SW-1178 - Upgrade to h2O 3.24.0.1
    • SW-1180 - Use specific metrics in grid search, in the same way as H2O Grid
    • SW-1181 - Document off heap memory configuration for Spark in Standalone mode/IBM conductor
    • SW-1182 - Fix random project name generation in H2OAutoML Spark Wrapper
  • New Feature
    • SW-1167 - Expose search_criteria for H2OGridSearch
    • SW-1174 - expose H2OGridSearch models
    • SW-1183 - Add includeAlgos to H2o AutoML pipeline stage & ability to ignore XGBoost
  • Improvement
    • SW-1164 - Add Sparkling Water to Jupyter spark/pyspark kernels in EMR terraform template
    • SW-1171 - Upgrade build to Gradle 5.2.1
    • SW-1175 - Integrate with H2O native hive support

v2.1.51, v2.2.37, v2.3.26, v2.4.8 (2019-03-15)

Downloads:

  • Bug
    • SW-1163 - Expose missing variables in shared TF EMR SW tamplate
  • Improvement
    • SW-1145 - Start jupyter notebook with Scala & Python Spark in AWS EMR Terraform template
    • SW-1165 - Upgrade to H2O 3.22.1.6

v2.1.50, v2.2.36, v2.3.25, v2.4.7 (2019-03-07)

Downloads:

  • Bug
    • SW-1150 - hc.stop() shows 'exit' not defined error
    • SW-1152 - Fix RSparkling in case the jars are being fetched from maven
    • SW-1156 - H2OXgboost pipeline stage does not define updateH2OParams method
    • SW-1159 - Unique project name in automl to avoid sharing one leaderboard
    • SW-1161 - Fix grid search pipeline step on pyspark side
  • Improvement
    • SW-1052 - Document teraform scripts for AWS
    • SW-1089 - Document using Google Cloud Storage In Sparkling Water
    • SW-1135 - Speed up conversion between sparse spark vectors and h2o frames by using sparse new chunk
    • SW-1141 - Improve terraform templates for AWS EMR and make them part of the release process
    • SW-1149 - Allow login via ssh to created cluster using terraform
    • SW-1153 - Add H2OGridSearch pipeline stage to PySpark
    • SW-1155 - Test GBM Grid Search Scala pipeline step
    • SW-1158 - Generalize H2OGridSearch Pipeline step to support other available algos
    • SW-1160 - Upgrade to H2O 3.22.1.5

v2.1.49, v2.2.35, v2.3.24, v2.4.6 (2019-02-18)

Downloads:

  • Bug
    • SW-1136 - Fix bug affecting loading pipeline in python when stored in scala
    • SW-1138 - Fix several cases in spark vector -> h2o conversion
  • Improvement
    • SW-1134 - Add H2OGLM Wrapper to Sparkling Water
    • SW-1139 - Update mojo2 to 0.3.16
    • SW-1143 - Fix s3 bootstrap templates for nightly builds
    • SW-1144 - Upgrade to H2O 3.22.1.4

v2.1.47, v2.2.33, v2.3.22, v2.4.4 (2019-01-21)

Downloads:

  • Bug
    • SW-1129 - Fix support for unsupervised mojo models
  • Improvement
    • SW-1101 - Update code to work with latest jetty changes
    • SW-1127 - Upgrade H2O to 3.22.1.2

v2.1.46, v2.2.32, v2.3.21, v2.4.3 (2019-01-17)

Downloads:

  • Bug
    • SW-1116 - Cannot serialize DAI model
  • Improvement
    • SW-1113 - Update to H2O 3.22.0.5
    • SW-1115 - Enable tabs in the documentation based on the language
    • SW-1120 - Prepare Terraform scripts for Sparkling Water on EMR
    • SW-1121 - Use getTimestamp method instead of _timestamp directly

v2.1.45, v2.2.31, v2.3.20, v2.4.2 (2019-01-08)

Downloads:

  • Bug
    • SW-1107 - NullPointerException at water.H2ONode.openChan(H2ONode.java:417) after upgrade to H2O 3.22.0.3
    • SW-1110 - Fix test suite to test PySparkling YARN integration tests on external backend as well
  • Task
    • SW-1109 - Docs: Change copyright year in docs to include 2019
  • Improvement
    • SW-464 - Publish PySparkling as conda package
    • SW-1111 - Update H2O to 3.22.0.4

v2.1.44, v2.2.30, v2.3.19, v2.4.1 (2018-12-27)

Downloads:

  • Bug
    • SW-1084 - Documentation link does not work on the Nightly Bleeding Edge download page
    • SW-1100 - Fix Travis builds
    • SW-1102 - Fix Travis builds (test just scala unit tests)
  • Improvement
    • SW-464 - Publish PySparkling as conda package
    • SW-1080 - Fix deprecation warning regarding automl -> AutoML
    • SW-1092 - Updates to streaming app
    • SW-1093 - Update to H2O 3.22.0.3
    • SW-1094 - Upgrade gradle to 4.10.3
    • SW-1095 - Enable GCS in Sparkling Water
    • SW-1097 - Properly integrate GCS with Sparkling Water, including test in PySparkling
  • Docs
    • SW-1083 - Add Installation and Starting instructions to the docs

v2.1.42, v2.2.28, v2.3.17 (2018-10-27)

Downloads:

  • Bug
    • SW-1071 - Fallback to original IP discovery in case we can't find the same network
    • SW-1072 - Fix handling time column for mojo pipeline
    • SW-1073 - Upgrade MOJO to 0.3.17
  • Improvement
    • SW-1045 - Upgrade H2O to 3.22.0.1

v2.1.41, v2.2.27, v2.3.16 (2018-10-17)

Downloads:

  • Bug
    • SW-930 - Enable AutoML tests in Sparkling Water
    • SW-1065 - Fix isssue with empty queue name by default
    • SW-1066 - In PySparkling, don't reconnect if already connected
    • SW-1068 - Fix warning in doc
  • Improvement
    • SW-1057 - Sparkling shell ignores parameters after last updates
    • SW-1058 - Automatic detection of client ip in external backend
    • SW-1059 - Pysparkling in external backend, manual mode stops the backend cluster, but the cluster should be left intact
    • SW-1060 - Create nightly release for 2.1, 2.2 and 2.3
    • SW-1061 - Upgrade to Mojo 0.3.15
    • SW-1062 - Don't expose mojo internal types
    • SW-1063 - More explicit checks for valid values of Backend mode and external backend start mode
    • SW-1064 - Expose run_as_user for External H2O Backend
    • SW-1069 - Upgrade H2O to 3.20.0.10

v2.1.40, v2.2.26, v2.3.15 (2018-10-02)

Downloads:

  • Bug
    • SW-1041 - Fix passing –jars to sparkling-shell
    • SW-1042 - More robust check for python package in PySparkling shell
    • SW-1048 - Add missing six dependency to setup.py for PySparkling
  • Improvement
    • SW-1043 - Mojo pipeline with multiple output columns (and also with dots in the names) does not work in SW
    • SW-1054 - Upgrade H2O dependency to 3.20.0.9

v2.1.39, v2.2.25, v2.3.14 (2018-09-24)

Downloads:

  • New Feature
    • SW-1020 - Expose leaderboard on H2OAutoML
    • SW-1022 - Display Release creation date on the download page
  • Improvement
    • SW-1024 - remove call to ./gradlew –help in jenkins pipeline
    • SW-1025 - Ensure that release does not depend on build id
    • SW-1026 - Automatically update master after RSparkling release with latest version
    • SW-1030 - [RSparkling] In case only path to SW jar file is specified, discover the version from JAR file instead of requiring it as parameter
    • SW-1031 - Enable installation ot RSparkling using devtools from Github repo
    • SW-1032 - Upgrade mojo pipeline to 0.13.2
    • SW-1033 - Document automatic certificate creation for Flow UI
    • SW-1034 - PySparkling fails if we specify https argument as part of getOrCreate()
    • SW-1035 - Document using s3a and s3n on Sparkling Water
    • SW-1036 - Upgrade to H2O 3.20.0.8
    • SW-1038 - The shell script bin/pysparkling should print missing dependencies
    • SW-1039 - Upgrade Gradle to 4.10.2
  • Docs
    • SW-1018 - Fix link to Installing RSparkling on Windows

v2.1.38, v2.2.24, v2.3.13 (2018-09-14)

Downloads:

  • New Feature
    • SW-1023 - Upgrade Gradle to 4.10.1
  • Improvement
    • SW-1019 - Upgrade H2O to 3.20.0.7
    • SW-1027 - Revert Upgrade to Gradle 4.10.1(bug in Gradle) and upgrade to Gradle 4.0
    • SW-1028 - Update docs and mention that ORC is supported
  • Docs
    • SW-1017 - Docs: Add Parquet to list of supported data formats

v2.1.37, v2.2.23, v2.3.12 (2018-08-28)

Downloads:

  • Bug
    • SW-270 - Add test for RDD[TimeStamp] -> H2OFrame[Time] -> RDD[Timestamp] conversion
    • SW-319 - SVMModelTest is failing
    • SW-986 - Fix links on RSparkling Readme page
    • SW-996 - Fix typos in documentation
    • SW-997 - Fix javadoc on JavaH2OContext
    • SW-1000 - Setting context path in pysparkling fails to launch h2o
    • SW-1001 - RSparkling does not respect context path
    • SW-1002 - Automatically generate the keystore for H2O Flow ssl (self-signed certificates)
    • SW-1003 - When running in Local mode, we ignore some configuration
    • SW-1004 - Fix context path value checks
    • SW-1005 - Use correct scheme in sparkling water when ssl on flow is enabled
    • SW-1006 - Fix context path setting on RSparkling
    • SW-1015 - Add context path after value of spark.ext.h2o.client.flow.baseurl.override when specified
  • New Feature
    • SW-980 - Integrate XGBoost in Sparkling Water
    • SW-1012 - Sparkling water External Backend Support in kerberized cluster
  • Task
    • SW-988 - Add to docs that pysparkling has a new dependency pyspark
  • Improvement
    • SW-175 - JavaH2OContext#asRDD implementation is missing
    • SW-920 - Sparkling Water/RSparkling needs to declare additional repository
    • SW-989 - Improve Scala Doc API of the support classes
    • SW-991 - Update Gradle Spinx libraries - faster documentation builds
    • SW-992 - Create abstract class from creating parameters from Enum for Sparkling Water pipelines
    • SW-993 - [PySparkling] Fix Wrong H2O version detection on latest bundled H2Os
    • SW-994 - Add timeouts & retries for docker pull
    • SW-998 - Document using PySparkling on the edge node ( EMR)
    • SW-1007 - Upgrade H2O to 3.20.0.6
    • SW-1011 - Fix EMR bootstrap scripts
    • SW-1013 - Add option which can be used to change the flow address which is printed out after H2OConetext started
    • SW-1014 - Document how to run Sparkling Water on kerberized cluster

v2.1.36, v2.2.22, v2.3.11 (2018-08-09)

Downloads:

  • Bug
    • SW-971 - Change maintainer of RSparkling to jakub@h2o.ai
    • SW-972 - Fix Content of RSparkling release table
    • SW-973 - Allow passing custom cars when running ./bin/sparkling/shell
    • SW-975 - Fix CRAN issues of Rsparkling
    • SW-981 - Fix wrong comparison of versions when detecing other h2o versions in PySparkling
    • SW-982 - Set up client_disconnect_timeout correctly in context on External backend, auto mode
    • SW-983 - Fix missing mojo impl artifact when running pysparkling tests in jenkins
  • Task
    • SW-633 - Add to doc that 100 columns are displayed in the preview data by default
  • Improvement
    • SW-528 - Update PySparkling Notebooks to work for Python 3
    • SW-548 - List nodes and driver memory in Spark UI - SParkling Water Tab
    • SW-910 - Use Mojo Pipeline API in Sparkling Water
    • SW-969 - Port documentation for mojo pipeline on Spark to SW repo
    • SW-970 - Upgrade Mojo 2 in SW to 0.11.0
    • SW-976 - Upgrade H2O to 3.20.0.5
    • SW-977 - Need ability to disable Flow UI for Sparkling-Water
    • SW-979 - Verify that we are running on correct Spark for PySparkling at init time
    • SW-984 - Cache also test and runtime dependencies in docker image
  • Docs
    • SW-946 - Add "How to" for using Sparkling Water on Google Cloud Dataproc

v2.1.35, v2.2.21, v2.3.10 (2018-08-01)

Downloads:

  • Bug
    • SW-903 - Automate releases of RSparkling and create release pipeline for this release proccess
    • SW-911 - Add missing repository to the documentation
    • SW-944 - Fix Sphinx gradle plugin, the latest version does not work
    • SW-945 - Stabilize releasing to Nexus Repository
    • SW-953 - Do not stop external H2O backend in case of manual start mode
    • SW-958 - Fix RSparkling README style issues
    • SW-959 - Fix address for fetching H2O R package in nightly tests
    • SW-961 - Add option to ignore SPARK_PUBLIC_DNS
    • SW-962 - Add option which ensures that items in flatfile are translated to IP address
    • SW-967 - Deprecate old behaviour of mojo pipeline output in SW
  • Improvement
    • SW-233 - Warn if user's h2o in python env is different then the one bundled in pysparkling
    • SW-921 - Move Rsparkling to Sparkling Water repo
    • SW-941 - Upgrade Gradle to 4.9
    • SW-952 - Fix issues when stopping Sparkling Water (Scala) in yarn-cluster mode for external Backend
    • SW-957 - RSparkling should run tests in both, external and internal mode
    • SW-963 - Upgrade H2O to 3.20.0.4
    • SW-965 - Expose port offset in Sparkling Water
    • SW-968 - Remove confusing message about stopping H2OContext in PySparkling

v2.1.34, v2.2.20, v2.3.9 (2018-07-16)

Downloads:

  • Bug
    • SW-902 - Upgrade Gradle to 4.8.1
    • SW-904 - Upgrade Mojo2 version to 0.10.7
    • SW-909 - Fix issues when stopping Sparkling Water (Scala) in yarn-cluster mode
    • SW-925 - Fix missing aposthrope in documentation
    • SW-929 - Disable temporarily AutoML tests in Sparkling Water
  • New Feature
    • SW-826 - Implement Synchronous and Asynchronous Scala cell behaviour
  • Improvement
    • SW-846 - Don't parse types again when passing data to mojo pipeline
    • SW-886 - Several Scala cell improvements in H2O flow
    • SW-887 - Make sure that we can use schemes unsupported by H2O in H2O Confoguration
    • SW-889 - Port AWS preparation scripts into SW codebase
    • SW-894 - Add support for queuing of Scala cell jobs
    • SW-914 - Wrong Spark version in documentation
    • SW-915 - Upgrade to Spark 2.1.3
    • SW-917 - Dockerize Sparkling Water release pipeline
    • SW-919 - Clean gradle build with regards to mojo2
    • SW-922 - Upgrade H2O to 3.20.0.3
    • SW-928 - Expose AutoML max models
  • Docs
    • SW-878 - Add section for using Sparkling Water with AWS

v2.1.32, v2.2.18, v2.3.7 (2018-06-18)

Downloads:

  • Bug
    • SW-861 - Upgrade Gradle to 4.8 (publishing plugin)
    • SW-872 - Fix reference to local-cluster on download page
    • SW-880 - Update Hadoop version on download page
    • SW-881 - Fix Script tests on Dockerized Jenkins infrastructure
    • SW-882 - Call h2oContext.stop after ham or spam Scala example
    • SW-883 - Add mising description in publish.gradle
  • Improvement
    • SW-860 - Modify the hadoop launch command on download page
    • SW-873 - Upgrade H2O to 3.20.0.1
    • SW-874 - Update Mojo2 to 0.10.4
    • SW-879 - Print output of script tests

v2.1.31, v2.2.17, v2.3.6 (2018-06-13)

Downloads:

  • Bug
    • SW-850 - Expose methods to get input/output names in H2OMOJOPipelineModel
    • SW-859 - Print Warning when spark-home is defined on PATH
    • SW-862 - Create & fix test in PySparkling for named mojo columns
    • SW-864 - Fix & more readable test
    • SW-865 - Better Naming of the UDF method to obtain predictions
    • SW-869 - Add repository to build required by xgboost-predictor
  • Story
    • SW-856 - Upgrade Mojo2 to latest version
  • Improvement
    • SW-839 - Verify that Spark time column representation can be digested by Mojo2
    • SW-848 - Document Kerberos on Sparkling Water
    • SW-849 - Update use from maven on sparkling water download page
    • SW-851 - Make use of output types when creating Spark DataFrame out of mojo2 predicted values
    • SW-852 - Create spark UDF used to extract predicted values
    • SW-853 - Sparkling Water py should require pyspark dependency
    • SW-854 - Upgrade MojoPipeline to 0.10.0
    • SW-855 - Upgrade H2O to 3.18.0.11

v2.1.30, v2.2.16, v2.3.5 (2018-05-23)

Downloads:

  • Bug
    • SW-842 - Enforce system level properties in SW
  • Improvement
    • SW-845 - Upgrade H2O to 3.18.0.10
    • SW-847 - Remove GA from Sparkling Water

v2.1.29, v2.2.15, v2.3.4 (2018-05-18)

Downloads:

  • Bug
    • SW-836 - Add support for converting empty dataframe/RDD in Python and Scala to H2OFrame
    • SW-841 - Remove withCustomCommitsState in pipelines as it's now duplicating Github
    • SW-843 - Fix data obtaining for mojo pipeline
    • SW-844 - Upgrade Mojo pipeline to 0.9.9

v2.1.28, v2.2.14, v2.3.3 (2018-05-15)

Downloads:

  • Bug
    • SW-817 - Enable running MOJO spark pipeline without H2O init
    • SW-825 - Local creation of Sparkling Water does not work anymore.
    • SW-831 - Check shape of H2O frame after the conversion from Spark frame
    • SW-834 - External Backend stored sparse vector values incorrectly
  • Improvement
    • SW-829 - Type checking in PySparkling pipelines
    • SW-832 - Small refactoring in identifiers
    • SW-833 - Explicitly set source and target java versions
    • SW-837 - Upgrade H2O to 3.18.0.9
    • SW-838 - Upgrade Mojo pipeline dependency to 0.9.8
    • SW-840 - Add test checking column names and types between spark and mojo2

v2.1.27, v2.2.13, v2.3.2 (2018-05-02)

Downloads:

  • Bug
    • SW-574 - Process steam handle and use it for connection to external h2o cluster
    • SW-822 - Require correct colorama version
    • SW-823 - Fix Windows starting scripts
    • SW-824 - Fix NPE in mojo pipeline predictions
  • New Feature
    • SW-827 - Change color highlight in scala cell as it is too dark
  • Improvement
    • SW-815 - Upgrade H2O to 3.18.0.8
    • SW-816 - Update Mojo2 dependency to one which is compatible with Java7
    • SW-818 - Spark Pipeline imports do not work in PySparkling
    • SW-819 - Add ability to convert specific columns to categoricals in Sparkling Water pipelines
    • SW-820 - Sparkling Water pipelines add duplicate response column to the list of features

v2.1.26, v2.2.12, v2.3.1 (2018-04-19)

Downloads:

  • Bug
    • SW-672 - Enable using sparkling water maven packages in databricks cloud
    • SW-787 - Documentation fixes
    • SW-790 - Add missing seed argument to H2OAutoml pipeline step
    • SW-794 - Point to proper web-based docs
    • SW-796 - Use parquet provided by Spark
    • SW-797 - Automatically update redirect table as part of release pipeline
    • SW-806 - Fix exporting and importing of pipeline steps and mojo models to and from HDFS
  • Improvement
    • SW-772 - Integrate & Test Mojo Pipeline with Sparkling Water
    • SW-789 - Upgrade H2O to 3.18.0.7
    • SW-791 - Expose context_path in Sparkling Water
    • SW-793 - Create additional test verifying that the new light endpoint works as expected
    • SW-798 - Additional link to documentation
    • SW-800 - Remove references to Sparkling Water 2.0
    • SW-804 - Reduce time of H2OAutoml step in pipeline tests to 1 minute
    • SW-808 - Upgrade to Gradle 4.7

v2.1.25, v2.2.11, v2.3.0 (2018-03-29)

Downloads:

  • Bug
    • SW-696 - Intermittent script test issue on external backend
    • SW-726 - Mark Spark dependencies as provided on artefacts published to maven
    • SW-740 - Increase timeout for conversion in pyunit test for external cluster
    • SW-760 - Fix doc artefact publication
    • SW-763 - Remove support for downloading H2O logs from Spark UI
    • SW-766 - Fix coding style issue
    • SW-769 - Fix import
    • SW-776 - sparkling water from maven does not know the stacktrace_collector_interval option
    • SW-778 - Handle nulls properly in H2OMojoModel
  • New Feature
    • SW-722 - [PySparkling] Check for correct data type as part of as_h2o_frame
  • Improvement
    • SW-733 - Parametrize pipeline scripts to be able to specify different algorithms
    • SW-746 - Log chunk layout after the conversion of data to external H2O cluster
    • SW-755 - Document GBM Grid Search Pipeline Step
    • SW-765 - Remove test artefacts from the sparkling-water assembly
    • SW-768 - Add missing import
    • SW-773 - Don't use default value for output dir in external backend, it's not required
    • SW-780 - Upgrade H2O to 3.18.0.5
  • Docs
    • SW-775 - Fix link for documentation on DEVEL.md

v2.1.24, v2.2.10 (2018-03-08)

Downloads:

  • Bug
    • SW-739 - Sparkling Water Doc artefact is still missing Scala version
    • SW-742 - Fix setting up node network mask on external cluster
    • SW-743 - Allow to set LDAP and different security options in external backend as well
    • SW-747 - Fix bug in documentation for manual mode of external backend
    • SW-757 - Fix tests after enabling the stack-trace collection
  • Improvement
    • SW-744 - Document how to use Sparkling Water with LDAP in Sparkling Water docs
    • SW-745 - Expose Grid search as Spark pipeline step in the Scala API
    • SW-748 - Upgrade to Gradle 4.6
    • SW-752 - Collect stack traces on each h2o node as part of log collecting extension
    • SW-754 - Upgrade H2O to 3.18.0.3
    • SW-756 - Upgrade H2O to 3.18.0.4
  • Docs
    • SW-753 - Add "How to" for changing the default H2O port

v2.1.23, v2.2.9 (2018-02-26)

Downloads:

  • Bug
    • SW-723 - Sparkling water doc artefact is missing scala version
    • SW-727 - Improve method for downloading H2O logs
    • SW-728 - Use new light endpoint introduced in 3.18.0.1
    • SW-734 - Make sure we use the unique key names in split method
    • SW-736 - Document how to download logs on Databricks cluster
    • SW-737 - Expose downloadH2OLogs on H2OContext in PySparkling
    • SW-738 - Move spark.ext.h2o.node.network.mask setter to SharedArguments
  • Improvement
    • SW-702 - Create Spark Transformer for AutoML
    • SW-725 - create an an equvivalent of h2o.download_all_logs in scala
    • SW-730 - Upgrade H2O to 3.18.0.2

v2.1.22, v2.2.8 (2018-02-14)

Downloads:

  • Technical task
    • SW-652 - Deliver SW documentation in HTML output
  • Bug
    • SW-685 - Fix Typo in documentation
    • SW-695 - Make printHadoopDistributions gradle task available again for testing
    • SW-701 - Kill the client when one of the h2o nodes went OOM in external mode
    • SW-706 - Fix pysparkling.ml import for non-interactive sessions
    • SW-707 - parquet import fails on HDP with Spark 2.0 (azure hdi cluster)
    • SW-708 - Make sure H2OMojoModel does not required H2OContext to be initialized
    • SW-709 - Fix mojo predictions tests
    • SW-710 - In PySparkling pipelines, ensure that if users pass integer to double type we handle that correctly for all possible double values
    • SW-713 - Write a simple test for parquet import in Sparkling Water
    • SW-714 - Add option to H2OModel pipeline step allowing us to convert unknown categoricals to NAs
    • SW-715 - Fix driverif configuration on the external backend
  • Improvement
    • SW-606 - Verify & Document run of RSparkling on top of Databricks Azure cluster
    • SW-678 - Document how to change log location
    • SW-683 - H2OContext can't be initalized on Databricks cloud
    • SW-686 - Fix typo in documentation
    • SW-687 - Upgrade Gradle to 4.5
    • SW-688 - Update docs - SparklyR supports Spark 2.2.1 in the latest release
    • SW-690 - Log Sparkling Water version during startup of Sparkling Water
    • SW-693 - Allow to set driverIf on external H2O backend
    • SW-694 - Fix creation of Extended JAR in gradle task
    • SW-700 - Report Yarn App ID of spark application in H2OContext
    • SW-703 - Upload generated sphinx documentation to S3
    • SW-704 - Update links on the download page to point to the new docs
    • SW-705 - Increase memory for JUNIT tests
    • SW-718 - Upgrade to Gradle 4.5.1
    • SW-719 - Upgrade to H2O 3.18.0.1
    • SW-720 - Fix parquet import test on external backend
  • Docs
    • SW-697 - Final updates for Sparkling Water html output
    • SW-698 - Update "Contributing" section in Sparkling Water

v2.1.21, v2.2.7 (2018-01-18)

Downloads:

  • Bug
    • SW-273 - Remove workaround introduced by SW-272 for yarn/cluster mode
    • SW-551 - Remove hotfix introduced by [SW-541] and implement proper fix
    • SW-662 - Remove extra files that got into the repo
    • SW-666 - Kill the cluster when a new executors joins in the internal backend
    • SW-668 - Generate download link as part of the release notes
    • SW-669 - Remove mentions of local-cluster in public docs
    • SW-670 - Deprecated call in H2OContextInitDemo
    • SW-671 - Fix jenkinsfile for builds again specific h2o branches
  • Improvement
    • SW-674 - Update H2O to 3.16.0.4
    • SW-675 - Tiny clean up of the release code
    • SW-679 - Cleaner release script
    • SW-680 - Ensure S3 in release pipeline does depend only on credentials provided from Jenkins
    • SW-681 - Separate releasing on Github and Publishing artifacts

v2.1.20, v2.2.6 (2018-01-03)

Downloads:

  • Bug
    • SW-627 - [PySparkling] calling as_spark_frame for the second time results in exception
    • SW-630 - Fix ham or spam flow to reflect latest changes in pipelines
    • SW-631 - Ensure that we do not access RDDs in pipelines ( to unblock the deployment)
    • SW-646 - Fix incosistencies in ham or spam examples between scala and python
    • SW-648 - Fix ham or spam pipeline tests
    • SW-649 - Fix ham or spam tests for deeplearning pipeline
    • SW-661 - Use always correct Spark version on the R download page
  • Improvement
    • SW-608 - Measure time of conversions to H2OFrame in debug mode
    • SW-612 - Port all arguments available to Scala ML to PySparkling ML
    • SW-617 - Support for exporting mojo to hdfs
    • SW-632 - Dump full spark configuration during H2OContext.getOrCreate into DEBUG
    • SW-635 - Fix wrong instruction at PySparkling download page
    • SW-637 - Create new DataFrame with new schema when it actually contain any dot in names
    • SW-638 - Port release script into the sw repo
    • SW-639 - Use persist layer for exportPOJOModel
    • SW-640 - export H2OMOJOMOdel.createFromMOJO to pysparkling
    • SW-642 - Create test for mojo predictions in PySparkling
    • SW-643 - Add tests for H2ODeeplearning in Scala and Python and Fix potential problems
    • SW-644 - Log spark configuration to INFO level
    • SW-650 - Upgrade Gradle to 4.4.1
    • SW-656 - Upgrade ShadowJar to 2.0.2

v2.1.19, v2.2.5 (2017-12-11)

Downloads:

  • Bug
    • SW-615 - pysparkling.__version__ returns incorrectly ‘SUBST_PROJECT_VERSION’
    • SW-616 - PySparkling fails on python 3.6 because long time does not exist in python 3.6
    • SW-621 - PySParkling failing on Python3.6
    • SW-624 - Python build does not support H2O_PYTHON_WHEEL when building against h2o older then 3.16.0.1
    • SW-628 - PySparkling fails when installed from pypi
  • Improvement
    • SW-626 - Upgrade Gradle to 4.4

v2.1.18, v2.2.4 (2017-12-01)

Downloads:

  • Bug
    • SW-602 - conversion of sparse data DataFrame to H2OFrame is slow
    • SW-620 - Fix obtaining version from bundled h2o inside pysparkling
  • Improvement
    • SW-613 - Append dynamic allocation option into SW tuning documentation.
    • SW-618 - Integration with H2O 3.16.0.2

v2.1.17, v2.2.3 (2017-11-25)

Downloads:

  • Bug
    • SW-320 - H2OConfTest Python test blocks test run
    • SW-499 - BinaryType handling is not implemented in SparkDataFrameConverter
    • SW-535 - asH2OFrame gives error if column names have DOT in it
    • SW-547 - Don’t use md5skip in external mode
    • SW-569 - pysparkling: h2o on exit does not shut down cleanly
    • SW-572 - Additional fix for [SW-571]
    • SW-573 - Minor Gradle build improvements and fixes
    • SW-575 - Incorrect comment in hamOrSpamMojo pipeline
    • SW-576 - Cleanup pysparkling test infrastructure
    • SW-577 - Fix conditions in jenkins file
    • SW-580 - Fix composite build in Jenkins
    • SW-581 - Fix H2OConf test on external cluster
    • SW-582 - Opening Chicago Crime Demo Notebook errors on the first opening
    • SW-584 - Create extended directory automatically
    • SW-588 - Fix links in README
    • SW-589 - Wrap stages in try finally in jenkins file
    • SW-592 - Properly pass all parameters to algorithm
    • SW-593 - H2Conf cannot be initialized on windows
    • SW-594 - Gradle ml submodule reports success even though tests fail
    • SW-595 - Fix ML tests
  • New Feature
    • SW-519 - Introduce SW Models into Spark python pipelines
  • Task
    • SW-609 - Upgrade H2O dependency to 3.16.0.1
  • Improvement
    • SW-318 - Keep H2O version inside sparklin-water-core.jar and provide utility to query it
    • SW-420 - Shell scripts miss-leading error message
    • SW-504 - Provides Sparkling Water Spark Uber package which can be used in –packages
    • SW-570 - Stop previous jobs in jenkins in case of PR
    • SW-571 - In PySparkling, getOrCreate(spark) still incorrectly complains that we should use spark session
    • SW-583 - Upgrade to Gradle 4.3
    • SW-585 - Add the custom commit status for internal and external pipelines
    • SW-586 - [ML] Remove some duplicities, enable mojo for deep learning
    • SW-590 - Replace deprecated method call in ChicagoCrime python example
    • SW-591 - Repl doesn’t require H2O dependencies to compile
    • SW-596 - Minor build improvements
    • SW-603 - Upgrade Gradle to 4.3.1
    • SW-605 - addFiles doesn’t accept sparkSession
    • SW-610 - Change default client mode to INFO, let user to change it at runtime

v2.1.16, v2.2.2 (2017-10-23)

Downloads:

  • Bug
    • SW-555 - Fix documentation issue in PySparkling
    • SW-558 - Increase default value for client connection retry timeout in
    • SW-560 - SW documentation for nthreads is inconsistent with code
    • SW-561 - Fix reporting artefacts in Jenkins and remove use of h2o-3-shared-lib
    • SW-564 - Clean test workspace in jenkins
    • SW-565 - Fix creation of extended jar in jenkins
    • SW-567 - Fix failing tests on external backend
    • SW-568 - Remove obsolete and failing idea configuration
    • SW-559 - GLM fails to build model when weights are specified
  • Improvement
    • SW-557 - Create 2 jenkins files ( for internal and external backend ) backed by configurable pipeline
    • SW-562 - Disable web on external H2O nodes in external cluster mode
    • SW-563 - In external cluster mode, print also YARN job ID of the external cluster once context is available
    • SW-566 - Upgrade H2O to 3.14.0.7
    • SW-553 - Improve handling of sparse vectors in internal cluster

v2.1.15, v2.2.1 (2017-10-10)

Downloads:

  • Bug
    • SW-423 - Tests of External Cluster mode fails
    • SW-516 - External cluster improperly convert RDD[ml.linalg.Vector]
    • SW-525 - Don’t use GPU nodes for sparkling water testing in Jenkins
    • SW-526 - Add missing when clause to scripts test stage in Jenkinsfile
    • SW-527 - Use dX cluster for Jenkins testing
    • SW-529 - Code defect in Scala example
    • SW-531 - Use code which is compatible between Scala 2.10 and 2.11
    • SW-532 - Make auto mode in external cluster default for tests in jenkins
    • SW-534 - Ensure that all tests run on both, internal and external backends
    • SW-536 - Allow to test sparkling water against specific h2o branch
    • SW-537 - Update Gradle to 4.2RC2
    • SW-538 - Fix problem in Jenkinsfile where H2O_HOME has higher priority then H2O_PYTHON_WHEEL
    • SW-539 - Fix PySparkling issue when running multiple times on the same node
    • SW-541 - Model training hangs in SW
    • SW-542 - sw does not support parquet import
    • SW-552 - Fix documentation bug
  • New Feature
    • SW-521 - Fix typo in documentation
    • SW-523 - Use linux label to determine which nodes are used for Jenkins testing
    • SW-533 - In external cluster, remove notification file at the end. This affects nothing, it is just cleanup.
  • Improvement
    • SW-543 - Upgrade Gradle to 4.2
    • SW-544 - Improve exception in ExternalH2OBackend
    • SW-545 - Stop H2O in afterAll in tests
    • SW-546 - Add sw version to name of h2odriver obtained using get-extended-h2o script
    • SW-549 - Upgrade gradle to 4.2.1
    • SW-550 - Upgrade H2O to 3.14.0.6

v2.1.14, v2.2.0 (2017-08-23)

Downloads:

  • Bug
    • SW-449 - Support Sparse Data during spark-h2o conversions
    • SW-510 - The link Demo Example from Git is broken on the download page
  • New Feature
  • Improvement
    • SW-514 - Upgrade H2O to 3.14.0.2
    • SW-395 - bin/sparkling-shell should fail if assembly jar file does not exist
    • SW-471 - Use mojo in pipelines if possible, remove H2OPipeline and OneTimeTransformers
    • SW-512 - Make JenkinsFile up-to-date with sparkling_yarn_branch
    • SW-513 - Upgrade to Gradle 4.1

v2.1.13 (2017-08-02)

Downloads:

  • Bug
    • SW-501 - Security Bug when using Security.enableSSL(spark)
    • SW-505 - Travis build is failing on missing OracleJdk7
  • Improvement
    • SW-355 - Include H2O R client distribution in Sparkling Water binary
    • SW-500 - Warehouse dir does not have to be set in tests on Spark from 2.1+
    • SW-506 - Documentation for the backends should mention get-extended-h2o.sh instead of manual jar extending
    • SW-507 - Upgrade to Gradle 4.0.2
    • SW-508 - More robust get-extended-h2o.sh
    • SW-509 - Add back DEVEL.md and CHANGELOG.md and redirect to new versions

v2.1.12 (2017-07-17)

Downloads:

  • Improvement
    • SW-490 - Upgrade Gradle to 4.0.1
    • SW-491 - Increase default value for Write and Read confirmation timeout
    • SW-492 - Remove dead code and deprecation warning in tests
    • SW-493 - Enforce Scala Style rules
    • SW-494 - Remove hard dependency on RequestServer by using RestApiContext
    • SW-496 - Remove ignored empty “H2OFrame[Time] to DataFrame[TimeStamp]” test
    • SW-498 - Upgrade H2O to 3.10.5.4

v2.1.11 (2017-07-12)

Downloads:

  • Bug
    • SW-407 - Make scala H2OConf consistent and allow to set and get all propertties
  • Improvement
    • SW-485 - Update instructions for a new PYPI.org
    • SW-489 - Upgrade H2O to 3.10.5.3

v2.1.10 (2017-06-29)

Downloads:

  • Bug
    • SW-469 - Remove accidentally added kerb.conf file
    • SW-470 - Allow to pask sparkSession to Security.enableSSL and deprecate sparkContext
    • SW-474 - Use deprecated HTTPClient as some CDH versions does not have the new method
    • SW-475 - Handle duke library in case it’s loaded using –packages
    • SW-479 - Fix CHANGELOG location in make-dist.sh
  • Improvement
    • SW-457 - Clean up windows scripts
    • SW-466 - Separate Devel.md into multiple rst files
    • SW-472 - Convert to rst README in gradle dir
    • SW-473 - Upgrade to gradle 4.0
    • SW-477 - Upgrade H2O to 3.10.5.2
    • SW-480 - Bring back publishToMavenLocal task
    • SW-482 - Updates to change log location
    • SW-483 - Make rel-2.1 changelog consistent and also rst

v2.1.9 (2017-06-15)

Downloads:

  • Technical task
    • SW-211 - In PySparkling for spark 2.0 document how to build the package
  • Bug
    • SW-448 - Add missing jar into the assembly
    • SW-450 - Fix instructions on the download site
    • SW-453 - Use size method to get attr num
    • SW-454 - Replace sparkSession with spark in backends documentation
    • SW-456 - Make shell scripts safe
    • SW-459 - Update PySparkling run-time dependencies
    • SW-461 - Fix wrong getters and setters in pysparkling
    • SW-467 - Fix typo in the FAQ documentation
    • SW-468 - Fix make-dist
  • New Feature
    • SW-455 - Replace the remaining references to egg files
  • Improvement
    • SW-24 - Append tab on Sparkling Water download page - how to use Sparkling Water package
    • SW-111 - Update FAQ with information about hive metastore location
    • SW-112 - Sparkling Water Tunning doc: add heartbeat dcoumentation
    • SW-311 - Please report Application Type to Yarn Resource Manager
    • SW-340 - Improve structure of SW README
    • SW-426 - Allow to download sparkling water logs from the spark UI
    • SW-444 - Remove references to Spark 1.5, 1.4 ( as it’s old ) in README.rst and other docs
    • SW-447 - Upgrade H2O to 3.10.5.1
    • SW-452 - Add missing spaces after “,” in H2OContextImplicits
    • SW-460 - Allow to configure flow dir location in SW
    • SW-463 - Extract sparkling water configuration to extra doc in rst format
    • SW-465 - Mark tensorflow demo as experimental

v2.1.8 (2017-05-25)

Downloads:

  • Bug
    • SW-263 - Cannot run build in parallel because of Python module
    • SW-336 - Wrong documentation of PyPi h2o_pysparkling_2.0 package
    • SW-430 - pysparkling: adding a column to a data frame does not work when parse the original frame in spark
    • SW-431 - Allow to pass additional arguments to run-python-script.sh
    • SW-436 - Fix getting of sparkling water jar in pysparkling
    • SW-437 - Don’t call atexit in case of pysparkling in cluster deploy mode
    • SW-438 - store h2o logs int unique directories
    • SW-439 - handle interrupted exception in H2ORuntimeInfoUIThread
    • SW-335 - Cannot install pysparkling from PyPi
  • Improvement
    • SW-445 - Remove information from README.pst that pip cannot be used
    • SW-341 - Support Python 3 distribution
    • SW-380 - Define Jenkins pipeline via Jenkinsfile
    • SW-433 - Add change logs link to the sw download page
    • SW-435 - Upgrade shadow jar plugin to 2.0.0
    • SW-440 - Sparkling Water cluster name should contain spark app id instead of random number
    • SW-441 - Replace deprecated DefaultHTTPClient in AnnouncementService
    • SW-442 - Get array size from metadata in case of ml.lilang.VectorUDT
    • SW-443 - Upgrade H2O version to 3.10.4.8

v2.1.7 (2017-05-10)

Downloads:

  • Bug
    • SW-429 - Different cluster name between client and h2o nodes in case of external cluster

v2.1.6 (2017-05-09)

Downloads:

  • Improvement
    • SW-424 - Add SW tab in Spark History Server
    • SW-427 - Upgrade H2O dependency to 3.10.4.7

v2.1.5 (2017-04-27)

Downloads:

  • Bug
    • SW-421 - Externar cluster: Job is reporting exit status as FAILED even all mappers return 0
  • Improvement
    • SW-422 - Upgrade H2O dependency to 3.10.4.6

v2.1.4 (2017-04-20)

Downloads:

  • Bug
    • SW-65 - Add pysparkling instruction to download page
    • SW-365 - Properexit status handling of external cluster
    • SW-398 - Usetimeout for read/write confirmation in external cluster mode
    • SW-400 - Fix stopping of H2OContext in case of running standalone application
    • SW-401 - Add configuration property to external backend allowing to specify the maximal timeout the cloud will wait for watchdog client to connect
    • SW-405 - Use correct quote in backend documentation
    • SW-408 - Use kwargs for h2o.connect in pysparkling
    • SW-409 - Fix stopping of python tests
    • SW-410 - Honor –core Spark settings in H2O executors
  • Improvement
    • SW-231 - Sparkling Water download page is missing PySParkling/RSparkling info
    • SW-404 - Upgrade H2O dependency to 3.10.4.4
    • SW-406 - Download page should list available jars for external cluster.
    • SW-411 - Migrate Pysparkling tests and examples to SparkSession
    • SW-412 - Upgrade H2O dependency to 3.10.4.5

v2.1.3 (2017-04-7)

Downloads:

  • Bug
    • SW-334 - as_factor() ‘corrupts’ dataframe if it fails
    • SW-353 - Kerberos for SW not loading JAAS module
    • SW-364 - Repl session not set on scala 2.11
    • SW-368 - bin/pysparkling.cmd is missing
    • SW-371 - Fix MarkDown syntax
    • SW-372 - Run negative test for PUBDEV-3808 multiple times to observe failure
    • SW-375 - Documentation fix in external cluster manual
    • SW-376 - Tests for DecimalType and DataType fail on external backend
    • SW-377 - Implement stopping of external H2O cluster in external backend mode
    • SW-383 - Update PySparkling README with info about SW-335 and using SW from Pypi
    • SW-385 - Fix residual plot R code generator
    • SW-386 - SW REPL cannot be used in combination with Spark Dataset
    • SW-387 - Fix typo in setClientIp method
    • SW-388 - Stop h2o when running inside standalone pysparkling job
    • SW-389 - Extending h2o jar from SW doesn’t work when the jar is already downloaded
    • SW-392 - Python in gradle is using wrong python - it doesn’t respect the PATH variable
    • SW-393 - Allow to specify timeout for h2o cloud up in external backend mode
    • SW-394 - Allow to specify log level to external h2o cluster
    • SW-396 - Create setter in pysparkling conf for h2o client log level
    • SW-397 - Better error message covering the most often case when cluster info file doesn’t exist
  • Improvement
    • SW-296 - H2OConf remove nulls and make it more Scala-like
    • SW-367 - Add task to Gradle build which prints all available Hadoop distributions for the corresponding h2o
    • SW-382 - Upgrade of H2O dependency to 3.10.4.3

v2.1.2 (2017-03-20)

Downloads:

  • Bug
    • SW-361 - Flow is not available in Sparkling Water
    • SW-362 - PySparkling does not work
  • Improvement
    • SW-344 - Use Spark public DNS if available to report Flow UI

v2.1.1 (2017-03-18)

Downloads:

  • Bug
    • SW-308 - Intermittent failure in creating H2O cloud
    • SW-321 - composite function fail when inner cbind()
    • SW-342 - Environment detection does not work with Spark2.1
    • SW-347 - Cannot start Sparkling Water at HDP Yarn cluster
    • SW-349 - Sparkling Shell scripts for Windows do not work
    • SW-350 - Fix command line environment for Windows
    • SW-357 - PySparkling in Zeppelin environment using wrong class loader
  • Improvement
    • SW-333 - ApplicationMaster info in Yarn for external cluster
    • SW-337 - Use h2o.connect in PySpark to connect to H2O cluster
    • SW-345 - Create configuration manual for External cluster
    • SW-356 - Improve documentation for spark.ext.h2o.fail.on.unsupported.spark.param
    • SW-360 - Upgrade H2O dependency to 3.10.4.2

v2.1.0 (2017-03-02)

Downloads:

  • Bug
    • SW-331 - Security.enableSSL does not work
  • Improvement
    • SW-302 - Support Spark 2.1.0
    • SW-325 - Implement a generic announcement mechanism
    • SW-326 - Add support to Spark 2.1 in Sparkling Water
    • SW-327 - Enrich Spark UI with Sparkling Water specific tab