Combining Rows from Two Datasets
--------------------------------

You can use the ``rbind`` function to combine two similar datasets into a single large dataset. This can be used, for example, to create a larger dataset by combining data from a validation dataset with its training or testing dataset.

Note that when using ``rbind``, the two datasets must have the same set of columns.

.. example-code::
   .. code-block:: r
   
	> library(h2o)
	> h2o.init(nthreads=-1)
	
	# Import exsiting training and testing datasets
	> ecg1Path = "../../../smalldata/anomaly/ecg_discord_train.csv"
	> ecg1.hex = h2o.importFile(path=ecg1Path, destination_frame="ecg1.hex")
	> ecg2Path = "../../../smalldata/anomaly/ecg_discord_test.csv"
	> ecg2.hex = h2o.importFile(path=ecg2Path, destination_frame="ecg2.hex")

	# Combine the two datasets into a single, larger dataset
	> ecgCombine.hex <- h2o.rbind(ecg1.hex, ecg2.hex)

   .. code-block:: python

	>>> import h2o
	>>> import numpy as np
	>>> h2o.init()
	
	# Generate a random dataset with 100 rows 4 columns. Label the columns A, B, C, and D.
	>>> df1 = h2o.H2OFrame.from_python(np.random.randn(100,4).tolist(), column_names=list('ABCD'))
	>>> df1.describe
	          A           B            C            D
	-----------  ----------  -----------  -----------
	nan          nan         nan          nan
	 -0.148045     0.516651   -0.218871    -2.11336
	  0.818191    -1.07749    -0.303827     0.0234708
	 -0.894042    -1.83727     1.69621     -0.306524
	 -1.90056      0.528147   -0.745829     0.325673
	 -1.14653      0.146565   -1.12463     -1.39162
	  0.81608      0.21313    -0.122169     1.47247
	  0.419028     1.14975     0.913349     0.975779
	  0.419134    -1.63199     0.633799     0.482761
	  0.0366856   -1.09199    -0.0831492    2.17306
	
	[101 rows x 4 columns]
	
	# Generate a second random dataset with 100 rows and 4 columns. Again, label the columns, A, B, C, and D.
	>>> df2 = h2o.H2OFrame.from_python(np.random.randn(100,4).tolist(), column_names=list('ABCD'))
	>>> df2.describe
	          A            B           C           D
	-----------  -----------  ----------  ----------
	nan          nan          nan         nan
	  0.626459    -1.80634     -1.08245     1.29828
	  1.31526     -0.223264     0.172243   -0.76666
	  1.70095     -0.666482    -0.486086   -1.16518
	 -0.241271    -1.08439      1.75451     1.37618
	 -0.151067    -0.830386     0.7113     -0.979204
	 -2.18042     -1.85949     -0.466211    0.707786
	 -0.0657297   -0.0092001    1.3721     -0.570298
	  1.59816     -0.149408    -0.874023   -0.883033
	 -0.367047    -0.586965    -0.98553    -1.33043
	
	[101 rows x 4 columns]
	
	# Bind the rows from the second dataset into the first dataset.
	>>> df1.rbind(df2)
	>>> df1.describe
          	A           B            C            D
	-----------  ----------  -----------  -----------
	nan          nan         nan          nan
	 -0.148045     0.516651   -0.218871    -2.11336
	  0.818191    -1.07749    -0.303827     0.0234708
	 -0.894042    -1.83727     1.69621     -0.306524
	 -1.90056      0.528147   -0.745829     0.325673
	 -1.14653      0.146565   -1.12463     -1.39162
	  0.81608      0.21313    -0.122169     1.47247
	  0.419028     1.14975     0.913349     0.975779
	  0.419134    -1.63199     0.633799     0.482761
	  0.0366856   -1.09199    -0.0831492    2.17306
	
	[202 rows x 4 columns]