Fill NAs

Use this function to fill in NA values in a sequential manner up to a specified limit. When using this function, you will specify whether the method to fill the NAs should go forward (default) or backward, whether the NAs should be filled along rows (default) or columns, and the maximum number of consecutive NAs to fill (defaults to 1).

library(h2o)
h2o.init()

# Create a random data frame with 6 rows and 2 columns.
# Specify that no more than 70% of the values are NAs.
fr_with_nas = h2o.createFrame(categorical_fraction = 0.0,
                              missing_fraction = 0.7,
                              rows = 6,
                              cols = 2,
                              seed = 123)
fr_with_nas
         C1        C2
1       NaN       NaN
2 -77.10471 -93.64087
3 -13.65926  57.44389
4       NaN       NaN
5  39.10130       NaN
6       NaN  55.43136

[6 rows x 2 columns]

# Forward fill a row. In R, the values for axis are 1 (row-wise) and 2 (column-wise)
fr <- h2o.fillna(fr_with_nas, "forward", axis = 1, maxlen = 1L)
fr
         C1        C2
1       NaN       NaN
2 -77.10471 -93.64087
3 -13.65926  57.44389
4       NaN       NaN
5  39.10130  39.10130
6       NaN  55.43136

[6 rows x 2 columns]
import h2o
h2o.init()

# Create a random data frame with 10 rows and 3 columns.
# Specify that no more than 20% of the values are NAs.
df = h2o.create_frame(rows=10,
                      cols=3,
                      real_fraction=1.0,
                      real_range=100,
                      missing_fraction=0.2,
                      seed=123)
df
      C1        C2        C3
--------  --------  --------
nan       nan       -77.1047
-93.6409  -13.6593   57.4439
-93.71     25.4342   39.1013
-95.8291  -92.4271   55.4314
 84.6372  -43.4759   53.1715
-57.9583   27.4148  -26.9013
 83.0921  -62.7819  -91.9426
-77.9814   64.3228  -93.954
nan       -80.6142  nan
 27.1672   60.5492  -13.2275

[10 rows x 3 columns]

# Forward fill a row. In Python, the values for axis are 0 (row-wise) and 1 (column-wise)
filled = df.fillna(method="forward",axis=0,maxlen=1)
filled

filled
      C1        C2        C3
--------  --------  --------
nan       nan       -77.1047
-93.6409  -13.6593   57.4439
-93.71     25.4342   39.1013
-95.8291  -92.4271   55.4314
 84.6372  -43.4759   53.1715
-57.9583   27.4148  -26.9013
 83.0921  -62.7819  -91.9426
-77.9814   64.3228  -93.954
-77.9814  -80.6142  -93.954
27.1672   60.5492  -13.2275

[10 rows x 3 columns]