Fill NAs¶

Use this function to fill in NA values in a sequential manner up to a specified limit. When using this function, you will specify whether the method to fill the NAs should go forward (default) or backward , whether the NAs should be filled along rows (default) or columns, and the maximum number of consecutive NAs to fill (defaults to 1).

library(h2o)
h2o.init()

# Create a random data frame with 6 rows and 2 columns.
# Specify that no more than 70% of the values are NAs.
fr.with.nas = h2o.createFrame(categorical_fraction=0.0,
missing_fraction=0.7,
rows=6,
cols=2,
seed=123)
fr.with.nas
C1        C2
1       NaN       NaN
2 -77.10471 -93.64087
3 -13.65926  57.44389
4       NaN       NaN
5  39.10130       NaN
6       NaN  55.43136

[6 rows x 2 columns]

# Forward fill a row. In R, the values for axis are 1 (row-wise) and 2 (column-wise)
fr <- h2o.fillna(fr.with.nas, "forward", axis=1, maxlen=1L)
fr
C1        C2
1       NaN       NaN
2 -77.10471 -93.64087
3 -13.65926  57.44389
4       NaN       NaN
5  39.10130  39.10130
6       NaN  55.43136

[6 rows x 2 columns]

import h2o
h2o.init()

# Create a random data frame with 100000 rows and 3 columns.
# Specify that no more than 20% of the values are NAs.
df = h2o.create_frame(rows=10,
cols=3,
real_fraction=1.0,
real_range=100,
missing_fraction=0.2,
seed=123)
df
C1        C2        C3
--------  --------  --------
nan       nan       -77.1047
-93.6409  -13.6593   57.4439
-93.71     25.4342   39.1013
-95.8291  -92.4271   55.4314
84.6372  -43.4759   53.1715
-57.9583   27.4148  -26.9013
83.0921  -62.7819  -91.9426
-77.9814   64.3228  -93.954
nan       -80.6142  nan
27.1672   60.5492  -13.2275

[10 rows x 3 columns]

# Forward fill a row. In Python, the values for axis are 0 (row-wise) and 1 (column-wise)
filled = df.fillna(method="forward",axis=0,maxlen=1)
filled

filled
C1        C2        C3
--------  --------  --------
nan       nan       -77.1047
-93.6409  -13.6593   57.4439
-93.71     25.4342   39.1013
-95.8291  -92.4271   55.4314
84.6372  -43.4759   53.1715
-57.9583   27.4148  -26.9013
83.0921  -62.7819  -91.9426
-77.9814   64.3228  -93.954
-77.9814  -80.6142  -93.954
27.1672   60.5492  -13.2275

[10 rows x 3 columns]