use_all_factor_levels
¶
- Available in: Deep Learning, PCA
- Hyperparameter: no
Description¶
This option allows you to specify whether to use all factor levels in the possible set of predictors. This option is disabled by default, so the first factor level is skipped. If you enable this option, then the PCA model ignores the first factor level of each categorical column when expanding into indicator columns. Note also that if you enable this option, then sufficient regularization is required.
Example¶
- r
- python
# Load the Birds dataset
birds.hex <- h2o.importFile("https://s3.amazonaws.com/h2o-public-test-data/smalldata/pca_test/birds.csv")
# Train using all factor levels
birds.pca <- h2o.prcomp(training_frame = birds.hex, transform = "STANDARDIZE",
k = 3, pca_method="Power", use_all_factor_levels=TRUE)
# View the importance of components
birds.pca@model$importance
Importance of components:
pc1 pc2 pc3
Standard deviation 1.546397 1.348276 1.055239
Proportion of Variance 0.300269 0.228258 0.139820
Cumulative Proportion 0.300269 0.528527 0.668347
# View the eigenvectors
birds.pca@model$eigenvectors
Rotation:
pc1 pc2 pc3
patch.Ref1a 0.009848 -0.005947 0.001061
patch.Ref1b -0.001628 -0.014739 0.001007
patch.Ref1c 0.004994 -0.009486 0.000523
patch.Ref1d 0.000117 -0.004400 0.004917
patch.Ref1e 0.003627 -0.001467 0.004268
---
pc1 pc2 pc3
S 0.515048 0.226915 0.123136
year -0.066269 -0.069526 -0.971250
area 0.414050 0.344332 -0.149339
log.area. 0.497313 0.363609 -0.131261
ENN -0.390235 0.545631 0.007944
log.ENN. -0.345665 0.562834 0.002092
# Train again without using all factor levels
birds2.pca <- h2o.prcomp(training_frame = birds.hex, transform = "STANDARDIZE",
k = 3, pca_method="Power", use_all_factor_levels=FALSE)
# View the importance of components
birds2.pca@model$importance
Importance of components:
pc1 pc2 pc3
Standard deviation 1.544463 1.342094 1.054848
Proportion of Variance 0.309387 0.233622 0.144320
Cumulative Proportion 0.309387 0.543008 0.687328
# View the eigenvectors
birds2.pca@model$eigenvectors
Rotation:
pc1 pc2 pc3
patch.Ref1b -0.001469 0.014976 0.000849
patch.Ref1c 0.005120 0.009480 0.000457
patch.Ref1d 0.000164 0.004468 0.004877
patch.Ref1e 0.003656 0.001399 0.004283
patch.Ref1g 0.005728 0.002821 -0.003653
---
pc1 pc2 pc3
S 0.510775 -0.233390 0.123700
year -0.064706 0.068396 -0.973014
area 0.409889 -0.355035 -0.145441
log.area. 0.494189 -0.379361 -0.125400
ENN -0.397489 -0.543776 0.012354
log.ENN. -0.355681 -0.554631 0.002802