Principal component analysis of an H2O data frame

Principal components analysis of an H2O data frame using the power method to calculate the singular value decomposition of the Gram matrix.

h2o.prcomp(training_frame, x, model_id = NULL, validation_frame = NULL,
  ignore_const_cols = TRUE, score_each_iteration = FALSE,
  transform = c("NONE", "STANDARDIZE", "NORMALIZE", "DEMEAN", "DESCALE"),
  pca_method = c("GramSVD", "Power", "Randomized", "GLRM"), k = 1,
  max_iterations = 1000, use_all_factor_levels = FALSE,
  compute_metrics = TRUE, impute_missing = FALSE, seed = -1,
  max_runtime_secs = 0)

Arguments

training_frame	Id of the training data frame.
x	A vector containing the `character` names of the predictors in the model.
model_id	Destination id for this model; auto-generated if not specified.
validation_frame	Id of the validation data frame.
ignore_const_cols	`Logical`. Ignore constant columns. Defaults to TRUE.
score_each_iteration	`Logical`. Whether to score during each iteration of model training. Defaults to FALSE.
transform	Transformation of training data Must be one of: "NONE", "STANDARDIZE", "NORMALIZE", "DEMEAN", "DESCALE". Defaults to NONE.
pca_method	Method for computing PCA (Caution: GLRM is currently experimental and unstable) Must be one of: "GramSVD", "Power", "Randomized", "GLRM". Defaults to GramSVD.
k	Rank of matrix approximation Defaults to 1.
max_iterations	Maximum training iterations Defaults to 1000.
use_all_factor_levels	`Logical`. Whether first factor level is included in each categorical expansion Defaults to FALSE.
compute_metrics	`Logical`. Whether to compute metrics on the training data Defaults to TRUE.
impute_missing	`Logical`. Whether to impute missing entries with the column mean Defaults to FALSE.
seed	Seed for random numbers (affects certain parts of the algo that are stochastic and those might or might not be enabled by default) Defaults to -1 (time-based random number).
max_runtime_secs	Maximum allowed runtime in seconds for model training. Use 0 to disable. Defaults to 0.

Value

Returns an object of class H2ODimReductionModel.

References

N. Halko, P.G. Martinsson, J.A. Tropp. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions[http://arxiv.org/abs/0909.4061]. SIAM Rev., Survey and Review section, Vol. 53, num. 2, pp. 217-288, June 2011.

Examples

# NOT RUN {
library(h2o)
h2o.init()
ausPath <- system.file("extdata", "australia.csv", package="h2o")
australia.hex <- h2o.uploadFile(path = ausPath)
h2o.prcomp(training_frame = australia.hex, k = 8, transform = "STANDARDIZE")
# }

Principal component analysis of an H2O data frame

Arguments

Value

References

See also

Examples

Contents