Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

mice.impute.rf

Imputation by random forests


Description

Imputes univariate missing data using random forests.

Usage

mice.impute.rf(y, ry, x, wy = NULL, ntree = 10, ...)

Arguments

y

Vector to be imputed

ry

Logical vector of length length(y) indicating the the subset y[ry] of elements in y to which the imputation model is fitted. The ry generally distinguishes the observed (TRUE) and missing values (FALSE) in y.

x

Numeric design matrix with length(y) rows with predictors for y. Matrix x may have no missing values.

wy

Logical vector of length length(y). A TRUE value indicates locations in y for which imputations are created.

ntree

The number of trees to grow. The default is 10.

...

Other named arguments passed down to mice:::install.on.demand(), randomForest::randomForest() and randomForest:::randomForest.default().

Details

Imputation of y by random forests. The method calls randomForrest() which implements Breiman's random forest algorithm (based on Breiman and Cutler's original Fortran code) for classification and regression. See Appendix A.1 of Doove et al. (2014) for the definition of the algorithm used.

Value

Vector with imputed data, same type as y, and of length sum(wy)

Note

An alternative implementation was independently developed by Shah et al (2014). This were available as functions CALIBERrfimpute::mice.impute.rfcat and CALIBERrfimpute::mice.impute.rfcont (now archived). Simulations by Shah (Feb 13, 2014) suggested that the quality of the imputation for 10 and 100 trees was identical, so mice 2.22 changed the default number of trees from ntree = 100 to ntree = 10.

Author(s)

Lisa Doove, Stef van Buuren, Elise Dusseldorp, 2012

References

Doove, L.L., van Buuren, S., Dusseldorp, E. (2014), Recursive partitioning for missing data imputation in the presence of interaction Effects. Computational Statistics \& Data Analysis, 72, 92-104.

Shah, A.D., Bartlett, J.W., Carpenter, J., Nicholas, O., Hemingway, H. (2014), Comparison of random forest and parametric imputation models for imputing missing data using MICE: A CALIBER study. American Journal of Epidemiology, doi: 10.1093/aje/kwt312.

Van Buuren, S. (2018). Flexible Imputation of Missing Data. Second Edition. Chapman & Hall/CRC. Boca Raton, FL.

See Also

Examples

library("lattice")

imp <- mice(nhanes2, meth = "rf", ntree = 3)
plot(imp)

mice

Multivariate Imputation by Chained Equations

v3.13.0
GPL-2 | GPL-3
Authors
Stef van Buuren [aut, cre], Karin Groothuis-Oudshoorn [aut], Gerko Vink [ctb], Rianne Schouten [ctb], Alexander Robitzsch [ctb], Patrick Rockenschaub [ctb], Lisa Doove [ctb], Shahab Jolani [ctb], Margarita Moreno-Betancur [ctb], Ian White [ctb], Philipp Gaffert [ctb], Florian Meinfelder [ctb], Bernie Gray [ctb], Vincent Arel-Bundock [ctb]
Initial release
2021-01-26

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.