Imputation using Partial Least Squares for Dimension Reduction
This function imputes a variable with missing values using PLS regression (Mevik & Wehrens, 2007) for a dimension reduction of the predictor space.
mice.impute.pls(y, ry, x, type, pls.facs=NULL, pls.impMethod="pmm", donors=5, pls.impMethodArgs=NULL, pls.print.progress=TRUE, imputationWeights=rep(1, length(y)), pcamaxcols=1E+09, min.int.cor=0, min.all.cor=0, N.largest=0, pls.title=NULL, print.dims=TRUE, pls.maxcols=5000, use_boot=FALSE, envir_pos=NULL, extract_data=TRUE, remove_lindep=TRUE, ...) mice.impute.2l.pls2(y, ry, x, type, pls.facs=NULL, pls.impMethod="pmm", pls.print.progress=TRUE, imputationWeights=rep(1, length(y)), pcamaxcols=1E+09, tricube.pmm.scale=NULL, min.int.cor=0, min.all.cor=0, N.largest=0, pls.title=NULL, print.dims=TRUE, pls.maxcols=5000, envir_pos=parent.frame(), ...)
y |
Incomplete data vector of length |
ry |
Vector of missing data pattern ( |
x |
Matrix ( |
type |
|
pls.facs |
Number of factors used in PLS regression. This argument can also be specified as a list defining different numbers of factors for all variables to be imputed. |
pls.impMethod |
Imputation method used for in PLS estimation.
Any imputation method can be used except if |
donors |
Number of donors if predictive mean matching is used
( |
pls.impMethodArgs |
Arguments for imputation method
|
pls.print.progress |
Print progress during PLS regression. |
imputationWeights |
Vector of sample weights to be used in imputation models. |
pcamaxcols |
Amount of variance explained by principal components (must be a number between 0 and 1) |
min.int.cor |
Minimum absolute correlation for an interaction of two predictors to be included in the PLS regression model |
min.all.cor |
Minimum absolute correlation for inclusion in the PLS regression model. |
N.largest |
Number of variable to be included which do have the largest absolute correlations. |
pls.title |
Title for progress print in console output. |
print.dims |
An optional logical indicating whether dimensions of inputs should be printed. |
pls.maxcols |
Maximum number of interactions to be created. |
use_boot |
Logical whether Bayesian bootstrap should be used for drawing regression parameters |
envir_pos |
Position of the environment from which the data should be extracted. |
extract_data |
Logical indicating whether input data should be extracted
from parent environment within |
remove_lindep |
Logical indicating whether linear dependencies should be automatically detected and some predictors are removed |
... |
Further arguments to be passed. |
tricube.pmm.scale |
Scale factor for tricube PMM imputation. |
A vector of length nmis=sum(!ry)
with imputations
if pls.impMethod !="xplsfacs"
. In case of
pls.impMethod=="xplsfacs"
a matrix with PLS factors
is computed.
The mice.impute.2l.pls2
function is just included for reasons of
backward compatibility to former miceadds versions.
Mevik, B. H., & Wehrens, R. (2007). The pls package: Principal component and partial least squares regression in R. Journal of Statistical Software, 18, 1-24. doi: 10.18637/jss.v018.i02
## Not run: ############################################################################# # EXAMPLE 1: PLS imputation method for internet data ############################################################################# data(data.internet) dat <- data.internet # specify predictor matrix predictorMatrix <- matrix( 1, ncol(dat), ncol(dat) ) rownames(predictorMatrix) <- colnames(predictorMatrix) <- colnames(dat) diag( predictorMatrix) <- 0 # use PLS imputation method for all variables impMethod <- rep( "pls", ncol(dat) ) names(impMethod) <- colnames(dat) # define predictors for interactions (entries with type 4 in predictorMatrix) predictorMatrix[c("IN1","IN15","IN16"),c("IN1","IN3","IN10","IN13")] <- 4 # define predictors which should appear as linear and quadratic terms (type 5) predictorMatrix[c("IN1","IN8","IN9","IN10","IN11"),c("IN1","IN2","IN7","IN5")] <- 5 # use 9 PLS factors for all variables pls.facs <- as.list( rep( 9, length(impMethod) ) ) names(pls.facs) <- names(impMethod) pls.facs$IN1 <- 15 # use 15 PLS factors for variable IN1 # choose norm or pmm imputation method pls.impMethod <- as.list( rep("norm", length(impMethod) ) ) names(pls.impMethod) <- names(impMethod) pls.impMethod[ c("IN1","IN6")] <- "pmm" # some arguments for imputation method pls.impMethodArgs <- list( "IN1"=list( "donors"=10 ), "IN2"=list( "ridge2"=1E-4 ) ) # Model 1: Three parallel chains imp1 <- mice::mice(data=dat, method=impMethod, m=3, maxit=5, predictorMatrix=predictorMatrix, pls.facs=pls.facs, # number of PLS factors pls.impMethod=pls.impMethod, # Imputation Method in PLS imputation pls.impMethodArgs=pls.impMethodArgs, # arguments for imputation method pls.print.progress=TRUE, ls.meth="ridge" ) summary(imp1) # Model 2: One long chain imp2 <- miceadds::mice.1chain(data=dat, method=impMethod, burnin=10, iter=21, Nimp=3, predictorMatrix=predictorMatrix, pls.facs=pls.facs, pls.impMethod=pls.impMethod, pls.impMethodArgs=pls.impMethodArgs, ls.meth="ridge" ) summary(imp2) #*** example for using imputation function at the level of a variable # extract first imputed dataset imp1 <- mice::complete(imp1, action=1) data_imp1[ is.na(dat$IN1), "IN1" ] <- NA # define variables y <- data_imp1$IN1 x <- data_imp1[, -1 ] ry <- ! is.na(y) cn <- colnames(dat) p <- ncol(dat) type <- rep(1,p) names(type) <- cn type["IN1"] <- 0 # imputation of variable 'IN1' imp0 <- miceadds::mice.impute.pls(y=y, x=x, ry=ry, type=type, pls.facs=10, pls.impMethod="norm", ls.meth="ridge", extract_data=FALSE ) ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.