mice: mice.impute.2lonly.pmm – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

mice.impute.2lonly.pmm

Imputation at level 2 by predictive mean matching

Description

Imputes univariate missing data at level 2 using predictive mean matching. Variables are level 1 are aggregated at level 2. The group identifier at level 2 must be indicated by type = -2 in the predictorMatrix.

Usage

mice.impute.2lonly.pmm(y, ry, x, type, wy = NULL, ...)

Arguments

`y`	Vector to be imputed
`ry`	Logical vector of length `length(y)` indicating the the subset `y[ry]` of elements in `y` to which the imputation model is fitted. The `ry` generally distinguishes the observed (`TRUE`) and missing values (`FALSE`) in `y`.
`x`	Numeric design matrix with `length(y)` rows with predictors for `y`. Matrix `x` may have no missing values.
`type`	Group identifier must be specified by '-2'. Predictors must be specified by '1'.
`wy`	Logical vector of length `length(y)`. A `TRUE` value indicates locations in `y` for which imputations are created.
`...`	Other named arguments.

Details

This function allows in combination with mice.impute.2l.pan switching regression imputation between level 1 and level 2 as described in Yucel (2008) or Gelman and Hill (2007, p. 541).

The function checks for partial missing level-2 data. Level-2 data are assumed to be constant within the same cluster. If one or more entries are missing, then the procedure aborts with an error message that identifies the cluster with incomplete level-2 data. In such cases, one may first fill in the cluster mean (or mode) by the 2lonly.mean method to remove inconsistencies.

Value

A vector of length nmis with imputations.

Note

The extension to categorical variables transforms a dependent factor variable by means of the as.integer() function. This may make sense for categories that are approximately ordered, but less so for pure nominal measures.

For a more general approach, see miceadds::mice.impute.2lonly.function().

Author(s)

Alexander Robitzsch (IPN - Leibniz Institute for Science and Mathematics Education, Kiel, Germany), robitzsch@ipn.uni-kiel.de

References

Gelman, A. and Hill, J. (2007). Data analysis using regression and multilevel/hierarchical models. Cambridge, Cambridge University Press.

Yucel, RM (2008). Multiple imputation inference for multivariate multilevel continuous data with ignorable non-response. Philosophical Transactions of the Royal Society A, 366, 2389-2404.

Van Buuren, S. (2018). Flexible Imputation of Missing Data. Second Edition. Chapman & Hall/CRC. Boca Raton, FL.

Examples

# simulate some data
# x,y ... level 1 variables
# v,w ... level 2 variables

G <- 250 # number of groups
n <- 20 # number of persons
beta <- .3 # regression coefficient
rho <- .30 # residual intraclass correlation
rho.miss <- .10 # correlation with missing response
missrate <- .50 # missing proportion
y1 <- rep(rnorm(G, sd = sqrt(rho)), each = n) + rnorm(G * n, sd = sqrt(1 - rho))
w <- rep(round(rnorm(G), 2), each = n)
v <- rep(round(runif(G, 0, 3)), each = n)
x <- rnorm(G * n)
y <- y1 + beta * x + .2 * w + .1 * v
dfr0 <- dfr <- data.frame("group" = rep(1:G, each = n), "x" = x, "y" = y, "w" = w, "v" = v)
dfr[rho.miss * x + rnorm(G * n, sd = sqrt(1 - rho.miss)) < qnorm(missrate), "y"] <- NA
dfr[rep(rnorm(G), each = n) < qnorm(missrate), "w"] <- NA
dfr[rep(rnorm(G), each = n) < qnorm(missrate), "v"] <- NA

# empty mice imputation
imp0 <- mice(as.matrix(dfr), maxit = 0)
predM <- imp0$predictorMatrix
impM <- imp0$method

# multilevel imputation
predM1 <- predM
predM1[c("w", "y", "v"), "group"] <- -2
predM1["y", "x"] <- 1 # fixed x effects imputation
impM1 <- impM
impM1[c("y", "w", "v")] <- c("2l.pan", "2lonly.norm", "2lonly.pmm")

# turn v into a categorical variable
dfr$v <- as.factor(dfr$v)
levels(dfr$v) <- LETTERS[1:4]

# y ... imputation using pan
# w ... imputation at level 2 using norm
# v ... imputation at level 2 using pmm

# skip imputation on solaris
is.solaris <- function() grepl("SunOS", Sys.info()["sysname"])
if (!is.solaris()) {
  imp <- mice(dfr,
    m = 1, predictorMatrix = predM1,
    method = impM1, maxit = 1, paniter = 500
  )
}

mice

Multivariate Imputation by Chained Equations

v3.13.0

GPL-2 | GPL-3

Authors

Stef van Buuren [aut, cre], Karin Groothuis-Oudshoorn [aut], Gerko Vink [ctb], Rianne Schouten [ctb], Alexander Robitzsch [ctb], Patrick Rockenschaub [ctb], Lisa Doove [ctb], Shahab Jolani [ctb], Margarita Moreno-Betancur [ctb], Ian White [ctb], Philipp Gaffert [ctb], Florian Meinfelder [ctb], Bernie Gray [ctb], Vincent Arel-Bundock [ctb]

Initial release

2021-01-26