Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

xyplot.mids

Scatterplot of observed and imputed data


Description

Plotting methods for imputed data using lattice. xyplot() produces a conditional scatterplots. The function automatically separates the observed (blue) and imputed (red) data. The function extends the usual features of lattice.

Usage

## S3 method for class 'mids'
xyplot(
  x,
  data,
  na.groups = NULL,
  groups = NULL,
  as.table = TRUE,
  theme = mice.theme(),
  allow.multiple = TRUE,
  outer = TRUE,
  drop.unused.levels = lattice::lattice.getOption("drop.unused.levels"),
  ...,
  subscripts = TRUE,
  subset = TRUE
)

Arguments

x

A mids object, typically created by mice() or mice.mids().

data

Formula that selects the data to be plotted. This argument follows the lattice rules for formulas, describing the primary variables (used for the per-panel display) and the optional conditioning variables (which define the subsets plotted in different panels) to be used in the plot.

The formula is evaluated on the complete data set in the long form. Legal variable names for the formula include names(x$data) plus the two administrative factors .imp and .id.

Extended formula interface: The primary variable terms (both the LHS y and RHS x) may consist of multiple terms separated by a ‘+’ sign, e.g., y1 + y2 ~ x | a * b. This formula would be taken to mean that the user wants to plot both y1 ~ x | a * b and y2 ~ x | a * b, but with the y1 ~ x and y2 ~ x in separate panels. This behavior differs from standard lattice. Only combine terms of the same type, i.e. only factors or only numerical variables. Mixing numerical and categorical data occasionally produces odds labeling of vertical axis.

na.groups

An expression evaluating to a logical vector indicating which two groups are distinguished (e.g. using different colors) in the display. The environment in which this expression is evaluated in the response indicator is.na(x$data).

The default na.group = NULL contrasts the observed and missing data in the LHS y variable of the display, i.e. groups created by is.na(y). The expression y creates the groups according to is.na(y). The expression y1 & y2 creates groups by is.na(y1) & is.na(y2), and y1 | y2 creates groups as is.na(y1) | is.na(y2), and so on.

groups

This is the usual groups arguments in lattice. It differs from na.groups because it evaluates in the completed data data.frame(complete(x, "long", inc=TRUE)) (as usual), whereas na.groups evaluates in the response indicator. See xyplot for more details. When both na.groups and groups are specified, na.groups takes precedence, and groups is ignored.

as.table

See xyplot.

theme

A named list containing the graphical parameters. The default function mice.theme produces a short list of default colors, line width, and so on. The extensive list may be obtained from trellis.par.get(). Global graphical parameters like col or cex in high-level calls are still honored, so first experiment with the global parameters. Many setting consists of a pair. For example, mice.theme defines two symbol colors. The first is for the observed data, the second for the imputed data. The theme settings only exist during the call, and do not affect the trellis graphical parameters.

allow.multiple

See xyplot.

outer

See xyplot.

drop.unused.levels

See xyplot.

...

Further arguments, usually not directly processed by the high-level functions documented here, but instead passed on to other functions.

subscripts

See xyplot.

subset

See xyplot.

Details

The argument na.groups may be used to specify (combinations of) missingness in any of the variables. The argument groups can be used to specify groups based on the variable values themselves. Only one of both may be active at the same time. When both are specified, na.groups takes precedence over groups.

Use the subset and na.groups together to plots parts of the data. For example, select the first imputed data set by by subset=.imp==1.

Graphical parameters like col, pch and cex can be specified in the arguments list to alter the plotting symbols. If length(col)==2, the color specification to define the observed and missing groups. col[1] is the color of the 'observed' data, col[2] is the color of the missing or imputed data. A convenient color choice is col=mdc(1:2), a transparent blue color for the observed data, and a transparent red color for the imputed data. A good choice is col=mdc(1:2), pch=20, cex=1.5. These choices can be set for the duration of the session by running mice.theme().

Value

The high-level functions documented here, as well as other high-level Lattice functions, return an object of class "trellis". The update method can be used to subsequently update components of the object, and the print method (usually called by default) will plot it on an appropriate plotting device.

Note

The first two arguments (x and data) are reversed compared to the standard Trellis syntax implemented in lattice. This reversal was necessary in order to benefit from automatic method dispatch.

In mice the argument x is always a mids object, whereas in lattice the argument x is always a formula.

In mice the argument data is always a formula object, whereas in lattice the argument data is usually a data frame.

All other arguments have identical interpretation.

Author(s)

Stef van Buuren

References

Sarkar, Deepayan (2008) Lattice: Multivariate Data Visualization with R, Springer.

van Buuren S and Groothuis-Oudshoorn K (2011). mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software, 45(3), 1-67. https://www.jstatsoft.org/v45/i03/

See Also

mice, stripplot, densityplot, bwplot, lattice for an overview of the package, as well as xyplot, panel.xyplot, print.trellis, trellis.par.set

Examples

imp <- mice(boys, maxit = 1)

# xyplot: scatterplot by imputation number
# observe the erroneous outlying imputed values
# (caused by imputing hgt from bmi)
xyplot(imp, hgt ~ age | .imp, pch = c(1, 20), cex = c(1, 1.5))

# same, but label with missingness of wgt (four cases)
xyplot(imp, hgt ~ age | .imp, na.group = wgt, pch = c(1, 20), cex = c(1, 1.5))

mice

Multivariate Imputation by Chained Equations

v3.13.0
GPL-2 | GPL-3
Authors
Stef van Buuren [aut, cre], Karin Groothuis-Oudshoorn [aut], Gerko Vink [ctb], Rianne Schouten [ctb], Alexander Robitzsch [ctb], Patrick Rockenschaub [ctb], Lisa Doove [ctb], Shahab Jolani [ctb], Margarita Moreno-Betancur [ctb], Ian White [ctb], Philipp Gaffert [ctb], Florian Meinfelder [ctb], Bernie Gray [ctb], Vincent Arel-Bundock [ctb]
Initial release
2021-01-26

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.