Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

covPlot

Robust Distance Plots


Description

Shows the Mahalanobis distances based on robust and classical estimates of the location and the covariance matrix in different plots. The following plots are available:

  • index plot of the robust and mahalanobis distances

  • distance-distance plot

  • Chisquare QQ-plot of the robust and mahalanobis distances

  • plot of the tolerance ellipses (robust and classic)

  • Scree plot - Eigenvalues comparison plot

Usage

## S3 method for class 'mcd'
plot(x,
     which = c("all", "dd", "distance", "qqchi2",
               "tolEllipsePlot", "screeplot"),
     classic = FALSE, ask = (which[1] == "all" && dev.interactive()),
     cutoff, id.n, labels.id = rownames(x$X), cex.id = 0.75,
     label.pos = c(4,2), tol = 1e-7, ...)

covPlot(x,
     which = c("all", "dd", "distance", "qqchi2",
               "tolEllipsePlot", "screeplot"),
     classic = FALSE, ask = (which[1] == "all" && dev.interactive()),
     m.cov = covMcd(x),
     cutoff = NULL, id.n, labels.id = rownames(x), cex.id = 0.75,
     label.pos = c(4,2), tol = 1e-07, ...)

Arguments

x

For the plot() method, a mcd object, typically result of covMcd.
For covPlot(), the numeric data matrix such as the X component as returned from covMcd.

which

string indicating which plot to show. See the Details section for a description of the options. Defaults to "all".

.

classic

whether to plot the classical distances too. Defaults to FALSE.

.

ask

logical indicating if the user should be asked before each plot, see par(ask=.). Defaults to which == "all" && dev.interactive().

cutoff

the cutoff value for the distances.

id.n

number of observations to be identified by a label. If not supplied, the number of observations with distance larger than cutoff is used.

labels.id

vector of labels, from which the labels for extreme points will be chosen. NULL uses observation numbers.

cex.id

magnification of point labels.

label.pos

positioning of labels, for the left half and right half of the graph respectively (used as text(.., pos=*)).

tol

tolerance to be used for computing the inverse, see solve. Defaults to tol = 1e-7.

m.cov

an object similar to those of class "mcd"; however only its components center and cov will be used. If missing, the MCD will be computed (via covMcd()).

...

other parameters to be passed through to plotting functions.

Details

These functions produce several plots based on the robust and classical location and covariance matrix. Which of them to select is specified by the attribute which. The plot method for "mcd" objects is calling covPlot() directly, whereas covPlot() should also be useful for plotting other (robust) covariance estimates. The possible options are:

distance

index plot of the robust distances

dd

distance-distance plot

qqchi2

a qq-plot of the robust distances versus the quantiles of the chi-squared distribution

tolEllipsePlot

a tolerance ellipse plot, via tolEllipsePlot()

screeplot

an eigenvalues comparison plot - screeplot

The Distance-Distance Plot, introduced by Rousseeuw and van Zomeren (1990), displays the robust distances versus the classical Mahalanobis distances. The dashed line is the set of points where the robust distance is equal to the classical distance. The horizontal and vertical lines are drawn at values equal to the cutoff which defaults to square root of the 97.5% quantile of a chi-squared distribution with p degrees of freedom. Points beyond these lines can be considered outliers.

References

P. J. Rousseeuw and van Zomeren, B. C. (1990). Unmasking Multivariate Outliers and Leverage Points. Journal of the American Statistical Association 85, 633–639.

P. J. Rousseeuw and K. van Driessen (1999) A fast algorithm for the minimum covariance determinant estimator. Technometrics 41, 212–223.

See Also

Examples

data(Animals, package ="MASS")
brain <- Animals[c(1:24, 26:25, 27:28),]
mcd <- covMcd(log(brain))

plot(mcd, which = "distance", classic = TRUE)# 2 plots
plot(mcd, which = "dd")
plot(mcd, which = "tolEllipsePlot", classic = TRUE)
op <- par(mfrow = c(2,3))
plot(mcd) ## -> which = "all" (5 plots)
par(op)

## same plots for another robust Cov estimate:
data(hbk)
hbk.x <- data.matrix(hbk[, 1:3])
cOGK <- covOGK(hbk.x, n.iter = 2, sigmamu = scaleTau2,
               weight.fn = hard.rejection)
covPlot(hbk.x, m.cov = cOGK, classic = TRUE)

robustbase

Basic Robust Statistics

v0.93-7
GPL (>= 2)
Authors
Martin Maechler [aut, cre] (<https://orcid.org/0000-0002-8685-9910>), Peter Rousseeuw [ctb] (Qn and Sn), Christophe Croux [ctb] (Qn and Sn), Valentin Todorov [aut] (most robust Cov), Andreas Ruckstuhl [aut] (nlrob, anova, glmrob), Matias Salibian-Barrera [aut] (lmrob orig.), Tobias Verbeke [ctb, fnd] (mc, adjbox), Manuel Koller [aut] (mc, lmrob, psi-func.), Eduardo L. T. Conceicao [aut] (MM-, tau-, CM-, and MTL- nlrob), Maria Anna di Palma [ctb] (initial version of Comedian)
Initial release
2021-01-04

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.