Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

outlier

Find and graph Mahalanobis squared distances to detect outliers


Description

The Mahalanobis distance is D^2 = (x-μ)' Σ^-1 (x-μ) where Σ is the covariance of the x matrix. D2 may be used as a way of detecting outliers in distribution. Large D2 values, compared to the expected Chi Square values indicate an unusual response pattern. The mahalanobis function in stats does not handle missing data.

Usage

outlier(x, plot = TRUE, bad = 5,na.rm = TRUE, xlab, ylab, ...)

Arguments

x

A data matrix or data.frame

plot

Plot the resulting QQ graph

bad

Label the bad worst values

na.rm

Should missing data be deleted

xlab

Label for x axis

ylab

Label for y axis

...

More graphic parameters, e.g., cex=.8

Details

Adapted from the mahalanobis function and help page from stats.

Value

The D2 values for each case

Author(s)

William Revelle

References

Yuan, Ke-Hai and Zhong, Xiaoling, (2008) Outliers, Leverage Observations, and Influential Cases in Factor Analysis: Using Robust Procedures to Minimize Their Effect, Sociological Methodology, 38, 329-368.

See Also

Examples

#first, just find and graph the outliers
d2 <- outlier(sat.act)
#combine with the data frame and plot it with the outliers highlighted in blue
sat.d2 <- data.frame(sat.act,d2)
pairs.panels(sat.d2,bg=c("yellow","blue")[(d2 > 25)+1],pch=21)

psych

Procedures for Psychological, Psychometric, and Personality Research

v2.1.3
GPL (>= 2)
Authors
William Revelle [aut, cre] (<https://orcid.org/0000-0003-4880-9610>)
Initial release
2021-03-21

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.