Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

locoutNeighbor

Diagnostic plot for identifying local outliers with varying size of neighborhood


Description

Computes global and pairwise Mahalanobis distances for visualizing global and local multivariate outliers. The size of the neighborhood (number of neighbors) is varying, but the fraction of neighbors is fixed.

Usage

locoutNeighbor(dat, X, Y, propneighb = 0.1, variant = c("dist", "knn"), usemax = 1/3, 
   npoints = 50, chisqqu = 0.975, indices = NULL, xlab = NULL, ylab = NULL, 
   colall = gray(0.7), colsel = 1, ...)

Arguments

dat

multivariate data set (without coordinates)

X

X coordinates of the data points

Y

Y coordinates of the data points

propneighb

proportion of neighbors to be included in tolerance ellipse

variant

either search for neighbors according to the Eucl.Distance, or according to kNN

usemax

for either variant: give fraction of points (max Dist) that is used for the plot

npoints

computation is done at most at npoints points

chisqqu

quantile of the chisquare distribution for splitting the plot

indices

if this is not NULL, these should be indices of observations to be highlighted

xlab

x-axis label for plot

ylab

y-axis label for plot

colall

color for lines if indices is NULL

colsel

color for lines if indices are selected

...

additional parameters for plotting

Details

For this diagnostic tool, the number of neighbors is varied up to a fraction of usemax observations. Then propneighb (called beta) is fixed, and for each observation we compute the degree of isolation from a fraction of 1-beta of its neighbors. Neighborhood can be defined either via the Euclidean distance or by k-Nearest-Neighbors. For computational reasons, all computations are done at most for npoints points. The critical value for outliers is the quantile chisqqu of the chisquare distribution. One can also provide indices of observations (for indices). Then the corresponding lines in the plots will be highlighted.

Value

indices.reg

indices of the (selected) observations being regular observations

indices.out

indices of the (selected) observations being golbal outliers

Author(s)

References

P. Filzmoser, A. Ruiz-Gazen, and C. Thomas-Agnan: Identification of local multivariate outliers. Submitted for publication, 2012.

See Also

Examples

# use data from illustrative example in paper:
data(X)
data(Y)
data(dat)
res <- locoutNeighbor(dat,X,Y,variant="knn",usemax=1,chisqqu=0.975,indices=c(1,11,24,36),
              propneighb=0.1,npoints=100)

mvoutlier

Multivariate Outlier Detection Based on Robust Methods

v2.0.9
GPL (>= 3)
Authors
Peter Filzmoser <P.Filzmoser@tuwien.ac.at> and Moritz Gschwandtner <e0125439@student.tuwien.ac.at>
Initial release
2018-02-08

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.