Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

outCoDa

Outlier detection for compositional data


Description

Outlier detection for compositional data using standard and robust statistical methods.

Usage

outCoDa(x, quantile = 0.975, method = "robust", alpha = 0.5, coda = TRUE)

## S3 method for class 'outCoDa'
print(x, ...)

## S3 method for class 'outCoDa'
plot(x, y, ..., which = 1)

Arguments

x

compositional data

quantile

quantile, corresponding to a significance level, is used as a cut-off value for outlier identification: observations with larger (squared) robust Mahalanobis distance are considered as potential outliers.

method

either “robust” (default) or “standard”

alpha

the size of the subsets for the robust covariance estimation according the MCD-estimator for which the determinant is minimized, see covMcd.

coda

if TRUE, data transformed to coordinate representation before outlier detection.

...

additional parameters for print and plot method passed through

y

unused second plot argument for the plot method

which

1 ... MD against index 2 ... distance-distance plot

Details

The outlier detection procedure is based on (robust) Mahalanobis distances in isometric logratio coordinates. Observations with squared Mahalanobis distance greater equal a certain quantile of the chi-squared distribution are marked as outliers.

If method “robust” is chosen, the outlier detection is based on the homogeneous majority of the compositional data set. If method “standard” is used, standard measures of location and scatter are applied during the outlier detection procedure.

plot method: the Mahalanobis distance are plotted against the index. The dashed line indicates the (1 - alpha) quantile of the chi-squared distribution. Observations with Mahalanobis distance greater than this quantile could be considered as compositional outliers.

Value

mahalDist

resulting Mahalanobis distance

limit

quantile of the Chi-squared distribution

outlierIndex

logical vector indicating outliers and non-outliers

method

method used

Note

It is highly recommended to use the robust version of the procedure.

Author(s)

Matthias Templ, Karel Hron

References

Egozcue J.J., Pawlowsky-Glahn, V., Mateu-Figueras, G., Barcelo-Vidal, C. (2003) Isometric logratio transformations for compositional data analysis. Mathematical Geology, 35 (3) 279-300.

Filzmoser, P., and Hron, K. (2008) Outlier detection for compositional data using robust methods. Math. Geosciences, 40, 233-248.

Rousseeuw, P.J., Van Driessen, K. (1999) A fast algorithm for the minimum covariance determinant estimator. Technometrics, 41, 212-223.

See Also

Examples

data(expenditures)
oD <- outCoDa(expenditures)
oD
## providing a function:
oD <- outCoDa(expenditures, coda = log)

robCompositions

Compositional Data Analysis

v2.3.0
GPL (>= 2)
Authors
Matthias Templ [aut, cre] (<https://orcid.org/0000-0002-8638-5276>), Karel Hron [aut] (<https://orcid.org/0000-0002-1847-6598>), Peter Filzmoser [aut] (<https://orcid.org/0000-0002-8014-4682>), Kamila Facevicova [ctb], Petra Kynclova [ctb], Jan Walach [ctb], Veronika Pintar [ctb], Jiajia Chen [ctb], Dominika Miksova [ctb], Bernhard Meindl [ctb], Alessandra Menafoglio [ctb] (<https://orcid.org/0000-0003-0682-6412>), Alessia Di Blasi [ctb], Federico Pavone [ctb], Gianluca Zeni [ctb]
Initial release
2020-11-18

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.