BACON: Blocked Adaptive Computationally-Efficient Outlier Nominators
This function performs an outlier identification algorithm to the data in the x array [n x p] and y vector [n] following the lines described by Hadi et al. for their BACON outlier procedure.
mvBACON(x, collect = 4, m = min(collect * p, n * 0.5), alpha = 0.95, init.sel = c("Mahalanobis", "dUniMedian", "random", "manual"), man.sel, maxsteps = 100, allowSingular = FALSE, verbose = TRUE)
x |
numeric matrix (of dimension [n x p]), not supposed to contain missing values. |
collect |
a multiplication factor c, when |
m |
integer in |
alpha |
significance level for the chisq cutoff, used to define the next iterations basic subset. |
init.sel |
character string, specifying the initial selection mode; implemented modes are:
|
man.sel |
only when |
maxsteps |
maximal number of iteration steps. |
allowSingular |
logical indicating a solution should be sought also when no matrix of rank p is found. |
verbose |
logical indicating if messages are printed which trace progress of the algorithm. |
a list with components
subset |
logical vector of length |
dis |
numeric vector of length |
cov |
p x p matrix, the corresponding robust estimate of covariance. |
Ueli Oetliker, Swiss Federal Statistical Office, for S-plus 5.1. Port to R, testing etc, by Martin Maechler
Billor, N., Hadi, A. S., and Velleman , P. F. (2000). BACON: Blocked Adaptive Computationally-Efficient Outlier Nominators; Computational Statistics and Data Analysis 34, 279–298. doi: 10.1016/S0167-9473(99)00101-2
require(robustbase) # for example data and covMcd(): ## simple 2D example : plot(starsCYG, main = "starsCYG data (n=47)") B.st <- mvBACON(starsCYG) points(starsCYG[ ! B.st$subset,], pch = 4, col = 2, cex = 1.5) stopifnot(identical(which(!B.st$subset), c(7L,9L,11L,14L,20L,30L,34L))) ## finds the clear outliers (and 3 "borderline") ## 'coleman' from pkg 'robustbase' coleman.x <- data.matrix(coleman[, 1:6]) Cc <- covMcd (coleman.x) # truly robust summary(Cc) # -> 6 outliers (1,3,10,12,17,18) Cb1 <- mvBACON(coleman.x) ##-> subset is all TRUE hmm?? Cb2 <- mvBACON(coleman.x, init.sel = "dUniMedian") stopifnot(all.equal(Cb1, Cb2)) Cb.r <- lapply(1:20, function(i) { set.seed(i) mvBACON(coleman.x, init.sel="random", verbose=FALSE) }) nm <- names(Cb.r[[1]]); nm <- nm[nm != "steps"] all(eqC <- sapply(Cb.r[-1], function(CC) all.equal(CC[nm], Cb.r[[1]][nm]))) # TRUE ## --> BACON always breaks down, i.e., does not see the outliers here ## breaks down even when manually starting with all the non-outliers: Cb.man <- mvBACON(coleman.x, init.sel = "manual", man.sel = setdiff(1:20, c(1,3,10,12,17,18))) which( ! Cb.man$subset) # the outliers according to mvBACON : _none_
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.