Robust Distance based observation orderings based on robust "Six pack"
Compute six initial robust estimators of multivariate location and
“scatter” (scale); then, for each, compute the distances
d_ij and take the h
(h > n/2) observations
with smallest distances. Then compute the statistical distances based
on these h observations.
Return the indices of the observations sorted in increasing order.
r6pack(x, h, full.h, scaled = TRUE, scalefn = rrcov.control()$scalefn)
x |
n x p data matrix |
h |
integer, typically around (and slightly larger than) n/2. |
full.h |
logical specifying if the full (length n) observation
ordering should be returned; otherwise only the first |
scaled |
logical indicating if the data |
scalefn |
a |
The six initial estimators are
Hyperbolic tangent of standardized data
Spearmann correlation matrix
Tukey normal scores
Spatial sign covariance matrix
BACON
Raw OGK estimate for scatter
a h' x 6 matrix
of observation
indices, i.e., with values from 1..n. If
full.h
is true, h' = n, otherwise h' = h.
Valentin Todorov, based on the original Matlab code by
Tim Verdonck and Mia Hubert. Martin Maechler for tweaks
(performance etc), and full.h
.
Hubert, M., Rousseeuw, P. J. and Verdonck, T. (2012) A deterministic algorithm for robust location and scatter. Journal of Computational and Graphical Statistics 21, 618–637.
data(pulpfiber) dim(m.pulp <- data.matrix(pulpfiber)) # 62 x 8 dim(fr6 <- r6pack(m.pulp, h = 40, full.h= FALSE)) # h x 6 = 40 x 6 dim(fr6F <- r6pack(m.pulp, h = 40, full.h= TRUE )) # n x 6 = 62 x 6 stopifnot(identical(fr6, fr6F[1:40,]))
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.