Medcouple, a Robust Measure of Skewness
Compute the ‘medcouple’, a robust concept and estimator of skewness. The medcouple is defined as a scaled median difference of the left and right half of distribution, and hence not based on the third moment as the classical skewness.
mc(x, na.rm = FALSE, doReflect = (length(x) <= 100), doScale = TRUE, # <- chg default to 'FALSE' ? eps1 = 1e-14, eps2 = 1e-15, # << new in 0.93-2 (2018-07..) maxit = 100, trace.lev = 0, full.result = FALSE)
x |
a numeric vector |
na.rm |
logical indicating how missing values ( |
doReflect |
logical indicating if the internal MC should also be
computed on the reflected sample |
doScale |
logical indicating if the internal algorithm should
also scale the data (using the most distant value from the
median which is unrobust and numerically dangerous); scaling has been
the hardwired default in the original algorithm and in R's
|
eps1, eps2 |
tolerance in the algorithm; |
maxit |
maximal number of iterations; typically a few should be sufficient. |
trace.lev |
integer specifying how much diagnostic output the
algorithm (in C) should produce. No output by default, most output
for |
full.result |
logical indicating if the full return values (from
C) should be returned as a list via |
a number between -1 and 1, which is the medcouple, MC(x).
For r <- mc(x, full.result = TRUE, ....)
, then
attr(r, "mcComp")
is a list with components
medc |
the medcouple mc.(x). |
medc2 |
the medcouple mc.(-x) if |
eps |
tolerances used. |
iter,iter2 |
number of iterations used. |
converged,converged2 |
logical specifying “convergence”. |
For extreme cases there are convergence problems.
Some of them can be alleviated by “loosening” the tolerances
eps1
and eps2
.
For others, with pecular values, notably many almost-ties with the
median, it can help extremely to replace mc(x, *)
by
mc(jitter(x), *)
,
or also just mc(signif(x), *)
,
Also, the algorithm not only centers the data around the median but also scales them by the extremes which may have a negative effect e.g., when changing an extreme outlier to even more extreme, the result changes wrongly; see the 'mc10x' example.
Guy Brys; modifications by Tobias Verbeke and bug fixes and extensions by Manuel Koller and Martin Maechler.
Guy Brys, Mia Hubert and Anja Struyf (2004) A Robust Measure of Skewness; JCGS 13 (4), 996–1017.
Hubert, M. and Vandervieren, E. (2008). An adjusted boxplot for skewed distributions, Computational Statistics and Data Analysis 52, 5186–5201.
Qn
for a robust measure of scale (aka
“dispersion”), ....
mc(1:5) # 0 for a symmetric sample x1 <- c(1, 2, 7, 9, 10) mc(x1) # = -1/3 data(cushny) mc(cushny) # 0.125 stopifnot(mc(c(-20, -5, -2:2, 5, 20)) == 0, mc(x1, doReflect=FALSE) == -mc(-x1, doReflect=FALSE), all.equal(mc(x1, doReflect=FALSE), -1/3, tolerance = 1e-12)) ## Susceptibility of the current algorithm to large outliers : dX10 <- function(X) c(1:5,7,10,15,25, X) # generate skewed size-10 with 'X' x <- c(10,20,30, 100^(1:20)) (mc10x <- vapply(x, function(X) mc(dX10(X)), 1)) ## limit X -> Inf should be 7/12 = 0.58333... but that "breaks down a bit" : plot(x, mc10x, type="b", main = "mc( c(1:5,7,10,15,25, X) )", xlab="X", log="x")
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.