Tuning Parameters for lmrob() and Auxiliaries
Tuning parameters for lmrob
, the MM-type regression
estimator and the associated S-, M- and D-estimators. Using
setting="KS2011"
sets the defaults as suggested by
Koller and Stahel (2011) and analogously for "KS2014"
.
lmrob.control(setting, seed = NULL, nResample = 500, tuning.chi = NULL, bb = 0.5, tuning.psi = NULL, max.it = 50, groups = 5, n.group = 400, k.fast.s = 1, best.r.s = 2, k.max = 200, maxit.scale = 200, k.m_s = 20, refine.tol = 1e-7, rel.tol = 1e-7, scale.tol = 1e-10, solve.tol = 1e-7, trace.lev = 0, mts = 1000, subsampling = c("nonsingular", "simple"), compute.rd = FALSE, method = "MM", psi = "bisquare", numpoints = 10, cov = NULL, split.type = c("f", "fi", "fii"), fast.s.large.n = 2000, eps.outlier = function(nobs) 0.1 / nobs, eps.x = function(maxx) .Machine$double.eps^(.75)*maxx, compute.outlier.stats = method, warn.limit.reject = 0.5, warn.limit.meanrw = 0.5, ...) .Mchi.tuning.defaults .Mchi.tuning.default(psi) .Mpsi.tuning.defaults .Mpsi.tuning.default(psi)
setting |
a string specifying alternative default values. Leave
empty for the defaults or use |
seed |
|
nResample |
number of re-sampling candidates to be used to find the initial S-estimator. Currently defaults to 500 which works well in most situations (see references). |
tuning.chi |
tuning constant vector for the S-estimator. If
|
bb |
expected value under the normal model of the
“chi” (rather rho) function with tuning
constant equal to |
tuning.psi |
tuning constant vector for the redescending
M-estimator. If |
max.it |
integer specifying the maximum number of IRWLS iterations. |
groups |
(for the fast-S algorithm): Number of random subsets to use when the data set is large. |
n.group |
(for the fast-S algorithm): Size of each of the
|
k.fast.s |
(for the fast-S algorithm): Number of local improvement steps (“I-steps”) for each re-sampling candidate. |
k.m_s |
(for the M-S algorithm): specifies after how many unsucessful refinement steps the algorithm stops. |
best.r.s |
(for the fast-S algorithm): Number of of best candidates to be iterated further (i.e., “refined”); is denoted t in Salibian-Barrera & Yohai(2006). |
k.max |
(for the fast-S algorithm): maximal number of refinement steps for the “fully” iterated best candidates. |
maxit.scale |
integer specifying the maximum number of C level
|
refine.tol |
(for the fast-S algorithm): relative convergence tolerance for the fully iterated best candidates. |
rel.tol |
(for the RWLS iterations of the MM algorithm): relative convergence tolerance for the parameter vector. |
scale.tol |
(for the scale estimation iterations of the S algorithm): relative
convergence tolerance for the |
solve.tol |
(for the S algorithm): relative
tolerance for inversion. Hence, this corresponds to
|
trace.lev |
integer indicating if the progress of the MM-algorithm
should be traced (increasingly); default |
mts |
maximum number of samples to try in subsampling algorithm. |
subsampling |
type of subsampling to be used, a string:
|
compute.rd |
logical indicating if robust distances (based on
the MCD robust covariance estimator |
method |
string specifying the estimator-chain. |
psi |
string specifying the type ψ-function
used. See Details of |
numpoints |
number of points used in Gauss quadrature. |
cov |
function or string with function name to be used to
calculate covariance matrix estimate. The default is
|
split.type |
determines how categorical and continuous variables
are split. See |
fast.s.large.n |
minimum number of observations required to switch from ordinary “fast S” algorithm to an efficient “large n” strategy. |
eps.outlier |
limit on the robustness weight below which an observation
is considered to be an outlier.
Either a numeric(1) or a function that takes the number of observations as
an argument. Used in |
eps.x |
limit on the absolute value of the elements of the design matrix below which an element is considered zero. Either a numeric(1) or a function that takes the maximum absolute value in the design matrix as an argument. |
compute.outlier.stats |
vector of |
warn.limit.reject |
limit of ratio
# rejected / # obs in level
above (>=) which a warning is produced.
Set to |
warn.limit.meanrw |
limit of the mean robustness per factor level
below which (<=) a warning is produced.
Set to |
... |
further arguments to be added as |
The option setting="KS2011"
alters the default
arguments. They are changed to method = "SMDM"
, psi = "lqq"
,
max.it = 500
, k.max = 2000
, cov = ".vcov.w"
.
The defaults of all the remaining arguments are not changed.
The option setting="KS2014"
builds upon setting="KS2011"
.
More arguments are changed to best.r.s = 20, k.fast.s = 2,
nResample = 1000
. This setting should produce more stable estimates
for designs with factor
s.
By default, and in .Mpsi.tuning.default()
and .Mchi.tuning.default()
,
tuning.chi
and tuning.psi
are set to yield an
MM-estimate with breakdown point 0.5 and efficiency of 95% at
the normal.
If numeric tuning.chi
or tuning.psi
are specified, say
cc
, for psi = "ggw"
or "lqq"
,
.psi.const(cc, psi)
is used, see its help page.
To get the defaults, e.g., .Mpsi.tuning.default(psi)
is
equivalent to but more efficient than the formerly widely used
lmrob.control(psi = psi)$tuning.psi
.
These defaults are:
psi |
tuning.chi |
tuning.psi |
bisquare
|
1.54764 |
4.685061 |
welsh |
0.5773502 |
2.11 |
ggw |
c(-0.5, 1.5, NA, 0.5) |
c(-0.5, 1.5, 0.95, NA) |
lqq |
c(-0.5, 1.5, NA, 0.5) |
c(-0.5, 1.5, 0.95, NA) |
optimal |
0.4047 |
1.060158 |
hampel |
c(1.5, 3.5, 8)*0.2119163 |
c(1.5, 3.5, 8)*0.9014
|
The values for the tuning constant for the ggw
and lqq
psi functions are specified differently here by a vector with four
elements: minimal slope, b (controlling the bend at the maximum of the curve),
efficiency, breakdown point.
Use NA
for an unspecified value of either efficiency or
breakdown point, see examples in the tables (above and below).
For these table examples, the respective “inner constants” are
stored precomputed, see .psi.lqq.findc
for more.
The constants for the "hampel"
psi function are chosen to have a
redescending slope of -1/3. Constants for a slope of -1/2
would be
psi |
tuning.chi |
tuning.psi |
"hampel"
|
c(2, 4, 8) * 0.1981319 |
c(2, 4, 8) * 0.690794
|
Alternative coefficients for an efficiency of 85% at the normal are given in the table below.
psi |
tuning.psi |
bisquare |
3.443689 |
welsh |
1.456 |
ggw , lqq
|
c(-0.5, 1.5, 0.85, NA) |
optimal |
0.8684 |
hampel (-1/3) |
c(1.5, 3.5, 8)* 0.5704545 |
hampel (-1/2) |
c( 2, 4, 8) * 0.4769578
|
lmrob.control()
returns a named list
with over
twenty components, corresponding to the arguments, where
tuning.psi
and tuning.chi
are typically computed, as
.Mpsi.tuning.default(psi)
or .Mchi.tuning.default(psi)
,
respectively.
Matias Salibian-Barrera, Martin Maechler and Manuel Koller
Koller, M. and Stahel, W.A. (2011) Sharpening Wald-type inference in robust regression for small samples. Computational Statistics & Data Analysis 55(8), 2504–2515.
Koller, M. and Stahel, W.A. (2017)
Nonsingular subsampling for regression S~estimators with categorical predictors,
Computational Statistics 32(2): 631–646.
doi: 10.1007/s00180-016-0679-x.
Referred as "KS2014"
everywhere in robustbase; A shorter first
version, Koller (2012) has been available from https://arxiv.org/abs/1208.5595.
## Show the default settings: str(lmrob.control()) ## Artificial data for a simple "robust t test": set.seed(17) y <- y0 <- rnorm(200) y[sample(200,20)] <- 100*rnorm(20) gr <- as.factor(rbinom(200, 1, prob = 1/8)) lmrob(y0 ~ 0+gr) ## Use Koller & Stahel(2011)'s recommendation but a larger 'max.it': str(ctrl <- lmrob.control("KS2011", max.it = 1000)) str(.Mpsi.tuning.defaults) stopifnot(identical(.Mpsi.tuning.defaults, sapply(names(.Mpsi.tuning.defaults), .Mpsi.tuning.default))) ## Containing (names!) all our (pre-defined) redescenders: str(.Mchi.tuning.defaults) ## Difference between settings: C11 <- lmrob.control("KS2011") C14 <- lmrob.control("KS2014") str(C14) ## Apart from `setting` itself, they only differ in three places: diffC <- names(which(!mapply(identical, C11,C14, ignore.environment=TRUE))) cbind(KS11 = unlist(C11[diffC[-1]]), KS14 = unlist(C14[diffC[-1]])) ## KS11 KS14 ## nResample 500 1000 ## best.r.s 2 20 ## k.fast.s 1 2
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.