Robust Covariance/Correlation Matrix Estimation
Compute robust estimates of multivariate location and scatter.
covRob(data, corr = FALSE, distance = TRUE, na.action = na.fail, estim = "auto", control = covRob.control(estim, ...), ...)
data |
a numeric matrix or data frame containing the data. |
corr |
a logical flag. If |
distance |
a logical flag. If |
na.action |
a function to filter missing data. The default |
estim |
a character string specifying the robust estimator to be used. The choices are: "mcd" for the Fast MCD algorithm of Rousseeuw and Van Driessen, "weighted" for the Reweighted MCD, "donostah" for the Donoho-Stahel projection based estimator, "M" for the constrained M estimator provided by Rocke, "pairwiseQC" for the orthogonalized quadrant correlation pairwise estimator, and "pairwiseGK" for the Orthogonalized Gnanadesikan-Kettenring pairwise estimator. The default "auto" selects from "donostah", "mcd", and "pairwiseQC" with the goal of producing a good estimate in a reasonable amount of time. |
control |
a list of control parameters to be used in the numerical algorithms. See |
... |
control parameters may be passed directly when |
The covRob
function selects a robust covariance estimator that is likely to provide a good estimate in a reasonable amount of time. Presently this selection is based on the problem size. The Donoho-Stahel estimator is used if there are less than 1000 observations and less than 10 variables or less than 5000 observations and less than 5 variables. If there are less than 50000 observations and less than 20 variables then the MCD is used. For larger problems, the Orthogonalized Quadrant Correlation estimator is used.
The M estimate (estim = "M"
) is computed using the covMest
function in the rrcov package. For historical reasons the Robust Library uses the MCD to compute the initial estimate.
The Donoho-Stahel (estim = "donostah"
) estimator is computed using the CovSde
function provided in the rrcov package.
The pairwise estimators (estim = "pairwisegk"
and estim = "pairwiseqc"
) are computed using the CovOgk
function in the rrcov package.
an object of class "covRob
" with components:
call |
an image of the call that produced the object with all the arguments named. |
cov |
a numeric matrix containing the final robust estimate of the covariance/correlation matrix. |
center |
a numeric vector containing the final robust estimate of the location vector. |
dist |
a numeric vector containing the squared Mahalanobis distances computed using robust estimates of covariance and location contained in |
raw.cov |
a numeric matrix containing the initial robust estimate of the covariance/correlation matrix. If there is no initial robust estimate then this element is set to |
raw.center |
a numeric vector containing the initial robust estimate of the location vector. If there is no initial robust estimate then this element is set to |
raw.dist |
a numeric vector containing the squared Mahalanobis distances computed using the initial robust estimates of covariance and location contained in |
corr |
a logical flag. If |
estim |
a character string containing the name of the robust estimator. |
control |
a list containing the control parameters used by the robust estimator. |
Version 0.3-8 of the Robust Library: all of the functions origianlly contributed by the S-Plus Robust Library have been replaced by dependencies on the robustbase and rrcov packages. Computed results may differ from earlier versions of the Robust Library. In particular, the MCD estimators are now adjusted by a small sample size correction factor. Additionally, a bug was fixed where the final MCD covariance estimate produced with estim = "mcd"
was not rescaled for consistency.
R. A. Maronna and V. J. Yohai (1995) The Behavior of the Stahel-Donoho Robust Multivariate Estimator. Journal of the American Statistical Association 90 (429), 330–341.
P. J. Rousseeuw and K. van Driessen (1999) A fast algorithm for the minimum covariance determinant estimator. Technometrics 41, 212–223.
D. L. Woodruff and D. M. Rocke (1994) Computable robust estimation of multivariate location and shape on high dimension using compound estimators. Journal of the American Statistical Association, 89, 888–896.
R. A. Maronna and R. H. Zamar (2002) Robust estimates of location and dispersion of high-dimensional datasets. Technometrics 44 (4), 307–317.
data(stackloss) covRob(stackloss)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.