Winsorize (Replace Extreme Values by Less Extreme Ones)
Winsorizing a vector means that a predefined quantum of the smallest and/or the largest values are replaced by less extreme values. Thereby the substitute values are the most extreme retained values.
Winsorize(x, minval = NULL, maxval = NULL, probs = c(0.05, 0.95), na.rm = FALSE, type = 7)
x |
a numeric vector to be winsorized. |
minval |
the low border, all values being lower than this will be replaced by this value. The default is set to the 5%-quantile of x. |
maxval |
the high border, all values being larger than this will be replaced by this value. The default is set to the 95%-quantile of x. |
probs |
numeric vector of probabilities with values in [0,1] as used in |
na.rm |
should NAs be omitted to calculate the quantiles? |
type |
an integer between 1 and 9 selecting one of the nine quantile algorithms detailed in |
The winsorized vector is obtained by
wins(x) = -c if x < -c, c if x > c, x otherwise
You may also want to consider standardizing (possibly robustly) the data before you perform a winsorization.
A vector of the same length as the original data
x
containing the winsorized data.
Andri Signorell <andri@signorell.net>
Winsorize
from the package robustHD
contains an option to winsorize multivariate data
## generate data set.seed(1234) # for reproducibility x <- rnorm(10) # standard normal x[1] <- x[1] * 10 # introduce outlier ## Winsorize data x Winsorize(x) # use Large and Small, if a fix number of values should be winsorized (here k=3): Winsorize(x, minval=tail(Small(x, k=3), 1), maxval=head(Large(x, k=3), 1))
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.