Outlier
Return outliers following Tukey's boxplot and Hampel's median/mad definition.
Outlier(x, method = c("boxplot", "hampel"), value = TRUE,na.rm = FALSE)
x |
a (non-empty) numeric vector of data values. |
method |
the method to be used. So far Tukey's boxplot and Hampel's rule are implemented. |
value |
logical. If |
na.rm |
logical. Should missing values be removed? Defaults to |
Outlier detection is a tricky problem and should be handled with care. We implement Tukey's boxplot rule as a rough idea of spotting extreme values.
Hampel considers values outside of median +/- 3 * (median absolute deviation) to be outliers.
the values of x lying outside the whiskers in a boxplot
or the indices of them
Andri Signorell <andri@signorell.net>
Hampel F. R. (1974) The influence curve and its role in robust estimation, Journal of the American Statistical Association, 69, 382-393
Outlier(d.pizza$temperature, na.rm=TRUE) # it's the same as the result from boxplot sort(d.pizza$temperature[Outlier(d.pizza$temperature, value=FALSE, na.rm=TRUE)]) b <- boxplot(d.pizza$temperature, plot=FALSE) sort(b$out) # nice to find the corresponding rows d.pizza[Outlier(d.pizza$temperature, value=FALSE, na.rm=TRUE), ] # compare to Hampel's rule Outlier(d.pizza$temperature, method="hampel", na.rm=TRUE) # outliers for the each driver tapply(d.pizza$temperature, d.pizza$driver, Outlier, na.rm=TRUE) # the same as: boxplot(temperature ~ driver, d.pizza)$out
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.