Rosner's Test for Outliers
Perform Rosner's generalized extreme Studentized deviate test for up to k potential outliers in a dataset, assuming the data without any outliers come from a normal (Gaussian) distribution.
rosnerTest(x, k = 3, alpha = 0.05, warn = TRUE)
x |
numeric vector of observations.
Missing ( |
k |
positive integer indicating the number of suspected outliers. The argument |
alpha |
numeric scalar between 0 and 1 indicating the Type I error associated with the
test of hypothesis. The default value is |
warn |
logical scalar indicating whether to issue a warning ( |
Let x_1, x_2, …, x_n denote the n observations. We assume that n-k of these observations come from the same normal (Gaussian) distribution, and that the k most “extreme” observations may or may not represent observations from a different distribution. Let x^{*}_1, x^{*}_2, …, x^{*}_{n-i} denote the n-i observations left after omiting the i most extreme observations, where i = 0, 1, …, k-1. Let \bar{x}^{(i)} and s^{(i)} denote the mean and standard deviation, respectively, of the n-i observations in the data that remain after removing the i most extreme observations. Thus, \bar{x}^{(0)} and s^{(0)} denote the mean and standard deviation for the full sample, and in general
\bar{x}^{(i)} = \frac{1}{n-i}∑_{j=1}^{n-i} x^{*}_j \;\;\;\;\;\; (1)
s^{(i)} = √{\frac{1}{n-i-1} ∑_{j=1}^{n-i} (x^{*}_j - \bar{x}^{(i)})^2} \;\;\;\;\;\; (2)
For a specified value of i, the most extreme observation x^{(i)} is the one that is the greatest distance from the mean for that data set, i.e.,
x^{(i)} = \max_{j=1,2,…,n-i} |x^{*}_j - \bar{x}^{(i)}| \;\;\;\;\;\; (3)
Thus, an extreme observation may be the smallest or the largest one in that data set.
Rosner's test is based on the k statistics R_1, R_2, …, R_k, which represent the extreme Studentized deviates computed from successively reduced samples of size n, n-1, …, n-k+1:
R_{i+1} = \frac{|x^{(i)} - \bar{x}^{(i)}|}{s^{(i)}} \;\;\;\;\;\; (4)
Critical values for R_{i+1} are denoted λ_{i+1} and are computed as:
λ_{i+1} = \frac{t_{p, n-i-2} (n-i-1)}{√{(n-i-2 + t_{p, n-i-2}) (n-i)}} \;\;\;\;\;\; (5)
where t_{p, ν} denotes the p'th quantile of Student's t-distribution with ν degrees of freedom, and in this case
p = 1 - \frac{α/2}{n - i} \;\;\;\;\;\; (6)
where α denotes the Type I error level.
The algorithm for determining the number of outliers is as follows:
Compare R_k with λ_k. If R_k > λ_k then conclude the k most extreme values are outliers.
If R_k ≤ λ_k then compare R_{k-1} with λ_{k-1}. If R_{k-1} > λ_{k-1} then conclude the k-1 most extreme values are outliers.
Continue in this fashion until a certain number of outliers have been identified or Rosner's test finds no outliers at all.
Based on a study using N=1,000 simulations, Rosner's (1983) Table 1 shows the estimated true Type I error of declaring at least one outlier when none exists for various sample sizes n ranging from 10 to 100, and the declared maximum number of outliers k ranging from 1 to 10. Based on that table, Roser (1983) declared that for an assumed Type I error level of 0.05, as long as n ≥ 25, the estimated α levels are quite close to 0.05, and that similar results were obtained assuming a Type I error level of 0.01. However, the table below is an expanded version of Rosner's (1983) Table 1 and shows results based on N=10,000 simulations. You can see that for an assumed Type I error of 0.05, the test maintains the Type I error fairly well for sample sizes as small as n = 3 as long as k = 1, and for n ≥ 15, as long as k ≤ 2. Also, for an assumed Type I error of 0.01, the test maintains the Type I error fairly well for sample sizes as small as n = 15 as long as k ≤ 7.
Based on these results, when warn=TRUE
, a warning is issued for the following cases
indicating that the assumed Type I error may not be correct:
alpha
is greater than 0.01
, the sample size is less than 15, and
k
is greater than 1
.
alpha
is greater than 0.01
,
the sample size is at least 15 and less than 25, and
k
is greater than 2
.
alpha
is less than or equal to 0.01
, the sample size is less than 15, and
k
is greater than 1
.
k
is greater than 10
, or greater than the floor of half of the sample size
(i.e., greater than the greatest integer less than or equal to half of the sample size).
A warning is given for this case because simulations have not been done for this case.
Table 1a. Observed Type I Error Levels based on 10,000 Simulations, n = 3 to 5.
Assumed | α=0.05 | Assumed | α=0.01 | ||||
n | k | \hat{α} | 95% LCL | 95% UCL | \hat{α} | 95% LCL | 95% UCL |
3 | 1 | 0.047 | 0.043 | 0.051 | 0.009 | 0.007 | 0.01 |
4 | 1 | 0.049 | 0.045 | 0.053 | 0.010 | 0.008 | 0.012 |
2 | 0.107 | 0.101 | 0.113 | 0.021 | 0.018 | 0.024 | |
5 | 1 | 0.048 | 0.044 | 0.053 | 0.008 | 0.006 | 0.009 |
2 | 0.095 | 0.090 | 0.101 | 0.020 | 0.018 | 0.023 |
Table 1b. Observed Type I Error Levels based on 10,000 Simulations, n = 6 to 10.
Assumed | α=0.05 | Assumed | α=0.01 | ||||
n | k | \hat{α} | 95% LCL | 95% UCL | \hat{α} | 95% LCL | 95% UCL |
6 | 1 | 0.048 | 0.044 | 0.053 | 0.010 | 0.009 | 0.012 |
2 | 0.085 | 0.080 | 0.091 | 0.017 | 0.015 | 0.020 | |
3 | 0.141 | 0.134 | 0.148 | 0.028 | 0.025 | 0.031 | |
7 | 1 | 0.048 | 0.044 | 0.053 | 0.013 | 0.011 | 0.015 |
2 | 0.080 | 0.075 | 0.086 | 0.017 | 0.015 | 0.020 | |
3 | 0.112 | 0.106 | 0.118 | 0.022 | 0.019 | 0.025 | |
8 | 1 | 0.048 | 0.044 | 0.053 | 0.011 | 0.009 | 0.013 |
2 | 0.080 | 0.074 | 0.085 | 0.017 | 0.014 | 0.019 | |
3 | 0.102 | 0.096 | 0.108 | 0.020 | 0.017 | 0.023 | |
4 | 0.143 | 0.136 | 0.150 | 0.028 | 0.025 | 0.031 | |
9 | 1 | 0.052 | 0.048 | 0.057 | 0.010 | 0.008 | 0.012 |
2 | 0.069 | 0.064 | 0.074 | 0.014 | 0.012 | 0.016 | |
3 | 0.097 | 0.091 | 0.103 | 0.018 | 0.015 | 0.021 | |
4 | 0.120 | 0.114 | 0.126 | 0.024 | 0.021 | 0.027 | |
10 | 1 | 0.051 | 0.047 | 0.056 | 0.010 | 0.008 | 0.012 |
2 | 0.068 | 0.063 | 0.073 | 0.012 | 0.010 | 0.014 | |
3 | 0.085 | 0.080 | 0.091 | 0.015 | 0.013 | 0.017 | |
4 | 0.106 | 0.100 | 0.112 | 0.021 | 0.018 | 0.024 | |
5 | 0.135 | 0.128 | 0.142 | 0.025 | 0.022 | 0.028 |
Table 1c. Observed Type I Error Levels based on 10,000 Simulations, n = 11 to 15.
Assumed | α=0.05 | Assumed | α=0.01 | ||||
n | k | \hat{α} | 95% LCL | 95% UCL | \hat{α} | 95% LCL | 95% UCL |
11 | 1 | 0.052 | 0.048 | 0.056 | 0.012 | 0.010 | 0.014 |
2 | 0.070 | 0.065 | 0.075 | 0.014 | 0.012 | 0.017 | |
3 | 0.082 | 0.077 | 0.088 | 0.014 | 0.011 | 0.016 | |
4 | 0.101 | 0.095 | 0.107 | 0.019 | 0.016 | 0.021 | |
5 | 0.116 | 0.110 | 0.123 | 0.022 | 0.019 | 0.024 | |
12 | 1 | 0.052 | 0.047 | 0.056 | 0.011 | 0.009 | 0.013 |
2 | 0.067 | 0.062 | 0.072 | 0.011 | 0.009 | 0.013 | |
3 | 0.074 | 0.069 | 0.080 | 0.016 | 0.013 | 0.018 | |
4 | 0.088 | 0.082 | 0.093 | 0.016 | 0.014 | 0.019 | |
5 | 0.099 | 0.093 | 0.105 | 0.016 | 0.013 | 0.018 | |
6 | 0.117 | 0.111 | 0.123 | 0.021 | 0.018 | 0.023 | |
13 | 1 | 0.048 | 0.044 | 0.052 | 0.010 | 0.008 | 0.012 |
2 | 0.064 | 0.059 | 0.069 | 0.014 | 0.012 | 0.016 | |
3 | 0.070 | 0.065 | 0.075 | 0.013 | 0.011 | 0.015 | |
4 | 0.079 | 0.074 | 0.084 | 0.014 | 0.012 | 0.017 | |
5 | 0.088 | 0.083 | 0.094 | 0.015 | 0.013 | 0.018 | |
6 | 0.109 | 0.103 | 0.115 | 0.020 | 0.017 | 0.022 | |
14 | 1 | 0.046 | 0.042 | 0.051 | 0.009 | 0.007 | 0.011 |
2 | 0.062 | 0.057 | 0.066 | 0.012 | 0.010 | 0.014 | |
3 | 0.069 | 0.064 | 0.074 | 0.012 | 0.010 | 0.014 | |
4 | 0.077 | 0.072 | 0.082 | 0.015 | 0.013 | 0.018 | |
5 | 0.084 | 0.079 | 0.090 | 0.016 | 0.013 | 0.018 | |
6 | 0.091 | 0.085 | 0.097 | 0.017 | 0.014 | 0.019 | |
7 | 0.107 | 0.101 | 0.113 | 0.018 | 0.016 | 0.021 | |
15 | 1 | 0.054 | 0.050 | 0.059 | 0.010 | 0.008 | 0.012 |
2 | 0.057 | 0.053 | 0.062 | 0.010 | 0.008 | 0.012 | |
3 | 0.065 | 0.060 | 0.069 | 0.013 | 0.011 | 0.016 | |
4 | 0.073 | 0.068 | 0.078 | 0.014 | 0.011 | 0.016 | |
5 | 0.074 | 0.069 | 0.079 | 0.012 | 0.010 | 0.014 | |
6 | 0.086 | 0.081 | 0.092 | 0.015 | 0.013 | 0.017 | |
7 | 0.099 | 0.094 | 0.105 | 0.018 | 0.015 | 0.020 |
Table 1d. Observed Type I Error Levels based on 10,000 Simulations, n = 16 to 20.
Assumed | α=0.05 | Assumed | α=0.01 | ||||
n | k | \hat{α} | 95% LCL | 95% UCL | \hat{α} | 95% LCL | 95% UCL |
16 | 1 | 0.052 | 0.048 | 0.057 | 0.010 | 0.008 | 0.012 |
2 | 0.055 | 0.051 | 0.059 | 0.011 | 0.009 | 0.013 | |
3 | 0.068 | 0.063 | 0.073 | 0.011 | 0.009 | 0.013 | |
4 | 0.074 | 0.069 | 0.079 | 0.015 | 0.013 | 0.017 | |
5 | 0.077 | 0.072 | 0.082 | 0.015 | 0.013 | 0.018 | |
6 | 0.075 | 0.070 | 0.080 | 0.013 | 0.011 | 0.016 | |
7 | 0.087 | 0.082 | 0.093 | 0.017 | 0.014 | 0.020 | |
8 | 0.096 | 0.090 | 0.101 | 0.016 | 0.014 | 0.019 | |
17 | 1 | 0.047 | 0.043 | 0.051 | 0.008 | 0.007 | 0.010 |
2 | 0.059 | 0.054 | 0.063 | 0.011 | 0.009 | 0.013 | |
3 | 0.062 | 0.057 | 0.067 | 0.012 | 0.010 | 0.014 | |
4 | 0.070 | 0.065 | 0.075 | 0.012 | 0.009 | 0.014 | |
5 | 0.069 | 0.064 | 0.074 | 0.012 | 0.010 | 0.015 | |
6 | 0.071 | 0.066 | 0.076 | 0.015 | 0.012 | 0.017 | |
7 | 0.081 | 0.076 | 0.087 | 0.014 | 0.012 | 0.016 | |
8 | 0.083 | 0.078 | 0.088 | 0.015 | 0.013 | 0.017 | |
18 | 1 | 0.051 | 0.047 | 0.055 | 0.010 | 0.008 | 0.012 |
2 | 0.056 | 0.052 | 0.061 | 0.012 | 0.010 | 0.014 | |
3 | 0.065 | 0.060 | 0.070 | 0.012 | 0.010 | 0.015 | |
4 | 0.065 | 0.060 | 0.070 | 0.013 | 0.011 | 0.015 | |
5 | 0.069 | 0.064 | 0.074 | 0.012 | 0.010 | 0.014 | |
6 | 0.068 | 0.063 | 0.073 | 0.014 | 0.011 | 0.016 | |
7 | 0.072 | 0.067 | 0.077 | 0.014 | 0.011 | 0.016 | |
8 | 0.076 | 0.071 | 0.081 | 0.012 | 0.010 | 0.014 | |
9 | 0.081 | 0.076 | 0.086 | 0.012 | 0.010 | 0.014 | |
19 | 1 | 0.051 | 0.046 | 0.055 | 0.008 | 0.006 | 0.010 |
2 | 0.059 | 0.055 | 0.064 | 0.012 | 0.010 | 0.014 | |
3 | 0.059 | 0.054 | 0.064 | 0.011 | 0.009 | 0.013 | |
4 | 0.061 | 0.057 | 0.066 | 0.012 | 0.010 | 0.014 | |
5 | 0.067 | 0.062 | 0.072 | 0.013 | 0.010 | 0.015 | |
6 | 0.066 | 0.061 | 0.071 | 0.011 | 0.009 | 0.013 | |
7 | 0.069 | 0.064 | 0.074 | 0.013 | 0.011 | 0.015 | |
8 | 0.074 | 0.069 | 0.079 | 0.012 | 0.010 | 0.014 | |
9 | 0.082 | 0.077 | 0.087 | 0.015 | 0.013 | 0.018 | |
20 | 1 | 0.053 | 0.048 | 0.057 | 0.011 | 0.009 | 0.013 |
2 | 0.056 | 0.052 | 0.061 | 0.010 | 0.008 | 0.012 | |
3 | 0.060 | 0.056 | 0.065 | 0.009 | 0.007 | 0.011 | |
4 | 0.063 | 0.058 | 0.068 | 0.012 | 0.010 | 0.014 | |
5 | 0.063 | 0.059 | 0.068 | 0.014 | 0.011 | 0.016 | |
6 | 0.063 | 0.058 | 0.067 | 0.011 | 0.009 | 0.013 | |
7 | 0.065 | 0.061 | 0.070 | 0.011 | 0.009 | 0.013 | |
8 | 0.070 | 0.065 | 0.076 | 0.012 | 0.010 | 0.014 | |
9 | 0.076 | 0.070 | 0.081 | 0.013 | 0.011 | 0.015 | |
10 | 0.081 | 0.076 | 0.087 | 0.012 | 0.010 | 0.014 |
Table 1e. Observed Type I Error Levels based on 10,000 Simulations, n = 21 to 25.
Assumed | α=0.05 | Assumed | α=0.01 | ||||
n | k | \hat{α} | 95% LCL | 95% UCL | \hat{α} | 95% LCL | 95% UCL |
21 | 1 | 0.054 | 0.049 | 0.058 | 0.013 | 0.011 | 0.015 |
2 | 0.054 | 0.049 | 0.058 | 0.012 | 0.010 | 0.014 | |
3 | 0.058 | 0.054 | 0.063 | 0.012 | 0.010 | 0.014 | |
4 | 0.058 | 0.054 | 0.063 | 0.011 | 0.009 | 0.013 | |
5 | 0.064 | 0.059 | 0.069 | 0.013 | 0.011 | 0.016 | |
6 | 0.066 | 0.061 | 0.071 | 0.012 | 0.010 | 0.015 | |
7 | 0.063 | 0.058 | 0.068 | 0.013 | 0.011 | 0.015 | |
8 | 0.066 | 0.061 | 0.071 | 0.010 | 0.008 | 0.012 | |
9 | 0.073 | 0.068 | 0.078 | 0.013 | 0.011 | 0.015 | |
10 | 0.071 | 0.066 | 0.076 | 0.012 | 0.010 | 0.014 | |
22 | 1 | 0.047 | 0.042 | 0.051 | 0.010 | 0.008 | 0.012 |
2 | 0.058 | 0.053 | 0.062 | 0.012 | 0.010 | 0.015 | |
3 | 0.056 | 0.052 | 0.061 | 0.010 | 0.008 | 0.012 | |
4 | 0.059 | 0.055 | 0.064 | 0.012 | 0.010 | 0.014 | |
5 | 0.061 | 0.057 | 0.066 | 0.009 | 0.008 | 0.011 | |
6 | 0.063 | 0.058 | 0.068 | 0.013 | 0.010 | 0.015 | |
7 | 0.065 | 0.060 | 0.070 | 0.013 | 0.010 | 0.015 | |
8 | 0.065 | 0.060 | 0.070 | 0.014 | 0.012 | 0.016 | |
9 | 0.065 | 0.060 | 0.070 | 0.012 | 0.010 | 0.014 | |
10 | 0.067 | 0.062 | 0.072 | 0.012 | 0.009 | 0.014 | |
23 | 1 | 0.051 | 0.047 | 0.056 | 0.008 | 0.007 | 0.010 |
2 | 0.056 | 0.052 | 0.061 | 0.010 | 0.009 | 0.012 | |
3 | 0.056 | 0.052 | 0.061 | 0.011 | 0.009 | 0.013 | |
4 | 0.062 | 0.057 | 0.066 | 0.011 | 0.009 | 0.013 | |
5 | 0.061 | 0.056 | 0.065 | 0.010 | 0.009 | 0.012 | |
6 | 0.060 | 0.055 | 0.064 | 0.012 | 0.010 | 0.014 | |
7 | 0.062 | 0.057 | 0.066 | 0.011 | 0.009 | 0.013 | |
8 | 0.063 | 0.058 | 0.068 | 0.012 | 0.010 | 0.014 | |
9 | 0.066 | 0.061 | 0.071 | 0.012 | 0.010 | 0.014 | |
10 | 0.068 | 0.063 | 0.073 | 0.014 | 0.012 | 0.017 | |
24 | 1 | 0.051 | 0.046 | 0.055 | 0.010 | 0.008 | 0.012 |
2 | 0.056 | 0.051 | 0.060 | 0.011 | 0.009 | 0.013 | |
3 | 0.058 | 0.053 | 0.062 | 0.010 | 0.008 | 0.012 | |
4 | 0.060 | 0.056 | 0.065 | 0.013 | 0.011 | 0.015 | |
5 | 0.057 | 0.053 | 0.062 | 0.012 | 0.010 | 0.014 | |
6 | 0.065 | 0.060 | 0.069 | 0.011 | 0.009 | 0.013 | |
7 | 0.062 | 0.057 | 0.066 | 0.012 | 0.010 | 0.014 | |
8 | 0.060 | 0.055 | 0.065 | 0.012 | 0.010 | 0.014 | |
9 | 0.066 | 0.061 | 0.071 | 0.012 | 0.010 | 0.014 | |
10 | 0.064 | 0.059 | 0.068 | 0.012 | 0.010 | 0.015 | |
25 | 1 | 0.054 | 0.050 | 0.059 | 0.012 | 0.009 | 0.014 |
2 | 0.055 | 0.051 | 0.060 | 0.010 | 0.008 | 0.012 | |
3 | 0.057 | 0.052 | 0.062 | 0.011 | 0.009 | 0.013 | |
4 | 0.055 | 0.051 | 0.060 | 0.011 | 0.009 | 0.013 | |
5 | 0.060 | 0.055 | 0.065 | 0.012 | 0.010 | 0.014 | |
6 | 0.060 | 0.055 | 0.064 | 0.011 | 0.009 | 0.013 | |
7 | 0.057 | 0.052 | 0.061 | 0.011 | 0.009 | 0.013 | |
8 | 0.062 | 0.058 | 0.067 | 0.011 | 0.009 | 0.013 | |
9 | 0.058 | 0.053 | 0.062 | 0.012 | 0.010 | 0.014 | |
10 | 0.061 | 0.057 | 0.066 | 0.010 | 0.008 | 0.012 |
Table 1f. Observed Type I Error Levels based on 10,000 Simulations, n = 26 to 30.
Assumed | α=0.05 | Assumed | α=0.01 | ||||
n | k | \hat{α} | 95% LCL | 95% UCL | \hat{α} | 95% LCL | 95% UCL |
26 | 1 | 0.051 | 0.047 | 0.055 | 0.012 | 0.010 | 0.014 |
2 | 0.057 | 0.053 | 0.062 | 0.013 | 0.011 | 0.015 | |
3 | 0.055 | 0.050 | 0.059 | 0.012 | 0.010 | 0.014 | |
4 | 0.055 | 0.051 | 0.060 | 0.010 | 0.008 | 0.012 | |
5 | 0.058 | 0.054 | 0.063 | 0.011 | 0.009 | 0.013 | |
6 | 0.061 | 0.056 | 0.066 | 0.012 | 0.010 | 0.014 | |
7 | 0.059 | 0.054 | 0.064 | 0.011 | 0.009 | 0.013 | |
8 | 0.060 | 0.056 | 0.065 | 0.010 | 0.008 | 0.012 | |
9 | 0.060 | 0.056 | 0.065 | 0.011 | 0.009 | 0.013 | |
10 | 0.061 | 0.056 | 0.065 | 0.011 | 0.009 | 0.013 | |
27 | 1 | 0.050 | 0.046 | 0.054 | 0.009 | 0.007 | 0.011 |
2 | 0.054 | 0.050 | 0.059 | 0.011 | 0.009 | 0.013 | |
3 | 0.062 | 0.057 | 0.066 | 0.012 | 0.010 | 0.014 | |
4 | 0.063 | 0.058 | 0.068 | 0.011 | 0.009 | 0.013 | |
5 | 0.051 | 0.047 | 0.055 | 0.010 | 0.008 | 0.012 | |
6 | 0.058 | 0.053 | 0.062 | 0.011 | 0.009 | 0.013 | |
7 | 0.060 | 0.056 | 0.065 | 0.010 | 0.008 | 0.012 | |
8 | 0.056 | 0.052 | 0.061 | 0.010 | 0.008 | 0.012 | |
9 | 0.061 | 0.056 | 0.066 | 0.012 | 0.010 | 0.014 | |
10 | 0.055 | 0.051 | 0.060 | 0.008 | 0.006 | 0.010 | |
28 | 1 | 0.049 | 0.045 | 0.053 | 0.010 | 0.008 | 0.011 |
2 | 0.057 | 0.052 | 0.061 | 0.011 | 0.009 | 0.013 | |
3 | 0.056 | 0.052 | 0.061 | 0.012 | 0.009 | 0.014 | |
4 | 0.057 | 0.053 | 0.062 | 0.011 | 0.009 | 0.013 | |
5 | 0.057 | 0.053 | 0.062 | 0.010 | 0.008 | 0.012 | |
6 | 0.056 | 0.051 | 0.060 | 0.010 | 0.008 | 0.012 | |
7 | 0.057 | 0.052 | 0.061 | 0.010 | 0.008 | 0.012 | |
8 | 0.058 | 0.054 | 0.063 | 0.011 | 0.009 | 0.013 | |
9 | 0.054 | 0.050 | 0.058 | 0.011 | 0.009 | 0.013 | |
10 | 0.062 | 0.057 | 0.067 | 0.011 | 0.009 | 0.013 | |
29 | 1 | 0.049 | 0.045 | 0.053 | 0.011 | 0.009 | 0.013 |
2 | 0.053 | 0.048 | 0.057 | 0.010 | 0.008 | 0.012 | |
3 | 0.056 | 0.051 | 0.060 | 0.010 | 0.009 | 0.012 | |
4 | 0.055 | 0.050 | 0.059 | 0.010 | 0.008 | 0.012 | |
5 | 0.056 | 0.051 | 0.060 | 0.010 | 0.008 | 0.012 | |
6 | 0.057 | 0.053 | 0.062 | 0.012 | 0.010 | 0.014 | |
7 | 0.055 | 0.050 | 0.059 | 0.010 | 0.008 | 0.012 | |
8 | 0.057 | 0.052 | 0.061 | 0.011 | 0.009 | 0.013 | |
9 | 0.056 | 0.051 | 0.061 | 0.011 | 0.009 | 0.013 | |
10 | 0.057 | 0.052 | 0.061 | 0.011 | 0.009 | 0.013 | |
30 | 1 | 0.050 | 0.046 | 0.054 | 0.009 | 0.007 | 0.011 |
2 | 0.054 | 0.049 | 0.058 | 0.011 | 0.009 | 0.013 | |
3 | 0.056 | 0.052 | 0.061 | 0.012 | 0.010 | 0.015 | |
4 | 0.054 | 0.049 | 0.058 | 0.010 | 0.008 | 0.012 | |
5 | 0.058 | 0.053 | 0.063 | 0.012 | 0.010 | 0.014 | |
6 | 0.062 | 0.058 | 0.067 | 0.012 | 0.010 | 0.014 | |
7 | 0.056 | 0.052 | 0.061 | 0.012 | 0.010 | 0.014 | |
8 | 0.059 | 0.054 | 0.064 | 0.011 | 0.009 | 0.013 | |
9 | 0.056 | 0.052 | 0.061 | 0.010 | 0.009 | 0.012 | |
10 | 0.058 | 0.053 | 0.062 | 0.012 | 0.010 | 0.015 |
Table 1g. Observed Type I Error Levels based on 10,000 Simulations, n = 31 to 35.
Assumed | α=0.05 | Assumed | α=0.01 | ||||
n | k | \hat{α} | 95% LCL | 95% UCL | \hat{α} | 95% LCL | 95% UCL |
31 | 1 | 0.051 | 0.047 | 0.056 | 0.009 | 0.007 | 0.011 |
2 | 0.054 | 0.050 | 0.059 | 0.010 | 0.009 | 0.012 | |
3 | 0.053 | 0.049 | 0.058 | 0.010 | 0.008 | 0.012 | |
4 | 0.055 | 0.050 | 0.059 | 0.010 | 0.008 | 0.012 | |
5 | 0.053 | 0.049 | 0.057 | 0.011 | 0.009 | 0.013 | |
6 | 0.055 | 0.050 | 0.059 | 0.010 | 0.008 | 0.012 | |
7 | 0.055 | 0.050 | 0.059 | 0.012 | 0.010 | 0.014 | |
8 | 0.056 | 0.051 | 0.060 | 0.010 | 0.008 | 0.012 | |
9 | 0.057 | 0.053 | 0.062 | 0.011 | 0.009 | 0.013 | |
10 | 0.058 | 0.053 | 0.062 | 0.011 | 0.009 | 0.013 | |
32 | 1 | 0.054 | 0.049 | 0.058 | 0.010 | 0.008 | 0.012 |
2 | 0.054 | 0.050 | 0.059 | 0.010 | 0.008 | 0.012 | |
3 | 0.052 | 0.047 | 0.056 | 0.009 | 0.007 | 0.011 | |
4 | 0.056 | 0.051 | 0.060 | 0.011 | 0.009 | 0.013 | |
5 | 0.056 | 0.052 | 0.061 | 0.011 | 0.009 | 0.013 | |
6 | 0.055 | 0.051 | 0.060 | 0.011 | 0.009 | 0.013 | |
7 | 0.055 | 0.051 | 0.060 | 0.010 | 0.008 | 0.012 | |
8 | 0.055 | 0.051 | 0.060 | 0.010 | 0.008 | 0.012 | |
9 | 0.057 | 0.053 | 0.062 | 0.012 | 0.010 | 0.014 | |
10 | 0.054 | 0.050 | 0.059 | 0.010 | 0.008 | 0.012 | |
33 | 1 | 0.051 | 0.046 | 0.055 | 0.011 | 0.009 | 0.013 |
2 | 0.055 | 0.051 | 0.060 | 0.011 | 0.009 | 0.013 | |
3 | 0.056 | 0.052 | 0.061 | 0.010 | 0.008 | 0.012 | |
4 | 0.052 | 0.048 | 0.057 | 0.010 | 0.008 | 0.012 | |
5 | 0.055 | 0.050 | 0.059 | 0.010 | 0.008 | 0.012 | |
6 | 0.058 | 0.053 | 0.062 | 0.011 | 0.009 | 0.013 | |
7 | 0.057 | 0.052 | 0.061 | 0.010 | 0.008 | 0.012 | |
8 | 0.058 | 0.054 | 0.063 | 0.011 | 0.009 | 0.013 | |
9 | 0.057 | 0.053 | 0.062 | 0.012 | 0.010 | 0.014 | |
10 | 0.055 | 0.051 | 0.060 | 0.011 | 0.009 | 0.013 | |
34 | 1 | 0.052 | 0.048 | 0.056 | 0.009 | 0.007 | 0.011 |
2 | 0.053 | 0.049 | 0.058 | 0.011 | 0.009 | 0.013 | |
3 | 0.055 | 0.050 | 0.059 | 0.012 | 0.010 | 0.014 | |
4 | 0.056 | 0.052 | 0.061 | 0.010 | 0.008 | 0.012 | |
5 | 0.053 | 0.048 | 0.057 | 0.009 | 0.007 | 0.011 | |
6 | 0.055 | 0.050 | 0.059 | 0.010 | 0.008 | 0.012 | |
7 | 0.052 | 0.048 | 0.057 | 0.012 | 0.010 | 0.014 | |
8 | 0.055 | 0.050 | 0.059 | 0.009 | 0.008 | 0.011 | |
9 | 0.055 | 0.051 | 0.060 | 0.011 | 0.009 | 0.013 | |
10 | 0.054 | 0.049 | 0.058 | 0.010 | 0.008 | 0.012 | |
35 | 1 | 0.051 | 0.046 | 0.055 | 0.010 | 0.009 | 0.012 |
2 | 0.054 | 0.049 | 0.058 | 0.010 | 0.009 | 0.012 | |
3 | 0.055 | 0.050 | 0.059 | 0.010 | 0.009 | 0.012 | |
4 | 0.053 | 0.048 | 0.057 | 0.011 | 0.009 | 0.013 | |
5 | 0.056 | 0.051 | 0.061 | 0.011 | 0.009 | 0.013 | |
6 | 0.055 | 0.051 | 0.059 | 0.012 | 0.010 | 0.014 | |
7 | 0.054 | 0.050 | 0.059 | 0.011 | 0.009 | 0.013 | |
8 | 0.054 | 0.049 | 0.058 | 0.011 | 0.009 | 0.013 | |
9 | 0.061 | 0.056 | 0.066 | 0.012 | 0.010 | 0.014 | |
10 | 0.053 | 0.048 | 0.057 | 0.011 | 0.009 | 0.013 |
Table 1h. Observed Type I Error Levels based on 10,000 Simulations, n = 36 to 40.
Assumed | α=0.05 | Assumed | α=0.01 | ||||
n | k | \hat{α} | 95% LCL | 95% UCL | \hat{α} | 95% LCL | 95% UCL |
36 | 1 | 0.047 | 0.043 | 0.051 | 0.010 | 0.008 | 0.012 |
2 | 0.058 | 0.053 | 0.062 | 0.012 | 0.010 | 0.015 | |
3 | 0.052 | 0.047 | 0.056 | 0.009 | 0.007 | 0.011 | |
4 | 0.052 | 0.048 | 0.056 | 0.012 | 0.010 | 0.014 | |
5 | 0.052 | 0.048 | 0.057 | 0.010 | 0.008 | 0.012 | |
6 | 0.055 | 0.051 | 0.059 | 0.012 | 0.010 | 0.014 | |
7 | 0.053 | 0.048 | 0.057 | 0.011 | 0.009 | 0.013 | |
8 | 0.056 | 0.051 | 0.060 | 0.012 | 0.010 | 0.014 | |
9 | 0.056 | 0.051 | 0.060 | 0.011 | 0.009 | 0.013 | |
10 | 0.056 | 0.051 | 0.060 | 0.011 | 0.009 | 0.013 | |
37 | 1 | 0.050 | 0.046 | 0.055 | 0.010 | 0.008 | 0.012 |
2 | 0.054 | 0.049 | 0.058 | 0.011 | 0.009 | 0.013 | |
3 | 0.054 | 0.049 | 0.058 | 0.011 | 0.009 | 0.013 | |
4 | 0.054 | 0.050 | 0.058 | 0.010 | 0.008 | 0.012 | |
5 | 0.054 | 0.049 | 0.058 | 0.010 | 0.008 | 0.012 | |
6 | 0.054 | 0.050 | 0.058 | 0.011 | 0.009 | 0.013 | |
7 | 0.055 | 0.051 | 0.060 | 0.010 | 0.008 | 0.012 | |
8 | 0.055 | 0.050 | 0.059 | 0.011 | 0.009 | 0.013 | |
9 | 0.053 | 0.049 | 0.058 | 0.011 | 0.009 | 0.013 | |
10 | 0.049 | 0.045 | 0.054 | 0.009 | 0.007 | 0.011 | |
38 | 1 | 0.049 | 0.045 | 0.053 | 0.009 | 0.007 | 0.011 |
2 | 0.052 | 0.047 | 0.056 | 0.008 | 0.007 | 0.010 | |
3 | 0.054 | 0.050 | 0.059 | 0.011 | 0.009 | 0.013 | |
4 | 0.055 | 0.050 | 0.059 | 0.011 | 0.009 | 0.013 | |
5 | 0.056 | 0.052 | 0.061 | 0.012 | 0.010 | 0.014 | |
6 | 0.055 | 0.050 | 0.059 | 0.011 | 0.009 | 0.013 | |
7 | 0.049 | 0.045 | 0.053 | 0.009 | 0.007 | 0.011 | |
8 | 0.052 | 0.048 | 0.057 | 0.010 | 0.008 | 0.012 | |
9 | 0.054 | 0.050 | 0.059 | 0.010 | 0.009 | 0.012 | |
10 | 0.055 | 0.050 | 0.059 | 0.011 | 0.009 | 0.013 | |
39 | 1 | 0.047 | 0.043 | 0.051 | 0.010 | 0.008 | 0.012 |
2 | 0.055 | 0.051 | 0.059 | 0.010 | 0.008 | 0.012 | |
3 | 0.053 | 0.049 | 0.057 | 0.010 | 0.008 | 0.012 | |
4 | 0.053 | 0.049 | 0.058 | 0.010 | 0.009 | 0.012 | |
5 | 0.052 | 0.048 | 0.057 | 0.010 | 0.008 | 0.012 | |
6 | 0.053 | 0.049 | 0.058 | 0.010 | 0.008 | 0.012 | |
7 | 0.057 | 0.052 | 0.061 | 0.011 | 0.009 | 0.013 | |
8 | 0.057 | 0.053 | 0.062 | 0.012 | 0.010 | 0.014 | |
9 | 0.050 | 0.046 | 0.055 | 0.010 | 0.008 | 0.012 | |
10 | 0.056 | 0.051 | 0.060 | 0.011 | 0.009 | 0.013 | |
40 | 1 | 0.049 | 0.045 | 0.054 | 0.010 | 0.008 | 0.012 |
2 | 0.052 | 0.048 | 0.057 | 0.010 | 0.009 | 0.012 | |
3 | 0.055 | 0.050 | 0.059 | 0.011 | 0.009 | 0.013 | |
4 | 0.054 | 0.050 | 0.059 | 0.011 | 0.009 | 0.013 | |
5 | 0.054 | 0.050 | 0.059 | 0.010 | 0.008 | 0.012 | |
6 | 0.049 | 0.045 | 0.053 | 0.010 | 0.008 | 0.012 | |
7 | 0.056 | 0.051 | 0.060 | 0.011 | 0.009 | 0.013 | |
8 | 0.054 | 0.050 | 0.059 | 0.011 | 0.009 | 0.013 | |
9 | 0.047 | 0.043 | 0.052 | 0.010 | 0.008 | 0.011 | |
10 | 0.058 | 0.054 | 0.063 | 0.010 | 0.008 | 0.012 |
A list of class "gofOutlier"
containing the results of the hypothesis test.
See the help file for gofOutlier.object
for details.
Rosner's test is a commonly used test for “outliers” when you are willing to assume that the data without outliers follows a normal (Gaussian) distribution. It is designed to avoid masking, which occurs when an outlier goes undetected because it is close in value to another outlier.
Rosner's test is a kind of discordancy test (Barnett and Lewis, 1995). The test statistic of a discordancy test is usually a ratio: the numerator is the difference between the suspected outlier and some summary statistic of the data set (e.g., mean, next largest observation, etc.), while the denominator is always a measure of spread within the data (e.g., standard deviation, range, etc.). Both USEPA (2009) and USEPA (2013a,b) discuss two commonly used discordancy tests: Dixon's test and Rosner's test. Both of these tests assume that all of the data that are not outliers come from a normal (Gaussian) distribution.
There are many forms of Dixon's test (Barnett and Lewis, 1995). The one presented in USEPA (2009) and USEPA (20013a,b) assumes just one outlier (Dixon, 1953). This test is vulnerable to "masking" in which the presence of several outliers masks the fact that even one outlier is present. There are also other forms of Dixon's test that allow for more than one outlier based on a sequence of sub-tests, but these tests are also vulnerable to masking.
Rosner's test allows you to test for several possible outliers and avoids the problem of
masking. Rosner's test requires you to set the number of suspected outliers, k,
in advance. As in the case of Dixon's test, there are several forms of Rosner's test,
so you need to be aware of which one you are using. The form of Rosner's test presented in
USEPA (2009) is based on the extreme Studentized deviate (ESD) (Rosner, 1975), whereas the
form of Rosner's test performed by the EnvStats function rosnerTest
and
presented in USEPA (2013a,b) is based on the generalized ESD (Rosner, 1983; Gilbert, 1987).
USEPA (2013a, p. 190) cites both Rosner (1975) and Rosner (1983), but presents only the
test given in Rosner (1983). Rosner's test based on the ESD has the appropriate Type I
error level if there are no outliers in the dataset, but if there are actually say m
outliers, where m < k, then the ESD version of Rosner's test tends to declare
more than m outliers with a probability that is greater than the stated Type I
error level (referred to as “swamping”). Rosner's test based on the
generalized ESD fixes this problem. USEPA (2013a, pp. 17, 191) incorrectly states that
the generalized ESD version of Rosner's test is vulnerable to masking. Surprisingly,
the well-known book on statistical outliers by Barnett and Lewis (1995) does not
discuss Rosner's generalized ESD test.
As noted, using Rosner's test requires specifying the number of suspected outliers, k, in advance. USEPA (2013a, pp.190-191) states: “A graphical display (Q-Q plot) can be used to identify suspected outliers needed to perform the Rosner test”, and USEPA (2009, p. 12-11) notes: “A potential drawback of Rosner's test is that the user must first identify the maximum number of potential outliers (k) prior to running the test. Therefore, this requirement makes the test ill-advised as an automatic outlier screening tool, and somewhat reliant on the user to identify candidate outliers.”
When observations contain non-detect values (NDs), USEPA (2013a, p. 191) states:
“one may replace the NDs by their respective detection limits (DLs), DL/2, or may
just ignore them ....” This is bad advice, as this method of dealing with non-detects
will produce Type I error rates that are not correct.
OUTLIERS ARE NOT NECESSARILY INCORRECT VALUES
Whether an observation is an “outlier” depends on the underlying assumed
statistical model. McBean and Rovers (1992) state:
“It may be possible to ignore the outlier if a physical rationale is available but,
failing that, the value must be included .... Note that the use of statistics does not
interpret the facts, it simply makes the facts easier to see. Therefore, it is incumbent
on the analyst to identify whether or not the high value ... is truly representative of
the chemical being monitored or, instead, is an outlier for reasons such as a result of
sampling or laboratory error.”
USEPA (2006, p.51) states:
“If scientific reasoning does not explain the outlier, it should not be
discarded from the data set.”
Finally, an editorial by the Editor-in-Chief of the journal Science deals with this topic (McNutt, 2014).
Steven P. Millard (EnvStats@ProbStatInfo.com)
Barnett, V., and T. Lewis. (1995). Outliers in Statistical Data. Third Edition. John Wiley & Sons, Chichester, UK, pp. 235–236.
Gilbert, R.O. (1987). Statistical Methods for Environmental Pollution Monitoring. Van Nostrand Reinhold, NY, pp.188–191.
McBean, E.A, and F.A. Rovers. (1992). Estimation of the Probability of Exceedance of Contaminant Concentrations. Ground Water Monitoring Review Winter, pp. 115–119.
McNutt, M. (2014). Raising the Bar. Science 345(6192), p. 9.
Rosner, B. (1975). On the Detection of Many Outliers. Technometrics 17, 221–227.
Rosner, B. (1983). Percentage Points for a Generalized ESD Many-Outlier Procedure. Technometrics 25, 165–172.
USEPA. (2006). Data Quality Assessment: A Reviewer's Guide. EPA QA/G-9R. EPA/240/B-06/002, February 2006. Office of Environmental Information, U.S. Environmental Protection Agency, Washington, D.C.
USEPA. (2009). Statistical Analysis of Groundwater Monitoring Data at RCRA Facilities, Unified Guidance. EPA 530/R-09-007, March 2009. Office of Resource Conservation and Recovery Program Implementation and Information Division. U.S. Environmental Protection Agency, Washington, D.C., pp. 12-10 to 12-14.
USEPA. (2013a). ProUCL Version 5.0.00 Technical Guide. EPA/600/R-07/041, September 2013. Office of Research and Development. U.S. Environmental Protection Agency, Washington, D.C., pp. 190–195.
USEPA. (2013b). ProUCL Version 5.0.00 User Guide. EPA/600/R-07/041, September 2013. Office of Research and Development. U.S. Environmental Protection Agency, Washington, D.C., pp. 190–195.
# Combine 30 observations from a normal distribution with mean 3 and # standard deviation 2, with 3 observations from a normal distribution # with mean 10 and standard deviation 1, then run Rosner's Test on these # data, specifying k=4 potential outliers based on looking at the # normal Q-Q plot. # (Note: the call to set.seed simply allows you to reproduce # this example.) set.seed(250) dat <- c(rnorm(30, mean = 3, sd = 2), rnorm(3, mean = 10, sd = 1)) dev.new() qqPlot(dat) rosnerTest(dat, k = 4) #Results of Outlier Test #------------------------- # #Test Method: Rosner's Test for Outliers # #Hypothesized Distribution: Normal # #Data: dat # #Sample Size: 33 # #Test Statistics: R.1 = 2.848514 # R.2 = 3.086875 # R.3 = 3.033044 # R.4 = 2.380235 # #Test Statistic Parameter: k = 4 # #Alternative Hypothesis: Up to 4 observations are not # from the same Distribution. # #Type I Error: 5% # #Number of Outliers Detected: 3 # # i Mean.i SD.i Value Obs.Num R.i+1 lambda.i+1 Outlier #1 0 3.549744 2.531011 10.7593656 33 2.848514 2.951949 TRUE #2 1 3.324444 2.209872 10.1460427 31 3.086875 2.938048 TRUE #3 2 3.104392 1.856109 8.7340527 32 3.033044 2.923571 TRUE #4 3 2.916737 1.560335 -0.7972275 25 2.380235 2.908473 FALSE #---------- # Clean up rm(dat) graphics.off() #-------------------------------------------------------------------- # Example 12-4 of USEPA (2009, page 12-12) gives an example of # using Rosner's test to test for outliers in napthalene measurements (ppb) # taken at 5 background wells over 5 quarters. The data for this example # are stored in EPA.09.Ex.12.4.naphthalene.df. EPA.09.Ex.12.4.naphthalene.df # Quarter Well Naphthalene.ppb #1 1 BW.1 3.34 #2 2 BW.1 5.39 #3 3 BW.1 5.74 # ... #23 3 BW.5 5.53 #24 4 BW.5 4.42 #25 5 BW.5 35.45 longToWide(EPA.09.Ex.12.4.naphthalene.df, "Naphthalene.ppb", "Quarter", "Well", paste.row.name = TRUE) # BW.1 BW.2 BW.3 BW.4 BW.5 #Quarter.1 3.34 5.59 1.91 6.12 8.64 #Quarter.2 5.39 5.96 1.74 6.05 5.34 #Quarter.3 5.74 1.47 23.23 5.18 5.53 #Quarter.4 6.88 2.57 1.82 4.43 4.42 #Quarter.5 5.85 5.39 2.02 1.00 35.45 # Look at Q-Q plots for both the raw and log-transformed data #------------------------------------------------------------ dev.new() with(EPA.09.Ex.12.4.naphthalene.df, qqPlot(Naphthalene.ppb, add.line = TRUE, main = "Figure 12-6. Naphthalene Probability Plot")) dev.new() with(EPA.09.Ex.12.4.naphthalene.df, qqPlot(Naphthalene.ppb, dist = "lnorm", add.line = TRUE, main = "Figure 12-7. Log Naphthalene Probability Plot")) # Test for 2 potential outliers on the original scale: #----------------------------------------------------- with(EPA.09.Ex.12.4.naphthalene.df, rosnerTest(Naphthalene.ppb, k = 2)) #Results of Outlier Test #------------------------- # #Test Method: Rosner's Test for Outliers # #Hypothesized Distribution: Normal # #Data: Naphthalene.ppb # #Sample Size: 25 # #Test Statistics: R.1 = 3.930957 # R.2 = 4.160223 # #Test Statistic Parameter: k = 2 # #Alternative Hypothesis: Up to 2 observations are not # from the same Distribution. # #Type I Error: 5% # #Number of Outliers Detected: 2 # # i Mean.i SD.i Value Obs.Num R.i+1 lambda.i+1 Outlier #1 0 6.44240 7.379271 35.45 25 3.930957 2.821681 TRUE #2 1 5.23375 4.325790 23.23 13 4.160223 2.801551 TRUE #---------- # Clean up graphics.off()
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.