Tolerance Interval for a Poisson Distribution
Construct a β-content or β-expectation tolerance interval for a Poisson distribution.
tolIntPois(x, coverage = 0.95, cov.type = "content", ti.type = "two-sided", conf.level = 0.95)
x |
numeric vector of observations, or an object resulting from a call to an
estimating function that assumes a Poisson distribution
(i.e., |
coverage |
a scalar between 0 and 1 indicating the desired coverage of the tolerance interval.
The default value is |
cov.type |
character string specifying the coverage type for the tolerance interval.
The possible values are |
ti.type |
character string indicating what kind of tolerance interval to compute.
The possible values are |
conf.level |
a scalar between 0 and 1 indicating the confidence level associated with the tolerance
interval. The default value is |
If x
contains any missing (NA
), undefined (NaN
) or
infinite (Inf
, -Inf
) values, they will be removed prior to
performing the estimation.
A tolerance interval for some population is an interval on the real line constructed so as to contain 100 β \% of the population (i.e., 100 β \% of all future observations), where 0 < β < 1. The quantity 100 β \% is called the coverage.
There are two kinds of tolerance intervals (Guttman, 1970):
A β-content tolerance interval with confidence level 100(1-α)\% is constructed so that it contains at least 100 β \% of the population (i.e., the coverage is at least 100 β \%) with probability 100(1-α)\%, where 0 < α < 1. The quantity 100(1-α)\% is called the confidence level or confidence coefficient associated with the tolerance interval.
A β-expectation tolerance interval is constructed so that the average coverage of the interval is 100 β \%.
Note: A β-expectation tolerance interval with coverage 100 β \% is equivalent to a prediction interval for one future observation with associated confidence level 100 β \%. Note that there is no explicit confidence level associated with a β-expectation tolerance interval. If a β-expectation tolerance interval is treated as a β-content tolerance interval, the confidence level associated with this tolerance interval is usually around 50% (e.g., Guttman, 1970, Table 4.2, p.76).
Because of the discrete nature of the Poisson distribution,
even true tolerance intervals (tolerance intervals based on the true value of
λ) will usually not contain exactly β\% of the population.
For example, for the Poisson distribution with parameter lambda=2
, the
interval [0, 4] contains 94.7% of this distribution and the interval [0, 5]
contains 98.3% of this distribution. Thus, no interval can contain exactly 95%
of this distribution.
β-Content Tolerance Intervals for a Poisson Distribution
Zacks (1970) showed that for monotone likelihood ratio (MLR) families of discrete
distributions, a uniformly most accurate upper β100\% β-content
tolerance interval with associated confidence level (1-α)100\% is
constructed by finding the upper (1-α)100\% confidence limit for the
parameter associated with the distribution, and then computing the β'th
quantile of the distribution assuming the true value of the parameter is equal to
the upper confidence limit. This idea can be extended to one-sided lower and
two-sided tolerance limits.
It can be shown that all distributions that are one parameter exponential families have the MLR property, and the Poisson distribution is a one-parameter exponential family, so the method of Zacks (1970) can be applied to a Poisson distribution.
Let X denote a Poisson random variable with parameter
lambda=
λ. Let x_{p|λ} denote the p'th quantile
of this distribution. That is,
Pr(X < x_{p|λ}) ≤ p ≤ Pr(X ≤ x_{p|λ}) \;\;\;\;\;\; (1)
Note that due to the discrete nature of the Poisson distribution, there will be several values of p associated with one value of X. For example, for λ=2, the value 1 is the p'th quantile for any value of p between 0.140 and 0.406.
Let \underline{x} denote a vector of n observations from a
Poisson distribution with parameter lambda=
λ.
When ti.type="upper"
, the first step is to compute the one-sided upper
(1-α)100\% confidence limit for λ based on the observations
\underline{x} (see the help file for epois
). Denote this upper
confidence limit by UCL. The one-sided upper β100\% tolerance limit
is then given by:
[0, x_{β | λ = UCL}] \;\;\;\;\;\; (2)
Similarly, when ti.type="lower"
, the first step is to compute the one-sided
lower (1-α)100\% confidence limit for λ based on the
observations \underline{x}. Denote this lower confidence limit by LCL.
The one-sided lower β100\% tolerance limit is then given by:
[x_{1-β | λ = LCL}, ∞] \;\;\;\;\;\; (3)
Finally, when ti.type="two-sided"
, the first step is to compute the two-sided
(1-α)100\% confidence limits for λ based on the
observations \underline{x}. Denote these confidence limits by LCL and
UCL. The two-sided β100\% tolerance limit is then given by:
[x_{\frac{1-β}{2} | λ = LCL}, x_{\frac{1+β}{2} | λ = UCL}] \;\;\;\;\;\; (4)
Note that the function tolIntPois
uses the exact confidence limits for
λ when computing β-content tolerance limits (see
epois
).
β-Expectation Tolerance Intervals for a Poisson Distribution
As stated above, a β-expectation tolerance interval with coverage
β100\% is equivalent to a prediction interval for one future observation
with associated confidence level β100\%. This is because the probability
that any single future observation will fall into this interval is β100\%,
so the distribution of the number of N future observations that will fall into
this interval is binomial with parameters
size=
N and prob=
β. Hence the expected proportion of
future observations that fall into this interval is β100\% and is
independent of the value of N. See the help file for predIntPois
for information on how these intervals are constructed.
If x
is a numeric vector, tolIntPois
returns a list of class
"estimate"
containing the estimated parameters, a component called
interval
containing the tolerance interval information, and other
information. See estimate.object
for details.
If x
is the result of calling an estimation function, tolIntPois
returns a list whose class is the same as x
. The list contains the same
components as x
. If x
already has a component called
interval
, this component is replaced with the tolerance interval
information.
Tolerance intervals have long been applied to quality control and life testing problems (Hahn, 1970b,c; Hahn and Meeker, 1991; Krishnamoorthy and Mathew, 2009). References that discuss tolerance intervals in the context of environmental monitoring include: Berthouex and Brown (2002, Chapter 21), Gibbons et al. (2009), Millard and Neerchal (2001, Chapter 6), Singh et al. (2010b), and USEPA (2009).
Gibbons (1987b) used the Poisson distribution to model the number of detected
compounds per scan of the 32 volatile organic priority pollutants (VOC), and
also to model the distribution of chemical concentration (in ppb). He explained
the derivation of a one-sided upper β-content tolerance limit for a
Poisson distribution based on the work of Zacks (1970) using the Pearson-Hartley
approximation to the confidence limits for the mean parameter λ
(see the help file for epois
). Note that there are several
typographical errors in the derivation and examples on page 575 of Gibbons (1987b)
because there is confusion between where the value of β (the coverage)
should be and where the value of 1-α (the confidence level) should be.
Gibbons et al. (2009, pp.103-104) gives correct formulas.
Steven P. Millard (EnvStats@ProbStatInfo.com)
Gibbons, R.D. (1987b). Statistical Models for the Analysis of Volatile Organic Compounds in Waste Disposal Sites. Ground Water 25, 572–580.
Gibbons, R.D., D.K. Bhaumik, and S. Aryal. (2009). Statistical Methods for Groundwater Monitoring, Second Edition. John Wiley & Sons, Hoboken.
Guttman, I. (1970). Statistical Tolerance Regions: Classical and Bayesian. Hafner Publishing Co., Darien, CT.
Hahn, G.J., and W.Q. Meeker. (1991). Statistical Intervals: A Guide for Practitioners. John Wiley and Sons, New York.
Johnson, N. L., S. Kotz, and A. Kemp. (1992). Univariate Discrete Distributions. Second Edition. John Wiley and Sons, New York, Chapter 4.
Krishnamoorthy K., and T. Mathew. (2009). Statistical Tolerance Regions: Theory, Applications, and Computation. John Wiley and Sons, Hoboken.
Millard, S.P., and N.K. Neerchal. (2001). Environmental Statistics with S-PLUS. CRC Press, Boca Raton.
Zacks, S. (1970). Uniformly Most Accurate Upper Tolerance Limits for Monotone Likelihood Ratio Families of Discrete Distributions. Journal of the American Statistical Association 65, 307–316.
# Generate 20 observations from a Poisson distribution with parameter # lambda=2. The interval [0, 4] contains 94.7% of this distribution and # the interval [0,5] contains 98.3% of this distribution. Thus, because # of the discrete nature of the Poisson distribution, no interval contains # exactly 95% of this distribution. Use tolIntPois to estimate the mean # parameter of the true distribution, and construct a one-sided upper 95% # beta-content tolerance interval with associated confidence level 90%. # (Note: the call to set.seed simply allows you to reproduce this example.) set.seed(250) dat <- rpois(20, 2) tolIntPois(dat, conf.level = 0.9) #Results of Distribution Parameter Estimation #-------------------------------------------- # #Assumed Distribution: Poisson # #Estimated Parameter(s): lambda = 1.8 # #Estimation Method: mle/mme/mvue # #Data: dat # #Sample Size: 20 # #Tolerance Interval Coverage: 95% # #Coverage Type: content # #Tolerance Interval Method: Zacks # #Tolerance Interval Type: two-sided # #Confidence Level: 90% # #Tolerance Interval: LTL = 0 # UTL = 6 #------ # Clean up rm(dat)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.