Expected Value of Order Statistics for Random Sample from Standard Normal Distribution
Compute the expected values of order statistics for a random sample from a standard normal distribution.
evNormOrdStats(n = 1, method = "royston", lower = -9, inc = 0.025, warn = TRUE, alpha = 3/8, nmc = 2000, seed = 47, approximate = NULL) evNormOrdStatsScalar(r = 1, n = 1, method = "royston", lower = -9, inc = 0.025, warn = TRUE, alpha = 3/8, nmc = 2000, conf.level = 0.95, seed = 47, approximate = NULL)
n |
positive integer indicating the sample size. |
r |
positive integer between |
method |
character string indicating what method to use. The possible values are:
See the DETAILS section below. |
lower |
numeric scalar ≤ -9 defining the lower bound used for approximating the
integral when |
inc |
numeric scalar between |
warn |
logical scalar indicating whether to issue a warning when
|
alpha |
numeric scalar between 0 and 0.5 that determines the constant used when |
nmc |
integer ≥ 100 denoting the number of Monte Carlo simulations to use
when |
conf.level |
numeric scalar between 0 and 1 denoting the confidence level of
the confidence interval for the expected value of the normal
order statistic when |
seed |
integer between -(2^31 - 1) and 2^31 - 1 specifying
the argument to |
approximate |
logical scalar included for backwards compatibility with versions of
EnvStats prior to version 2.3.0.
When |
Let \underline{z} = z_1, z_2, …, z_n denote a vector of n
observations from a normal distribution with parameters
mean=0
and sd=1
. That is, \underline{z} denotes a vector of
n observations from a standard normal distribution. Let
z_{(r)} denote the r'th order statistic of \underline{z},
for r = 1, 2, …, n. The probability density function of
z_{(r)} is given by:
f_{r,n}(t) = \frac{n!}{(r-1)!(n-r)!} [Φ(t)]^{r-1} [1 - Φ(t)]^{n-r} φ(t) \;\;\;\;\;\; (1)
where Φ and φ denote the cumulative distribution function and probability density function of the standard normal distribution, respectively (Johnson et al., 1994, p.93). Thus, the expected value of z_{(r)} is given by:
E(r, n) = E[z_{(r)}] = \int_{-∞}^{∞} t f_{r,n}(t) dt \;\;\;\;\;\; (2)
It can be shown that if n is odd, then
E[(n+1)/2, n] = 0 \;\;\;\;\;\; (3)
Also, for all values of n,
E(r, n) = -E(n-r+1, n) \;\;\;\;\;\; (4)
The function evNormOrdStatsScalar
computes the value of E(r,n) for
user-specified values of r and n.
The function evNormOrdStats
computes the values of E(r,n) for all
values of r (i.e., for r = 1, 2, …, n)
for a user-specified value of n.
Exact Method Based on Royston's Approximation to the Integral (method="royston"
)
When method="royston"
, the integral in Equation (2) above is approximated by
computing the value of the integrand between the values of lower
and
-lower
using increments of inc
, then summing these values and
multiplying by inc
. In particular, the integrand is restructured as:
t \; f_{r,n}(t) = t \; exp\{log(n!) - log[(r-1)!] - log[(n-r)!] + (r-1)log[Φ(t)] + (n-r)log[1 - Φ(t)] + log[φ(t)]\} \;\;\; (5)
By default, as per Royston (1982), the integrand is evaluated between -9 and 9 in increments of 0.025. The approximation is computed this way for values of r between 1 and [n/2], where [x] denotes the floor of x. If r > [n/2], then the approximation is computed for E(n-r+1, n) and Equation (4) is used.
Note that Equation (1) in Royston (1982) differs from Equations (1) and (2) above because Royston's paper is based on the r^{th} largest value, not the r^{th} order statistic.
Royston (1982) states that this algorithm “is accurate to at least seven decimal
places on a 36-bit machine,” that it has been validated up to a sample size
of n=2000, and that the accuracy for n > 2000 may be improved by
reducing the value of the argument inc
. Note that making
inc
smaller will increase the computation time.
Approxmation Based on Blom's Method (method="blom"
)
When method="blom"
, the following approximation to E(r,n),
proposed by Blom (1958, pp. 68-75), is used:
E(r, n) \approx Φ^{-1}(\frac{r - α}{n - 2α + 1}) \;\;\;\;\;\; (5)
By default, α = 3/8 = 0.375. This approximation is quite accurate. For example, for n ≥ 2, the approximation is accurate to the first decimal place, and for n ≥ 9 it is accurate to the second decimal place.
Harter (1961) discusses appropriate values of α for various sample sizes
n and values of r.
Approximation Based on Monte Carlo Simulation (method="mc"
)
When method="mc"
, Monte Carlo simulation is used to estmate the expected value
of the r^{th} order statistic. That is, N = nmc
trials are run in which,
for each trial, a random sample of n standard normal observations is
generated and the r^{th} order statistic is computed. Then, the average value
of this order statistic over all N trials is computed, along with a
confidence interval for the expected value, assuming an approximately
normal distribution for the mean of the order statistic (the confidence interval
is computed by supplying the simulated values of the r^{th} order statistic
to the function enorm
).
NOTE: This method has not been optimized for large sample sizes n
(i.e., large values of the argument n
) and/or a large number of
Monte Carlo trials N (i.e., large values of the argument nmc
) and
may take a long time to execute in these cases.
For evNormOrdStats
: a numeric vector of length n
containing the
expected values of all the order statistics for a random sample of n
standard normal deviates.
For evNormOrdStatsScalar
: a numeric scalar containing the expected value
of the r
'th order statistic from a random sample of n
standard
normal deviates. When method="mc"
, the returned object also has a
cont.int
attribute that contains the 95
and a nmc
attribute indicating the number of Monte Carlo trials run.
The expected values of normal order statistics are used to construct normal
quantile-quantile (Q-Q) plots (see qqPlot
) and to compute
goodness-of-fit statistics (see gofTest
). Usually, however,
approximations are used instead of exact values. The functions
evNormOrdStats
and evNormOrdStatsScalar
have been included mainly
because evNormOrdStatsScalar
is called by elnorm3
and
predIntNparSimultaneousTestPower
.
Steven P. Millard (EnvStats@ProbStatInfo.com)
Blom, G. (1958). Statistical Estimates and Transformed Beta Variables. John Wiley and Sons, New York.
Harter, H. L. (1961). Expected Values of Normal Order Statistics 48, 151–165.
Johnson, N. L., S. Kotz, and N. Balakrishnan. (1994). Continuous Univariate Distributions, Volume 1. Second Edition. John Wiley and Sons, New York, pp. 93–99.
Royston, J.P. (1982). Algorithm AS 177. Expected Normal Order Statistics (Exact and Approximate). Applied Statistics 31, 161–165.
# Compute the expected value of the minimum for a random sample of size 10 # from a standard normal distribution: # Based on method="royston" #-------------------------- evNormOrdStatsScalar(r = 1, n = 10) #[1] -1.538753 # Based on method="blom" #----------------------- evNormOrdStatsScalar(r = 1, n = 10, method = "blom") #[1] -1.546635 # Based on method="mc" with 10,000 Monte Carlo trials #---------------------------------------------------- evNormOrdStatsScalar(r = 1, n = 10, method = "mc", nmc = 10000) #[1] -1.544318 #attr(,"confint") # 95%LCL 95%UCL #-1.555838 -1.532797 #attr(,"nmc") #[1] 10000 #==================== # Compute the expected values of all of the order statistics # for a random sample of size 10 from a standard normal distribution # based on Royston's (1982) method: #-------------------------------------------------------------------- evNormOrdStats(10) #[1] -1.5387527 -1.0013570 -0.6560591 -0.3757647 -0.1226678 #[6] 0.1226678 0.3757647 0.6560591 1.0013570 1.5387527 # Compare the above with Blom (1958) scores: #------------------------------------------- evNormOrdStats(10, method = "blom") #[1] -1.5466353 -1.0004905 -0.6554235 -0.3754618 -0.1225808 #[6] 0.1225808 0.3754618 0.6554235 1.0004905 1.5466353
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.