The Zero-Modified Normal Distribution
Density, distribution function, quantile function, and random generation
for the zero-modified normal distribution with parameters mean
,
sd
, and p.zero
.
The zero-modified normal distribution is the mixture of a normal distribution with a positive probability mass at 0.
dzmnorm(x, mean = 0, sd = 1, p.zero = 0.5) pzmnorm(q, mean = 0, sd = 1, p.zero = 0.5) qzmnorm(p, mean = 0, sd = 1, p.zero = 0.5) rzmnorm(n, mean = 0, sd = 1, p.zero = 0.5)
x |
vector of quantiles. |
q |
vector of quantiles. |
p |
vector of probabilities between 0 and 1. |
n |
sample size. If |
mean |
vector of means of the normal (Gaussian) part of the distribution.
The default is |
sd |
vector of (positive) standard deviations of the normal (Gaussian)
part of the distribution. The default is |
p.zero |
vector of probabilities between 0 and 1 indicating the probability the random
variable equals 0. For |
The zero-modified normal distribution is the mixture of a normal distribution with a positive probability mass at 0.
Let f(x; μ, σ) denote the density of a
normal (Gaussian) random variable X with parameters
mean=
μ and sd=
σ. The density function of a
zero-modified normal random variable Y with parameters mean=
μ,
sd=
σ, and p.zero=
p, denoted h(y; μ, σ, p),
is given by:
h(y; μ, σ, p) = | p | for y = 0 |
(1 - p) f(y; μ, σ) | for y \ne 0 |
Note that μ is not the mean of the zero-modified normal distribution; it is the mean of the normal part of the distribution. Similarly, σ is not the standard deviation of the zero-modified normal distribution; it is the standard deviation of the normal part of the distribution.
Let γ and δ denote the mean and standard deviation of the overall zero-modified normal distribution. Aitchison (1955) shows that:
E(Y) = γ = (1 - p) μ
Var(Y) = δ^2 = (1 - p) σ^2 + p (1-p) μ^2
Note that when p.zero=
p=0
, the zero-modified normal
distribution simplifies to the normal distribution.
dzmnorm
gives the density, pzmnorm
gives the distribution function,
qzmnorm
gives the quantile function, and rzmnorm
generates random
deviates.
The zero-modified normal distribution is sometimes used to model chemical concentrations for which some observations are reported as “Below Detection Limit”. See, for example USEPA (1992c, pp.27-34) and Gibbons et al. (2009, Chapter 12). Note, however, that USEPA (1992c) has been superseded by USEPA (2009) which recommends this strategy only in specific situations (see Chapter 15 of the document). This strategy is strongly discouraged by Helsel (2012, Chapter 1).
In cases where you want to model chemical concentrations for which some observations are reported as “Below Detection Limit” and you want to treat the non-detects as equal to 0, it will usually be more appropriate to model the data with a zero-modified lognormal (delta) distribution since chemical concentrations are bounded below at 0 (e.g., Gilliom and Helsel, 1986; Owen and DeRouen, 1980).
One way to try to assess whether a zero-modified lognormal (delta),
zero-modified normal, censored normal, or censored lognormal is the best
model for the data is to construct both censored and detects-only probability
plots (see qqPlotCensored
).
Steven P. Millard (EnvStats@ProbStatInfo.com)
Aitchison, J. (1955). On the Distribution of a Positive Random Variable Having a Discrete Probability Mass at the Origin. Journal of the American Statistical Association 50, 901-908.
Gilliom, R.J., and D.R. Helsel. (1986). Estimation of Distributional Parameters for Censored Trace Level Water Quality Data: 1. Estimation Techniques. Water Resources Research 22, 135-146.
Gibbons, RD., D.K. Bhaumik, and S. Aryal. (2009). Statistical Methods for Groundwater Monitoring. Second Edition. John Wiley and Sons, Hoboken, NJ.
Helsel, D.R. (2012). Statistics for Censored Environmental Data Using Minitab and R. Second Edition. John Wiley and Sons, Hoboken, NJ, Chapter 1.
Johnson, N. L., S. Kotz, and A.W. Kemp. (1992). Univariate Discrete Distributions. Second Edition. John Wiley and Sons, New York, p.312.
Owen, W., and T. DeRouen. (1980). Estimation of the Mean for Lognormal Data Containing Zeros and Left-Censored Values, with Applications to the Measurement of Worker Exposure to Air Contaminants. Biometrics 36, 707-719.
USEPA (1992c). Statistical Analysis of Ground-Water Monitoring Data at RCRA Facilities: Addendum to Interim Final Guidance. Office of Solid Waste, Permits and State Programs Division, US Environmental Protection Agency, Washington, D.C.
USEPA. (2009). Statistical Analysis of Groundwater Monitoring Data at RCRA Facilities, Unified Guidance. EPA 530/R-09-007, March 2009. Office of Resource Conservation and Recovery Program Implementation and Information Division. U.S. Environmental Protection Agency, Washington, D.C.
# Density of the zero-modified normal distribution with parameters # mean=2, sd=1, and p.zero=0.5, evaluated at 0, 0.5, 1, 1.5, and 2: dzmnorm(seq(0, 2, by = 0.5), mean = 2) #[1] 0.5000000 0.0647588 0.1209854 0.1760327 0.1994711 #---------- # The cdf of the zero-modified normal distribution with parameters # mean=3, sd=2, and p.zero=0.1, evaluated at 4: pzmnorm(4, 3, 2, .1) #[1] 0.7223162 #---------- # The median of the zero-modified normal distribution with parameters # mean=3, sd=1, and p.zero=0.1: qzmnorm(0.5, 3, 1, 0.1) #[1] 2.86029 #---------- # Random sample of 3 observations from the zero-modified normal distribution # with parameters mean=3, sd=1, and p.zero=0.4. # (Note: The call to set.seed simply allows you to reproduce this example.) set.seed(20) rzmnorm(3, 3, 1, 0.4) #[1] 0.000000 0.000000 3.073168
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.