Overdispersed Poisson log-linear models
This function estimates overdispersed Poisson log-linear models using the approach discussed by Breslow N.E. (1984).
glm.poisson.disp(object, maxit = 30, verbose = TRUE)
object |
an object of class |
maxit |
integer giving the maximal number of iterations for the model fitting procedure. |
verbose |
logical, if |
Breslow (1984) proposed an iterative algorithm for fitting overdispersed Poisson log-linear models. The method is similar to that proposed by Williams (1982) for handling overdispersion in logistic regression models (see glm.binomial.disp
).
Suppose we observe n independent responses such that
y_i | λ_i ~ Poisson(λ_i n_i)
for i = 1, …, n. The response variable y_i may be an event counts variable observed over a period of time (or in the space) of length n_i, whereas λ_i is the rate parameter. Then,
E(y_i | λ_i) = μ_i = λ_i n_i = exp(log(n_i) + log(λ_i))
where log(n_i) is an offset and log(λ_i) = β'x_i expresses the dependence of the Poisson rate parameter on a set of, say p, predictors. If the periods of time are all of the same length, we can set n_i = 1 for all i so the offset is zero.
The Poisson distribution has E(y_i | λ_i) = V(y_i | λ_i), but it may happen that the actual variance exceeds the nominal variance under the assumed probability model.
Suppose that θ_i = λ_i n_i is a random variable distributed according to
θ_i ~ Gamma (μ_i, 1/φ)
where E(θ_i) = μ_i and V(θ_i) = μ_i^2 φ. Thus, it can be shown that the unconditional mean and variance of y_i are given by
E(y_i) = μ_i
and
V(y_i) = μ_i + μ_i^2 φ = μ_i (1 + μ_iφ)
Hence, for φ > 0 we have overdispersion. It is interesting to note that the same mean and variance arise also if we assume a negative binomial distribution for the response variable.
The method proposed by Breslow uses an iterative algorithm for estimating the dispersion parameter φ and hence the necessary weights 1/(1 + μ_i \hat{φ}) (for details see Breslow, 1984).
The function returns an object of class "glm"
with the usual information and the added components:
dispersion |
the estimated dispersion parameter. |
disp.weights |
the final weights used to fit the model. |
Based on a similar procedure available in Arc (Cook and Weisberg, http://www.stat.umn.edu/arc)
Breslow, N.E. (1984), Extra-Poisson variation in log-linear models, Applied Statistics, 33, 38–44.
## Salmonella TA98 data data(salmonellaTA98) salmonellaTA98 <- within(salmonellaTA98, logx10 <- log(x+10)) mod <- glm(y ~ logx10 + x, data = salmonellaTA98, family = poisson(log)) summary(mod) mod.disp <- glm.poisson.disp(mod) summary(mod.disp) mod.disp$dispersion # compute predictions on a grid of x-values... x0 <- with(salmonellaTA98, seq(min(x), max(x), length=50)) eta0 <- predict(mod, newdata = data.frame(logx10 = log(x0+10), x = x0), se=TRUE) eta0.disp <- predict(mod.disp, newdata = data.frame(logx10 = log(x0+10), x = x0), se=TRUE) # ... and plot the mean functions with variability bands plot(y ~ x, data = salmonellaTA98) lines(x0, exp(eta0$fit)) lines(x0, exp(eta0$fit+2*eta0$se), lty=2) lines(x0, exp(eta0$fit-2*eta0$se), lty=2) lines(x0, exp(eta0.disp$fit), col=3) lines(x0, exp(eta0.disp$fit+2*eta0.disp$se), lty=2, col=3) lines(x0, exp(eta0.disp$fit-2*eta0.disp$se), lty=2, col=3) ## Holford's data data(holford) mod <- glm(incid ~ offset(log(pop)) + Age + Cohort, data = holford, family = poisson(log)) summary(mod) mod.disp <- glm.poisson.disp(mod) summary(mod.disp) mod.disp$dispersion
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.