QL/MM Estimation of Generalized Linear Models for Overdispersed Count Data
From clustered data, the function fits generalized linear models containing an over-dispersion parameter Φ using quasi-likelihood estimating equations for the mean μ and a moment estimator for Φ.
For binomial-type models, data have the form {(n_1, m_1), (n_2, m_2), ..., (n_N, m_N)}, where n_i is the size of cluster i, m_i the number of “successes”, and N the number of clusters. The response is the proportion y = m/n.
For Poisson-type models, data can be of two forms. When modeling “simple counts”, data have the form {m_1, m_2, ..., m_N}, where m_i is the number of occurences of the event under study. When modeling rates (e.g. hazard rates), data have the same form as for the BB model, where n_i is the denominator of the rate for cluster i (considered as an “offset”, i.e. a constant known value) and m_i the number of occurences of the event. For both forms of data, the response is the count y = m.
aodql(formula, data, family = c("qbin", "qpois"), link = c("logit", "cloglog", "probit"), method = c("chisq", "dev"), phi = NULL, tol = 1e-5, ...) ## S3 method for class 'aodql' anova(object, ...) ## S3 method for class 'aodql' coef(object, ...) ## S3 method for class 'aodql' deviance(object, ...) ## S3 method for class 'aodql' df.residual(object, ...) ## S3 method for class 'aodql' fitted(object, ...) ## S3 method for class 'aodql' logLik(object, ...) ## S3 method for class 'aodql' predict(object, ...) ## S3 method for class 'aodql' print(x, ...) ## S3 method for class 'aodql' residuals(object, ...) ## S3 method for class 'aodql' summary(object, ...) ## S3 method for class 'aodql' vcov(object, ...)
formula |
A formula for the mean μ, defining the parameter vector b (see details). For binomial-type models, the left-hand side of the formula must be of the form |
data |
A data frame containing the response ( |
family |
Define the model which is fitted: “qbin” for binomial-type models and “qpois” for Poisson-type models. |
link |
For binomial-type models only. Define the link function g for the mean μ: “cloglog”, “logit” (default) or “probit”. For Poisson-type models, |
method |
For function |
phi |
An optional value defining the over-dispersion parameter Φ if it is set as constant. Default to NULL (in that case, Φ is estimated). |
tol |
A positive scalar (default to 0.001). The algorithm stops at iteration r + 1 when the condition χ{^2}[r+1] - χ{^2}[r] <= tol is met by the chi-squared statistics . |
... |
Further arguments to passed to the appropriate functions. |
object |
An object of class “aodql”. |
x |
An object of class “aodql”. |
Binomial-type models
For a given cluster (n, m), the model is
m | λ,n \sim Binomial(n, λ)
where λ follows a random variable of mean E[λ] = μ and variance Var[λ] = Φ * μ * (1 - μ). The marginal mean and variance of m are
E[m] = n * μ
Var[m] = n * μ * (1 - μ) * (1 + (n - 1) * Φ)
The response in aodql
is y = m/n. The mean is E[y] = μ, defined such as μ = g^{-1}(X * b) = g^{-1}(ν), where g is the link function, X is a design-matrix, b a vector of fixed effects and ν = X * b is the corresponding linear predictor. The variance is Var[y] = (1 / n) * μ * (1 - μ) * (1 + (n - 1) * Φ).
Poisson-type models
—— Simple counts (model with no offset)
For a given cluster (m), the model is
y | λ \sim Poisson(λ)
where λ follows a random distribution of mean μ and variance Φ * μ^2. The mean and variance of the marginal distribution of m are
E[m] = μ
Var[m] = μ + Φ * μ^2
The response in aodql
is y = m. The mean is E[y] = μ, defined such as μ = exp(X * b) = exp(ν). The variance is Var[y] = μ + Φ * μ^2.
—— Rates (model with offset)
For a given cluster (n, m), the model is
m | λ,n \sim Poisson(λ)
where λ follows the same random distribution as for the case with no offset. The marginal mean and variance are
E[m | n] = μ
Var[m | n] = μ + Φ * μ^2
The response in aodql
is y = m. The mean is E[y] = μ, defined such as μ = exp(X * b + log(n)) = exp(ν + log(n)) = exp(η), where log(n) is the offset. The variance is Var[y] = μ + Φ * μ^2.
Other details
Vector b and parameter Φ are estimated iteratively, using procedures referred to as "Model I" in Williams (1982) for binomial-type models, and "Procedure II" in Breslow (1984) for Poisson-type models.
Iterations are as follows. Quasi-likelihood estimating equations (McCullagh & Nelder, 1989) are used to estimate b (using function glm
and its weights
argument), Φ being set to a constant value. Then, Φ is calculated by the moment estimator, obtained by equalizing the goodness-of-fit statistic (chi-squared X2
or deviance D
) of the model to its degrees of freedom.
Parameter Φ can be set as constant, using argument phi
. In that case, only b is estimated.
An object of class aodql
, printed and summarized by various functions.
Breslow, N.E., 1984. Extra-Poisson variation in log-linear models. Appl. Statist. 33, 38-44.
Moore, D.F., 1987, Modelling the extraneous variance in the presence of extra-binomial variation.
Appl. Statist. 36, 8-14.
Moore, D.F., Tsiatis, A., 1991. Robust estimation of the variance in moment methods for extra-binomial
and extra-poisson variation. Biometrics 47, 383-401.
McCullagh, P., Nelder, J. A., 1989, 2nd ed. Generalized linear models. New York, USA: Chapman and Hall.
Williams, D.A., 1982, Extra-binomial variation in logistic linear models. Appl. Statist. 31, 144-148.
#------ Binomial-type models data(orob2) fm <- aodql(cbind(m, n - m) ~ seed, data = orob2, family = "qbin") coef(fm) vcov(fm) summary(fm) # chi2 tests of the seed factor in fm wald.test(b = coef(fm), varb = vcov(fm), Terms = 2) # chi-2 vs. deviance statistic to estimate phi fm1 <- aodql(cbind(m, n - m) ~ seed + root, data = orob2, family = "qbin") fm2 <- aodql(cbind(m, n - m) ~ seed + root, data = orob2, family = "qbin", method = "dev") coef(fm1) coef(fm2) fm1$phi fm2$phi vcov(fm1) vcov(fm2) gof(fm1) gof(fm2) # estimate with fixed phi fm <- aodql(cbind(m, n - m) ~ seed, data = orob2, family = "qbin", phi = 0.05) coef(fm) vcov(fm) summary(fm) #------ Poisson-type models data(salmonella) fm <- aodql(m ~ log(dose + 10) + dose, data = salmonella, family = "qpois") coef(fm) vcov(fm) summary(fm) # chi2 tests of the "log(dose + 10) + dose" factors wald.test(b = coef(fm), varb = vcov(fm), Terms = 2:3) # chi-2 vs. deviance statistic to estimate phi fm1 <- aodql(m ~ log(dose + 10) + dose, data = salmonella, family = "qpois") fm2 <- aodql(m ~ log(dose + 10) + dose, data = salmonella, family = "qpois", method = "dev") coef(fm1) coef(fm2) fm1$phi fm2$phi vcov(fm1) vcov(fm2) gof(fm1) gof(fm2) # estimate with fixed phi fm <- aodql(m ~ log(dose + 10) + dose, data = salmonella, family = "qpois", phi = 0.05) coef(fm) vcov(fm) summary(fm) # modelling a rate data(dja) # rate "m / trisk" fm <- aodql(formula = m ~ group + offset(log(trisk)), data = dja, family = "qpois") summary(fm)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.