GAM Tweedie families
Tweedie families, designed for use with gam
from the mgcv
library.
Restricted to variance function powers between 1 and 2. A useful alternative to quasi
when a
full likelihood is desirable. Tweedie
is for use with fixed p
. tw
is for use when p
is to be estimated during fitting. For fixed p
between 1 and 2 the Tweedie is an exponential family
distribution with variance given by the mean to the power p
.
Tweedie(p=1, link = power(0)) tw(theta = NULL, link = "log",a=1.01,b=1.99)
p |
the variance of an observation is proportional to its mean to the power |
link |
The link function: one of |
theta |
Related to the Tweedie power parameter by p=(a+b*exp(theta))/(1+exp(theta)). If this is supplied as a positive value then it is taken as the fixed value for |
a |
lower limit on |
b |
upper limit on |
A Tweedie random variable with 1<p<2 is a sum of N
gamma random variables
where N
has a Poisson distribution. The p=1 case is a generalization of a Poisson distribution and is a discrete
distribution supported on integer multiples of the scale parameter. For 1<p<2 the distribution is supported on the
positive reals with a point mass at zero. p=2 is a gamma distribution. As p gets very close to 1 the continuous
distribution begins to converge on the discretely supported limit at p=1, and is therefore highly multimodal.
See ldTweedie
for more on this behaviour.
Tweedie
is based partly on the poisson
family, and partly on tweedie
from the
statmod
package. It includes extra components to work with all mgcv
GAM fitting methods as well as
an aic
function.
The Tweedie density involves a normalizing constant with no closed form, so this is evaluated using the series
evaluation method of Dunn and Smyth (2005), with extensions to also compute the derivatives w.r.t. p
and the scale parameter.
Without restricting p
to (1,2) the calculation of Tweedie densities is more difficult, and there does not
currently seem to be an implementation which offers any benefit over quasi
. If you need this
case then the tweedie
package is the place to start.
For Tweedie
, an object inheriting from class family
, with additional elements
dvar |
the function giving the first derivative of the variance function w.r.t. |
d2var |
the function giving the second derivative of the variance function w.r.t. |
ls |
A function returning a 3 element array: the saturated log likelihood followed by its first 2 derivatives w.r.t. the scale parameter. |
For tw
, an object of class extended.family
.
Simon N. Wood simon.wood@r-project.org.
Dunn, P.K. and G.K. Smyth (2005) Series evaluation of Tweedie exponential dispersion model densities. Statistics and Computing 15:267-280
Tweedie, M. C. K. (1984). An index which distinguishes between some important exponential families. Statistics: Applications and New Directions. Proceedings of the Indian Statistical Institute Golden Jubilee International Conference (Eds. J. K. Ghosh and J. Roy), pp. 579-604. Calcutta: Indian Statistical Institute.
Wood, S.N., N. Pya and B. Saefken (2016), Smoothing parameter and model selection for general smooth models. Journal of the American Statistical Association 111, 1548-1575 doi: 10.1080/01621459.2016.1180986
library(mgcv) set.seed(3) n<-400 ## Simulate data... dat <- gamSim(1,n=n,dist="poisson",scale=.2) dat$y <- rTweedie(exp(dat$f),p=1.3,phi=.5) ## Tweedie response ## Fit a fixed p Tweedie, with wrong link ... b <- gam(y~s(x0)+s(x1)+s(x2)+s(x3),family=Tweedie(1.25,power(.1)), data=dat) plot(b,pages=1) print(b) ## Same by approximate REML... b1 <- gam(y~s(x0)+s(x1)+s(x2)+s(x3),family=Tweedie(1.25,power(.1)), data=dat,method="REML") plot(b1,pages=1) print(b1) ## estimate p as part of fitting b2 <- gam(y~s(x0)+s(x1)+s(x2)+s(x3),family=tw(), data=dat,method="REML") plot(b2,pages=1) print(b2) rm(dat)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.