ilc: rhdata – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

rhdata

Data formatting utility for the extended (Stratified) LC model function

Description

It creates rhdata class object suitable for fitting the extended SLC model using elca.rh iterative fitting method. Basically, it transforms a two-dimensional survival data into three-dimensional arrays of population (exposure) and mortality rates dependent on age, calendar time and additional covariate(s).

Usage

rhdata(dat, covar, xbreaks = 60:96, xlabels = 60:95, 
		ybreaks = mdy.date(1, 1, 1999:2008), ylabels = 1999:2007, 
		name = NULL, label = NULL)

Arguments

`dat`	`data.frame` containing individual observations of survival data along with values of additional covariate(s). Thus, the data set needs to contain the following named columns of individual survival records: - 'event' = binary value corresponding to the survival event (1 - fail/death or 0 - survive); - 'dob' = Julian date corresponding to the date of birth (or origin) of the survival time; - 'dev' = Julian date corresponding to date of event (or end) of the survival time. In addition, there should be at least one extra column corresponding to observations related to any additional covariate(s) (e.g. socio-economic factors).
`covar`	(partial) covariate name(s) or position number(s) in the `dat` data set. The covariate(s) must be of class 'factor'.
`xbreaks`	a sequence of age break points (including the starting and closing values) to be used for sub-grouping the input data set `dat` in order to calculate age-specific exposures and mortality rates. By default, it is set to `60:96` that corresponds to integer ages between 60 - 95.
`xlabels`	a sequence of age labels to be used for the sequence defined in `xbreaks`.
`ybreaks`	a sequence of year break points (as Julian calendar dates) to be used for sub-grouping the input data set `dat` in order to calculate year-specific exposures and mortality rates. By default, it is set to `mdy.date(1, 1, 1999:2008)` that corresponds to whole years between 1st of January of years 1999 - 2008.
`ylabels`	a sequence of year labels to be used for the sequence defined in `ybreaks`.
`name`	name of subset data series (e.g. male, female or total)
`label`	label (name) of overall data source (e.g. CMI)

Details

While the rhdata function can sub-group the input data by more than one additional covariates (possibly useful for other preliminary analysis), the fitting method implemented in elc.rh can only handle a single additional covariate. Also, currently, there are no generic methods to plot or to extract parts of the rhdata class object, but there are a few illustrations provided below how these might be carried out.

Value

List object defined as class rhdata made up by the following components:

`year`	vector of year labels
`age`	vector of age labels
`covariates`	vector of levels of the additional covariate
`deaths`	3-dimensional array of number of deaths (by age-year-covariate)
`pop`	3-dimensional array of population (exposure) (by age-year-covariate)
`mu`	3-dimensional array of central mortality rates (by age-year-covariate)
`label`	label (name) of overall data source
`name`	name of subset data series

Author(s)

Z. Butt and S. Haberman and H. L. Shang

References

Renshaw, A. E. and Haberman, S. (2003a), “Lee-Carter mortality forecasting: a parallel generalised linear modelling approach for England and Wales mortality projections", Journal of the Royal Statistical Society, Series C, 52(1), 119-137.

Renshaw, A. E. and Haberman, S. (2003b), “Lee-Carter mortality forecasting with age specific enhancement", Insurance: Mathematics and Economics, 33, 255-272.

Renshaw, A. E. and Haberman, S. (2006), “A cohort-based extension to the Lee-Carter model for mortality reduction factors", Insurance: Mathematics and Economics, 38, 556-570.

Renshaw, A. E. and Haberman, S. (2008), “On simulation-based approaches to risk measurement in mortality with specific reference to Poisson Lee-Carter modelling", Insurance: Mathematics and Economics, 42(2), 797-816.

Renshaw, A. E. and Haberman, S. (2009), “On age-period-cohort parametric mortality rate projections", Insurance: Mathematics and Economics, 45(2), 255-270.

Examples

# See data set 'tab' provided in the ilc package
# names(tab)
# [1] "refno" "dob"   "dev"   "event" "cov1"  "cov2"
# Get multidimensional survival data: 
mdat <- rhdata(tab, covar='cov2', xbreaks=60:96, xlabels=60:95,
  ybreaks=mdy.date(1,1,2000:2006), ylabels=2000:2005, name='M', label='CMI')
# Warning: although rhdata() can sort by more than a single parameter, for ex.
#   covar=c('cov1', 'cov2'), the SLC fitting only works at the moment with
#   a single extra covariate.

# print data summary:
mdat
#Multidimensional Mortality data for: MDat [M] 
#Across covariates:
#         years: 2000 - 2005
#         ages:  60 - 95
#         cov2: 0, 1, 2, 3
# Graphical illustrations of mdat data levels (by the additional factor):
# plot of exposures:
matplot(mdat$age, mdat$pop[,,1], type='l', xlab='Age', ylab='Ec', main='Base Level')
matplot(mdat$age, mdat$pop[,,2], type='l', xlab='Age', ylab='Ec', main='Level 1')
# plot of deaths:
matplot(mdat$age, mdat$deaths[,,1], type='l', xlab='Age', ylab='D', main='Base Level')
matplot(mdat$age, mdat$deaths[,,2], type='l', xlab='Age', ylab='D', main='Level 1')
# plot of log mortality rates:
matplot(mdat$age, log(mdat$mu[,,1]), type='l', xlab='Age', ylab='log(mu)', main='Base Level')
matplot(mdat$age, log(mdat$mu[,,2]), type='l', xlab='Age', ylab='log(mu)', main='Level 1')

ilc

Lee-Carter Mortality Models using Iterative Fitting Algorithms

v1.0

GPL (>= 2)

Authors

Zoltan Butt, Steven Haberman and Han Lin Shang

Initial release

2014-11-19