Breast cancer data set used in Royston and Altman (2013)
The rotterdam
data set includes 2982 primary breast cancers patients
whose data whose records were included in the Rotterdam tumor bank.
rotterdam data(cancer, package="survival")
A data frame with 2982 observations on the following 15 variables.
pid
patient identifier
year
year of surgery
age
age at surgery
meno
menopausal status (0= premenopausal, 1= postmenopausal)
size
tumor size, a factor with levels <=20
20-50
>50
grade
differentiation grade
nodes
number of positive lymph nodes
pgr
progesterone receptors (fmol/l)
er
estrogen receptors (fmol/l)
hormon
hormonal treatment (0=no, 1=yes)
chemo
chemotherapy
rtime
days to relapse or last follow-up
recur
0= no relapse, 1= relapse
dtime
days to death or last follow-up
death
0= alive, 1= dead
These data sets are used in the paper by Royston and Altman. The Rotterdam data is used to create a fitted model, and the GBSG data for validation of the model. The paper gives references for the data source.
Patrick Royston and Douglas Altman, External validation of a Cox prognostic model: principles and methods. BMC Medical Research Methodology 2013, 13:33
rfstime <- pmin(rotterdam$rtime, rotterdam$dtime) status <- pmax(rotterdam$recur, rotterdam$death) fit1 <- coxph(Surv(rfstime, status) ~ pspline(age) + meno + size + pspline(nodes) + er, data=rotterdam, subset = (nodes > 0)) # Royston and Altman used fractional polynomials for the nonlinear terms
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.