rms Special Transformation Functions
This is a series of functions (asis
, pol
, lsp
,
rcs
, catg
, scored
, strat
, matrx
,
gTrans
, and
%ia%
) that set up special attributes (such as
knots and nonlinear term indicators) that are carried through to fits
(using for example lrm
,cph
, ols
,
psm
). anova.rms
, summary.rms
, Predict
,
survplot
, fastbw
, validate
, specs
,
which.influence
, nomogram
and latex.rms
use these
attributes to automate certain analyses (e.g., automatic tests of linearity
for each predictor are done by anova.rms
). Many of the functions
are called implicitly. Some S functions such as ns
derive data-dependent
transformations that are not always "remembered" when predicted values are
later computed, so the predictions may be incorrect. The functions listed
here solve that problem when used in the rms
context.
asis
is the identity transformation, pol
is an ordinary
(non-orthogonal) polynomial, rcs
is a linear tail-restricted
cubic spline function (natural spline, for which the
rcspline.eval
function generates the design matrix, the
presence of system option rcspc
causes rcspline.eval
to be
invoked with pc=TRUE
, and the presence of system option fractied
causes this value to be passed to rcspline.eval
as the fractied
argument), catg
is for a categorical variable,
scored
is for an ordered categorical variable, strat
is
for a stratification factor in a Cox model, matrx
is for a matrix
predictor, and %ia%
represents restricted interactions in which
products involving nonlinear effects on both variables are not included
in the model. asis, catg, scored, matrx
are seldom invoked
explicitly by the user (only to specify label
or name
,
usually).
gTrans
is a general multiple-parameter transformation function.
It can be used to specify new polynomial bases, smooth relationships
with a discontinuity at one or more values of x
, grouped
categorical variables, e.g., a categorical variable with 5 levels where
you want to combine two of the levels to spend only 3 degrees of freedom in
all but see plots of predicted values where the two combined categories
are kept separate but will have equal effect estimates. The first
argument to gTrans
is a regular numeric, character, or factor
variable. The next argument is a function that transforms a vector into
a matrix. If the basis functions are to include a linear term it is up
too the user to include the original x
as one of the columns.
Column names are assigned automaticall, but any column names specified
by the user will override the default name. If you want to signal which
terms correspond to linear and which correspond to nonlinear effects for
the purpose of running anova.rms
, add an integer vector attribute
nonlinear
to the resulting matrix. This vector specifies the
column numbers corresponding to nonlinear effects. The default is to assume a column
is a linear effect. The parms
attribute stored with a
gTrans
result a character vector version of the function, so as
to not waste space carrying along any environment information.
In the list below, functions asis
through gTrans
can have
arguments x, parms, label, name
except that parms
does not
apply to asis, matrx, strat
.
asis(...) matrx(...) pol(...) lsp(...) rcs(...) catg(...) scored(...) strat(...) gTrans(...) x1 %ia% x2
... |
The arguments ... above contain the following.
|
x1,x2 |
two continuous variables for which to form a non-doubly-nonlinear interaction |
Frank Harrell
Department of Biostatistics, Vanderbilt University
fh@fharrell.com
## Not run: options(knots=4, poly.degree=2) # To get the old behavior of rcspline.eval knot placement (which didnt' handle # clumping at the lowest or highest value of the predictor very well): # options(fractied = 1.0) # see rcspline.eval for details country <- factor(country.codes) blood.pressure <- cbind(sbp=systolic.bp, dbp=diastolic.bp) fit <- lrm(Y ~ sqrt(x1)*rcs(x2) + rcs(x3,c(5,10,15)) + lsp(x4,c(10,20)) + country + blood.pressure + poly(age,2)) # sqrt(x1) is an implicit asis variable, but limits of x1, not sqrt(x1) # are used for later plotting and effect estimation # x2 fitted with restricted cubic spline with 4 default knots # x3 fitted with r.c.s. with 3 specified knots # x4 fitted with linear spline with 2 specified knots # country is an implied catg variable # blood.pressure is an implied matrx variable # since poly is not an rms function (pol is), it creates a # matrx type variable with no automatic linearity testing # or plotting f1 <- lrm(y ~ rcs(x1) + rcs(x2) + rcs(x1) %ia% rcs(x2)) # %ia% restricts interactions. Here it removes terms nonlinear in # both x1 and x2 f2 <- lrm(y ~ rcs(x1) + rcs(x2) + x1 %ia% rcs(x2)) # interaction linear in x1 f3 <- lrm(y ~ rcs(x1) + rcs(x2) + x1 %ia% x2) # simple product interaction (doubly linear) # Use x1 %ia% x2 instead of x1:x2 because x1 %ia% x2 triggers # anova to pool x1*x2 term into x1 terms to test total effect # of x1 # # Examples of gTrans # # Linear relationship with a discontinuity at zero: ldisc <- function(x) {z <- cbind(x == 0, x); attr(z, 'nonlinear') <- 1; z} gTrans(x, ldisc) # Duplicate pol(x, 2): pol2 <- function(x) {z <- cbind(x, x^2); attr(z, 'nonlinear') <- 2; z} gTrans(x, pol2) # Linear spline with a knot at x=10 with the new slope taking effect # until x=20 and the spline turning flat at that point but with a # discontinuous vertical shift dspl <- function(x) { z <- cbind(x, pmax(pmin(x, 20) - 10, 0), x > 20) attr(z, 'nonlinear') <- 2:3 z } gTrans(x, dspl) ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.