Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

rms

rms Methods and Generic Functions


Description

This is a series of special transformation functions (asis, pol, lsp, rcs, catg, scored, strat, matrx), fitting functions (e.g., lrm,cph, psm, or ols), and generic analysis functions (anova.rms, summary.rms, Predict, plot.Predict, ggplot.Predict, survplot, fastbw, validate, calibrate, specs.rms, which.influence, latexrms, nomogram, datadist, gendata) that help automate many analysis steps, e.g. fitting restricted interactions and multiple stratification variables, analysis of variance (with tests of linearity of each factor and pooled tests), plotting effects of variables in the model, estimating and graphing effects of variables that appear non-linearly in the model using e.g. inter-quartile-range hazard ratios, bootstrapping model fits, and constructing nomograms for obtaining predictions manually. Behind the scene is the Design function which stores extra attributes. Design() is not intended to be called by users. Design causes detailed design attributes and descriptions of the distribution of predictors to be stored in an attribute of the terms component called Design.

modelData is a replacement for model.frame.default that is much streamlined and prepares data for Design(). If a second formula is present, modelData ensures that missing data deletions are the same for both formulas, and produces a second model frame for formula2 as the data2 attribute of the main returned data frame.

Usage

modelData(data=environment(formula), formula, formula2=NULL,
          weights, subset, na.action=na.delete, dotexpand=TRUE,
          callenv=parent.frame(n=2))

Design(mf, formula=NULL, specials=NULL, allow.offset=TRUE, intercept=1)
# not to be called by the user; called by fitting routines
# dist <- datadist(x1,x2,sex,age,race,bp)   
# or dist <- datadist(my.data.frame)
# Can omit call to datadist if not using summary.rms, Predict,
# survplot.rms, or if all variable settings are given to them
# options(datadist="dist")
# f <- fitting.function(formula = y ~ rcs(x1,4) + rcs(x2,5) + x1%ia%x2 +
#                       rcs(x1,4)%ia%rcs(x2,5) +
#                       strat(sex)*age + strat(race)*bp)
# See rms.trans for rcs, strat, etc.
# %ia% is restricted interaction - not doubly nonlinear
# for x1 by x2 this uses the simple product only, but pools x1*x2
# effect with nonlinear function for overall tests
# specs(f)
# anova(f)
# summary(f)
# fastbw(f)
# pred <- predict(f, newdata=expand.grid(x1=1:10,x2=3,sex="male",
#                 age=50,race="black"))
# pred <- predict(f, newdata=gendata(f, x1=1:10, x2=3, sex="male"))
# This leaves unspecified variables set to reference values from datadist
# pred.combos <- gendata(f, nobs=10)   # Use X-windows to edit predictor settings
# predict(f, newdata=pred.combos)
# plot(Predict(f, x1))  # or ggplot(...)
# latex(f)
# nomogram(f)

Arguments

data

a data frame or calling environment

formula

model formula

formula2

an optional second model formula (see for example ppo in blrm)

weights

a weight variable or expression

subset

a subsetting expression evaluated in the calling frame or data

na.action

NA handling function, ideally one such as na.delete that stores extra information about data omissions

specials

a character vector specifying which function evaluations appearing in formula are "special" in the model.frame sense

dotexpand

set to FALSE to prevent . on right hand side of model formula from expanding into all variables in data; used for cph

callenv

the parent frame that called the fitting function

mf

a model frame

allow.offset

set to TRUE if model fitter allows an offset term

intercept

1 if an ordinary intercept is present, 0 otherwise

Value

a data frame augmented with additional information about the predictors and model formulation

Author(s)

Frank Harrell
Department of Biostatistics, Vanderbilt University
fh@fharrell.com

See Also

Examples

## Not run: 
require(rms)
dist <- datadist(data=2)     # can omit if not using summary, (gg)plot, survplot,
                             # or if specify all variable values to them. Can
                             # also  defer.  data=2: get distribution summaries
                             # for all variables in search position 2
                             # run datadist once, for all candidate variables
dist <- datadist(age,race,bp,sex,height)   # alternative
options(datadist="dist")
f <- cph(Surv(d.time, death) ~ rcs(age,4)*strat(race) +
         bp*strat(sex)+lsp(height,60),x=TRUE,y=TRUE)
anova(f)
anova(f,age,height)          # Joint test of 2 vars
fastbw(f)
summary(f, sex="female")     # Adjust sex to "female" when testing
                             # interacting factor bp
bplot(Predict(f, age, height))   # 3-D plot
ggplot(Predict(f, age=10:70, height=60))
latex(f)                     # LaTeX representation of fit


f <- lm(y ~ x)               # Can use with any fitting function that
                             # calls model.frame.default, e.g. lm, glm
specs.rms(f)                 # Use .rms since class(f)="lm"
anova(f)                     # Works since Varcov(f) (=Varcov.lm(f)) works
fastbw(f)
options(datadist=NULL)
f <- ols(y ~ x1*x2)          # Saves enough information to do fastbw, anova
anova(f)                     # Will not do Predict since distributions
fastbw(f)                    # of predictors not saved
plot(f, x1=seq(100,300,by=.5), x2=.5) 
                             # all values defined - don't need datadist
dist <- datadist(x1,x2)      # Equivalent to datadist(f)
options(datadist="dist")
plot(f, x1, x2=.5)        # Now you can do plot, summary
plot(nomogram(f, interact=list(x2=c(.2,.7))))

## End(Not run)

rms

Regression Modeling Strategies

v6.2-0
GPL (>= 2)
Authors
Frank E Harrell Jr <fh@fharrell.com>
Initial release
2021-03-17

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.