Generate Data Frame with Predictor Combinations
If nobs
is not specified, allows user to specify predictor settings
by e.g. age=50, sex="male"
, and any omitted predictors are set to
reference values (default=median for continuous variables, first level
for categorical ones - see datadist
). If any predictor has more than one
value given, expand.grid
is called to generate all possible combinations
of values, unless expand=FALSE
. If nobs
is given, a data
frame is first generated which has
nobs
of adjust-to values duplicated. Then an editor window is opened
which allows the user to subset the variable names down to ones which she
intends to vary (this streamlines the data.ed
step). Then, if any
predictors kept are discrete and viewvals=TRUE
, a window (using page
)
is opened defining the possible values of this subset, to facilitate
data editing. Then the data.ed
function is invoked to allow interactive
overriding of predictor settings in the nobs
rows. The subset of
variables are combined with the other predictors which were not
displayed with data.ed
, and a final full data frame is returned.
gendata
is most useful for creating a newdata
data frame to pass
to predict
.
gendata(fit, ..., nobs, viewvals=FALSE, expand=TRUE, factors)
fit |
a fit object created with |
... |
predictor settings, if |
nobs |
number of observations to create if doing it interactively using X-windows |
viewvals |
if |
expand |
set to |
factors |
a list containing predictor settings with their names. This is an
alternative to specifying the variables separately in .... Unlike the
usage of ..., variables getting default ranges in |
if you have a variable in ...
that is named n, no, nob,
nob
, add nobs=FALSE
to the invocation to prevent that variable
from being misrecognized as nobs
a data frame with all predictors, and an attribute names.subset
if
nobs
is specified. This attribute contains the vector of variable
names for predictors which were passed to de
and hence were
allowed to vary. If neither nobs
nor any predictor settings were
given, returns a data frame with adjust-to values.
optionally writes to the terminal, opens X-windows, and generates a
temporary file using sink
.
Frank Harrell
Department of Biostatistics
Vanderbilt University
fh@fharrell.com
set.seed(1) age <- rnorm(200, 50, 10) sex <- factor(sample(c('female','male'),200,TRUE)) race <- factor(sample(c('a','b','c','d'),200,TRUE)) y <- sample(0:1, 200, TRUE) dd <- datadist(age,sex,race) options(datadist="dd") f <- lrm(y ~ age*sex + race) gendata(f) gendata(f, age=50) d <- gendata(f, age=50, sex="female") # leave race=reference category d <- gendata(f, age=c(50,60), race=c("b","a")) # 4 obs. d$Predicted <- predict(f, d, type="fitted") d # Predicted column prints at the far right options(datadist=NULL) ## Not run: d <- gendata(f, nobs=5, view=TRUE) # 5 interactively defined obs. d[,attr(d,"names.subset")] # print variables which varied predict(f, d) ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.