Predictive Ability using Resampling
predab.resample
is a general-purpose
function that is used by functions for specific models.
It computes estimates of optimism of, and bias-corrected estimates of a vector
of indexes of predictive accuracy, for a model with a specified
design matrix, with or without fast backward step-down of predictors. If bw=TRUE
, the design
matrix x
must have been created by ols
, lrm
, or cph
.
If bw=TRUE
, predab.resample
stores as the kept
attribute a logical matrix encoding which
factors were selected at each repetition.
predab.resample(fit.orig, fit, measure, method=c("boot","crossvalidation",".632","randomization"), bw=FALSE, B=50, pr=FALSE, prmodsel=TRUE, rule="aic", type="residual", sls=.05, aics=0, tol=1e-12, force=NULL, estimates=TRUE, non.slopes.in.x=TRUE, kint=1, cluster, subset, group=NULL, allow.varying.intercepts=FALSE, debug=FALSE, ...)
fit.orig |
object containing the original full-sample fit, with the |
fit |
a function to fit the model, either the original model fit, or a fit in a
sample. fit has as arguments |
measure |
a function to compute a vector of indexes of predictive accuracy for a given fit.
For |
method |
The default is |
bw |
Set to |
B |
Number of repetitions, default=50. For |
pr |
|
prmodsel |
set to |
rule |
Stopping rule for fastbw, |
type |
Type of statistic to use in stopping rule for fastbw, |
sls |
Significance level for stopping in fastbw if |
aics |
Stopping criteria for |
tol |
Tolerance for singularity checking. Is passed to |
force |
see |
estimates |
see |
non.slopes.in.x |
set to |
kint |
For multiple intercept models such as the ordinal logistic model, you may
specify which intercept to use as |
cluster |
Vector containing cluster identifiers. This can be specified only if
|
subset |
specify a vector of positive or negative integers or a logical vector when
you want to have the |
group |
a grouping variable used to stratify the sample upon bootstrapping. This allows one to handle k-sample problems, i.e., each bootstrap sample will be forced to selected the same number of observations from each level of group as the number appearing in the original dataset. |
allow.varying.intercepts |
set to |
debug |
set to |
... |
The user may add other arguments here that are passed to |
For method=".632"
, the program stops with an error if every observation
is not omitted at least once from a bootstrap sample. Efron's ".632" method
was developed for measures that are formulated in terms on per-observation
contributions. In general, error measures (e.g., ROC areas) cannot be
written in this way, so this function uses a heuristic extension to
Efron's formulation in which it is assumed that the average error measure
omitting the i
th observation is the same as the average error measure
omitting any other observation. Then weights are derived
for each bootstrap repetition and weighted averages over the B
repetitions
can easily be computed.
a matrix of class "validate"
with rows corresponding
to indexes computed by measure
, and the following columns:
index.orig |
indexes in original overall fit |
training |
average indexes in training samples |
test |
average indexes in test samples |
optimism |
average |
index.corrected |
|
n |
number of successful repetitions with the given index non-missing |
.
Also contains an attribute keepinfo
if measure
returned
such an attribute when run on the original fit.
Frank Harrell
Department of Biostatistics, Vanderbilt University
fh@fharrell.com
Efron B, Tibshirani R (1997). Improvements on cross-validation: The .632+ bootstrap method. JASA 92:548–560.
# See the code for validate.ols for an example of the use of # predab.resample
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.