Inferential Assessments About Model Performance
Methods for making inferences about differences between models
## S3 method for class 'resamples' diff( x, models = x$models, metric = x$metrics, test = t.test, confLevel = 0.95, adjustment = "bonferroni", ... ) ## S3 method for class 'diff.resamples' summary(object, digits = max(3, getOption("digits") - 3), ...) compare_models(a, b, metric = a$metric[1])
x |
an object generated by |
models |
a character string for which models to compare |
metric |
a character string for which metrics to compare |
test |
a function to compute differences. The output of this function
should have scalar outputs called |
confLevel |
confidence level to use for
|
adjustment |
any p-value adjustment method to pass to
|
... |
further arguments to pass to |
object |
a object generated by |
digits |
the number of significant differences to display when printing |
a, b |
two objects of class |
The ideas and methods here are based on Hothorn et al. (2005) and Eugster et al. (2008).
For each metric, all pair-wise differences are computed and tested to assess if the difference is equal to zero.
When a Bonferroni correction is used, the confidence level is changed from
confLevel
to 1-((1-confLevel)/p)
here p
is the number
of pair-wise comparisons are being made. For other correction methods, no
such change is used.
compare_models
is a shorthand function to compare two models using a
single metric. It returns the results of t.test
on the
differences.
An object of class "diff.resamples"
with elements:
call
|
the call |
difs |
a list for each metric being compared. Each list contains a matrix with differences in columns and resamples in rows |
statistics |
a list of results generated by |
adjustment |
the p-value adjustment used |
models |
a character string for which models were compared. |
metrics |
a character string of performance metrics that were used |
or...
An object of class "summary.diff.resamples"
with elements:
call
|
the call |
table |
a list of tables that show the differences and p-values |
...or (for compare_models
) an object of class htest
resulting
from t.test
.
Max Kuhn
Hothorn et al. The design and analysis of benchmark experiments. Journal of Computational and Graphical Statistics (2005) vol. 14 (3) pp. 675-699
Eugster et al. Exploratory and inferential analysis of benchmark experiments. Ludwigs-Maximilians-Universitat Munchen, Department of Statistics, Tech. Rep (2008) vol. 30
## Not run: #load(url("http://topepo.github.io/caret/exampleModels.RData")) resamps <- resamples(list(CART = rpartFit, CondInfTree = ctreeFit, MARS = earthFit)) difs <- diff(resamps) difs summary(difs) compare_models(rpartFit, ctreeFit) ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.