A General Framework For Bagging
bag
provides a framework for bagging classification or regression models. The user can provide their own functions for model building, prediction and aggregation of predictions (see Details below).
bag(x, ...) bagControl( fit = NULL, predict = NULL, aggregate = NULL, downSample = FALSE, oob = TRUE, allowParallel = TRUE ) ## Default S3 method: bag(x, y, B = 10, vars = ncol(x), bagControl = NULL, ...) ## S3 method for class 'bag' predict(object, newdata = NULL, ...) ## S3 method for class 'bag' print(x, ...) ## S3 method for class 'bag' summary(object, ...) ## S3 method for class 'summary.bag' print(x, digits = max(3, getOption("digits") - 3), ...) ldaBag plsBag nbBag ctreeBag svmBag nnetBag
x |
a matrix or data frame of predictors |
... |
arguments to pass to the model function |
fit |
a function that has arguments |
predict |
a function that generates predictions for each sub-model. The function should have #' arguments |
aggregate |
a function with arguments |
downSample |
logical: for classification, should the data set be randomly sampled so that each #' class has the same number of samples as the smallest class? |
oob |
logical: should out-of-bag statistics be computed and the predictions retained? |
allowParallel |
a parallel backend is loaded and available, should the function use it? |
y |
a vector of outcomes |
B |
the number of bootstrap samples to train over. |
vars |
an integer. If this argument is not |
bagControl |
a list of options. |
object |
an object of class |
newdata |
a matrix or data frame of samples for prediction. Note that this argument must have a non-null value |
digits |
minimal number of significant digits. |
An object of class list
of length 3.
The function is basically a framework where users can plug in any model in to assess
the effect of bagging. Examples functions can be found in ldaBag
, plsBag
, nbBag
, svmBag
and nnetBag
.
Each has elements fit
, pred
and aggregate
.
One note: when vars
is not NULL
, the sub-setting occurs prior to the fit
and #' predict
functions are called. In this way, the user probably does not need to account for the #' change in predictors in their functions.
When using bag
with train
, classification models should use type = "prob"
#' inside of the predict
function so that predict.train(object, newdata, type = "prob")
will #' work.
If a parallel backend is registered, the foreach package is used to train the models in parallel.
bag
produces an object of class bag
with elements
fits |
a list with two sub-objects: the |
control |
a mirror of the arguments passed into |
call |
the call |
B |
the number of bagging iterations |
dims |
the dimensions of the training set |
Max Kuhn
## A simple example of bagging conditional inference regression trees: data(BloodBrain) ## treebag <- bag(bbbDescr, logBBB, B = 10, ## bagControl = bagControl(fit = ctreeBag$fit, ## predict = ctreeBag$pred, ## aggregate = ctreeBag$aggregate)) ## An example of pooling posterior probabilities to generate class predictions data(mdrr) ## remove some zero variance predictors and linear dependencies mdrrDescr <- mdrrDescr[, -nearZeroVar(mdrrDescr)] mdrrDescr <- mdrrDescr[, -findCorrelation(cor(mdrrDescr), .95)] ## basicLDA <- train(mdrrDescr, mdrrClass, "lda") ## bagLDA2 <- train(mdrrDescr, mdrrClass, ## "bag", ## B = 10, ## bagControl = bagControl(fit = ldaBag$fit, ## predict = ldaBag$pred, ## aggregate = ldaBag$aggregate), ## tuneGrid = data.frame(vars = c((1:10)*10 , ncol(mdrrDescr))))
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.