projpred: cv_varsel – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

cv_varsel

Cross-validated variable selection (varsel)

Description

Perform cross-validation for the projective variable selection for a generalized linear model or generalized lienar and additive multilevel models.

Usage

cv_varsel(object, ...)

## Default S3 method:
cv_varsel(object, ...)

## S3 method for class 'refmodel'
cv_varsel(
  object,
  method = NULL,
  cv_method = NULL,
  ndraws = NULL,
  nclusters = NULL,
  ndraws_pred = NULL,
  nclusters_pred = NULL,
  cv_search = TRUE,
  nterms_max = NULL,
  intercept = NULL,
  penalty = NULL,
  verbose = TRUE,
  nloo = NULL,
  K = NULL,
  lambda_min_ratio = 1e-05,
  nlambda = 150,
  thresh = 1e-06,
  regul = 1e-04,
  validate_search = TRUE,
  seed = NULL,
  search_terms = NULL,
  ...
)

Arguments

`object`	Same as in varsel.
`...`	Additional arguments to be passed to the `get_refmodel`-function.
`method`	Same as in varsel.
`cv_method`	The cross-validation method, either 'LOO' or 'kfold'. Default is 'LOO'.
`ndraws`	Number of posterior draws used for selection. Ignored if nclusters is provided or if method='L1'.
`nclusters`	Number of clusters used for selection. Default is 1 and ignored if method='L1' (L1-search uses always one cluster).
`ndraws_pred`	Number of samples used for prediction (after selection). Ignored if nclusters_pred is given.
`nclusters_pred`	Number of clusters used for prediction (after selection). Default is 5.
`cv_search`	Same as in varsel.
`nterms_max`	Same as in varsel.
`intercept`	Same as in varsel.
`penalty`	Same as in varsel.
`verbose`	Whether to print out some information during the validation, Default is TRUE.
`nloo`	Number of observations used to compute the LOO validation (anything between 1 and the total number of observations). Smaller values lead to faster computation but higher uncertainty (larger errorbars) in the accuracy estimation. Default is to use all observations, but for faster experimentation, one can set this to a small value such as 100. Only applicable if `cv_method = 'LOO'`.
`K`	Number of folds in the K-fold cross validation. Default is 5 for genuine reference models and 10 for datafits (that is, for penalized maximum likelihood estimation).
`lambda_min_ratio`	Same as in varsel.
`nlambda`	Same as in varsel.
`thresh`	Same as in varsel.
`regul`	Amount of regularization in the projection. Usually there is no need for regularization, but sometimes for some models the projection can be ill-behaved and we need to add some regularization to avoid numerical problems.
`validate_search`	Whether to cross-validate also the selection process, that is, whether to perform selection separately for each fold. Default is TRUE and we strongly recommend not setting this to FALSE, because this is known to bias the accuracy estimates for the selected submodels. However, setting this to FALSE can sometimes be useful because comparing the results to the case where this parameter is TRUE gives idea how strongly the feature selection is (over)fitted to the data (the difference corresponds to the search degrees of freedom or the effective number of parameters introduced by the selectin process).
`seed`	Random seed used in the subsampling LOO. By default uses a fixed seed.
`search_terms`	User defined list of terms to consider for selection.

Value

An object of type vsel that contains information about the feature selection. The fields are not meant to be accessed directly by the user but instead via the helper functions (see the vignettes or type ?projpred to see the main functions in the package.)

Examples

if (requireNamespace('rstanarm', quietly=TRUE)) {
  ### Usage with stanreg objects
  n <- 30
  d <- 5
  x <- matrix(rnorm(n*d), nrow=n)
  y <- x[,1] + 0.5*rnorm(n)
  data <- data.frame(x,y)
  fit <- rstanarm::stan_glm(y ~ X1 + X2 + X3 + X4 + X5, gaussian(),
     data=data, chains=2, iter=500)
  cvs <- cv_varsel(fit)
  plot(cvs)
}

projpred

Projection Predictive Feature Selection

v2.0.2

GPL-3

Authors

Juho Piironen [aut], Markus Paasiniemi [aut], Alejandro Catalina [cre, aut], Aki Vehtari [aut], Jonah Gabry [ctb], Marco Colombo [ctb], Paul-Christian Bürkner [ctb]

Initial release