Group V-Fold Cross-Validation
Group V-fold cross-validation creates splits of the data based on some grouping variable (which may have more than a single row associated with it). The function can create as many splits as there are unique values of the grouping variable or it can create a smaller set of splits where more than one value is left out at a time.
group_vfold_cv(data, group = NULL, v = NULL, ...)
data |
A data frame. |
group |
This could be a single character value or a variable name that corresponds to a variable that exists in the data frame. |
v |
The number of partitions of the data set. If let
|
... |
Not currently used. |
A tibble with classes group_vfold_cv
,
rset
, tbl_df
, tbl
, and data.frame
.
The results include a column for the data split objects and an
identification variable.
set.seed(3527) test_data <- data.frame(id = sort(sample(1:20, size = 80, replace = TRUE))) test_data$dat <- runif(nrow(test_data)) set.seed(5144) split_by_id <- group_vfold_cv(test_data, group = "id") get_id_left_out <- function(x) unique(assessment(x)$id) library(purrr) table(map_int(split_by_id$splits, get_id_left_out)) set.seed(5144) split_by_some_id <- group_vfold_cv(test_data, group = "id", v = 7) held_out <- map(split_by_some_id$splits, get_id_left_out) table(unlist(held_out)) # number held out per resample: map_int(held_out, length)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.