Permutation sampling
A permutation sample is the same size as the original data set and is made
by permuting/shuffling one or more columns. This results in analysis
samples where some columns are in their original order and some columns
are permuted to a random order. Unlike other sampling functions in
rsample
, there is no assessment set and calling assessment()
on a
permutation split will throw an error.
permutations(data, permute = NULL, times = 25, apparent = FALSE, ...)
data |
A data frame. |
permute |
One or more columns to shuffle. This argument supports
|
times |
The number of permutation samples. |
apparent |
A logical. Should an extra resample be added where the analysis is the standard data set. |
... |
Not currently used. |
The argument apparent
enables the option of an additional
"resample" where the analysis data set is the same as the original data
set. Permutation-based resampling can be especially helpful for computing
a statistic under the null hypothesis (e.g. t-statistic). This forms the
basis of a permutation test, which computes a test statistic under all
possible permutations of the data.
A tibble
with classes permutations
, rset
, tbl_df
, tbl
, and
data.frame
. The results include a column for the data split objects and a
column called id
that has a character string with the resample
identifier.
permutations(mtcars, mpg, times = 2) permutations(mtcars, mpg, times = 2, apparent = TRUE) library(purrr) resample1 <- permutations(mtcars, starts_with("c"), times = 1) resample1$splits[[1]] %>% analysis() resample2 <- permutations(mtcars, hp, times = 10, apparent = TRUE) map_dbl(resample2$splits, function(x) { t.test(hp ~ vs, data = analysis(x))$statistic })
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.