Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

permutations

Permutation sampling


Description

A permutation sample is the same size as the original data set and is made by permuting/shuffling one or more columns. This results in analysis samples where some columns are in their original order and some columns are permuted to a random order. Unlike other sampling functions in rsample, there is no assessment set and calling assessment() on a permutation split will throw an error.

Usage

permutations(data, permute = NULL, times = 25, apparent = FALSE, ...)

Arguments

data

A data frame.

permute

One or more columns to shuffle. This argument supports tidyselect selectors. Multiple expressions can be combined with c(). Variable names can be used as if they were positions in the data frame, so expressions like x:y can be used to select a range of variables. See language for more details.

times

The number of permutation samples.

apparent

A logical. Should an extra resample be added where the analysis is the standard data set.

...

Not currently used.

Details

The argument apparent enables the option of an additional "resample" where the analysis data set is the same as the original data set. Permutation-based resampling can be especially helpful for computing a statistic under the null hypothesis (e.g. t-statistic). This forms the basis of a permutation test, which computes a test statistic under all possible permutations of the data.

Value

A tibble with classes permutations, rset, tbl_df, tbl, and data.frame. The results include a column for the data split objects and a column called id that has a character string with the resample identifier.

Examples

permutations(mtcars, mpg, times = 2)
permutations(mtcars, mpg, times = 2, apparent = TRUE)

library(purrr)
resample1 <- permutations(mtcars, starts_with("c"), times = 1)
resample1$splits[[1]] %>% analysis()

resample2 <- permutations(mtcars, hp, times = 10, apparent = TRUE)
map_dbl(resample2$splits, function(x) {
  t.test(hp ~ vs, data = analysis(x))$statistic
})

rsample

General Resampling Infrastructure

v0.1.0
MIT + file LICENSE
Authors
Julia Silge [aut, cre] (<https://orcid.org/0000-0002-3671-836X>), Fanny Chow [aut], Max Kuhn [aut], Hadley Wickham [aut], RStudio [cph]
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.