dplyr: ranking – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

ranking

Windowed rank functions.

Description

Six variations on ranking functions, mimicking the ranking functions described in SQL2003. They are currently implemented using the built in rank function, and are provided mainly as a convenience when converting between R and SQL. All ranking functions map smallest inputs to smallest outputs. Use desc() to reverse the direction.

Usage

row_number(x)

ntile(x = row_number(), n)

min_rank(x)

dense_rank(x)

percent_rank(x)

cume_dist(x)

Arguments

`x`	a vector of values to rank. Missing values are left as is. If you want to treat them as the smallest or largest values, replace with Inf or -Inf before ranking.
`n`	number of groups to split up into.

Details

row_number(): equivalent to rank(ties.method = "first")
min_rank(): equivalent to rank(ties.method = "min")
dense_rank(): like min_rank(), but with no gaps between ranks
percent_rank(): a number between 0 and 1 computed by rescaling min_rank to [0, 1]
cume_dist(): a cumulative distribution function. Proportion of all values less than or equal to the current rank.
ntile(): a rough rank, which breaks the input vector into n buckets. The size of the buckets may differ by up to one, larger buckets have lower rank.

Examples

x <- c(5, 1, 3, 2, 2, NA)
row_number(x)
min_rank(x)
dense_rank(x)
percent_rank(x)
cume_dist(x)

ntile(x, 2)
ntile(1:8, 3)

# row_number can be used with single table verbs without specifying x
# (for data frames and databases that support windowing)
mutate(mtcars, row_number() == 1L)
mtcars %>% filter(between(row_number(), 1, 10))

dplyr

A Grammar of Data Manipulation

v1.0.6

MIT + file LICENSE

Authors

Hadley Wickham [aut, cre] (<https://orcid.org/0000-0003-4757-117X>), Romain François [aut] (<https://orcid.org/0000-0002-2444-4226>), Lionel Henry [aut], Kirill Müller [aut] (<https://orcid.org/0000-0002-1416-3412>), RStudio [cph, fnd]

Initial release