Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

stri_order

Ordering Permutation


Description

This function finds a permutation which rearranges the strings in a given character vector into the ascending or descending locale-dependent lexicographic order.

Usage

stri_order(str, decreasing = FALSE, na_last = TRUE, ..., opts_collator = NULL)

Arguments

str

a character vector

decreasing

a single logical value; should the sort order be nondecreasing (FALSE, default) or nonincreasing (TRUE)?

na_last

a single logical value; controls the treatment of NAs in str. If TRUE, then missing values in str are put at the end; if FALSE, they are put at the beginning; if NA, then they are removed from the output

...

additional settings for opts_collator

opts_collator

a named list with ICU Collator's options, see stri_opts_collator, NULL for default collation options

Details

For more information on ICU's Collator and how to tune it up in stringi, refer to stri_opts_collator.

As usual in stringi, non-character inputs are coerced to strings, see an example below for a somewhat non-intuitive behavior of lexicographic sorting on numeric inputs.

This function uses a stable sort algorithm (STL's stable_sort), which performs up to N*log^2(N) element comparisons, where N is the length of str.

For ordering with regards to multiple criteria (such as sorting data frames by more than 1 column), see stri_rank.

Value

The function yields an integer vector that gives the sort order.

References

Collation - ICU User Guide, http://userguide.icu-project.org/collation

See Also

Examples

stri_order(c('hladny', 'chladny'), locale='pl_PL')
stri_order(c('hladny', 'chladny'), locale='sk_SK')

stri_order(c(1, 100, 2, 101, 11, 10))
stri_order(c(1, 100, 2, 101, 11, 10), numeric=TRUE)

stringi

Character String Processing Facilities

v1.6.1
file LICENSE
Authors
Marek Gagolewski [aut, cre, cph] (<https://orcid.org/0000-0003-0637-6028>), Bartek Tartanus [ctb], and others (stringi source code); IBM, Unicode, Inc. and others (ICU4C source code, Unicode Character Database)
Initial release
2021-05-05

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.