Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

about_stringi

THE String Processing Package


Description

stringi is THE R package for fast, correct, consistent, and convenient string/text manipulation. It gives predictable results on every platform, in each locale, and under any native character encoding.

Keywords: R, text processing, character strings, internationalization, localization, ICU, ICU4C, i18n, l10n, Unicode.

License: The BSD-3-clause license for the package code, the ICU license for the accompanying ICU4C distribution, and the UCD license for the Unicode Character Database. See the COPYRIGHTS and LICENSE file for more details.

Details

Manual pages on general topics:

  • about_encoding – character encoding issues, including information on encoding management in stringi, as well as on encoding detection and conversion.

  • about_locale – locale issues, including locale management and specification in stringi, and the list of locale-sensitive operations. In particular, see stri_opts_collator for a description of the string collation algorithm, which is used for string comparing, ordering, ranking, sorting, case-folding, and searching.

  • about_arguments – information on how stringi treats its functions' arguments.

Facilities available

Refer to the following:

Note that each man page provides many further links to other interesting facilities and topics.

Author(s)

Marek Gagolewski, with contributions from Bartek Tartanus and many others. ICU4C was developed by IBM, Unicode, Inc., and others.

References

stringi Package homepage, https://stringi.gagolewski.com/

ICU – International Components for Unicode, http://site.icu-project.org/

The Unicode Consortium, https://home.unicode.org/

UTF-8, a transformation format of ISO 10646 – RFC 3629, https://tools.ietf.org/html/rfc3629

See Also


stringi

Character String Processing Facilities

v1.6.1
file LICENSE
Authors
Marek Gagolewski [aut, cre, cph] (<https://orcid.org/0000-0003-0637-6028>), Bartek Tartanus [ctb], and others (stringi source code); IBM, Unicode, Inc. and others (ICU4C source code, Unicode Character Database)
Initial release
2021-05-05

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.