Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

about_search_fixed

Locale-Insensitive Fixed Pattern Matching in stringi


Description

String searching facilities described here provide a way to locate a specific sequence of bytes in a string. The search engine's settings may be tuned up (for example to perform case-insensitive search) via a call to the stri_opts_fixed function.

Byte Compare

The fast Knuth-Morris-Pratt search algorithm, with worst time complexity of O(n+p) (n == length(str), p == length(pattern)) is implemented (with some tweaks for very short search patterns).

Be aware that, for natural language processing, fixed pattern searching might not be what you actually require. It is because a bitwise match will not give correct results in cases of:

  1. accented letters;

  2. conjoined letters;

  3. ignorable punctuation;

  4. ignorable case,

Note that the conversion of input data to Unicode is done as usual.

See Also

Other search_fixed: about_search, stri_opts_fixed()


stringi

Character String Processing Facilities

v1.6.1
file LICENSE
Authors
Marek Gagolewski [aut, cre, cph] (<https://orcid.org/0000-0003-0637-6028>), Bartek Tartanus [ctb], and others (stringi source code); IBM, Unicode, Inc. and others (ICU4C source code, Unicode Character Database)
Initial release
2021-05-05

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.