Extract a Substring From or Replace a Substring In a Character Vector
stri_sub
extracts particular substrings at code point-based
index ranges provided. Its replacement version allows to substitute
(in-place) parts of
a string with given replacement strings. stri_sub_replace
is its magrittr's pipe-operator-friendly variant that returns
a copy of the input vector.
For extracting/replacing multiple substrings from/within each string, see
stri_sub_all
.
stri_sub(str, from = 1L, to = -1L, length) stri_sub(str, from=1L, to=-1L, length, omit_na=FALSE) <- value stri_sub_replace(..., replacement, value = replacement)
str |
a character vector |
from |
an integer vector giving the start indexes or a two-column matrix
of type |
to |
an integer vector giving the end indexes; mutually exclusive with
|
length |
an integer vector giving the substring lengths;
mutually exclusive with |
omit_na |
a single logical value; indicates whether missing values
in any of the indexes or in |
value |
a character vector defining the replacement strings [replacement function only] |
... |
arguments to be passed to |
replacement |
alias of |
Vectorized over str
, [value
], from
and
(to
or length
). Parameters
to
and length
are mutually exclusive.
Indexes are 1-based, i.e., the start of a string is at index 1.
For negative indexes in from
or to
,
counting starts at the end of the string.
For instance, index -1 denotes the last code point in the string.
Non-positive length
gives an empty string.
Argument from
gives the start of a substring to extract.
Argument to
defines the last index of a substring, inclusive.
Alternatively, its length
may be provided.
If from
is a two-column matrix, then these two columns are
used as from
and to
, respectively, and anything passed
explicitly as from
or to
is ignored.
Such types of index matrices are generated by stri_locate_first
and stri_locate_last
. If extraction based on
stri_locate_all
is needed, see
stri_sub_all
.
In stri_sub
, out-of-bound indexes are silently
corrected. If from
> to
, then an empty string is returned.
In stri_sub<-
, some configurations of indexes may work as
substring 'injection' at the front, back, or in middle.
If both to
and length
are provided,
length
has priority over to
.
Note that for some Unicode strings, the extracted substrings might not
be well-formed, especially if input strings are not NFC-normalized
(see stri_trans_nfc
),
include byte order marks, Bidirectional text marks, and so on.
Handle with care.
stri_sub
and stri_sub_replace
return a character vector.
stri_sub<-
changes the str
object in-place.
Other indexing:
stri_locate_all_boundaries()
,
stri_locate_all()
,
stri_sub_all()
s <- 'Lorem ipsum dolor sit amet, consectetur adipisicing elit.' stri_sub(s, from=1:3*6, to=21) stri_sub(s, from=c(1,7,13), length=5) stri_sub(s, from=1, length=1:3) stri_sub(s, -17, -7) stri_sub(s, -5, length=4) (stri_sub(s, 1, 5) <- 'stringi') (stri_sub(s, -6, length=5) <- '.') (stri_sub(s, 1, 1:3) <- 1:2) x <- c('12 3456 789', 'abc', '', NA, '667') stri_sub(x, stri_locate_first_regex(x, '[0-9]+')) # see stri_extract_first stri_sub(x, stri_locate_last_regex(x, '[0-9]+')) # see stri_extract_last stri_sub_replace(x, stri_locate_first_regex(x, '[0-9]+'), omit_na=TRUE, replacement='***') # see stri_replace_first stri_sub_replace(x, stri_locate_last_regex(x, '[0-9]+'), omit_na=TRUE, replacement='***') # see stri_replace_last stri_sub(x, stri_locate_first_regex(x, '[0-9]+'), omit_na=TRUE) <- '***' print(x) ## Not run: x %>% stri_sub_replace(1, 5, replacement='new_substring')
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.