Positions of possibly degenerated motifs within sequences
word.pos
searches all the occurences of the motif pattern
within the sequence text
and returns their positions. This
function is based on regexp
allowing thus for complex motif searches.
The main difference with gregexpr
is that non disjoint matches
are reported here.
words.pos(pattern, text, ignore.case = FALSE, perl = TRUE, fixed = FALSE, useBytes = TRUE, ...)
pattern |
character string containing a regular expression (or character string for |
text |
a character vector where matches are sought. |
ignore.case |
if |
perl |
logical. Should perl-compatible regexps be used if available?
Has priority over |
fixed |
logical. If |
useBytes |
logical. If |
... |
arguments passed to |
Default parameter values have been tuned for speed when working biological sequences.
a vector of positions for which the motif pattern
was
found in the sequence text
.
J.R. Lobry
citation("seqinr")
myseq <- "tatagaga" words.pos("t", myseq) # Should be 1 3 words.pos("tag", myseq) # Should be 3 words.pos("ga", myseq) # Should be 5 7 # How to specify ambiguous base ? Look for YpR motifs by words.pos("[ct][ag]", myseq) # Should be 1 3 # # Show the difference with gregexpr: # words.pos("toto", "totototo") # 1 3 5 (three overlapping matches) unlist(gregexpr("toto", "totototo")) # 1 5 (two disjoint matches)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.