Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

amb

Expansion of IUPAC nucleotide symbols


Description

This function returns the list of nucleotide matching a given IUPAC nucleotide symbol, for instance c("c", "g") for "s".

Usage

amb(base, forceToLower = TRUE, checkBase = TRUE,
IUPAC = s2c("acgturymkswbdhvn"), u2t = TRUE)

Arguments

base

an IUPAC symbol for a nucleotide as a single character

forceToLower

if TRUE the base is forced to lower case

checkBase

if TRUE the character is checked to belong to the allowed IUPAC symbol list

IUPAC

the list of allowed IUPAC symbols

u2t

if TRUE "u" for uracil in RNA are changed into "t" for thymine in DNA

Details

Non ambiguous bases are returned unchanged (except for "u" when u2t is TRUE).

Value

When base is missing, the list of IUPAC symbols is returned, otherwise a vector with expanded symbols.

Author(s)

J.R. Lobry

References

The nomenclature for incompletely specified bases in nucleic acid sequences at: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC341218/

citation("seqinr")

See Also

See bma for the reverse operation. Use tolower to change upper case letters into lower case letters.

Examples

#
# The list of IUPAC symbols:
#

amb()

#
# And their expansion:
#

sapply(amb(), amb)

seqinr

Biological Sequences Retrieval and Analysis

v4.2-16
GPL (>= 2)
Authors
Delphine Charif [aut], Olivier Clerc [ctb], Carolin Frank [ctb], Jean R. Lobry [aut, cph], Anamaria Necşulea [ctb], Leonor Palmeira [ctb], Simon Penel [cre], Guy Perrière [ctb]
Initial release
2022-05-19

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.