Read FASTA formated Sequences
Read aligned or un-aligned sequences from a FASTA format file.
read.fasta(file, rm.dup = TRUE, to.upper = FALSE, to.dash=TRUE)
file |
input sequence file. |
rm.dup |
logical, if TRUE duplicate sequences (with the same names/ids) will be removed. |
to.upper |
logical, if TRUE residues are forced to uppercase. |
to.dash |
logical, if TRUE ‘.’ gap characters are converted to ‘-’ gap characters. |
A list with two components:
ali |
an alignment character matrix with a row per sequence and a column per equivalent aminoacid/nucleotide. |
ids |
sequence names as identifers. |
call |
the matched call. |
For a description of FASTA format see: https://www.ncbi.nlm.nih.gov/BLAST/blastcgihelp.shtml. When reading alignment files, the dash ‘-’ is interpreted as the gap character.
Barry Grant
Grant, B.J. et al. (2006) Bioinformatics 22, 2695–2696.
# Read alignment aln <- read.fasta(system.file("examples/hivp_xray.fa",package="bio3d")) # Print alignment overview aln # Sequence names/ids head( aln$id ) # Alignment positions 335 to 339 head( aln$ali[,33:39] ) # Sequence d2a4f_b aa123( aln$ali["d2a4f_b",] ) # Write out positions 33 to 45 only #aln$ali=aln$ali[,30:45] #write.fasta(aln, file="eg2.fa")
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.