Manipulate indexed fasta files.
Use FaFile()
to create a reference to an indexed fasta
file. The reference remains open across calls to methods, avoiding
costly index re-loading.
FaFileList()
provides a convenient way of managing a list of
FaFile
instances.
## Constructors FaFile(file, index=sprintf("%s.fai", file), gzindex=sprintf("%s.gzi", file)) FaFileList(...) ## Opening / closing ## S3 method for class 'FaFile' open(con, ...) ## S3 method for class 'FaFile' close(con, ...) ## accessors; also path(), index() ## S4 method for signature 'FaFile' gzindex(object, asNA=TRUE) ## S4 method for signature 'FaFileList' gzindex(object, asNA=TRUE) ## S4 method for signature 'FaFile' isOpen(con, rw="") ## actions ## S4 method for signature 'FaFile' indexFa(file, ...) ## S4 method for signature 'FaFile' scanFaIndex(file, ...) ## S4 method for signature 'FaFileList' scanFaIndex(file, ..., as=c("GRangesList", "GRanges")) ## S4 method for signature 'FaFile' seqinfo(x) ## S4 method for signature 'FaFile' countFa(file, ...) ## S4 method for signature 'FaFile,GRanges' scanFa(file, param, ..., as=c("DNAStringSet", "RNAStringSet", "AAStringSet")) ## S4 method for signature 'FaFile,IntegerRangesList' scanFa(file, param, ..., as=c("DNAStringSet", "RNAStringSet", "AAStringSet")) ## S4 method for signature 'FaFile,missing' scanFa(file, param, ..., as=c("DNAStringSet", "RNAStringSet", "AAStringSet")) ## S4 method for signature 'FaFile' getSeq(x, param, ...) ## S4 method for signature 'FaFileList' getSeq(x, param, ...)
con, object, x |
An instance of |
file, index, gzindex |
A character(1) vector of the fasta or fasta index
or bgzip index file path (for |
asNA |
logical indicating if missing output should be NA or character() |
param |
An optional |
... |
Additional arguments.
|
rw |
Mode of file; ignored. |
as |
A character(1) vector indicating the type of object to return.
|
Objects are created by calls of the form FaFile()
.
The FaFile
class inherits fields from the
RsamtoolsFile
class.
FaFileList
inherits methods from
RsamtoolsFileList
and SimpleList
.
Opening / closing:
Opens the (local or remote) path
and
index
files. Returns a FaFile
instance.
Closes the FaFile
con
; returning
(invisibly) the updated FaFile
. The instance may be
re-opened with open.FaFile
.
Accessors:
Returns a character(1) vector of the fasta path name.
Returns a character(1) vector of fasta index name (minus the '.fai' extension).
Methods:
Visit the path in path(file)
and create an
index file (with the extension ‘.fai’).
Read the sequence names and and widths of
recorded in an indexed fasta file, returning the information as a
GRanges
object.
Consult the index file for defined sequences
(seqlevels()
) and lengths (seqlengths()
).
Return the number of records in the fasta file.
Return the sequences indicated by param
as a
DNAStringSet
instance. seqnames(param)
selects the sequences to return; start(param)
and
end{param}
define the (1-based) region of the sequence to
return. Values of end(param)
greater than the width of the
sequence cause an error; use seqlengths(FaFile(file))
to
discover sequence widths. When param
is missing, all
records are selected. When length(param)==0
no records are
selected.
Returns the sequences indicated by param
from
the indexed fasta file(s) of file
.
For the FaFile
method, the return type is a
DNAStringSet
. The getSeq,FaFile
and
scanFa,FaFile,GRanges
methods differ in that getSeq
will reverse complement sequences selected from the minus strand.
For the FaFileList
method, the param
argument must
be a GRangesList
of the same length as file
,
creating a one-to-one mapping between the ith element of
file
and the ith element of param
; the return type
is a SimpleList
of DNAStringSet
instances, with
elements of the list in the same order as the input elements.
Compactly display the object.
Martin Morgan
fl <- system.file("extdata", "ce2dict1.fa", package="Rsamtools", mustWork=TRUE) fa <- open(FaFile(fl)) # open countFa(fa) (idx <- scanFaIndex(fa)) (dna <- scanFa(fa, param=idx[1:2])) ranges(idx) <- narrow(ranges(idx), -10) # last 10 nucleotides (dna <- scanFa(fa, param=idx[1:2]))
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.