Manipulate tabix indexed tab-delimited files.
Use TabixFile()
to create a reference to a Tabix file (and its
index). Once opened, the reference remains open across calls to
methods, avoiding costly index re-loading.
TabixFileList()
provides a convenient way of managing a list of
TabixFile
instances.
## Constructors TabixFile(file, index = paste(file, "tbi", sep="."), ..., yieldSize=NA_integer_) TabixFileList(...) ## Opening / closing ## S3 method for class 'TabixFile' open(con, ...) ## S3 method for class 'TabixFile' close(con, ...) ## accessors; also path(), index(), yieldSize() ## S4 method for signature 'TabixFile' isOpen(con, rw="") ## actions ## S4 method for signature 'TabixFile' seqnamesTabix(file, ...) ## S4 method for signature 'TabixFile' headerTabix(file, ...) ## S4 method for signature 'TabixFile,GRanges' scanTabix(file, ..., param) ## S4 method for signature 'TabixFile,IntegerRangesList' scanTabix(file, ..., param) ## S4 method for signature 'TabixFile,missing' scanTabix(file, ..., param) ## S4 method for signature 'character,ANY' scanTabix(file, ..., param) ## S4 method for signature 'character,missing' scanTabix(file, ..., param) countTabix(file, ...)
con |
An instance of |
file |
For TabixFile(), A character(1) vector to the tabix file
path; can be remote (http://, ftp://). For |
index |
A character(1) vector of the tabix file index. |
yieldSize |
Number of records to yield each time the file is read
from using |
param |
An instance of GRanges or IntegerRangesList, used to select which records to scan. |
... |
Additional arguments. For |
rw |
character() indicating mode of file; not used for |
Objects are created by calls of the form TabixFile()
.
The TabixFile
class inherits fields from the
RsamtoolsFile
class.
TabixFileList
inherits methods from
RsamtoolsFileList
and SimpleList
.
Opening / closing:
Opens the (local or remote) path
and
index
. Returns a TabixFile
instance.
yieldSize
determines the number of records parsed during
each call to scanTabix
; NA
indicates that all
records are to be parsed.
Closes the TabixFile
con
; returning
(invisibly) the updated TabixFile
. The instance may be
re-opened with open.TabixFile
.
Accessors:
Returns a character(1) vector of the tabix path name.
Returns a character(1) vector of tabix index name.
Return or set an integer(1) vector indicating yield size.
Methods:
Visit the path in path(file)
, returning
the sequence names present in the file.
Visit the path in path(file)
, returning
the sequence names, column indicies used to sort the file, the
number of lines skipped while indexing, the comment character used
while indexing, and the header (preceeded by comment character, at
start of file) lines.
Return the number of records in each range of
param
, or the count of all records in the file (when
param
is missing).
For signature(file="TabixFile")
, Visit the
path in path(file)
, returning the result of
scanTabix
applied to the specified path. For
signature(file="character")
, call the corresponding method
after coercing file
to TabixFile
.
This method operates on file paths, rather than
TabixFile
objects, to index tab-separated files. See
indexTabix
.
Compactly display the object.
Martin Morgan
fl <- system.file("extdata", "example.gtf.gz", package="Rsamtools", mustWork=TRUE) tbx <- TabixFile(fl) param <- GRanges(c("chr1", "chr2"), IRanges(c(1, 1), width=100000)) countTabix(tbx) countTabix(tbx, param=param) res <- scanTabix(tbx, param=param) sapply(res, length) res[["chr1:1-100000"]][1:2] ## parse to list of data.frame's dff <- Map(function(elt) { read.csv(textConnection(elt), sep="\t", header=FALSE) }, res) dff[["chr1:1-100000"]][1:5,1:8] ## parse 100 records at a time length(scanTabix(tbx)[[1]]) # total number of records tbx <- open(TabixFile(fl, yieldSize=100)) while(length(res <- scanTabix(tbx)[[1]])) cat("records read:", length(res), "\n") close(tbx)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.