Encode the overlaps between RNA-seq reads and the transcripts of a gene model
In the context of an RNA-seq experiment, encoding the overlaps between the aligned reads and the transcripts of a given gene model can be used for detecting those overlaps that are compatible with the splicing of the transcript.
The central tool for this is the encodeOverlaps
method for
GRangesList objects, which computes the "overlap
encodings" between a query
and a subject
, both list-like
objects with list elements containing multiple ranges.
Other related utilities are also documented in this man page.
encodeOverlaps(query, subject, hits=NULL, ...) ## S4 method for signature 'GRangesList,GRangesList' encodeOverlaps(query, subject, hits=NULL, flip.query.if.wrong.strand=FALSE) ## Related utilities: flipQuery(x, i) selectEncodingWithCompatibleStrand(ovencA, ovencB, query.strand, subject.strand, hits=NULL) isCompatibleWithSkippedExons(x, max.skipped.exons=NA) extractSteppedExonRanks(x, for.query.right.end=FALSE) extractSpannedExonRanks(x, for.query.right.end=FALSE) extractSkippedExonRanks(x, for.query.right.end=FALSE) extractQueryStartInTranscript(query, subject, hits=NULL, ovenc=NULL, flip.query.if.wrong.strand=FALSE, for.query.right.end=FALSE)
query, subject |
Typically GRangesList objects representing the
the aligned reads and the transcripts of a given gene model, respectively.
If the 2 objects don't have the same length, and if the More generally speaking, |
hits |
An optional Hits object typically obtained from a
previous call to Strictly speaking, Supplying |
... |
Additional arguments for methods. |
flip.query.if.wrong.strand |
See the "OverlapEncodings" vignette located in this package (GenomicAlignments). |
x |
For For |
i |
Subscript specifying the elements in |
ovencA, ovencB, ovenc |
OverlapEncodings objects. |
query.strand, subject.strand |
Vector-like objects containing the strand of the query and subject, respectively. |
max.skipped.exons |
Not supported yet. If |
for.query.right.end |
If |
See ?OverlapEncodings
for a short introduction to "overlap encodings".
The topic of working with overlap encodings is covered in details
in the "OverlapEncodings" vignette located this package
(GenomicAlignments) and accessible with
vignette("OverlapEncodings")
.
For encodeOverlaps
: An OverlapEncodings object.
If hits
is not supplied, this object is parallel to the
longest of query
and subject
, that is, it has the length
of the longest and the i-th encoding in it corresponds to the i-th element
in the longest.
If hits
is supplied, then the returned object is parallel
to it, that is, it has one encoding per hit.
For flipQuery
: TODO
For selectEncodingWithCompatibleStrand
: TODO
For isCompatibleWithSkippedExons
: A logical vector parallel
to x
.
For extractSteppedExonRanks
, extractSpannedExonRanks
, and
extractSkippedExonRanks
: TODO
For extractQueryStartInTranscript
: TODO
Hervé Pagès
The OverlapEncodings class for a brief introduction to "overlap encodings".
The Hits class defined and documented in the S4Vectors package.
The "OverlapEncodings" vignette in this package.
findCompatibleOverlaps
for a specialized version
of findOverlaps
that uses
encodeOverlaps
internally to keep only the hits where
the junctions in the aligned read are compatible with
the splicing of the annotated transcript.
The GRangesList class defined and documented in the GenomicRanges package.
The findOverlaps
generic function defined
in the IRanges package.
## --------------------------------------------------------------------- ## A. BETWEEN 2 IntegerRangesList OBJECTS ## --------------------------------------------------------------------- ## In the context of an RNA-seq experiment, encoding the overlaps ## between 2 GRangesList objects, one containing the reads (the query), ## and one containing the transcripts (the subject), can be used for ## detecting hits between reads and transcripts that are "compatible" ## with the splicing of the transcript. Here we illustrate this with 2 ## IntegerRangesList objects, in order to keep things simple: ## 4 aligned reads in the query: read1 <- IRanges(c(7, 15, 22), c(9, 19, 23)) # 2 junctions read2 <- IRanges(c(5, 15), c(9, 17)) # 1 junction read3 <- IRanges(c(16, 22), c(19, 24)) # 1 junction read4 <- IRanges(c(16, 23), c(19, 24)) # 1 junction query <- IRangesList(read1, read2, read3, read4) ## 1 transcript in the subject: tx <- IRanges(c(1, 4, 15, 22, 38), c(2, 9, 19, 25, 47)) # 5 exons subject <- IRangesList(tx) ## Encode the overlaps: ovenc <- encodeOverlaps(query, subject) ovenc encoding(ovenc) ## --------------------------------------------------------------------- ## B. BETWEEN 2 GRangesList OBJECTS ## --------------------------------------------------------------------- ## With real RNA-seq data, the reads and transcripts will typically be ## stored in GRangesList objects. Please refer to the "OverlapEncodings" ## vignette in this package for realistic examples.
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.