Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

findCompatibleOverlaps-methods

Finding hits between reads and transcripts that are compatible with the splicing of the transcript


Description

In the context of an RNA-seq experiment, findCompatibleOverlaps (or countCompatibleOverlaps) can be used for finding (or counting) hits between reads and transcripts that are compatible with the splicing of the transcript.

Usage

findCompatibleOverlaps(query, subject)
countCompatibleOverlaps(query, subject)

Arguments

query

A GAlignments or GAlignmentPairs object representing the aligned reads.

subject

A GRangesList object representing the transcripts.

Details

findCompatibleOverlaps is a specialized version of findOverlaps that uses encodeOverlaps internally to keep only the hits where the junctions in the aligned read are compatible with the splicing of the annotated transcript.

The topic of working with overlap encodings is covered in details in the "OverlapEncodings" vignette located this package (GenomicAlignments) and accessible with vignette("OverlapEncodings").

Value

A Hits object for findCompatibleOverlaps.

An integer vector parallel to (i.e. same length as) query for countCompatibleOverlaps.

Author(s)

Hervé Pagès

See Also

Examples

## Here we only show a simple example illustrating the use of
## countCompatibleOverlaps() on a very small data set. Please
## refer to the "OverlapEncodings" vignette in the GenomicAlignments
## package for a comprehensive presentation of "overlap
## encodings" and related tools/concepts (e.g. "compatible"
## overlaps, "almost compatible" overlaps etc...), and for more
## examples.

## sm_treated1.bam contains a small subset of treated1.bam, a BAM
## file containing single-end reads from the "Pasilla" experiment
## (RNA-seq, Fly, see the pasilla data package for the details)
## and aligned to reference genome BDGP Release 5 (aka dm3 genome on
## the UCSC Genome Browser):
sm_treated1 <- system.file("extdata", "sm_treated1.bam",
                           package="GenomicAlignments", mustWork=TRUE)

## Load the alignments:
flag0 <- scanBamFlag(isDuplicate=FALSE, isNotPassingQualityControls=FALSE)
param0 <- ScanBamParam(flag=flag0)
gal <- readGAlignments(sm_treated1, use.names=TRUE, param=param0)

## Load the transcripts (IMPORTANT: Like always, the reference genome
## of the transcripts must be *exactly* the same as the reference
## genome used to align the reads):
library(TxDb.Dmelanogaster.UCSC.dm3.ensGene)
txdb <- TxDb.Dmelanogaster.UCSC.dm3.ensGene
exbytx <- exonsBy(txdb, by="tx", use.names=TRUE)

## Number of "compatible" transcripts per alignment in 'gal':
gal_ncomptx <- countCompatibleOverlaps(gal, exbytx)
mcols(gal)$ncomptx <- gal_ncomptx
table(gal_ncomptx)
mean(gal_ncomptx >= 1)
## --> 33% of the alignments in 'gal' are "compatible" with at least
## 1 transcript in 'exbytx'.

## Keep only alignments compatible with at least 1 transcript in
## 'exbytx':
compgal <- gal[gal_ncomptx >= 1]
head(compgal)

GenomicAlignments

Representation and manipulation of short genomic alignments

v1.26.0
Artistic-2.0
Authors
Hervé Pagès, Valerie Obenchain, Martin Morgan
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.