Extract non-overlapping exonic or intronic parts from a TxDb-like object
exonicParts
and intronicParts
extract the non-overlapping
(a.k.a. disjoint) exonic or intronic parts from a TxDb-like object.
exonicParts(txdb, linked.to.single.gene.only=FALSE) intronicParts(txdb, linked.to.single.gene.only=FALSE) ## 3 helper functions used internally by exonicParts() and intronicParts(): tidyTranscripts(txdb, drop.geneless=FALSE) tidyExons(txdb, drop.geneless=FALSE) tidyIntrons(txdb, drop.geneless=FALSE)
txdb |
A TxDb object, or any TxDb-like object that supports the
|
linked.to.single.gene.only |
If If
|
drop.geneless |
If If Note that
|
exonicParts
returns a disjoint and strictly sorted
GRanges object with 1 range per exonic part
and with metadata columns tx_id
, tx_name
, gene_id
,
exon_id
, exon_name
, and exon_rank
.
If linked.to.single.gene.only
was set to TRUE
,
an additional exonic_part
metadata column is added that
indicates the rank of each exonic part within all the exonic parts
linked to the same gene.
intronicParts
returns a disjoint and strictly sorted
GRanges object with 1 range per intronic part
and with metadata columns tx_id
, tx_name
, and gene_id
.
If linked.to.single.gene.only
was set to TRUE
,
an additional intronic_part
metadata column is added that
indicates the rank of each intronic part within all the intronic parts
linked to the same gene.
tidyTranscripts
returns a GRanges object
with 1 range per transcript and with metadata columns tx_id
,
tx_name
, and gene_id
.
tidyExons
returns a GRanges object
with 1 range per exon and with metadata columns tx_id
,
tx_name
, gene_id
, exon_id
, exon_name
,
and exon_rank
.
tidyIntrons
returns a GRanges object
with 1 range per intron and with metadata columns tx_id
,
tx_name
, and gene_id
.
exonicParts
is a replacement for disjointExons
with
the following differences/improvements:
Argument linked.to.single.gene.only
in exonicParts
replaces argument aggregateGenes
in disjointExons
,
but has opposite meaning i.e.
exonicParts(txdb, linked.to.single.gene.only=TRUE)
returns the same exonic parts as
disjointExons(txdb, aggregateGenes=FALSE)
.
Unlike disjointExons(txdb, aggregateGenes=TRUE)
,
exonicParts(txdb, linked.to.single.gene.only=FALSE)
does
NOT discard exon parts that are not linked to a gene.
exonicParts
is almost 2x more efficient than
disjointExons
.
exonicParts
works out-of-the-box on any TxDb-like
object that supports the transcripts()
and
exonsBy()
extractors (e.g. on an
EnsDb object).
Hervé Pagès
disjoin
in the IRanges package.
transcripts
, transcriptsBy
,
and transcriptsByOverlaps
, for extracting
genomic feature locations from a TxDb-like object.
transcriptLengths
for extracting the transcript
lengths (and other metrics) from a TxDb object.
extractTranscriptSeqs
for extracting transcript
(or CDS) sequences from chromosome sequences.
coverageByTranscript
for computing coverage by
transcript (or CDS) of a set of ranges.
The TxDb class.
library(TxDb.Hsapiens.UCSC.hg19.knownGene) txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene ## --------------------------------------------------------------------- ## exonicParts() ## --------------------------------------------------------------------- exonic_parts1 <- exonicParts(txdb) exonic_parts1 ## Mapping from exonic parts to genes is many-to-many: gene_id1 <- mcols(exonic_parts1)$gene_id gene_id1 # CharacterList object table(lengths(gene_id1)) ## The number of known genes a Human exonic part can be linked to ## varies from 0 to 22! exonic_parts2 <- exonicParts(txdb, linked.to.single.gene.only=TRUE) exonic_parts2 ## Mapping from exonic parts to genes now is many-to-one: gene_id2 <- mcols(exonic_parts2)$gene_id gene_id2[1:20] # character vector ## Select exonic parts for a given gene: exonic_parts2[gene_id2 %in% "643837"] ## Sanity checks: stopifnot(isDisjoint(exonic_parts1), isStrictlySorted(exonic_parts1)) stopifnot(isDisjoint(exonic_parts2), isStrictlySorted(exonic_parts2)) stopifnot(all(exonic_parts2 %within% reduce(exonic_parts1))) stopifnot(identical( lengths(gene_id1) == 1L, exonic_parts1 %within% exonic_parts2 )) ## --------------------------------------------------------------------- ## intronicParts() ## --------------------------------------------------------------------- intronic_parts1 <- intronicParts(txdb) intronic_parts1 ## Mapping from intronic parts to genes is many-to-many: mcols(intronic_parts1)$gene_id table(lengths(mcols(intronic_parts1)$gene_id)) ## A Human intronic part can be linked to 0 to 22 known genes! intronic_parts2 <- intronicParts(txdb, linked.to.single.gene.only=TRUE) intronic_parts2 ## Mapping from intronic parts to genes now is many-to-one: class(mcols(intronic_parts2)$gene_id) # character vector ## Sanity checks: stopifnot(isDisjoint(intronic_parts1), isStrictlySorted(intronic_parts1)) stopifnot(isDisjoint(intronic_parts2), isStrictlySorted(intronic_parts2)) stopifnot(all(intronic_parts2 %within% reduce(intronic_parts1))) stopifnot(identical( lengths(mcols(intronic_parts1)$gene_id) == 1L, intronic_parts1 %within% intronic_parts2 )) ## --------------------------------------------------------------------- ## Helper functions ## --------------------------------------------------------------------- tidyTranscripts(txdb) # Ordered by 'tx_id'. tidyTranscripts(txdb, drop.geneless=TRUE) # Ordered first by 'gene_id', # then by 'tx_id'. tidyExons(txdb) # Ordered first by 'tx_id', # then by 'exon_rank'. tidyExons(txdb, drop.geneless=TRUE) # Ordered first by 'gene_id', # then by 'tx_id', # then by 'exon_rank'. tidyIntrons(txdb) # Ordered by 'tx_id'. tidyIntrons(txdb, drop.geneless=TRUE) # Ordered first by 'gene_id', # then by 'tx_id'.
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.