Integration into the AnnotationDbi framework
Several of the methods available for AnnotationDbi
objects are
also implemented for EnsDb
objects. This enables to extract
data from EnsDb
objects in a similar fashion than from objects
inheriting from the base annotation package class
AnnotationDbi
.
In addition to the standard usage, the select
and
mapIds
for EnsDb
objects support also the filter
framework of the ensembdb package and thus allow to perform more
fine-grained queries to retrieve data.
## S4 method for signature 'EnsDb' columns(x) ## S4 method for signature 'EnsDb' keys(x, keytype, filter,...) ## S4 method for signature 'EnsDb' keytypes(x) ## S4 method for signature 'EnsDb' mapIds(x, keys, column, keytype, ..., multiVals) ## S4 method for signature 'EnsDb' select(x, keys, columns, keytype, ...)
(In alphabetic order)
column |
For |
columns |
For |
keys |
The keys/ids for which data should be retrieved from the
database. This can be either a character vector of keys/IDs, a
single filter object extending
|
keytype |
For For |
filter |
For |
multiVals |
What should |
x |
The |
... |
Not used. |
See method description above.
List all the columns that can be retrieved by the mapIds
and select
methods. Note that these column names are
different from the ones supported by the genes
,
transcripts
etc. methods that can be listed by the
listColumns
method.
Returns a character vector of supported column names.
Retrieves all keys from the column name specified with
keytype
. By default (if keytype
is not provided) it
returns all gene IDs. Note that keytype="TXNAME"
will
return transcript ids, since no transcript names are available in
the database.
Returns a character vector of IDs.
List all supported key types (column names).
Returns a character vector of key types.
Retrieve the mapped ids for a set of keys that are of a particular
keytype. Argument keys
can be either a character vector of
keys/IDs, a single filter object extending
AnnotationFilter
or a list of such objects. For
the latter, the argument keytype
does not have to be
specified. Importantly however, if the filtering system is used,
the ordering of the results might not represent the ordering of
the keys.
The method usually returns a named character vector or, depending
on the argument multiVals
a named list, with names
corresponding to the keys (same ordering is only guaranteed if
keys
is a character vector).
Retrieve the data as a data.frame
based on parameters for
selected keys
, columns
and keytype
arguments. Multiple matches of the keys are returned in one row
for each possible match. Argument keys
can be either a
character vector of keys/IDs, a single filter object extending
AnnotationFilter
or a list of such objects. For
the latter, the argument keytype
does not have to be
specified.
Note that values from a column "TXNAME"
will be the same
than for a column "TXID"
, since internally no database
column "tx_name"
is present and the column is thus mapped
to "tx_id"
.
Returns a data.frame
with the column names corresponding to
the argument columns
and rows with all data matching the
criteria specified with keys
.
The use of select
without filters or keys and without
restricting to specicic columns is strongly discouraged, as the
SQL query to join all of the tables, especially if protein
annotation data is available is very expensive.
Johannes Rainer
library(EnsDb.Hsapiens.v86) edb <- EnsDb.Hsapiens.v86 ## List all supported keytypes. keytypes(edb) ## List all supported columns for the select and mapIds methods. columns(edb) ## List /real/ database column names. listColumns(edb) ## Retrieve all keys corresponding to transcript ids. txids <- keys(edb, keytype = "TXID") length(txids) head(txids) ## Retrieve all keys corresponding to gene names of genes encoded on chromosome X gids <- keys(edb, keytype = "GENENAME", filter = SeqNameFilter("X")) length(gids) head(gids) ## Get a mapping of the genes BCL2 and BCL2L11 to all of their ## transcript ids and return the result as list maps <- mapIds(edb, keys = c("BCL2", "BCL2L11"), column = "TXID", keytype = "GENENAME", multiVals = "list") maps ## Perform the same query using a combination of a GeneNameFilter and a ## TxBiotypeFilter to just retrieve protein coding transcripts for these ## two genes. mapIds(edb, keys = list(GeneNameFilter(c("BCL2", "BCL2L11")), TxBiotypeFilter("protein_coding")), column = "TXID", multiVals = "list") ## select: ## Retrieve all transcript and gene related information for the above example. select(edb, keys = list(GeneNameFilter(c("BCL2", "BCL2L11")), TxBiotypeFilter("protein_coding")), columns = c("GENEID", "GENENAME", "TXID", "TXBIOTYPE", "TXSEQSTART", "TXSEQEND", "SEQNAME", "SEQSTRAND")) ## Get all data for genes encoded on chromosome Y Y <- select(edb, keys = "Y", keytype = "SEQNAME") head(Y) nrow(Y) ## Get selected columns for all lincRNAs encoded on chromosome Y. Here we use ## a filter expression to define what data to retrieve. Y <- select(edb, keys = ~ seq_name == "Y" & gene_biotype == "lincRNA", columns = c("GENEID", "GENEBIOTYPE", "TXID", "GENENAME")) head(Y) nrow(Y)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.