Counts per Million or Reads per Kilobase per Million
Compute counts per million (CPM) or reads per kilobase per million (RPKM).
## S3 method for class 'DGEList' cpm(y, normalized.lib.sizes = TRUE, log = FALSE, prior.count = 2, ...) ## S3 method for class 'SummarizedExperiment' cpm(y, normalized.lib.sizes = TRUE, log = FALSE, prior.count = 2, ...) ## S3 method for class 'DGEGLM' cpm(y, log = FALSE, shrunk = TRUE, ...) ## Default S3 method: cpm(y, lib.size = NULL, offset=NULL, log = FALSE, prior.count = 2, ...) ## S3 method for class 'DGEList' rpkm(y, gene.length = NULL, normalized.lib.sizes = TRUE, log = FALSE, prior.count = 2, ...) ## S3 method for class 'SummarizedExperiment' rpkm(y, gene.length = NULL, normalized.lib.sizes = TRUE, log = FALSE, prior.count = 2, ...) ## S3 method for class 'DGEGLM' rpkm(y, gene.length, log = FALSE, shrunk = TRUE, ...) ## Default S3 method: rpkm(y, gene.length, lib.size = NULL, offset=NULL, log = FALSE, prior.count = 2, ...) ## S3 method for class 'DGEList' cpmByGroup(y, group = NULL, dispersion = NULL, ...) ## S3 method for class 'SummarizedExperiment' cpmByGroup(y, group = NULL, dispersion = NULL, ...) ## Default S3 method: cpmByGroup(y, group = NULL, dispersion = 0.05, offset = NULL, weights = NULL, log = FALSE, prior.count = 2, ...) ## S3 method for class 'DGEList' rpkmByGroup(y, group = NULL, gene.length = NULL, dispersion = NULL, ...) ## S3 method for class 'SummarizedExperiment' rpkmByGroup(y, group = NULL, gene.length = NULL, dispersion = NULL, ...) ## Default S3 method: rpkmByGroup(y, group = NULL, gene.length, dispersion = 0.05, offset = NULL, weights = NULL, log = FALSE, prior.count = 2, ...)
y |
a matrix-like object containing counts.
Can be a numeric matrix, a |
normalized.lib.sizes |
logical, use normalized library sizes? |
lib.size |
library size, defaults to |
offset |
numeric matrix of same size as |
log |
logical, if |
prior.count |
average count to be added to each observation to avoid taking log of zero. Used only if |
shrunk |
logical, if |
gene.length |
vector of length |
group |
factor giving group membership for columns of |
dispersion |
numeric vector of negative binomial dispersions. |
weights |
numeric vector or matrix of non-negative quantitative weights.
Can be a vector of length equal to the number of libraries, or a matrix of the same size as |
... |
other arguments are not used. |
CPM or RPKM values are useful descriptive measures for the expression level of a gene.
By default, the normalized library sizes are used in the computation for DGEList
objects but simple column sums for matrices.
If log-values are computed, then a small count, given by prior.count
but scaled to be proportional to the library size, is added to y
to avoid taking the log of zero.
The rpkm
methods for DGEList
, DGEGLM
or DGELRT
objects will try to find the gene lengths in a column of y$genes
called Length
or length
.
Failing that, it will look for any column name containing "length"
in any capitalization.
The cpm
and rpkm
methods for DGEGLM
and DGELRT
fitted model objects return fitted CPM or RPKM values.
If shrunk=TRUE
, then the CPM or RPKM values will reflect the prior.count
input to the original linear model fit.
If shrunk=FALSE
, then the CPM or RPKM values will be computed with prior.count=0
.
Note that the latter could result in taking the log of near-zero values if log=TRUE
.
cpmByGroup
and rpkmByGroup
compute group average values on the unlogged scale.
A numeric matrix of CPM or RPKM values, on the log2 scale if log=TRUE
.
cpm
and rpkm
produce matrices of the same size as y
.
If y
was a data object, then observed values are returned.
If y
was a fitted model object, then fitted values are returned.
cpmByGroup
and rpkmByGroup
produce matrices with a column for each level of group
.
aveLogCPM(y)
, rowMeans(cpm(y,log=TRUE))
and log2(rowMeans(cpm(y))
all give slightly different results.
Davis McCarthy, Gordon Smyth, Yunshun Chen, Aaron Lun
y <- matrix(rnbinom(20,size=1,mu=10),5,4) cpm(y) d <- DGEList(counts=y, lib.size=1001:1004) cpm(d) cpm(d,log=TRUE) d$genes <- data.frame(Length=c(1000,2000,500,1500,3000)) rpkm(d) cpmByGroup(d, group=c(1,1,2,2)) rpkmByGroup(d, group=c(1,1,2,2))
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.