Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

estimateCommonDisp

Estimate Common Negative Binomial Dispersion by Conditional Maximum Likelihood


Description

Maximizes the negative binomial conditional common likelihood to estimate a common dispersion value across all genes.

Usage

## S3 method for class 'DGEList'
estimateCommonDisp(y, tol=1e-06, rowsum.filter=5, verbose=FALSE, ...)
## Default S3 method:
estimateCommonDisp(y, group=NULL, lib.size=NULL, tol=1e-06, 
          rowsum.filter=5, verbose=FALSE, ...)

Arguments

y

matrix of counts or a DGEList object.

tol

the desired accuracy, passed to optimize.

rowsum.filter

genes with total count (across all samples) below this value will be filtered out before estimating the dispersion.

verbose

logical, if TRUE then the estimated dispersion and BCV will be printed to standard output.

group

vector or factor giving the experimental group/condition for each library.

lib.size

numeric vector giving the total count (sequence depth) for each library.

...

other arguments that are not currently used.

Details

Implements the conditional maximum likelihood (CML) method proposed by Robinson and Smyth (2008) for estimating a common dispersion parameter. This method proves to be accurate and nearly unbiased even for small counts and small numbers of replicates.

The CML method involves computing a matrix of quantile-quantile normalized counts, called pseudo-counts. The pseudo-counts are adjusted in such a way that the library sizes are equal for all samples, while preserving differences between groups and variability within each group. The pseudo-counts are included in the output of the function, but are intended mainly for internal edgeR use.

Value

estimateCommonDisp.DGEList adds the following components to the input DGEList object:

common.dispersion

estimate of the common dispersion.

pseudo.counts

numeric matrix of pseudo-counts.

pseudo.lib.size

the common library size to which the pseudo-counts have been adjusted.

AveLogCPM

numeric vector giving log2(AveCPM) for each row of y.

estimateCommonDisp.default returns a numeric scalar of the common dispersion estimate.

Author(s)

Mark Robinson, Davis McCarthy, Gordon Smyth

References

Robinson MD and Smyth GK (2008). Small-sample estimation of negative binomial dispersion, with applications to SAGE data. Biostatistics, 9, 321-332. http://biostatistics.oxfordjournals.org/content/9/2/321

See Also

Examples

# True dispersion is 1/5=0.2
y <- matrix(rnbinom(250*4,mu=20,size=5),nrow=250,ncol=4)
dge <- DGEList(counts=y,group=c(1,1,2,2))
dge <- estimateCommonDisp(dge, verbose=TRUE)

edgeR

Empirical Analysis of Digital Gene Expression Data in R

v3.32.1
GPL (>=2)
Authors
Yunshun Chen, Aaron TL Lun, Davis J McCarthy, Matthew E Ritchie, Belinda Phipson, Yifang Hu, Xiaobei Zhou, Mark D Robinson, Gordon K Smyth
Initial release
2021-01-14

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.