Competitive Gene Set Test Accounting for Inter-gene Correlation
Test whether a set of genes is highly ranked relative to other genes in terms of differential expression, accounting for inter-gene correlation.
## Default S3 method: camera(y, index, design, contrast = ncol(design), weights = NULL, use.ranks = FALSE, allow.neg.cor=FALSE, inter.gene.cor=0.01, trend.var = FALSE, sort = TRUE, ...) ## Default S3 method: cameraPR(statistic, index, use.ranks = FALSE, inter.gene.cor=0.01, sort = TRUE, ...) interGeneCorrelation(y, design)
y |
a numeric matrix of log-expression values or log-ratios of expression values, or any data object containing such a matrix.
Rows correspond to probes and columns to samples.
Any type of object that can be processed by |
statistic |
a numeric vector of genewise statistics. If |
index |
an index vector or a list of index vectors. Can be any vector such that |
design |
design matrix. |
contrast |
contrast of the linear model coefficients for which the test is required. Can be an integer specifying a column of |
weights |
numeric matrix of precision weights. Can be a matrix of the same size as |
use.ranks |
do a rank-based test ( |
allow.neg.cor |
should reduced variance inflation factors be allowed for negative correlations? |
inter.gene.cor |
numeric, optional preset value for the inter-gene correlation within tested sets. If |
trend.var |
logical, should an empirical Bayes trend be estimated? See |
sort |
logical, should the results be sorted by p-value? |
... |
other arguments are not currently used |
camera
and interGeneCorrelation
implement methods proposed by Wu and Smyth (2012).
camera
performs a competitive test in the sense defined by Goeman and Buhlmann (2007).
It tests whether the genes in the set are highly ranked in terms of differential expression relative to genes not in the set.
It has similar aims to geneSetTest
but accounts for inter-gene correlation.
See roast
for an analogous self-contained gene set test.
The function can be used for any microarray experiment which can be represented by a linear model.
The design matrix for the experiment is specified as for the lmFit
function, and the contrast of interest is specified as for the contrasts.fit
function.
This allows users to focus on differential expression for any coefficient or contrast in a linear model by giving the vector of test statistic values.
camera
estimates p-values after adjusting the variance of test statistics by an estimated variance inflation factor.
The inflation factor depends on estimated genewise correlation and the number of genes in the gene set.
By default, camera
uses interGeneCorrelation
to estimate the mean pair-wise correlation within each set of genes.
camera
can alternatively be used with a preset correlation specified by inter.gene.cor
that is shared by all sets.
This usually works best with a small value, say inter.gene.cor=0.01
.
If interGeneCorrelation=NA
, then camera
will estimate the inter-gene correlation for each set.
In this mode, camera
gives rigorous error rate control for all sample sizes and all gene sets.
However, in this mode, highly co-regulated gene sets that are biological interpretable may not always be ranked at the top of the list.
With interGeneCorrelation=0.01
, camera
will rank biologically interpetable sets more highly.
This gives a useful compromise between strict error rate control and interpretable gene set rankings.
cameraPR
is a "pre-ranked" version of camera
where the genes are pre-ranked according to a pre-computed statistic.
camera
and cameraPR
return a data.frame with a row for each set and the following columns:
NGenes |
number of genes in set. |
Correlation |
inter-gene correlation (only included if the |
Direction |
direction of change ( |
PValue |
two-tailed p-value. |
FDR |
Benjamini and Hochberg FDR adjusted p-value. |
interGeneCorrelation
returns a list with components:
vif |
variance inflation factor. |
correlation |
inter-gene correlation. |
The default settings for inter.gene.cor
and allow.neg.cor
were changed to the current values in limma 3.29.6.
Previously, the default was to estimate an inter-gene correlation for each set.
To reproduce the earlier default, use allow.neg.cor=TRUE
and inter.gene.cor=NA
.
Di Wu and Gordon Smyth
Wu, D, and Smyth, GK (2012). Camera: a competitive gene set test accounting for inter-gene correlation. Nucleic Acids Research 40, e133. http://nar.oxfordjournals.org/content/40/17/e133
Goeman, JJ, and Buhlmann, P (2007). Analyzing gene expression data in terms of gene sets: methodological issues. Bioinformatics 23, 980-987.
There is a topic page on 10.GeneSetTests.
y <- matrix(rnorm(1000*6),1000,6) design <- cbind(Intercept=1,Group=c(0,0,0,1,1,1)) # First set of 20 genes are genuinely differentially expressed index1 <- 1:20 y[index1,4:6] <- y[index1,4:6]+1 # Second set of 20 genes are not DE index2 <- 21:40 camera(y, index1, design) camera(y, index2, design) camera(y, list(set1=index1,set2=index2), design, inter.gene.cor=NA) camera(y, list(set1=index1,set2=index2), design, inter.gene.cor=0.01) # Pre-ranked version fit <- eBayes(lmFit(y, design)) cameraPR(fit$t[,2], list(set1=index1,set2=index2))
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.