Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

removeBatchEffect

Remove Batch Effect


Description

Remove batch effects from expression data.

Usage

removeBatchEffect(x, batch=NULL, batch2=NULL, covariates=NULL,
                  design=matrix(1,ncol(x),1), ...)

Arguments

x

numeric matrix, or any data object that can be processed by getEAWP containing log-expression values for a series of samples. Rows correspond to probes and columns to samples.

batch

factor or vector indicating batches.

batch2

factor or vector indicating a second series of batches.

covariates

matrix or vector of numeric covariates to be adjusted for.

design

design matrix relating to treatment conditions to be preserved, usually the design matrix with all experimental factors other than the batch effects.

...

other arguments are passed to lmFit.

Details

This function is useful for removing unwanted batch effects, associated with hybridization time or other technical variables, ready for plotting or unsupervised analyses such as PCA, MDS or heatmaps. The design matrix is used to describe comparisons between the samples, for example treatment effects, that should not be removed. The function (in effect) fits a linear model to the data, including both batches and regular treatments, then removes the component due to the batch effects.

In most applications, only the first batch argument will be needed. This case covers the situation where the data has been collected in a series of separate batches.

The batch2 argument is used when there is a second series of batch effects, independent of the first series. For example, batch might correspond to time of data collection while batch2 might correspond to operator or some other change in operating characteristics. If batch2 is included, then the effects of batch and batch2 are assumed to be additive.

The covariates argument allows correction for one or more continuous numeric effects, similar to the analysis of covariance method in statistics. If covariates contains more than one column, then the columns are assumed to have additive effects. Setting covariates to be a design matrix constructed from batch effects and technical effects allows very general batch effects to be accounted for.

The data object x can be of any class for which lmFit works. If x contains weights, then these will be used in estimating the batch effects.

Value

A numeric matrix of log-expression values with batch and covariate effects removed.

Note

This function is not intended to be used prior to linear modelling. For linear modelling, it is better to include the batch factors in the linear model.

Author(s)

Gordon Smyth and Carolyn de Graaf

See Also

Examples

y <- matrix(rnorm(10*9),10,9)
y[,1:3] <- y[,1:3] + 5
batch <- c("A","A","A","B","B","B","C","C","C")
y2 <- removeBatchEffect(y, batch)
par(mfrow=c(1,2))
boxplot(as.data.frame(y),main="Original")
boxplot(as.data.frame(y2),main="Batch corrected")

limma

Linear Models for Microarray Data

v3.46.0
GPL (>=2)
Authors
Gordon Smyth [cre,aut], Yifang Hu [ctb], Matthew Ritchie [ctb], Jeremy Silver [ctb], James Wettenhall [ctb], Davis McCarthy [ctb], Di Wu [ctb], Wei Shi [ctb], Belinda Phipson [ctb], Aaron Lun [ctb], Natalie Thorne [ctb], Alicia Oshlack [ctb], Carolyn de Graaf [ctb], Yunshun Chen [ctb], Mette Langaas [ctb], Egil Ferkingstad [ctb], Marcus Davy [ctb], Francois Pepin [ctb], Dongseok Choi [ctb]
Initial release
2020-10-19

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.