DESeq2: collapseReplicates – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

collapseReplicates

Collapse technical replicates in a RangedSummarizedExperiment or DESeqDataSet

Description

Collapses the columns in object by summing within levels of a grouping factor groupby. The purpose of this function is to sum up read counts from technical replicates to create an object with a single column of read counts for each sample. Note: by "technical replicates", we mean multiple sequencing runs of the same library, in constrast to "biological replicates" in which multiple libraries are prepared from separate biological units. Optionally renames the columns of returned object with the levels of the grouping factor. Note: this function is written very simply and can be easily altered to produce other behavior by examining the source code.

Usage

collapseReplicates(object, groupby, run, renameCols = TRUE)

Arguments

`object`	A `RangedSummarizedExperiment` or `DESeqDataSet`
`groupby`	a grouping factor, as long as the columns of object
`run`	optional, the names of each unique column in object. if provided, a new column `runsCollapsed` will be added to the `colData` which pastes together the names of `run`
`renameCols`	whether to rename the columns of the returned object using the levels of the grouping factor

Value

the object with as many columns as levels in groupby. This object has assay/count data which is summed from the various columns which are grouped together, and the colData is subset using the first column for each group in groupby.

Examples

dds <- makeExampleDESeqDataSet(m=12)

# make data with two technical replicates for three samples
dds$sample <- factor(sample(paste0("sample",rep(1:9, c(2,1,1,2,1,1,2,1,1)))))
dds$run <- paste0("run",1:12)

ddsColl <- collapseReplicates(dds, dds$sample, dds$run)

# examine the colData and column names of the collapsed data
colData(ddsColl)
colnames(ddsColl)

# check that the sum of the counts for "sample1" is the same
# as the counts in the "sample1" column in ddsColl
matchFirstLevel <- dds$sample == levels(dds$sample)[1]
stopifnot(all(rowSums(counts(dds[,matchFirstLevel])) == counts(ddsColl[,1])))

DESeq2

Differential gene expression analysis based on the negative binomial distribution

v1.30.1

LGPL (>= 3)

Authors

Michael Love [aut, cre], Constantin Ahlmann-Eltze [ctb], Kwame Forbes [ctb], Simon Anders [aut, ctb], Wolfgang Huber [aut, ctb]

Initial release