Unmix samples using loss in a variance stabilized space
Unmixes samples in x
according to pure
components,
using numerical optimization. The components in pure
are added on the scale of gene expression (either normalized counts, or TPMs).
The loss function when comparing fitted expression to the
samples in x
occurs in a variance stabilized space.
This task is sometimes referred to as "deconvolution",
and can be used, for example, to identify contributions from
various tissues.
Note: some groups have found that the mixing contributions
may be more accurate if very lowly expressed genes across x
and pure
are first removed. We have not explored this fully.
Note: if the pbapply
package is installed a progress bar
will be displayed while mixing components are fit.
unmix(x, pure, alpha, shift, power = 1, format = "matrix", quiet = FALSE)
x |
normalized counts or TPMs of the samples to be unmixed |
pure |
normalized counts or TPMs of the "pure" samples |
alpha |
for normalized counts, the dispersion of the data
when a negative binomial model is fit. this can be found by examining
the asymptotic value of |
shift |
for TPMs, the shift which approximately stabilizes the variance
of log shifted TPMs. Can be assessed with |
power |
either 1 (for L1) or 2 (for squared) loss function. Default is 1. |
format |
|
quiet |
suppress progress bar. default is FALSE, show progress bar if pbapply is installed. |
a matrix, the mixture components for each sample in x
(rows).
The "pure" samples make up the columns, and so each row sums to 1.
If colnames existed on the input matrices they will be propagated to the output matrix.
If format="list"
, then a list, containing as elements:
(1) the matrix of mixture components,
(2) the correlations in the variance stabilized space of the fitted samples
to the samples in x
, and
(3) the fitted samples as a matrix with the same dimension as x
.
# some artificial data cts <- matrix(c(80,50,1,100, 1,1,60,100, 0,50,60,100), ncol=4, byrow=TRUE) # make a DESeqDataSet dds <- DESeqDataSetFromMatrix(cts, data.frame(row.names=seq_len(ncol(cts))), ~1) colnames(dds) <- paste0("sample",1:4) # note! here you would instead use # estimateSizeFactors() to do actual normalization sizeFactors(dds) <- rep(1, ncol(dds)) norm.cts <- counts(dds, normalized=TRUE) # 'pure' should also have normalized counts... pure <- matrix(c(10,0,0, 0,0,10, 0,10,0), ncol=3, byrow=TRUE) colnames(pure) <- letters[1:3] # for real data, you need to find alpha after fitting estimateDispersions() mix <- unmix(norm.cts, pure, alpha=0.01)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.