Fit Negative Binomial Generalized Linear Models to Multiple Response Vectors: Low Level Functions
Fit the same log-link negative binomial or Poisson generalized linear model (GLM) to each row of a matrix of counts.
mglmOneGroup(y, dispersion = 0, offset = 0, weights = NULL, coef.start = NULL, maxit = 50, tol = 1e-10, verbose = FALSE) mglmOneWay(y, design = NULL, group = NULL, dispersion = 0, offset = 0, weights = NULL, coef.start = NULL, maxit = 50, tol = 1e-10) mglmLevenberg(y, design, dispersion = 0, offset = 0, weights = NULL, coef.start = NULL, start.method = "null", maxit = 200, tol = 1e-06) designAsFactor(design)
y |
numeric matrix containing the negative binomial counts. Rows for genes and columns for libraries. |
design |
numeric matrix giving the design matrix of the GLM.
Assumed to be full column rank.
This is a required argument for |
group |
factor giving group membership for oneway layout.
If both |
dispersion |
numeric scalar or vector giving the dispersion parameter for each GLM. Can be a scalar giving one value for all genes, or a vector of length equal to the number of genes giving genewise dispersions. |
offset |
numeric vector or matrix giving the offset that is to be included in the log-linear model predictor. Can be a scalar, a vector of length equal to the number of libraries, or a matrix of the same size as |
weights |
numeric vector or matrix of non-negative quantitative weights.
Can be a vector of length equal to the number of libraries, or a matrix of the same size as |
coef.start |
numeric matrix of starting values for the linear model coefficients.
Number of rows should agree with |
start.method |
method used to generate starting values when |
tol |
numeric scalar giving the convergence tolerance. For |
maxit |
integer giving the maximum number of iterations for the Fisher scoring algorithm. The iteration will be stopped when this limit is reached even if the convergence criterion hasn't been satisfied. |
verbose |
logical. If |
These functions are low-level work-horses used by higher-level functions in the edgeR package, especially by glmFit
.
Most users will not need to call these functions directly.
The functions mglmOneGroup
, mglmOneWay
and mglmLevenberg
all fit a negative binomial GLM to each row of y
.
The row-wise GLMS all have the same design matrix but possibly different dispersions, offsets and weights.
These functions are all low-level in that they operate on atomic objects (numeric matrices and vectors).
mglmOneGroup
fits an intercept only model to each response vector.
In other words, it treats all the libraries as belonging to one group.
It implements Fisher scoring with a score-statistic stopping criterion for each gene.
Excellent starting values are available for the null model so this function seldom has any problems with convergence.
It is used by other edgeR functions to compute the overall abundance for each gene.
mglmOneWay
fits a oneway layout to each response vector.
It treats the libraries as belonging to a number of groups and calls mglmOneGroup
for each group.
mglmLevenberg
fits an arbitrary log-linear model to each response vector.
It implements a Levenberg-Marquardt modification of the GLM scoring algorithm to prevent divergence.
The main computation is implemented in C++.
All these functions treat the dispersion parameter of the negative binomial distribution as a known input.
designAsFactor
is used to convert a general design matrix into a oneway layout if that is possible.
It determines how many distinct row values the design matrix is capable of computing and returns a factor with a level for each possible distinct value.
mglmOneGroup
produces a numeric vector of coefficients.
mglmOneWay
produces a list with the following components:
coefficients |
matrix of estimated coefficients for the linear models. Rows correpond to rows of |
fitted.values |
matrix of fitted values. Of same dimensions as |
mglmLevenberg
produces a list with the following components:
coefficients |
matrix of estimated coefficients for the linear models. |
fitted.values |
matrix of fitted values. |
deviance |
numeric vector of residual deviances. |
iter |
number of iterations used. |
fail |
logical vector indicating genes for which the maximum damping was exceeded before convergence was achieved. |
designAsFactor
returns a factor of length equal to nrow(design)
.
Gordon Smyth, Yunshun Chen, Davis McCarthy, Aaron Lun. C++ code by Aaron Lun.
McCarthy, DJ, Chen, Y, Smyth, GK (2012). Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Research 40, 4288-4297. https://doi.org/10.1093/nar/gks042
y <- matrix(rnbinom(1000, mu = 10, size = 2), ncol = 4) lib.size <- colSums(y) dispersion <- 0.1 ## Compute intercept for each row beta <- mglmOneGroup(y, dispersion = dispersion, offset = log(lib.size)) ## Unlogged intercepts add to one: sum(exp(beta)) ## Fit the NB GLM to the counts with a given design matrix f1 <- factor(c(1,1,2,2)) f2 <- factor(c(1,2,1,2)) X <- model.matrix(~ f1 + f2) fit <- mglmLevenberg(y, X, dispersion = dispersion, offset = log(lib.size)) head(fit$coefficients)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.