Plot of regularized linear discriminant functions for microarray data
Plot regularized linear discriminant functions for classifying samples based on expression data.
plotRLDF(y, design = NULL, z = NULL, nprobes = 100, plot = TRUE, labels.y = NULL, labels.z = NULL, pch.y = NULL, pch.z = NULL, col.y = "black", col.z = "black", show.dimensions = c(1,2), ndim = max(show.dimensions), var.prior = NULL, df.prior = NULL, trend = FALSE, robust = FALSE, ...)
y |
the training dataset. Can be any data object which can be coerced to a matrix, such as |
design |
design matrix defining the training groups to be distinguished. The first column is assumed to represent the intercept.
Defaults to |
z |
the dataset to be classified. Can be any data object which can be coerced to a matrix, such as |
nprobes |
number of probes to be used for the calculations. The probes will be selected by moderated F statistic. |
plot |
logical, should a plot be created? |
labels.y |
character vector of sample names or labels in |
labels.z |
character vector of sample names or labels in |
pch.y |
plotting symbol or symbols for |
pch.z |
plotting symbol or symbols for |
col.y |
colors for the plotting |
col.z |
colors for the plotting |
show.dimensions |
integer vector of length two indicating which two discriminant functions to plot. Functions are in decreasing order of discriminatory power. |
ndim |
number of discriminant functions to compute |
var.prior |
prior variances, for regularizing the within-group covariance matrix. By default is estimated by |
df.prior |
prior degrees of freedom for regularizing the within-group covariance matrix. By default is estimated by |
trend |
logical, should a trend be estimated for |
robust |
logical, should |
... |
any other arguments are passed to |
The function builds discriminant functions from the training data (y
) and applies them to the test data (z
).
The method is a variation on classifical linear discriminant functions (LDFs), in that the within-group covariance matrix is regularized to ensure that it is invertible, with eigenvalues bounded away from zero.
The within-group covariance matrix is squeezed towards a diagonal matrix with empirical Bayes posterior variances as diagonal elements.
The calculations are based on a filtered list of probes.
The nprobes
probes with largest moderated F statistics are used to discriminate.
The ndim
argument allows all required LDFs to be computed even though only two are plotted.
If plot=TRUE
a plot is created on the current graphics device.
A list containing the following components is (invisibly) returned:
training |
numeric matrix with |
predicting |
numeric matrix with |
top |
integer vector of length |
metagenes |
numeric matrix with |
singular.values |
singular.values showing the predictive power of each discriminant function. |
rank |
maximum number of discriminant functions with singular.values greater than zero. |
var.prior |
numeric vector of prior variances. |
df.prior |
numeric vector of prior degrees of freedom. |
The default values for df.prior
and var.prior
were changed in limma 3.27.10.
Previously these were preset values.
Now the default is to estimate them using squeezeVar
.
Gordon Smyth, Di Wu and Yifang Hu
lda
in package MASS
# Simulate gene expression data for 1000 probes and 6 microarrays. # Samples are in two groups # First 50 probes are differentially expressed in second group sd <- 0.3*sqrt(4/rchisq(1000,df=4)) y <- matrix(rnorm(1000*6,sd=sd),1000,6) rownames(y) <- paste("Gene",1:1000) y[1:50,4:6] <- y[1:50,4:6] + 2 z <- matrix(rnorm(1000*6,sd=sd),1000,6) rownames(z) <- paste("Gene",1:1000) z[1:50,4:6] <- z[1:50,4:6] + 1.8 z[1:50,1:3] <- z[1:50,1:3] - 0.2 design <- cbind(Grp1=1,Grp2vs1=c(0,0,0,1,1,1)) options(digit=3) # Samples 1-6 are training set, samples a-f are test set: plotRLDF(y, design, z=z, col.y="black", col.z="red") legend("top", pch=16, col=c("black","red"), legend=c("Training","Predicted"))
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.