Linear Model Diagnostics for Constrained Ordination
This set of function extracts influence statistics and some other
linear model statistics directly from a constrained ordination result
object from cca
, rda
,
capscale
or dbrda
. The constraints are
linear model functions and these support functions return identical
results as the corresponding linear models (lm
), and you
can use their documentation. The main functions for normal usage are
leverage values (hatvalues
), standardized residuals
(rstandard
), studentized or leave-one-out residuals
(rstudent
), and Cook's distance
(cooks.distance
). In addition, vcov
returns the variance-covariance matrix of coefficients, and its
diagonal values the variances of coefficients. Other functions are
mainly support functions for these, but they can be used directly.
## S3 method for class 'cca' hatvalues(model, ...) ## S3 method for class 'cca' rstandard(model, type = c("response", "canoco"), ...) ## S3 method for class 'cca' rstudent(model, type = c("response", "canoco"), ...) ## S3 method for class 'cca' cooks.distance(model, type = c("response", "canoco"), ...) ## S3 method for class 'cca' sigma(object, type = c("response", "canoco"), ...) ## S3 method for class 'cca' vcov(object, type = "canoco", ...) ## S3 method for class 'cca' SSD(object, type = "canoco", ...) ## S3 method for class 'cca' qr(x, ...) ## S3 method for class 'cca' df.residual(object, ...)
model, object, x |
A constrained ordination result object. |
type |
Type of statistics used for extracting raw residuals and
residual standard deviation ( |
... |
Other arguments to functions (ignored). |
The vegan algorithm for constrained ordination uses linear model
(or weighted linear model in cca
) to find the fitted
values of dependent community data, and constrained ordination is
based on this fitted response (Legendre & Legendre 2012). The
hatvalues
give the leverage values of these constraints,
and the leverage is independent on the response data. Other influence
statistics (rstandard
, rstudent
,
cooks.distance
) are based on leverage, and on the raw
residuals and residual standard deviation (sigma
). With
type = "response"
the raw residuals are given by the
unconstrained component of the constrained ordination, and influence
statistics are a matrix with dimensions no. of observations times
no. of species. For cca
the statistics are the same as
obtained from the lm
model using Chi-square standardized
species data (see decostand
) as dependent variable, and
row sums of community data as weights, and for rda
the
lm
model uses non-modified community data and no
weights.
The algorithm in the CANOCO software constraints the results during
iteration by performing a linear regression of weighted averages (WA)
scores on constraints and taking the fitted values of this regression
as linear combination (LC) scores (ter Braak 1984). The WA scores are
directly found from species scores, but LC scores are linear
combinations of constraints in the regression. With type =
"canoco"
the raw residuals are the differences of WA and LC scores,
and the residual standard deviation (sigma
) is taken to
be the axis sum of squared WA scores minus one. These quantities have
no relationship to residual component of ordination, but they rather
are methodological artefacts of an algorithm that is not used in
vegan. The result is a matrix with dimensions no. of
observations times no. of constrained axes.
Function vcov
returns the matrix of variances and
covariances of regression coefficients. The diagonal values of this
matrix are the variances, and their square roots give the standard
errors of regression coefficients. The function is based on
SSD
that extracts the sum of squares and crossproducts
of residuals. The residuals are defined similarly as in influence
measures and with each type
they have similar properties and
limitations, and define the dimensions of the result matrix.
Function as.mlm
casts an ordination object to a multiple
linear model of class "mlm"
(see lm
), and similar
statistics can be derived from that modified object as with this set
of functions. However, there are some problems in the R
implementation of the further analysis of multiple linear model
objects. When the results differ, the current set of functions is more
probable to be correct. The use of as.mlm
objects should be
avoided.
Jari Oksanen
Legendre, P. and Legendre, L. (2012) Numerical Ecology. 3rd English ed. Elsevier.
ter Braak, C.J.F. (1984–): CANOCO – a FORTRAN program for canonical community ordination by [partial] [detrended] [canonical] correspondence analysis, principal components analysis and redundancy analysis. TNO Inst. of Applied Computer Sci., Stat. Dept. Wageningen, The Netherlands.
Corresponding lm
methods and
as.mlm.cca
. Function ordiresids
provides
lattice graphics for residuals.
data(varespec, varechem) mod <- cca(varespec ~ Al + P + K, varechem) ## leverage hatvalues(mod) plot(hatvalues(mod), type = "h") ## ordination plot with leverages: points with high leverage have ## similar LC and WA scores plot(mod, type = "n") ordispider(mod) # segment from LC to WA scores points(mod, dis="si", cex=5*hatvalues(mod), pch=21, bg=2) # WA scores text(mod, dis="bp", col=4) ## deviation and influence head(rstandard(mod)) head(cooks.distance(mod)) ## Influence measures from lm y <- decostand(varespec, "chi.square") # needed in cca y1 <- with(y, Cladstel) # take one species for lm lmod1 <- lm(y1 ~ Al + P + K, varechem, weights = rowSums(varespec)) ## numerically identical within 2e-15 range(cooks.distance(lmod1) - cooks.distance(mod)[, "Cladstel"]) ## t-values of regression coefficients based on type = "canoco" ## residuals coef(mod) coef(mod)/sqrt(diag(vcov(mod, type = "canoco")))
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.