Compute trace of a large matrix by sample means
Some matrices are too large to be represented as a matrix, even as a sparse matrix. Nevertheless, it can be possible to compute the matrix vector product fairly easy, and this is utilized to estimate the trace of the matrix.
mctrace(mat, N, tol = 0.001, maxsamples = Inf, trname = "", init)
mat |
square matrix, Matrix, function or list of factors. |
N |
integer. if |
tol |
numeric. Tolerance. |
maxsamples |
numeric. Maximum number of samples in the expectation estimation. |
trname |
character. Arbitrary name used in progress reports. |
init |
numeric. Initial guess for the trace. |
For any matrix A, the trace equals the sum of the diagonal elements,
or the sum of the eigenvalues. However, if the size of the matrix is very
large, we may not have a matrix representation, so the diagonal is not
immediately available. In that case we can use the formula tr(A) = E(x'Ax) where x is a random vector with zero
expectation and Var(x) = I. We estimate the expectation with sample
means. mctrace
draws x in {-1,1}, and
evaluates mat
on these vectors.
If mat
is a function, it must be able to take a matrix of column
vectors as input. Since x'Ax = (Ax,x) is evaluated,
where (,) is the Euclidean inner product, the function
mat
can perform this inner product itself. In that case the function
should have an attribute attr(mat,'IP') <- TRUE
to signal this.
If mat
is a list of factors, the matrix for which to estimate the
trace, is the projection matrix which projects out the factors. I.e. how
many dimensions are left when the factors have been projected out. Thus, it
is possible to estimate the degrees of freedom in an OLS where factors are
projected out.
The tolerance tol
is a relative tolerance. The iteration terminates
when the normalized standard deviation of the sample mean (s.d. divided by
absolute value of the current sample mean) goes below tol
. Specify a
negative tol
to use the absolute standard deviation. The tolerance
can also change during the iterations; you can specify
tol=function(curest) {...}
and return a tolerance based on the
current estimate of the trace (i.e. the current sample mean).
An estimate of the trace of the matrix represented by mat
is
returned.
A <- matrix(rnorm(25),5) fun <- function(x) A %*% x sum(diag(A)) sum(eigen(A,only.values=TRUE)$values) # mctrace is not really useful for small problems. mctrace(fun,ncol(A),tol=0.05) # try a larger problem (3000x3000): f1 <- factor(sample(1500,3000,replace=TRUE)) f2 <- factor(sample(1500,3000,replace=TRUE)) fl <- list(f1,f2) mctrace(fl,tol=-5) # exact: length(f1) - nlevels(f1) - nlevels(f2) + nlevels(compfactor(fl))
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.