mgcv: XWXd – R documentation

Pricing

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

Get Started for Free

Documentation

mgcv

XWXd

Internal functions for discretized model matrix handling

Description

Routines for computing with discretized model matrices as described in Wood et al. (2017) and Li and Wood (2019).

Usage

XWXd(X,w,k,ks,ts,dt,v,qc,nthreads=1,drop=NULL,ar.stop=-1,ar.row=-1,ar.w=-1,
     lt=NULL,rt=NULL)
XWyd(X,w,y,k,ks,ts,dt,v,qc,drop=NULL,ar.stop=-1,ar.row=-1,ar.w=-1,lt=NULL)
Xbd(X,beta,k,ks,ts,dt,v,qc,drop=NULL,lt=NULL)
diagXVXd(X,V,k,ks,ts,dt,v,qc,drop=NULL,nthreads=1,lt=NULL,rt=NULL)

Arguments

`X`	A list of the matrices containing the unique rows of model matrices for terms of a full model matrix, or the model matrices of the terms margins. if term subsetting arguments `lt` and `rt` are non-NULL then this requires an `"lpip"` attribute: see details. The elements of `X` may be sparse matrices of class `"dgCMatrix"`, in which case the list requires attributes `"r"` and `"off"` defining reverse indices (see details).
`w`	An n-vector of weights
`y`	n-vector of data.
`beta`	coefficient vector.
`k`	A matrix whose columns are index n-vectors each selecting the rows of an X[[i]] required to create the full matrix.
`ks`	The ith term has index vectors `ks[i,1]:(ks[i,2]-1)`. The corresponing full model matrices are summed over.
`ts`	The element of `X` at which each model term starts.
`dt`	How many elements of `X` contribute to each term.
`v`	`v[[i]]` is Householder vector for ith term, if `qc[i]>0`.
`qc`	if `qc[i]>0` then term has a constraint.
`nthreads`	number of threads to use
`drop`	list of columns of model matrix/parameters to drop
`ar.stop`	Negative to ignore. Otherwise sum rows `(ar.stop[i-1]+1):ar.stop[i]` of the rows selected by `ar.row` and weighted by `ar.w` to get ith row of model matrix to use.
`ar.row`	extract these rows...
`ar.w`	weight by these weights, and sum up according to `ar.stop`. Used to implement AR models.
`lt`	use only columns of X corresponding to these model matrix terms (for left hand `X` in `XWXd`). If `NULL` set to `rt`.
`rt`	as `lt` for right hand `X`. If `NULL` set to `lt`. If `lt` and `rt` are `NULL` use all columns.
`V`	Coefficient covariance matrix.

Details

These functions are really intended to be internal, but are exported so that they can be used in the initialization code of families without problem. They are primarily used by bam to implement the methods given in the references. XWXd produces X'WX, XWy produces X'Wy, Xbd produces Xb and diagXVXd produces the diagonal of XVX'.

The "lpip" attribute of X is a list of the coefficient indices for each term. Required if subsetting via lt and rt.

X can be a list of sparse matrices of class "dgCMatrix", in which case reverse indices are needed, mapping stored matrix rows to rows in the full matrix (that is the reverse of k which maps full matrix rows to the stored unique matrix rows). r is the same dimension as k while off is a list with as many elements as k has columns. r and off are supplied as attributes to X . For simplicity let r and codeoff denote a single column and element corresponding to each other: then coder[off[j]:(off[j+1]-1)] contains the rows of the full matrix corresponding to row j of the stored matrix. The reverse indices are essential for efficient computation with sparse matrices. See the example code for how to create them efficiently from the forward index matrix, k.

Author(s)

Simon N. Wood simon.wood@r-project.org

References

Wood, S.N., Li, Z., Shaddick, G. & Augustin N.H. (2017) Generalized additive models for gigadata: modelling the UK black smoke network daily data. Journal of the American Statistical Association. 112(519):1199-1210 doi: 10.1080/01621459.2016.1195744

Li, Z & S.N. Wood (2019) Faster model matrix crossproducts for large generalized linear models with discretized covariates. Statistics and Computing. doi: 10.1007/s11222-019-09864-2

Examples

library(mgcv);library(Matrix)
  ## simulate some data creating a marginal matrix sequence...
  set.seed(0);n <- 4000
  dat <- gamSim(1,n=n,dist="normal",scale=2)
  dat$x4 <- runif(n)
  dat$y <- dat$y + 3*exp(dat$x4*15-5)/(1+exp(dat$x4*15-5))
  dat$fac <- factor(sample(1:20,n,replace=TRUE))
  G <- gam(y ~ te(x0,x2,k=5,bs="bs",m=1)+s(x1)+s(x4)+s(x3,fac,bs="fs"),
           fit=FALSE,data=dat,discrete=TRUE)
  p <- ncol(G$X)
  ## create a sparse version...
  Xs <- list(); r <- G$kd*0; off <- list()
  for (i in 1:length(G$Xd)) Xs[[i]] <- as(G$Xd[[i]],"dgCMatrix")
  for (j in 1:nrow(G$ks)) { ## create the reverse indices...
    nr <- nrow(Xs[[j]]) ## make sure we always tab to final stored row 
    for (i in G$ks[j,1]:(G$ks[j,2]-1)) {
      r[,i] <- (1:length(G$kd[,i]))[order(G$kd[,i])]
      off[[i]] <- cumsum(c(1,tabulate(G$kd[,i],nbins=nr)))-1
    }
  }
  attr(Xs,"off") <- off;attr(Xs,"r") <- r 

  par(mfrow=c(2,3))

  beta <- runif(p)
  Xb0 <- Xbd(G$Xd,beta,G$kd,G$ks,G$ts,G$dt,G$v,G$qc)
  Xb1 <- Xbd(Xs,beta,G$kd,G$ks,G$ts,G$dt,G$v,G$qc)
  range(Xb0-Xb1);plot(Xb0,Xb1,pch=".")

  bb <- cbind(beta,beta+runif(p)*.3)
  Xb0 <- Xbd(G$Xd,bb,G$kd,G$ks,G$ts,G$dt,G$v,G$qc)
  Xb1 <- Xbd(Xs,bb,G$kd,G$ks,G$ts,G$dt,G$v,G$qc)
  range(Xb0-Xb1);plot(Xb0,Xb1,pch=".")
  
  w <- runif(n)
  XWy0 <- XWyd(G$Xd,w,y=dat$y,G$kd,G$ks,G$ts,G$dt,G$v,G$qc) 
  XWy1 <- XWyd(Xs,w,y=dat$y,G$kd,G$ks,G$ts,G$dt,G$v,G$qc)
  range(XWy1-XWy0);plot(XWy1,XWy0,pch=".")

  yy <- cbind(dat$y,dat$y+runif(n)-.5)
  XWy0 <- XWyd(G$Xd,w,y=yy,G$kd,G$ks,G$ts,G$dt,G$v,G$qc) 
  XWy1 <- XWyd(Xs,w,y=yy,G$kd,G$ks,G$ts,G$dt,G$v,G$qc)
  range(XWy1-XWy0);plot(XWy1,XWy0,pch=".")

  A <- XWXd(G$Xd,w,G$kd,G$ks,G$ts,G$dt,G$v,G$qc)
  B <- XWXd(Xs,w,G$kd,G$ks,G$ts,G$dt,G$v,G$qc)
  range(A-B);plot(A,B,pch=".")

  V <- crossprod(matrix(runif(p*p),p,p))
  ii <- c(20:30,100:200)
  jj <- c(50:90,150:160)
  V[ii,jj] <- 0;V[jj,ii] <- 0
  d1 <- diagXVXd(G$Xd,V,G$kd,G$ks,G$ts,G$dt,G$v,G$qc)
  Vs <- as(V,"dgCMatrix")
  d2 <- diagXVXd(Xs,Vs,G$kd,G$ks,G$ts,G$dt,G$v,G$qc)
  range(d1-d2);plot(d1,d2,pch=".")

mgcv

Mixed GAM Computation Vehicle with Automatic Smoothness Estimation

v1.8-35

GPL (>= 2)

Authors

Simon Wood <simon.wood@r-project.org>

Initial release

XWXd

Description

Usage

Arguments

Details

Author(s)

References

Examples

mgcv

We don't support your browser anymore