Simulated data set for testing sparse-precision code
This is used in tests/test-adjacency-long.R
data("adjlg")
Includes an adjacency matrix adjlgMat
. and a data frame adjlg
with 5474 observations on the following 8 variables.
ID
a factor with levels 1
to 1000
months
a numeric vector
GENDER
a character vector
AGE
a numeric vector
X1
a numeric vector
X2
a numeric vector
month
a numeric vector
BUY
a numeric vector
The simulation code shown below is derived from an example produced by Jeroen van den Ochtend. Following a change incorporated in spaMM version 3.8.0, that implied stricter checks of the input matrix, it appeared that the precision matrix generated by this example had inappropriate (repeated) dimnames. This example was then updated to reproduce past fitting results with a correctly formatted matrix. Note that changing the names of an adjacency matrix (as below) is generally unwise as it generally changes the statistical model because these names are matched whenever possible to levels of the grouping factor in the data.
The code was also modified to compensate for changes in R's default random number generator.
data(adjlg) ## See further usage in tests/test-adjacency-long.R ## Not run: # as produced by: library(data.table) ## Included data produced using version 1.10.4.3 library(igraph) ## Included data produced using version 1.2.1 rsample <- function(N=100, ## size of implied adjacency matrix month_max=10,seed) { if (is.integer(seed)) set.seed(seed) dt <- data.table(ID=factor(1:N)) dt$months <- sample(1:month_max,N,replace=T) ## # of liens for each level of ID dt$GENDER <- sample(c("MALE","FEMALE"),N,replace=TRUE) dt$AGE <- sample(18:99,N,replace=T) dt$X1 <- sample(1000:9900,N,replace=T) dt$X2 <- runif(N) dt <- dt[, c(.SD, month=data.table(seq(from=1, to=months, by = 1))), by = ID] dt[,BUY := 0] dt[month.V1==months,BUY := sample(c(0,1),1),by=ID] setnames(dt,"month.V1","month") #### create adjacency matrix Network <- data.table(OUT=sample(dt$ID,N*month_max*4/10)) Network$IN <- sample(dt$ID,N*month_max*4/10) Network <- Network[IN != OUT] Network <- unique(Network) g <- graph.data.frame(Network,directed=F) g <- add_vertices(g,sum(!unique(dt$ID) %in% V(g)), name=unique(dt[!dt$ID %in% V(g),list(ID)])) # => improper names Network <- as_adjacency_matrix(g,sparse = TRUE,type="both") colnames(Network) <- rownames(Network) <- seq(nrow(Network)) # post-v3.8.0 names return(list(data=dt,adjMatrix=Network)) } RNGkind("Mersenne-Twister", "Inversion", "Rounding" ) set.seed(123) adjlg_sam <- rsample(N=1000,seed=NULL) RNGkind("Mersenne-Twister", "Inversion", "Rejection" ) # adjlg <- as.data.frame(adjlg_sam$data) adjlgMat <- adjlg_sam$adjMatrix ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.