Create correlation matrices or data matrices with a particular measurement and structural model
Structural Equation Models decompose correlation or correlation matrices into a measurement (factor) model and a structural (regression) model. sim.structural creates data sets with known measurement and structural properties. Population or sample correlation matrices with known properties are generated. Optionally raw data are produced.
It is also possible to specify a measurement model for a set of x variables separately from a set of y variables. They are then combined into one model with the correlation structure between the two sets.
Finally, the general case is given a population correlation matrix, generate data that will reproduce (with sampling variability) that correlation matrix. sim.correlation
.
sim.structure(fx=NULL,Phi=NULL, fy=NULL, f=NULL, n=0, uniq=NULL, raw=TRUE, items = FALSE, low=-2,high=2,d=NULL,cat=5, mu=0) sim.structural(fx=NULL, Phi=NULL, fy=NULL, f=NULL, n=0, uniq=NULL, raw=TRUE, items = FALSE, low=-2,high=2,d=NULL,cat=5, mu=0) #deprecated simCor(R,n=1000,data=FALSE,scale=TRUE, skew=c("none","log","lognormal", "sqrt","abs"),vars=NULL,latent=FALSE,quant=NULL) sim.correlation(R,n=1000,data=FALSE,scale=TRUE, skew=c("none","log","lognormal", "sqrt","abs"),vars=NULL,latent=FALSE,quant=NULL)
fx |
The measurement model for x |
Phi |
The structure matrix of the latent variables |
fy |
The measurement model for y |
f |
The measurement model |
n |
Number of cases to simulate. If n=0, the population matrix is returned. |
uniq |
The uniquenesses if creating a covariance matrix |
raw |
if raw=TRUE, raw data are returned as well for n > 0. |
items |
TRUE if simulating items, FALSE if simulating scales |
low |
Restrict the item difficulties to range from low to high |
high |
Restrict the item difficulties to range from low to high |
d |
A vector of item difficulties, if NULL will range uniformly from low to high |
cat |
Number of categories when creating binary (2) or polytomous items |
mu |
A vector of means, defaults to 0 |
R |
The correlation matrix to reproduce |
data |
if TRUE, return the raw data, otherwise return the sample correlation matrix. |
scale |
standardize the simulated data? |
skew |
Defaults to none (the multivariate normal case. Alternatives take the log, the squareroot, or the absolute value of latent or observed data ) |
vars |
Apply the skewing or cuts to just these variables. If NULL, to all the variables/ |
latent |
Should the skewing transforms be applied to the latent variables, or the observed variables? |
quant |
Either a single number or a vector length nvar. The data will be dichotomized at quant. |
Given the measurement model, fx and the structure model Phi, the model is f %*% Phi %*% t(f). Reliability is f %*% t(f). f φ f' and the reliability for each test is the items communality or just the diag of the model.
If creating a correlation matrix, (uniq=NULL) then the diagonal is set to 1, otherwise the diagonal is diag(model) + uniq and the resulting structure is a covariance matrix.
A special case of a structural model are one factor models such as parallel tests, tau equivalent tests, and congeneric tests. These may be created by letting the structure matrix = 1 and then defining a vector of factor loadings. Alternatively, sim.congeneric
will do the same.
The general case is to use simCor
aka sim.correlation
which will create data sampled from a specified correlation matrix for a particular sample size. If desired, it will just return the sample correlation matrix. With data=TRUE, it will return the sample data as well. It uses an eigen value decomposition of the original matrix times a matrix of random normal deviates (code adapted from the mvnorm function of Brian Ripley's MASS package). These resulting scores may be transformed using a number of transforms (see the skew option) or made into dichotomous variables (see quant option) for all or a select set (vars option) of the variables.
model |
The implied population correlation or covariance matrix |
reliability |
The population reliability values |
r |
The sample correlation or covariance matrix |
observed |
If raw=TRUE, a sample data matrix |
William Revelle
Revelle, W. (in preparation) An Introduction to Psychometric Theory with applications in R. Springer. at https://personality-project.org/r/book/
make.hierarchical
for another structural model and make.congeneric
for the one factor case. structure.list
and structure.list
for making symbolic structures.
#First, create a sem like model with a factor model of x and ys with correlation Phi fx <-matrix(c( .9,.8,.6,rep(0,4),.6,.8,-.7),ncol=2) fy <- matrix(c(.6,.5,.4),ncol=1) rownames(fx) <- c("V","Q","A","nach","Anx") rownames(fy)<- c("gpa","Pre","MA") Phi <-matrix( c(1,0,.7,.0,1,.7,.7,.7,1),ncol=3) #now create this structure gre.gpa <- sim.structural(fx,Phi,fy) print(gre.gpa,2) #correct for attenuation to see structure #the raw correlations are below the diagonal, the adjusted above round(correct.cor(gre.gpa$model,gre.gpa$reliability),2) #These are the population values, # we can also create a correlation matrix sampled from this population GRE.GPA <- sim.structural(fx,Phi,fy,n=250,raw=FALSE) lowerMat(GRE.GPA$r) #or we can show data sampled from such a population GRE.GPA <- sim.structural(fx,Phi,fy,n=250,raw=TRUE) lowerCor(GRE.GPA$observed) congeneric <- sim.structure(f=c(.9,.8,.7,.6)) # a congeneric model congeneric #now take this correlation matrix as a population value and create samples from it example.congeneric <- sim.correlation(congeneric$model,n=200) #create a sample matrix lowerMat(example.congeneric ) #show the correlation matrix #or create another sample and show the data example.congeneric.data <- simCor(congeneric$model,n=200,data=TRUE) describe(example.congeneric.data) lowerCor(example.congeneric.data ) example.skewed <- simCor(congeneric$model,n=200,vars=c(1,2),data=TRUE,skew="log") describe(example.skewed)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.