Local Structural Equation Models (LSEM)
Local structural equation models (LSEM) are structural equation models (SEM)
which are evaluated for each value of a pre-defined moderator variable
(Hildebrandt, Wilhelm, & Robitzsch, 2009; Hildebrandt, Luedtke, Robitzsch,
Sommer & Wilhelm, 2016).
As in nonparametric regression models, observations near a focal point - at
which the model is evaluated - obtain higher weights, far distant observations
obtain lower weights. The LSEM can be specified by making use of lavaan syntax.
It is also possible to specify a discretized version of LSEM in which
values of the moderator are grouped and a multiple group SEM is specified.
The LSEM can be tested by employing a permutation test, see
lsem.permutationTest
.
The function lsem.MGM.stepfunctions
outputs stepwise functions
for a multiple group model evaluated at a grid of focal points of the
moderator, specified in moderator.grid
.
The argument pseudo_weights
provides an ad hoc solution to estimate
an LSEM for any model which can be fitted in lavaan.
It is also possible to constrain some of the parameters along the values
of the moderator in a joint estimation approach (est_joint=TRUE
). Parameter
names can be speicified which are assumed to be invariant (in par_invariant
).
In addition, linear or quadratic constraints can be imposed on
parameters (par_linear
or par_quadratic
).
Statistical inference in case of joint estimation (but also for separate estimation)
can be conducted via bootstrap using the function lsem.bootstrap
.
Bootstrap at the level of a cluster identifier is allowed (argument cluster
).
lsem.estimate(data, moderator, moderator.grid, lavmodel, type="LSEM", h=1.1, bw=NULL, residualize=TRUE, fit_measures=c("rmsea", "cfi", "tli", "gfi", "srmr"), standardized=FALSE, standardized_type="std.all", lavaan_fct="sem", sufficient_statistics=FALSE, use_lavaan_survey=FALSE, pseudo_weights=0, sampling_weights=NULL, est_joint=FALSE, par_invariant=NULL, par_linear=NULL, par_quadratic=NULL, partable_joint=NULL, se=NULL, kernel="gaussian", eps=1e-08, verbose=TRUE, ...) ## S3 method for class 'lsem' summary(object, file=NULL, digits=3, ...) ## S3 method for class 'lsem' plot(x, parindex=NULL, ask=TRUE, ci=TRUE, lintrend=TRUE, parsummary=TRUE, ylim=NULL, xlab=NULL, ylab=NULL, main=NULL, digits=3, ...) lsem.MGM.stepfunctions( object, moderator.grid ) # compute local weights lsem_local_weights(data.mod, moderator.grid, h, sampling_weights=NULL, bw=NULL, kernel="gaussian") lsem.bootstrap(object, R=100, verbose=TRUE, cluster=NULL, seed=1, repl_design=NULL, repl_factor=NULL)
data |
Data frame |
moderator |
Variable name of the moderator |
moderator.grid |
Focal points at which the LSEM should be evaluated. If |
lavmodel |
Specified SEM in lavaan. |
type |
Type of estimated model. The default is |
h |
Bandwidth factor |
bw |
Optional bandwidth parameter if |
residualize |
Logical indicating whether a residualization should be applied. |
fit_measures |
Vector with names of fit measures following the labels in lavaan |
standardized |
Optional logical indicating whether
standardized solution should be included as parameters in
the output using the
|
standardized_type |
Type of standardization if |
lavaan_fct |
String whether
|
sufficient_statistics |
Logical whether sufficient statistics of weighted means and covariances should be used for model fitting. This option must be used if the data contain missing values. Note that this approach is only valid for missing completely at random (MCAR) data. The option can only be used for continuous data. |
use_lavaan_survey |
Logical indicating whether estimation should be conducted with lavaan.survey package. |
pseudo_weights |
Integer defining a target sample size. Local weights
are multiplied by a factor which is rounded to integers.
This approach is referred as a pseudo weighting approach.
For example, using |
sampling_weights |
Optional vector of sampling weights |
est_joint |
Logical indicating whether LSEM should be estimated in a joint estimation approach. This options only works wih continuous data and sufficient statistics. |
par_invariant |
Vector of invariant parameters |
par_linear |
Vector of parameters with linear function |
par_quadratic |
Vector of parameters with quadratic function |
partable_joint |
User-defined parameter table if joint estimation is
used ( |
se |
Type of standard error used in |
kernel |
Type of kernel function. Can be |
eps |
Minimum number for weights |
verbose |
Optional logical printing information about computation progress. |
object |
Object of class |
file |
A file name in which the summary output will be written. |
digits |
Number of digits. |
x |
Object of class |
parindex |
Vector of indices for parameters in plot function. |
ask |
A logical which asks for changing the graphic for each parameter. |
ci |
Logical indicating whether confidence intervals should be plotted. |
lintrend |
Logical indicating whether a linear trend should be plotted. |
parsummary |
Logical indicating whether a parameter summary should be displayed. |
ylim |
Plot parameter |
xlab |
Plot parameter |
ylab |
Plot parameter |
main |
Plot parameter |
... |
Further arguments to be passed to |
data.mod |
Observed values of the moderator |
R |
Number of bootstrap samples |
cluster |
Optional variable name for bootstrap at the level of a cluster identifier |
seed |
Used random seed in bootstrap. Note that the seed is only defined locally in this function, it does not affect the seed in the global R environment. |
repl_design |
Optional matrix containing replication weights for computation of
standard errors. Note that sampling weights have to be already included in
|
repl_factor |
Replication factor in variance formula for statistical inference, e.g., 0.05 in PISA. |
List with following entries
parameters |
Data frame with all parameters estimated at focal points of
moderator. Bias-corrected estimates under boostrap can be found in
the column |
weights |
Data frame with weights at each focal point |
parameters_summary |
Summary table for estimated parameters |
parametersM |
Estimated parameters in matrix form. Parameters are in columns and values of the grid of the moderator are in rows. |
bw |
Used bandwidth |
h |
Used bandwidth factor |
N |
Sample size |
moderator.density |
Estimated frequencies and effective sample size for moderator at focal points |
moderator.stat |
Descriptive statistics for moderator |
moderator |
Variable name of moderator |
moderator.grid |
Used grid of focal points for moderator |
moderator.grouped |
Data frame with informations about grouping of
moderator if |
residualized.intercepts |
Estimated intercept functions used for residualization. |
lavmodel |
Used lavaan model |
data |
Used data frame, possibly residualized if |
model_parameters |
Model parameters in LSEM |
parameters_boot |
Parameter values in each bootstrap sample
(for |
fitstats_joint_boot |
Fit statistics in each bootstrap sample
(for |
Alexander Robitzsch, Oliver Luedtke, Andrea Hildebrandt
Hildebrandt, A., Luedtke, O., Robitzsch, A., Sommer, C., & Wilhelm, O. (2016). Exploring factor model parameters across continuous variables with local structural equation models. Multivariate Behavioral Research, 51(2-3), 257-278. doi: 10.1080/00273171.2016.1142856
Hildebrandt, A., Wilhelm, O., & Robitzsch, A. (2009). Complementary and competing factor analytic approaches for the investigation of measurement invariance. Review of Psychology, 16, 87-102.
## Not run: ############################################################################# # EXAMPLE 1: data.lsem01 | Age differentiation ############################################################################# data(data.lsem01, package="sirt") dat <- data.lsem01 # specify lavaan model lavmodel <- " F=~ v1+v2+v3+v4+v5 F ~~ 1*F" # define grid of moderator variable age moderator.grid <- seq(4,23,1) #******************************** #*** Model 1: estimate LSEM with bandwidth 2 mod1 <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid, lavmodel=lavmodel, h=2, std.lv=TRUE) summary(mod1) plot(mod1, parindex=1:5) # perform permutation test for Model 1 pmod1 <- sirt::lsem.permutationTest( mod1, B=10 ) # only for illustrative purposes the number of permutations B is set # to a low number of 10 summary(pmod1) plot(pmod1, type="global") #** estimate Model 1 based on pseudo weights mod1b <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid, lavmodel=lavmodel, h=2, std.lv=TRUE, pseudo_weights=50 ) summary(mod1b) #** estimation with sampling weights # generate random sampling weights set.seed(987) weights <- stats::runif(nrow(dat), min=.4, max=3 ) mod1c <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid, lavmodel=lavmodel, h=2, sampling_weights=weights) summary(mod1c) #******************************** #*** Model 2: estimate multiple group model with 4 age groups # define breaks for age groups moderator.grid <- seq( 3.5, 23.5, len=5) # 4 groups # estimate model mod2 <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid, lavmodel=lavmodel, type="MGM", std.lv=TRUE) summary(mod2) # output step functions smod2 <- sirt::lsem.MGM.stepfunctions( object=mod2, moderator.grid=seq(4,23,1) ) str(smod2) #******************************** #*** Model 3: define standardized loadings as derived variables # specify lavaan model lavmodel <- " F=~ a1*v1+a2*v2+a3*v3+a4*v4 v1 ~~ s1*v1 v2 ~~ s2*v2 v3 ~~ s3*v3 v4 ~~ s4*v4 F ~~ 1*F # standardized loadings l1 :=a1 / sqrt(a1^2 + s1 ) l2 :=a2 / sqrt(a2^2 + s2 ) l3 :=a3 / sqrt(a3^2 + s3 ) l4 :=a4 / sqrt(a4^2 + s4 ) " # estimate model mod3 <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid, lavmodel=lavmodel, h=2, std.lv=TRUE) summary(mod3) plot(mod3) #******************************** #*** Model 4: estimate LSEM and automatically include standardized solutions lavmodel <- " F=~ 1*v1+v2+v3+v4 F ~~ F" mod4 <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid, lavmodel=lavmodel, h=2, standardized=TRUE) summary(mod4) # permutation test (use only few permutations for testing purposes) pmod1 <- sirt::lsem.permutationTest( mod4, B=3 ) #**** compute LSEM local weights wgt <- sirt::lsem_local_weights(data.mod=dat$age, moderator.grid=moderator.grid, h=2)$weights print(str(weights)) #******************************** #*** Model 5: invariance parameter constraints and other constraints lavmodel <- " F=~ 1*v1+v2+v3+v4 F ~~ F" moderator.grid <- seq(4,23,4) #- estimate model without constraints mod5a <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid, lavmodel=lavmodel, h=2, standardized=TRUE) summary(mod5a) # extract parameter names mod5a$model_parameters #- invariance constraints on residual variances par_invariant <- c("F=~v2","v2~~v2") mod5b <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid, lavmodel=lavmodel, h=2, standardized=TRUE, par_invariant=par_invariant) summary(mod5b) #- bootstrap for statistical inference bmod5b <- sirt::lsem.bootstrap(mod5b, R=100) # inspect parameter values and standard errors bmod5b$parameters #- user-defined replication design R <- 100 # bootstrap samples N <- nrow(dat) repl_design <- matrix(0, nrow=N, ncol=R) for (rr in 1:R){ indices <- sort( sample(1:N, replace=TRUE) ) repl_design[,rr] <- sapply(1:N, FUN=function(ii){ sum(indices==ii) } ) } head(repl_design) bmod5b1 <- sirt::lsem.bootstrap(mod5a, repl_design=repl_design, repl_factor=1/R) #- compare model mod5b with joint estimation without constraints mod5c <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid, lavmodel=lavmodel, h=2, standardized=TRUE, est_joint=TRUE) summary(mod5c) #- linear and quadratic functions par_invariant <- c("F=~v1","v2~~v2") par_linear <- c("v1~~v1") par_quadratic <- c("v4~~v4") mod5d <- sirt::lsem.estimate( dat1, moderator="age", moderator.grid=moderator.grid, lavmodel=lavmodel, h=2, par_invariant=par_invariant, par_linear=par_linear, par_quadratic=par_quadratic) summary(mod5d) #- user-defined constraints: step functions for parameters # inspect parameter table (from lavaan) of fitted model pj <- mod5d$partable_joint #* modify parameter table for user-defined constraints # define step function for F=~v1 which is constant on intervals 1:4 and 5:7 pj2 <- pj[ pj$con==1, ] pj2[ c(5,6), "lhs" ] <- "p1g5" pj2 <- pj2[ -4, ] partable_joint <- rbind(pj1, pj2) # estimate model with constraints mod5e <- lsem::lsem.estimate( dat1, moderator="age", moderator.grid=moderator.grid, lavmodel=lavmodel, h=2, std.lv=TRUE, estimator="ML", partable_joint=partable_joint) summary(mod5e) ############################################################################# # EXAMPLE 2: data.lsem01 | FIML with missing data ############################################################################# data(data.lsem01) dat <- data.lsem01 # induce artifical missing values set.seed(98) dat[ runif(nrow(dat)) < .5, c("v1")] <- NA dat[ runif(nrow(dat)) < .25, c("v2")] <- NA # specify lavaan model lavmodel1 <- " F=~ v1+v2+v3+v4+v5 F ~~ 1*F" # define grid of moderator variable age moderator.grid <- seq(4,23,2) #*** estimate LSEM with FIML mod1 <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid, lavmodel=lavmodel1, h=2, std.lv=TRUE, estimator="ML", missing="fiml") summary(mod1) ############################################################################# # EXAMPLE 3: data.lsem01 | WLSMV estimation ############################################################################# data(data.lsem01) dat <- data.lsem01 # create artificial dichotomous data for (vv in 2:6){ dat[,vv] <- 1*(dat[,vv] > mean(dat[,vv])) } # specify lavaan model lavmodel1 <- " F=~ v1+v2+v3+v4+v5 F ~~ 1*F v1 | t1 v2 | t1 v3 | t1 v4 | t1 v5 | t1 " # define grid of moderator variable age moderator.grid <- seq(4,23,2) #*** local WLSMV estimation mod1 <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid, lavmodel=lavmodel1, h=2, std.lv=TRUE, estimator="DWLS", ordered=paste0("v",1:5), residualize=FALSE, pseudo_weights=10000, parameterization="THETA" ) summary(mod1) ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.