Estimating the Generalized DINA (GDINA) Model
This function implements the generalized DINA model for dichotomous
attributes (GDINA; de la Torre, 2011) and polytomous attributes
(pGDINA; Chen & de la Torre, 2013, 2018).
In addition, multiple group estimation
is also possible using the gdina
function. This function also
allows for the estimation of a higher order GDINA model
(de la Torre & Douglas, 2004).
Polytomous item responses are treated by specifying a sequential
GDINA model (Ma & de la Torre, 2016; Tutz, 1997).
The simulataneous modeling of skills and misconceptions (bugs) can be
also estimated within the GDINA framework (see Kuo, Chen & de la Torre, 2018;
see argument rule
).
The estimation can also be conducted by posing monotonocity
constraints (Hong, Chang, & Tsai, 2016) using the argument mono.constr
.
Moreover, regularization methods SCAD, lasso, ridge, SCAD-L2 and
truncated L_1 penalty (TLP) for item parameters
can be employed (Xu & Shang, 2018).
Normally distributed priors can be specified for item parameters (item intercepts and item slopes). Note that (for convenience) the prior specification holds simultaneously for all items.
gdina(data, q.matrix, skillclasses=NULL, conv.crit=0.0001, dev.crit=.1, maxit=1000, linkfct="identity", Mj=NULL, group=NULL, invariance=TRUE,method=NULL, delta.init=NULL, delta.fixed=NULL, delta.designmatrix=NULL, delta.basispar.lower=NULL, delta.basispar.upper=NULL, delta.basispar.init=NULL, zeroprob.skillclasses=NULL, attr.prob.init=NULL, reduced.skillspace=NULL, reduced.skillspace.method=2, HOGDINA=-1, Z.skillspace=NULL, weights=rep(1, nrow(data)), rule="GDINA", bugs=NULL, regular_lam=0, regular_type="none", regular_alpha=NA, regular_tau=NA, regular_weights=NULL, mono.constr=FALSE, prior_intercepts=NULL, prior_slopes=NULL, progress=TRUE, progress.item=FALSE, mstep_iter=10, mstep_conv=1E-4, increment.factor=1.01, fac.oldxsi=0, max.increment=.3, avoid.zeroprobs=FALSE, seed=0, save.devmin=TRUE, calc.se=TRUE, se_version=1, PEM=TRUE, PEM_itermax=maxit, cd=FALSE, cd_steps=1, mono_maxiter=10, freq_weights=FALSE, optimizer="CDM", ...) ## S3 method for class 'gdina' summary(object, digits=4, file=NULL, ...) ## S3 method for class 'gdina' plot(x, ask=FALSE, ...) ## S3 method for class 'gdina' print(x, ...)
data |
A required N \times J data matrix
containing integer responses, 0, 1, ..., K. Polytomous
item responses are treated by the sequential GDINA model.
|
q.matrix |
A required integer J \times K matrix containing attributes not required or required, 0 or 1, to master the items in case of dichotomous attributes or integers in case of polytomous attributes. For polytomous item responses the Q-matrix must also include the item name and item category, see Example 11. |
skillclasses |
An optional matrix for determining the skill space. The argument can be used if a user wants less than 2^K skill classes. |
conv.crit |
Convergence criterion for maximum absolute change in item parameters |
dev.crit |
Convergence criterion for maximum absolute change in deviance |
maxit |
Maximum number of iterations |
linkfct |
A string which indicates the link function for the GDINA model.
Options are |
Mj |
A list of design matrices and labels for each item.
The definition of |
group |
A vector of group identifiers for multiple group
estimation. Default is |
invariance |
Logical indicating whether invariance of item parameters
is assumed for multiple group models. If a subset of items should
be treated as noninvariant, then |
method |
Estimation method for item parameters (see)
(de la Torre, 2011). The default |
delta.init |
List with initial δ parameters |
delta.fixed |
List with fixed δ parameters.
For free estimated parameters |
delta.designmatrix |
A design matrix for restrictions on delta. See Example 4. |
delta.basispar.lower |
Lower bounds for delta basis parameters. |
delta.basispar.upper |
Upper bounds for delta basis parameters. |
delta.basispar.init |
An optional vector of starting values for the basis parameters of delta.
This argument only applies when using a designmatrix for delta,
i.e. |
zeroprob.skillclasses |
An optional vector of integers which indicates which skill classes should have zero probability. Default is NULL (no skill classes with zero probability). |
attr.prob.init |
Initial probabilities of skill distribution. |
reduced.skillspace |
A logical which indicates if the latent class skill space dimension
should be reduced (see Xu & von Davier, 2008). The default is |
reduced.skillspace.method |
Computation method for skill space reduction
in case of |
HOGDINA |
Values of -1, 0 or 1 indicating if a higher order GDINA model (see Details) should be estimated. The default value of -1 corresponds to the case that no higher order factor is assumed to exist. A value of 0 corresponds to independent attributes. A value of 1 assumes the existence of a higher order factor. |
Z.skillspace |
A user specified design matrix for the skill space reduction as described in Xu and von Davier (2008). See in the Examples section for applications. See Example 6. |
weights |
An optional vector of sample weights. |
rule |
A string or a vector of itemwise condensation rules. Allowed entries are
|
bugs |
Character vector indicating which columns in the Q-matrix
refer to bugs (misconceptions). This is only available if some |
regular_lam |
Regularization parameter λ |
regular_type |
Type of regularization. Can be |
regular_alpha |
Regularization parameter α (applicable for elastic net or SCAD-L2. |
regular_tau |
Regularization parameter τ for truncated L_1 penalty. |
regular_weights |
Optional list of item parameter weights used for penalties in regularized estimation (see Example 13) |
mono.constr |
Logical indicating whether monotonicity constraints should be fulfilled in estimation (implemented by the increasing penalty method; see Nash, 2014, p. 156). |
prior_intercepts |
Vector with mean and standard deviation for prior of random intercepts (applies to all items) |
prior_slopes |
Vector with mean and standard deviation for prior of random slopes (applies to all items and all parameters) |
progress |
An optional logical indicating whether the function should print the progress of iteration in the estimation process. |
progress.item |
An optional logical indicating whether item wise progress should be displayed |
mstep_iter |
Number of iterations in M-step if |
mstep_conv |
Convergence criterion in M-step if |
increment.factor |
A factor larger than 1 (say 1.1) to control maximum increments in item parameters. This parameter can be used in case of nonconvergence. |
fac.oldxsi |
A convergence acceleration factor between 0 and 1 which defines the weight of previously estimated values in current parameter updates. |
max.increment |
Maximum size of change in increments in M steps
of EM algorithm when |
avoid.zeroprobs |
An optional logical indicating whether for estimating
item parameters probabilities occur. Especially if
not a skill classes are used, it is recommended to switch
the argument to |
seed |
Simulation seed for initial parameters. A value of zero corresponds
to deterministic starting values, an integer value different from
zero to random initial values with |
save.devmin |
An optional logical indicating whether intermediate
estimates should be saved corresponding to minimal deviance.
Setting the argument to |
calc.se |
Optional logical indicating whether standard errors should be calculated. |
se_version |
Integer for calculation method of standard errors.
|
PEM |
Logical indicating whether the P-EM acceleration should be applied (Berlinet & Roland, 2012). |
PEM_itermax |
Number of iterations in which the P-EM method should be applied. |
cd |
Logical indicating whether coordinate descent algorithm should be used. |
cd_steps |
Number of steps for each parameter in coordinate descent algorithm |
mono_maxiter |
Maximum number of iterations for fulfilling the monotonicity constraint |
freq_weights |
Logical indicating whether frequency weights should
be used. Default is |
optimizer |
String indicating which optimizer should be used in
M-step estimation in case of |
object |
A required object of class |
digits |
Number of digits after decimal separator to display. |
file |
Optional file name for a file in which |
x |
A required object of class |
ask |
A logical indicating whether every separate item should
be displayed in |
... |
Optional parameters to be passed to or from other methods will be ignored. |
The estimation is based on an EM algorithm as described in de la Torre (2011).
Item parameters are contained in the delta
vector which is a list where
the jth entry corresponds to item parameters of the jth item.
The following description refers to the case of dichotomous attributes. For using polytomous attributes see Chen and de la Torre (2013) and Example 7 for a definition of the Q-matrix. In this case, Q_{ik}=l means that the ith item requires the mastery (at least) of level l of attribute k.
Assume that two skills α_1 and α_2 are required for mastering item j. Then the GDINA model can be written as
g [ P( X_{nj}=1 | α_n ) ]=δ_{j0} + δ_{j1} α_{n1} + δ_{j2} α_{n2} + δ_{j12} α_{n1} α_{n2}
which is a two-way GDINA-model (the rule="GDINA2"
specification) with a
link function g (which can be the identity, logit or logarithmic link).
If the specification
ACDM
is chosen, then δ_{j12}=0.
The DINA model (rule="DINA"
) assumes δ_{j1}=δ_{j2}=0.
For the reduced RUM model (rule="RRUM"
), the item response model is
P(X_{nj}=1 | α_n )=π_i^\ast \cdot r_{i1}^{1-α_{i1} } \cdot r_{i2}^{1-α_{i2} }
From this equation, it is obvious, that
this model is equivalent to an additive model (rule="ACDM"
) with
a logarithmic link function (linkfct="log"
).
If a reduced skillspace (reduced.skillspace=TRUE
) is employed, then the
logarithm of probability distribution of the attributes is modeled as a
log-linear model:
\log P[ ( α_{n1}, α_{n2}, …, α_{nK} ) ] =γ_0 + ∑_k γ_k α_{nk} + ∑_{k < l} γ_{kl} α_{nk} α_{nl}
If a higher order DINA model is assumed (HOGDINA=1
), then a higher order
factor θ_n for the attributes is assumed:
P( α_{nk}=1 | θ_n )=Φ ( a_k θ_n + b_k )
For HOGDINA=0
, all attributes α_{nk} are assumed to be
independent of each other:
P[ ( α_{n1}, α_{n2}, …, α_{nK} ) ] =∏_k P( α_{nk} )
Note that the noncompensatory reduced RUM (NC-RRUM) according
to Rupp and Templin (2008) is the GDINA model with the arguments
rule="ACDM"
and linkfct="log"
. NC-RRUM can also be
obtained by choosing rule="RRUM"
.
The compensatory RUM (C-RRUM) can be obtained by using the arguments
rule="ACDM"
and linkfct="logit"
.
The cognitive diagnosis model for identifying
skills and misconceptions (SISM; Kuo, Chen & de la Torre, 2018) can be
estimated with rule="SISM"
(see Example 12).
The gdina
function internally parameterizes the GDINA model as
g [ P( X_{nj}=1 | α_n ) ]=\boldmath{M}_j ( α _n ) \boldmath{δ}_j
with item-specific design matrices \boldmath{M}_j (α _n ) and item parameters
\boldmath{δ}_j. Only those attributes are modelled which correspond
to non-zero entries in the Q-matrix. Because the Q-matrix (in q.matrix
)
and the design matrices (in M_j
; see Example 3) can be
specified by the user, several
cognitive diagnosis models can be estimated. Therefore, some additional extensions
of the DINA model can also be estimated using the gdina
function.
These models include the DINA model with multiple strategies
(Huo & de la Torre, 2014)
An object of class gdina
with following entries
coef |
Data frame of item parameters |
delta |
List with basis item parameters |
se.delta |
Standard errors of basis item parameters |
probitem |
Data frame with model implied conditional item probabilities
P(X_i=1 | \bold{α}). These probabilities are displayed
in |
itemfit.rmsea |
The RMSEA item fit index (see |
mean.rmsea |
Mean of RMSEA item fit indexes. |
loglike |
Log-likelihood |
deviance |
Deviance |
G |
Number of groups |
N |
Sample size |
AIC |
AIC |
BIC |
BIC |
CAIC |
CAIC |
Npars |
Total number of parameters |
Nipar |
Number of item parameters |
Nskillpar |
Number of parameters for skill class distribution |
Nskillclasses |
Number of skill classes |
varmat.delta |
Covariance matrix of δ item parameters |
posterior |
Individual posterior distribution |
like |
Individual likelihood |
data |
Original data |
q.matrix |
Used Q-matrix |
pattern |
Individual patterns, individual MLE and MAP classifications and their corresponding probabilities |
attribute.patt |
Probabilities of skill classes |
skill.patt |
Marginal skill probabilities |
subj.pattern |
Individual subject pattern |
attribute.patt.splitted |
Splitted attribute pattern |
pjk |
Array of item response probabilities |
Mj |
Design matrix M_j in GDINA algorithm (see de la Torre, 2011) |
Aj |
Design matrix A_j in GDINA algorithm (see de la Torre, 2011) |
rule |
Used condensation rules |
linkfct |
Used link function |
delta.designmatrix |
Designmatrix for item parameters |
reduced.skillspace |
A logical if skillspace reduction was performed |
Z.skillspace |
Design matrix for skillspace reduction |
beta |
Parameters δ for skill class representation |
covbeta |
Standard errors of δ parameters |
iter |
Number of iterations |
rrum.params |
Parameters in the parametrization of the reduced RUM model
if |
group.stat |
Group statistics (sample sizes, group labels) |
HOGDINA |
The used value of |
mono.constr |
Monotonicity constraint |
regularization |
Logical indicating whether regularization is used |
regular_lam |
Regularization parameter |
numb_bound_mono |
Number of items with parameters at boundary of monotonicity constraints |
numb_regular_pars |
Number of regularized item parameters |
delta_regularized |
List indicating which item parameters are regularized |
cd_algorithm |
Logical indicating whether coordinate descent algorithm is used |
cd_steps |
Number of steps for each parameter in coordinate descent algorithm |
seed |
Used simulation seed |
a.attr |
Attribute parameters a_k in case of |
b.attr |
Attribute parameters b_k in case of |
attr.rf |
Attribute response functions. This matrix contains all a_k and b_k parameters |
converged |
Logical indicating whether convergence was achieved. |
control |
Optimization parameters used in estimation |
partable |
Parameter table for |
polychor |
Group-wise matrices with polychoric correlations |
sequential |
Logical indicating whether a sequential GDINA model is applied for polytomous item responses |
... |
Further values |
The function din
does not allow for multiple group estimation.
Use this gdina
function instead and choose the appropriate rule="DINA"
as an argument.
Standard error calculation in analyses which use sample weights or
designmatrix for delta parameters (delta.designmatrix!=NULL
) is not yet
correctly implemented. Please use replication methods instead.
Berlinet, A. F., & Roland, C. (2012). Acceleration of the EM algorithm: P-EM versus epsilon algorithm. Computational Statistics & Data Analysis, 56(12), 4122-4137.
Chen, J., & de la Torre, J. (2013). A general cognitive diagnosis model for expert-defined polytomous attributes. Applied Psychological Measurement, 37, 419-437.
Chen, J., & de la Torre, J. (2018). Introducing the general polytomous diagnosis modeling framework. Frontiers in Psychology | Quantitative Psychology and Measurement, 9(1474).
de la Torre, J., & Douglas, J. A. (2004). Higher-order latent trait models for cognitive diagnosis. Psychometrika, 69, 333-353.
de la Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76, 179-199.
Hong, C. Y., Chang, Y. W., & Tsai, R. C. (2016). Estimation of generalized DINA model with order restrictions. Journal of Classification, 33(3), 460-484.
Huo, Y., de la Torre, J. (2014). Estimating a cognitive diagnostic model for multiple strategies via the EM algorithm. Applied Psychological Measurement, 38, 464-485.
Kuo, B.-C., Chen, C.-H., & de la Torre, J. (2018). A cognitive diagnosis model for identifying coexisting skills and misconceptions. Applied Psychological Measurement, 42(3), 179-191.
Ma, W., & de la Torre, J. (2016). A sequential cognitive diagnosis model for polytomous responses. British Journal of Mathematical and Statistical Psychology, 69(3), 253-275.
Nash, J. C. (2014). Nonlinear parameter optimization using R tools. West Sussex: Wiley.
Rupp, A. A., & Templin, J. (2008). Unique characteristics of diagnostic classification models: A comprehensive review of the current state-of-the-art. Measurement: Interdisciplinary Research and Perspectives, 6, 219-262.
Shen, X., Pan, W., & Zhu, Y. (2012). Likelihood-based selection and sharp parameter estimation. Journal of the American Statistical Association, 107, 223-232.
Tutz, G. (1997). Sequential models for ordered responses. In W. van der Linden & R. K. Hambleton. Handbook of modern item response theory (pp. 139-152). New York: Springer.
Xu, G., & Shang, Z. (2018). Identifying latent structures in restricted latent class models. Journal of the American Statistical Association, 523, 1284-1295.
Xu, X., & von Davier, M. (2008). Fitting the structured general diagnostic model to NAEP data. ETS Research Report ETS RR-08-27. Princeton, ETS.
Zeng, L., & Xie, J. (2014). Group variable selection via SCAD-L_2. Statistics, 48, 49-66.
Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave penalty. Annals of Statistics, 38, 894-942.
See also the din
function (for DINA and DINO estimation).
For assessment of model fit see modelfit.cor.din
and
anova.gdina
.
See itemfit.sx2
for item fit statistics.
See sim.gdina
for simulating the GDINA model.
See gdina.wald
for a Wald test for testing the DINA and ACDM
rules at the item-level.
See gdina.dif
for assessing differential item
functioning.
See discrim.index
for computing discrimination indices.
See the GDINA::GDINA
function in the
GDINA package for similar functionality.
############################################################################# # EXAMPLE 1: Simulated DINA data | different condensation rules ############################################################################# data(sim.dina, package="CDM") data(sim.qmatrix, package="CDM") dat <- sim.dina Q <- sim.qmatrix #*** # Model 1: estimation of the GDINA model (identity link) mod1 <- CDM::gdina( data=dat, q.matrix=Q) summary(mod1) plot(mod1) # apply plot function ## Not run: # Model 1a: estimate model with different simulation seed mod1a <- CDM::gdina( data=dat, q.matrix=Q, seed=9089) summary(mod1a) # Model 1b: estimate model with some fixed delta parameters delta.fixed <- as.list( rep(NA,9) ) # List for parameters of 9 items delta.fixed[[2]] <- c( 0, .15, .15, .45 ) delta.fixed[[6]] <- c( .25, .25 ) mod1b <- CDM::gdina( data=dat, q.matrix=Q, delta.fixed=delta.fixed) summary(mod1b) # Model 1c: fix all delta parameters to previously fitted model mod1c <- CDM::gdina( data=dat, q.matrix=Q, delta.fixed=mod1$delta) summary(mod1c) # Model 1d: estimate GDINA model with GDINA package mod1d <- GDINA::GDINA( dat=dat, Q=Q, model="GDINA" ) summary(mod1d) # extract item parameters GDINA::itemparm(mod1d) GDINA::itemparm(mod1d, what="delta") # compare likelihood logLik(mod1) logLik(mod1d) #*** # Model 2: estimation of the DINA model with gdina function mod2 <- CDM::gdina( data=dat, q.matrix=Q, rule="DINA") summary(mod2) plot(mod2) #*** # Model 2b: compare results with din function mod2b <- CDM::din( data=dat, q.matrix=Q, rule="DINA") summary(mod2b) # Model 2: estimation of the DINO model with gdina function mod3 <- CDM::gdina( data=dat, q.matrix=Q, rule="DINO") summary(mod3) #*** # Model 4: DINA model with logit link mod4 <- CDM::gdina( data=dat, q.matrix=Q, rule="DINA", linkfct="logit" ) summary(mod4) #*** # Model 5: DINA model log link mod5 <- CDM::gdina( data=dat, q.matrix=Q, rule="DINA", linkfct="log") summary(mod5) #*** # Model 6: RRUM model mod6 <- CDM::gdina( data=dat, q.matrix=Q, rule="RRUM") summary(mod6) #*** # Model 7: Higher order GDINA model mod7 <- CDM::gdina( data=dat, q.matrix=Q, HOGDINA=1) summary(mod7) #*** # Model 8: GDINA model with independent attributes mod8 <- CDM::gdina( data=dat, q.matrix=Q, HOGDINA=0) summary(mod8) #*** # Model 9: Estimating the GDINA model with monotonicity constraints mod9 <- CDM::gdina( data=dat, q.matrix=Q, rule="GDINA", mono.constr=TRUE, linkfct="logit") summary(mod9) #*** # Model 10: Estimating the ACDM model with SCAD penalty and regularization # parameter of .05 mod10 <- CDM::gdina( data=dat, q.matrix=Q, rule="ACDM", linkfct="logit", regular_type="scad", regular_lam=.05 ) summary(mod10) #*** # Model 11: Estimation of GDINA model with prior distributions # N(0,10^2) prior for item intercepts prior_intercepts <- c(0,10) # N(0,1^2) prior for item slopes prior_slopes <- c(0,1) # estimate model mod11 <- CDM::gdina( data=dat, q.matrix=Q, rule="GDINA", prior_intercepts=prior_intercepts, prior_slopes=prior_slopes) summary(mod11) ############################################################################# # EXAMPLE 2: Simulated DINO data # additive cognitive diagnosis model with different link functions ############################################################################# data(sim.dino, package="CDM") data(sim.matrix, package="CDM") dat <- sim.dino Q <- sim.qmatrix #*** # Model 1: additive cognitive diagnosis model (ACDM; identity link) mod1 <- CDM::gdina( data=dat, q.matrix=Q, rule="ACDM") summary(mod1) #*** # Model 2: ACDM logit link mod2 <- CDM::gdina( data=dat, q.matrix=Q, rule="ACDM", linkfct="logit") summary(mod2) #*** # Model 3: ACDM log link mod3 <- CDM::gdina( data=dat, q.matrix=Q, rule="ACDM", linkfct="log") summary(mod3) #*** # Model 4: Different condensation rules per item I <- 9 # number of items rule <- rep( "GDINA", I ) rule[1] <- "DINO" # 1st item: DINO model rule[7] <- "GDINA2" # 7th item: GDINA model with first- and second-order interactions rule[8] <- "ACDM" # 8ht item: additive CDM rule[9] <- "DINA" # 9th item: DINA model mod4 <- CDM::gdina( data=dat, q.matrix=Q, rule=rule ) summary(mod4) ############################################################################# # EXAMPLE 3: Model with user-specified design matrices ############################################################################# data(sim.dino, package="CDM") data(sim.qmatrix, package="CDM") dat <- sim.dino Q <- sim.qmatrix # do a preliminary analysis and modify obtained design matrices mod0 <- CDM::gdina( data=dat, q.matrix=Q, maxit=1) # extract default design matrices Mj <- mod0$Mj Mj.user <- Mj # these user defined design matrices are modified. #~~~ For the second item, the following model should hold # X1 ~ V2 + V2*V3 mj <- Mj[[2]][[1]] mj.lab <- Mj[[2]][[2]] mj <- mj[,-3] mj.lab <- mj.lab[-3] Mj.user[[2]] <- list( mj, mj.lab ) # [[1]] # [,1] [,2] [,3] # [1,] 1 0 0 # [2,] 1 1 0 # [3,] 1 0 0 # [4,] 1 1 1 # [[2]] # [1] "0" "1" "1-2" #~~~ For the eight item an equality constraint should hold # X8 ~ a*V2 + a*V3 + V2*V3 mj <- Mj[[8]][[1]] mj.lab <- Mj[[8]][[2]] mj[,2] <- mj[,2] + mj[,3] mj <- mj[,-3] mj.lab <- c("0", "1=2", "1-2" ) Mj.user[[8]] <- list( mj, mj.lab ) Mj.user[[8]] ## [[1]] ## [,1] [,2] [,3] ## [1,] 1 0 0 ## [2,] 1 1 0 ## [3,] 1 1 0 ## [4,] 1 2 1 ## ## [[2]] ## [1] "0" "1=2" "1-2" mod <- CDM::gdina( data=dat, q.matrix=Q, Mj=Mj.user, maxit=200 ) summary(mod) ############################################################################# # EXAMPLE 4: Design matrix for delta parameters ############################################################################# data(sim.dino, package="CDM") data(sim.qmatrix, package="CDM") #~~~ estimate an initial model mod0 <- CDM::gdina( data=dat, q.matrix=Q, rule="ACDM", maxit=1) # extract coefficients c0 <- mod0$coef I <- 9 # number of items delta.designmatrix <- matrix( 0, nrow=nrow(c0), ncol=nrow(c0) ) diag( delta.designmatrix) <- 1 # set intercept of item 1 and item 3 equal to each other delta.designmatrix[ 7, 1 ] <- 1 ; delta.designmatrix[,7] <- 0 # set loading of V1 of item1 and item 3 equal delta.designmatrix[ 8, 2 ] <- 1 ; delta.designmatrix[,8] <- 0 delta.designmatrix <- delta.designmatrix[, -c(7:8) ] # exclude original parameters with indices 7 and 8 #*** # Model 1: ACDM with designmatrix mod1 <- CDM::gdina( data=dat, q.matrix=Q, rule="ACDM", delta.designmatrix=delta.designmatrix ) summary(mod1) #*** # Model 2: Same model, but with logit link instead of identity link function mod2 <- CDM::gdina( data=dat, q.matrix=Q, rule="ACDM", delta.designmatrix=delta.designmatrix, linkfct="logit") summary(mod2) ############################################################################# # EXAMPLE 5: Multiple group estimation ############################################################################# # simulate data set.seed(9279) N1 <- 200 ; N2 <- 100 # group sizes I <- 10 # number of items q.matrix <- matrix(0,I,2) # create Q-matrix q.matrix[1:7,1] <- 1 ; q.matrix[ 5:10,2] <- 1 # simulate first group dat1 <- CDM::sim.din(N1, q.matrix=q.matrix, mean=c(0,0) )$dat # simulate second group dat2 <- CDM::sim.din(N2, q.matrix=q.matrix, mean=c(-.3, -.7) )$dat # merge data dat <- rbind( dat1, dat2 ) # group indicator group <- c( rep(1,N1), rep(2,N2) ) # estimate GDINA model with multiple groups assuming invariant item parameters mod1 <- CDM::gdina( data=dat, q.matrix=q.matrix, group=group) summary(mod1) # estimate DINA model with multiple groups assuming invariant item parameters mod2 <- CDM::gdina( data=dat, q.matrix=q.matrix, group=group, rule="DINA") summary(mod2) # estimate GDINA model with noninvariant item parameters mod3 <- CDM::gdina( data=dat, q.matrix=q.matrix, group=group, invariance=FALSE) summary(mod3) # estimate GDINA model with some invariant item parameters (I001, I006, I008) mod4 <- CDM::gdina( data=dat, q.matrix=q.matrix, group=group, invariance=c("I001", "I006","I008") ) #--- model comparison IRT.compareModels(mod1,mod2,mod3,mod4) # estimate GDINA model with non-invariant item parameters except for the # items I001, I006, I008 mod5 <- CDM::gdina( data=dat, q.matrix=q.matrix, group=group, invariance=setdiff( colnames(dat), c("I001", "I006","I008") ) ) ############################################################################# # EXAMPLE 6: User specified reduced skill space ############################################################################# # Some correlations between attributes should be set to zero. q.matrix <- expand.grid( c(0,1), c(0,1), c(0,1), c(0,1) ) colnames(q.matrix) <- colnames( paste("Attr", 1:4,sep="")) q.matrix <- q.matrix[ -1, ] Sigma <- matrix( .5, nrow=4, ncol=4 ) diag(Sigma) <- 1 Sigma[3,2] <- Sigma[2,3] <- 0 # set correlation of attribute A2 and A3 to zero dat <- CDM::sim.din( N=1000, q.matrix=q.matrix, Sigma=Sigma)$dat #~~~ Step 1: initial estimation mod1a <- CDM::gdina( data=dat, q.matrix=q.matrix, maxit=1, rule="DINA") # estimate also "full" model mod1 <- CDM::gdina( data=dat, q.matrix=q.matrix, rule="DINA") #~~~ Step 2: modify designmatrix for reduced skillspace Z.skillspace <- data.frame( mod1a$Z.skillspace ) # set correlations of A2/A4 and A3/A4 to zero vars <- c("A2_A3","A2_A4") for (vv in vars){ Z.skillspace[,vv] <- NULL } #~~~ Step 3: estimate model with reduced skillspace mod2 <- CDM::gdina( data=dat, q.matrix=q.matrix, Z.skillspace=Z.skillspace, rule="DINA") #~~~ eliminate all covariances Z.skillspace <- data.frame( mod1$Z.skillspace ) colnames(Z.skillspace) Z.skillspace <- Z.skillspace[, -grep( "_", colnames(Z.skillspace),fixed=TRUE)] colnames(Z.skillspace) mod3 <- CDM::gdina( data=dat, q.matrix=q.matrix, Z.skillspace=Z.skillspace, rule="DINA") summary(mod1) summary(mod2) summary(mod3) ############################################################################# # EXAMPLE 7: Polytomous GDINA model (Chen & de la Torre, 2013) ############################################################################# data(data.pgdina, package="CDM") dat <- data.pgdina$dat q.matrix <- data.pgdina$q.matrix # pGDINA model with "DINA rule" mod1 <- CDM::gdina( dat, q.matrix=q.matrix, rule="DINA") summary(mod1) # no reduced skill space mod1a <- CDM::gdina( dat, q.matrix=q.matrix, rule="DINA",reduced.skillspace=FALSE) summary(mod1) # pGDINA model with "GDINA rule" mod2 <- CDM::gdina( dat, q.matrix=q.matrix, rule="GDINA") summary(mod2) ############################################################################# # EXAMPLE 8: Fraction subtraction data: DINA and HO-DINA model ############################################################################# data(fraction.subtraction.data, package="CDM") data(fraction.subtraction.qmatrix, package="CDM") dat <- fraction.subtraction.data Q <- fraction.subtraction.qmatrix # Model 1: DINA model mod1 <- CDM::gdina( dat, q.matrix=Q, rule="DINA") summary(mod1) # Model 2: HO-DINA model mod2 <- CDM::gdina( dat, q.matrix=Q, HOGDINA=1, rule="DINA") summary(mod2) ############################################################################# # EXAMPLE 9: Skill space approximation data.jang ############################################################################# data(data.jang, package="CDM") data <- data.jang$data q.matrix <- data.jang$q.matrix #*** Model 1: Reduced RUM model mod1 <- CDM::gdina( data, q.matrix, rule="RRUM", conv.crit=.001, maxit=500 ) #*** Model 2: Reduced RUM model with skill space approximation # use 300 instead of 2^9=512 skill classes skillspace <- CDM::skillspace.approximation( L=300, K=ncol(q.matrix) ) mod2 <- CDM::gdina( data, q.matrix, rule="RRUM", conv.crit=.001, skillclasses=skillspace ) ## > logLik(mod1) ## 'log Lik.' -30318.08 (df=153) ## > logLik(mod2) ## 'log Lik.' -30326.52 (df=153) ############################################################################# # EXAMPLE 10: CDM with a linear hierarchy ############################################################################# # This model is equivalent to a unidimensional IRT model with an ordered # ordinal latent trait and is actually a probabilistic Guttman model. set.seed(789) # define 3 competency levels alpha <- scan() 0 0 0 1 0 0 1 1 0 1 1 1 # define skill class distribution K <- 3 skillspace <- alpha <- matrix( alpha, K + 1, K, byrow=TRUE ) alpha <- alpha[ rep( 1:4, c(300,300,200,200) ), ] # P(000)=P(100)=.3, P(110)=P(111)=.2 # define Q-matrix Q <- scan() 1 0 0 1 1 0 1 1 1 Q <- matrix( Q, nrow=K, ncol=K, byrow=TRUE ) Q <- Q[ rep(1:K, each=4 ), ] colnames(skillspace) <- colnames(Q) <- paste0("A",1:K) I <- nrow(Q) # define guessing and slipping parameters guess <- stats::runif( I, 0, .3 ) slip <- stats::runif( I, 0, .2 ) # simulate data dat <- CDM::sim.din( q.matrix=Q, alpha=alpha, slip=slip, guess=guess )$dat #*** Model 1: DINA model with linear hierarchy mod1 <- CDM::din( dat, q.matrix=Q, rule="DINA", skillclasses=skillspace ) summary(mod1) #*** Model 2: pGDINA model with 3 levels # The multidimensional CDM with a linear hierarchy is a unidimensional # polytomous GDINA model. Q2 <- matrix( rowSums(Q), nrow=I, ncol=1 ) mod2 <- CDM::gdina( dat, q.matrix=Q2, rule="DINA" ) summary(mod2) #*** Model 3: estimate probabilistic Guttman model in sirt # Proctor, C. H. (1970). A probabilistic formulation and statistical # analysis for Guttman scaling. Psychometrika, 35, 73-78. library(sirt) mod3 <- sirt::prob.guttman( dat, itemlevel=Q2[,1] ) summary(mod3) # -> The three models result in nearly equivalent fit. ############################################################################# # EXAMPLE 11: Sequential GDINA model (Ma & de la Torre, 2016) ############################################################################# data(data.cdm04, package="CDM") #** attach dataset dat <- data.cdm04$data # polytomous item responses q.matrix1 <- data.cdm04$q.matrix1 q.matrix2 <- data.cdm04$q.matrix2 #-- DINA model with first Q-matrix mod1 <- CDM::gdina( dat, q.matrix=q.matrix1, rule="DINA") summary(mod1) #-- DINA model with second Q-matrix mod2 <- CDM::gdina( dat, q.matrix=q.matrix2, rule="DINA") #-- GDINA model mod3 <- CDM::gdina( dat, q.matrix=q.matrix2, rule="GDINA") #** model comparison IRT.compareModels(mod1,mod2,mod3) ############################################################################# # EXAMPLE 12: Simulataneous modeling of skills and misconceptions (Kuo et al., 2018) ############################################################################# data(data.cdm08, package="CDM") dat <- data.cdm08$data q.matrix <- data.cdm08$q.matrix #*** estimate model mod <- CDM::gdina( dat0, q.matrix, rule="SISM", bugs=colnames(q.matrix)[5:7] ) summary(mod) ############################################################################# # EXAMPLE 13: Regularized estimation in GDINA model data.dtmr ############################################################################# data(data.dtmr, package="CDM") dat <- data.dtmr$data q.matrix <- data.dtmr$q.matrix #***** LASSO regularization with lambda parameter of .02 mod1 <- CDM::gdina(dat, q.matrix=q.matrix, rule="GDINA", regular_lam=.02, regular_type="lasso") summary(mod1) mod$delta_regularized #***** using starting values from previuos estimation delta.init <- mod1$delta attr.prob.init <- mod1$attr.prob mod2 <- CDM::gdina(dat, q.matrix=q.matrix, rule="GDINA", regular_lam=.02, regular_type="lasso", delta.init=delta.init, attr.prob.init=attr.prob.init) summary(mod2) #***** final estimation fixing regularized estimates to zero and estimate all other #***** item parameters unregularized regular_weights <- mod2$delta_regularized delta.init <- mod2$delta attr.prob.init <- mod2$attr.prob mod3 <- CDM::gdina(dat, q.matrix=q.matrix, rule="GDINA", regular_lam=1E5, regular_type="lasso", delta.init=delta.init, attr.prob.init=attr.prob.init, regular_weights=regular_weights) summary(mod3) ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.