Hierarchical Rater Model Based on Signal Detection Theory (HRM-SDT)
This function estimates a version of the hierarchical rater model (HRM) based on signal detection theory (HRM-SDT; DeCarlo, 2005; DeCarlo, Kim & Johnson, 2011; Robitzsch & Steinfeld, 2018). The model is estimated by means of an EM algorithm adapted from multilevel latent class analysis (Vermunt, 2008).
rm.sdt(dat, pid, rater, Qmatrix=NULL, theta.k=seq(-9, 9, len=30), est.a.item=FALSE, est.c.rater="n", est.d.rater="n", est.mean=FALSE, est.sigma=TRUE, skillspace="normal", tau.item.fixed=NULL, a.item.fixed=NULL, d.min=0.5, d.max=100, d.start=3, c.start=NULL, tau.start=NULL, sd.start=1, d.prior=c(3,100), c.prior=c(3,100), tau.prior=c(0,1000), a.prior=c(1,100), link_item="GPCM", max.increment=1, numdiff.parm=0.00001, maxdevchange=0.1, globconv=.001, maxiter=1000, msteps=4, mstepconv=0.001, optimizer="nlminb" ) ## S3 method for class 'rm.sdt' summary(object, file=NULL, ...) ## S3 method for class 'rm.sdt' plot(x, ask=TRUE, ...) ## S3 method for class 'rm.sdt' anova(object,...) ## S3 method for class 'rm.sdt' logLik(object,...) ## S3 method for class 'rm.sdt' IRT.factor.scores(object, type="EAP", ...) ## S3 method for class 'rm.sdt' IRT.irfprob(object,...) ## S3 method for class 'rm.sdt' IRT.likelihood(object,...) ## S3 method for class 'rm.sdt' IRT.posterior(object,...) ## S3 method for class 'rm.sdt' IRT.modelfit(object,...) ## S3 method for class 'IRT.modelfit.rm.sdt' summary(object,...)
dat |
Original data frame. Ratings on variables must be in rows, i.e. every row corresponds to a person-rater combination. |
pid |
Person identifier. |
rater |
Rater identifier. |
Qmatrix |
An optional Q-matrix. If this matrix is not provided, then by default the ordinary scoring of categories (from 0 to the maximum score of K) is used. |
theta.k |
A grid of theta values for the ability distribution. |
est.a.item |
Should item parameters a_i be estimated? |
est.c.rater |
Type of estimation for item-rater parameters c_{ir}
in the signal detection model. Options are |
est.d.rater |
Type of estimation of d parameters. Options are
the same as in |
est.mean |
Optional logical indicating whether the mean of the trait distribution should be estimated. |
est.sigma |
Optional logical indicating whether the standard deviation of the trait distribution should be estimated. |
skillspace |
Specified θ distribution type. It can be
|
tau.item.fixed |
Optional matrix with three columns specifying fixed τ parameters. The first two columns denote item and category indices, the third the fixed value. See Example 3. |
a.item.fixed |
Optional matrix with two columns specifying fixed a parameters. First column: Item index. Second column: Fixed a parameter. |
d.min |
Minimal d parameter to be estimated |
d.max |
Maximal d parameter to be estimated |
d.start |
Starting value(s) of d parameters |
c.start |
Starting values of c parameters |
tau.start |
Starting values of τ parameters |
sd.start |
Starting value for trait standard deviation |
d.prior |
Normal prior N(M,S^2) for d parameters |
c.prior |
Normal prior for c parameters. The prior for
parameter c_{irk} is defined as M \cdot ( k - 0.5)
where M is |
tau.prior |
Normal prior for τ parameters |
a.prior |
Normal prior for a parameters |
link_item |
Type of item response function for latent responses.
Can be |
max.increment |
Maximum increment of item parameters during estimation |
numdiff.parm |
Numerical differentiation step width |
maxdevchange |
Maximum relative deviance change as a convergence criterion |
globconv |
Maximum parameter change |
maxiter |
Maximum number of iterations |
msteps |
Maximum number of iterations during an M step |
mstepconv |
Convergence criterion in an M step |
optimizer |
Choice of optimization function in M-step for
item parameters. Options are |
object |
Object of class |
file |
Optional file name in which summary should be written. |
x |
Object of class |
ask |
Optional logical indicating whether a new plot should be asked for. |
type |
Factor score estimation method. Up to now,
only |
... |
Further arguments to be passed |
The specification of the model follows DeCarlo et al. (2011).
The second level models the ideal rating (latent response) η=0, ...,K
of person p on item i. The option link_item='GPCM'
follows the
generalized partial credit model
P( η_{pi}=η | θ_p ) \propto exp( a_{i} q_{i η } θ_p - τ_{i η } )
. The option link_item='GRM'
employs the
graded response model
P( η_{pi}=η | θ_p )= Ψ( τ_{i,η + 1} - a_i θ_p ) - Ψ( τ_{i,η} - a_i θ_p )
At the first level, the ratings X_{pir} for person p on item i and rater r are modeled as a signal detection model
P( X_{pir} ≤ k | η_{pi} )= G( c_{irk} - d_{ir} η_{pi} )
where G is the logistic distribution function and the categories are k=1,…, K+1. Note that the item response model can be equivalently written as
P( X_{pir} ≥ k | η_{pi} )= G( d_{ir} η_{pi} - c_{irk})
The thresholds c_{irk} can be further restricted to
c_{irk}=c_{k} (est.c.rater='e'
),
c_{irk}=c_{ik} (est.c.rater='i'
) or
c_{irk}=c_{ir} (est.c.rater='r'
). The same
holds for rater precision parameters d_{ir}.
A list with following entries:
deviance |
Deviance |
ic |
Information criteria and number of parameters |
item |
Data frame with item parameters. The columns
|
rater |
Data frame with rater parameters.
Transformed c parameters
( |
person |
Data frame with person parameters: EAP and corresponding standard errors |
EAP.rel |
EAP reliability |
EAP.rel |
EAP reliability |
mu |
Mean of the trait distribution |
sigma |
Standard deviation of the trait distribution |
tau.item |
Item parameters τ_{ik} |
se.tau.item |
Standard error of item parameters τ_{ik} |
a.item |
Item slopes a_i |
se.a.item |
Standard error of item slopes a_i |
c.rater |
Rater parameters c_{irk} |
se.c.rater |
Standard error of rater severity parameter c_{irk} |
d.rater |
Rater slope parameter d_{ir} |
se.d.rater |
Standard error of rater slope parameter d_{ir} |
f.yi.qk |
Individual likelihood |
f.qk.yi |
Individual posterior distribution |
probs |
Item probabilities at grid |
prob.item |
Probabilities P( η_i=η | θ ) of latent item responses evaluated at theta grid θ_p. |
n.ik |
Expected counts |
pi.k |
Estimated trait distribution P(θ_p). |
maxK |
Maximum number of categories |
procdata |
Processed data |
iter |
Number of iterations |
... |
Further values |
DeCarlo, L. T. (2005). A model of rater behavior in essay grading based on signal detection theory. Journal of Educational Measurement, 42, 53-76.
DeCarlo, L. T. (2010). Studies of a latent-class signal-detection model for constructed response scoring II: Incomplete and hierarchical designs. ETS Research Report ETS RR-10-08. Princeton NJ: ETS.
DeCarlo, T., Kim, Y., & Johnson, M. S. (2011). A hierarchical rater model for constructed responses, with a signal detection rater model. Journal of Educational Measurement, 48, 333-356.
Robitzsch, A., & Steinfeld, J. (2018). Item response models for human ratings: Overview, estimation methods, and implementation in R. Psychological Test and Assessment Modeling, 60(1), 101-139.
Vermunt, J. K. (2008). Latent class and finite mixture models for multilevel data sets. Statistical Methods in Medical Research, 17, 33-51.
The facets rater model can be estimated with rm.facets
.
############################################################################# # EXAMPLE 1: Hierarchical rater model (HRM-SDT) data.ratings1 ############################################################################# data(data.ratings1) dat <- data.ratings1 ## Not run: # Model 1: Partial Credit Model: no rater effects mod1 <- sirt::rm.sdt( dat[, paste0( "k",1:5) ], rater=dat$rater, pid=dat$idstud, est.c.rater="n", d.start=100, est.d.rater="n" ) summary(mod1) # Model 2: Generalized Partial Credit Model: no rater effects mod2 <- sirt::rm.sdt( dat[, paste0( "k",1:5) ], rater=dat$rater, pid=dat$idstud, est.c.rater="n", est.d.rater="n", est.a.item=TRUE, d.start=100) summary(mod2) # Model 3: Equal effects in SDT mod3 <- sirt::rm.sdt( dat[, paste0( "k",1:5) ], rater=dat$rater, pid=dat$idstud, est.c.rater="e", est.d.rater="e") summary(mod3) # Model 4: Rater effects in SDT mod4 <- sirt::rm.sdt( dat[, paste0( "k",1:5) ], rater=dat$rater, pid=dat$idstud, est.c.rater="r", est.d.rater="r") summary(mod4) ############################################################################# # EXAMPLE 2: HRM-SDT data.ratings3 ############################################################################# data(data.ratings3) dat <- data.ratings3 dat <- dat[ dat$rater < 814, ] psych::describe(dat) # Model 1: item- and rater-specific effects mod1 <- sirt::rm.sdt( dat[, paste0( "crit",c(2:4)) ], rater=dat$rater, pid=dat$idstud, est.c.rater="a", est.d.rater="a" ) summary(mod1) plot(mod1) # Model 2: Differing number of categories per variable mod2 <- sirt::rm.sdt( dat[, paste0( "crit",c(2:4,6)) ], rater=dat$rater, pid=dat$idstud, est.c.rater="a", est.d.rater="a") summary(mod2) plot(mod2) ############################################################################# # EXAMPLE 3: Hierarchical rater model with discrete skill spaces ############################################################################# data(data.ratings3) dat <- data.ratings3 dat <- dat[ dat$rater < 814, ] psych::describe(dat) # Model 1: Discrete theta skill space with values of 0,1,2 and 3 mod1 <- sirt::rm.sdt( dat[, paste0( "crit",c(2:4)) ], theta.k=0:3, rater=dat$rater, pid=dat$idstud, est.c.rater="a", est.d.rater="a", skillspace="discrete" ) summary(mod1) plot(mod1) # Model 2: Modelling of one item by using a discrete skill space and # fixed item parameters # fixed tau and a parameters tau.item.fixed <- cbind( 1, 1:3, 100*cumsum( c( 0.5, 1.5, 2.5)) ) a.item.fixed <- cbind( 1, 100 ) # fit HRM-SDT mod2 <- sirt::rm.sdt( dat[, "crit2", drop=FALSE], theta.k=0:3, rater=dat$rater, tau.item.fixed=tau.item.fixed,a.item.fixed=a.item.fixed, pid=dat$idstud, est.c.rater="a", est.d.rater="a", skillspace="discrete" ) summary(mod2) plot(mod2) ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.