analysis of educational testing data and roll call data with IRT models, via Markov chain Monte Carlo methods
Analysis of rollcall
data via the spatial voting model;
equivalent to a 2 parameter item-response model to educational testing data. Model fitting via Markov chain Monte Carlo (MCMC).
ideal(object, codes = object$codes, dropList = list(codes = "notInLegis", lop = 0), d = 1, maxiter = 10000, thin = 100, burnin = 5000, impute = FALSE, normalize = FALSE, meanzero = normalize, priors = NULL, startvals = "eigen", store.item = FALSE, file = NULL, verbose=FALSE, use.voter=NULL)
object |
an object of class |
codes |
a |
dropList |
a |
d |
numeric, (small) positive integer (default = 1), dimensionality of the ability space (or "policy space" in the rollcall context). |
maxiter |
numeric, positive integer, multiple of |
thin |
numeric, positive integer, thinning interval used for recording MCMC iterations. |
burnin |
number of MCMC iterations to run before recording. The
iteration numbered |
impute |
|
normalize |
|
meanzero |
to be deprecated/ignored; use |
priors |
a
None of the components should contain |
startvals |
either a string naming a method for generating start
values, valid options are |
store.item |
|
file |
string, file to write MCMC output. Default is
|
verbose |
logical, default
is |
use.voter |
A vector of logicals of length |
The function fits a d
+1 parameter item-response model to
the roll call data object, so in one dimension the model reduces
to the two-parameter item-response model popular in educational testing.
See References.
Identification: The model parameters are not identified without the user supplying some restrictions on the model parameters; i.e., translations, rotations and re-scalings of the ideal points are observationally equivalent, via offsetting transformations of the item parameters. It is the user's responsibility to impose these identifying restrictions if desired. The following brief discussion provides some guidance.
For one-dimensional models (i.e., d=1
), a simple route to
identification is the normalize
option, by imposing the restriction that the means of the posterior densities of the ideal points (ability parameters) have mean zero and standard deviation one, across legislators (test-takers). This normalization supplies
local identification (that is, identification up to a 180 degree rotation of
the recovered dimension).
Near-degenerate “spike” priors
(priors with arbitrarily large precisions) or the
constrain.legis
option on any two legislators' ideal points
ensures global identification in one dimension.
Identification in higher dimensions can be obtained by supplying
fixed values for d+1
legislators' ideal points, provided the
supplied fixed points span a d
-dimensional space (e.g., three
supplied ideal points form a triangle in d=2
dimensions), via
the constrain.legis
option. In this case the function
defaults to vague normal priors on the unconstrained ideal points, but at each iteration the sampled
ideal points are transformed back into the space of identified
parameters, applying the linear transformation that maps the
d+1
fixed ideal points from their sampled values to their
fixed values. Alternatively, one can impose
restrictions on the item parameters via
constrain.items
. See the examples in the documentation
for the constrain.legis
and
constrain.items
.
Another route to identification is via post-processing. That
is, the user can run ideal
without any identification
constraints. This does not pose any formal/technical problem in a
Bayesian analysis. The fact that the posterior density may have
multiple modes doesn't imply that the posterior is improper or that
it can't be explored via MCMC methods. – but then use the function
postProcess
to map the MCMC output from the space of
unidentified parameters into the subspace of identified parameters.
See the example in the documentation for the
postProcess
function.
When the normalize
option is set to TRUE
, an
unidentified model is run, and the ideal
object is
post-processed with the normalize
option, and then returned
to the user (but again, note that the normalize
option is
only implemented for unidimensional models).
Start values. Start values can be supplied by the user, or generated by the function itself.
The default method, corresponding to startvals="eigen"
, first
forms a n
-by-n
correlation matrix from the
double-centered roll call matrix (subtracting row means, and column
means, adding in the grand mean), and then extracts the first
d
principal components (eigenvectors), scaling the
eigenvectors by the square root of their corresponding eigenvector.
If the user is imposing constraints on ideal points (via
constrain.legis
), these constraints are applied to the
corresponding elements of the start values generated from the
eigen-decomposition. Then, to generate start
values for the rollcall/item parameters, a series of
binomial
glms
are
estimated (with a probit link
), one for
each rollcall/item, j = 1, …, m. The votes on the
j-th rollcall/item are binary responses (presumed to be
conditionally independent given each legislator's latent
preference), and the (constrained or unconstrained) start values for
legislators are used as predictors. The estimated coefficients from
these probit models are used as start values for the item
discrimination and difficulty parameters (with the intercepts from
the probit GLMs multiplied by -1 so as to make those coefficients
difficulty parameters).
The default eigen
method generates extremely good start
values for low-dimensional models fit to recent U.S. congresses,
where high rates of party line voting result in excellent fits from
low dimensional models. The eigen
method may be
computationally expensive or lead to memory errors for
rollcall
objects with large numbers of legislators.
The random
method generates start values via iid sampling
from a N(0,1) density, via rnorm
, imposing any
constraints that may have been supplied via
constrain.legis
, and then uses the probit method
described above to get start values for the rollcall/item
parameters.
If startvals
is a list
, it must contain the named
components x
and/or b
, or named components that
(uniquely) begin with the letters x
and/or b
. The
component x
must be a vector or a matrix of dimensions equal to
the number of individuals (legislators) by d
. If supplied,
startvals$b
must be a matrix with dimension number of items
(votes) by d
+1. The x
and b
components cannot
contain NA
. If x
is not supplied when startvals
is a list, then start values are generated using the default
eiegn
method described above, and start values for the
rollcall/item parameters are regenerated using the probit method,
ignoring any user-supplied values in startvals$b
. That is,
user-supplied values in startvals$b
are only used when
accompanied by a valid set of start values for the ideal points in
startvals$x
.
Implementation via Data Augmentation. The MCMC algorithm for this problem consists of a Gibbs sampler for the ideal points (latent traits) and item parameters, conditional on latent data y^*, generated via a data augmentation (DA) step. That is, following Albert (1992) and Albert and Chib (1993), if y_{ij} = 1 we sample from the truncated normal density
y_{ij}^* \sim N(x_i' β_j - α_j, 1)\mathcal{I}(y_{ij}^* ≥q 0)
and for y_{ij}=0 we sample
y_{ij}^* \sim N(x_i' β_j - α_j, 1)\mathcal{I}(y_{ij}^* < 0)
where \mathcal{I} is an indicator function evaluating to one if its argument is true and zero otherwise. Given the latent y^*, the conditional distributions for x and (β,α) are extremely simple to sample from; see the references for details.
This data-augmented Gibbs sampling strategy is easily implemented, but can sometimes require many thousands of samples in order to generate tolerable explorations of the posterior densities of the latent traits, particularly for legislators with short and/or extreme voting histories (the equivalent in the educational testing setting is a test-taker who gets almost every item right or wrong).
a list
of class ideal
with named components
n |
|
m |
|
d |
|
x |
a three-dimensional |
beta |
a three-dimensional |
xbar |
a |
betabar |
a |
args |
calling arguments, evaluated in the frame calling |
call |
an object of class |
Simon Jackman simon.jackman@sydney.edu.au, with help from Christina Maimone and Alex Tahk.
Albert, James. 1992. Bayesian Estimation of normal ogive item response curves using Gibbs sampling. Journal of Educational Statistics. 17:251-269.
Albert, James H. and Siddhartha Chib. 1993. Bayesian Analysis of Binary and Polychotomous Response Data. Journal of the American Statistical Association. 88:669-679.
Clinton, Joshua, Simon Jackman and Douglas Rivers. 2004. The Statistical Analysis of Roll Call Data. American Political Science Review. 98:335-370.
Jackman, Simon. 2009. Bayesian Analysis for the Social Sciences. Wiley: Hoboken, New Jersey.
Jessee, Stephen. 2016. (How) Can We Estimate the Ideology of Citizens and Political Elites on the Same Scale? American Journal of Political Science.
Patz, Richard J. and Brian W. Junker. 1999. A Straightforward Approach to Markov Chain Monte Carlo Methods for Item Response Models. Journal of Education and Behavioral Statistics. 24:146-178.
Rivers, Douglas. 2003. “Identification of Multidimensional Item-Response Models.” Typescript. Department of Political Science, Stanford University.
van Dyk, David A and Xiao-Li Meng. 2001. The art of data augmentation (with discussion). Journal of Computational and Graphical Statistics. 10(1):1-111.
rollcall
, summary.ideal
,
plot.ideal
, predict.ideal
.
tracex
for graphical display of MCMC iterative
history.
idealToMCMC
converts the MCMC iterates in an
ideal
object to a form that can be used by the coda
library.
constrain.items
and
constrain.legis
for implementing identifying
restrictions.
postProcess
for imposing identifying restrictions
ex post.
## Not run: ## long run, many iterations data(s109) n <- dim(s109$legis.data)[1] x0 <- rep(0,n) x0[s109$legis.data$party=="D"] <- -1 x0[s109$legis.data$party=="R"] <- 1 id1 <- ideal(s109, d=1, startvals=list(x=x0), normalize=TRUE, store.item=TRUE, maxiter=260E3, burnin=10E3, thin=100) ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.