Markov Chain Monte Carlo for Ordinal Data Factor Analysis Model
This function generates a sample from the posterior distribution of an ordinal data factor analysis model. Normal priors are assumed on the factor loadings and factor scores while improper uniform priors are assumed on the cutpoints. The user supplies data and parameters for the prior distributions, and a sample from the posterior distribution is returned as an mcmc object, which can be subsequently analyzed with functions provided in the coda package.
MCMCordfactanal( x, factors, lambda.constraints = list(), data = parent.frame(), burnin = 1000, mcmc = 20000, thin = 1, tune = NA, verbose = 0, seed = NA, lambda.start = NA, l0 = 0, L0 = 0, store.lambda = TRUE, store.scores = FALSE, drop.constantvars = TRUE, ... )
x |
Either a formula or a numeric matrix containing the manifest variables. |
factors |
The number of factors to be fitted. |
lambda.constraints |
List of lists specifying possible equality or
simple inequality constraints on the factor loadings. A typical entry in the
list has one of three forms: |
data |
A data frame. |
burnin |
The number of burn-in iterations for the sampler. |
mcmc |
The number of iterations for the sampler. |
thin |
The thinning interval used in the simulation. The number of iterations must be divisible by this value. |
tune |
The tuning parameter for the Metropolis-Hastings sampling. Can be either a scalar or a k-vector. Must be strictly positive. |
verbose |
A switch which determines whether or not the progress of the
sampler is printed to the screen. If |
seed |
The seed for the random number generator. If NA, the Mersenne
Twister generator is used with default seed 12345; if an integer is passed
it is used to seed the Mersenne twister. The user can also pass a list of
length two to use the L'Ecuyer random number generator, which is suitable
for parallel computation. The first element of the list is the L'Ecuyer
seed, which is a vector of length six or NA (if NA a default seed of
|
lambda.start |
Starting values for the factor loading matrix Lambda. If
|
l0 |
The means of the independent Normal prior on the factor loadings.
Can be either a scalar or a matrix with the same dimensions as
|
L0 |
The precisions (inverse variances) of the independent Normal prior
on the factor loadings. Can be either a scalar or a matrix with the same
dimensions as |
store.lambda |
A switch that determines whether or not to store the factor loadings for posterior analysis. By default, the factor loadings are all stored. |
store.scores |
A switch that determines whether or not to store the factor scores for posterior analysis. NOTE: This takes an enormous amount of memory, so should only be used if the chain is thinned heavily, or for applications with a small number of observations. By default, the factor scores are not stored. |
drop.constantvars |
A switch that determines whether or not manifest variables that have no variation should be deleted before fitting the model. Default = TRUE. |
... |
further arguments to be passed |
The model takes the following form:
Let i=1,…,N index observations and j=1,…,K index response variables within an observation. The typical observed variable x_{ij} is ordinal with a total of C_j categories. The distribution of X is governed by a N \times K matrix of latent variables X^* and a series of cutpoints γ. X^* is assumed to be generated according to:
x^*_i = Λ φ_i + ε_i
ε_i \sim \mathcal{N}(0,I)
where x^*_i is the k-vector of latent variables specific to observation i, Λ is the k \times d matrix of factor loadings, and φ_i is the d-vector of latent factor scores. It is assumed that the first element of φ_i is equal to 1 for all i.
The probability that the jth variable in observation i takes the value c is:
π_{ijc} = Φ(γ_{jc} - Λ'_jφ_i) - Φ(γ_{j(c-1)} - Λ'_jφ_i)
The implementation used here assumes independent conjugate priors for each element of Λ and each φ_i. More specifically we assume:
Λ_{ij} \sim \mathcal{N}(l_{0_{ij}}, L_{0_{ij}}^{-1}), i=1,…,k, j=1,…,d
φ_{i(2:d)} \sim \mathcal{N}(0, I), i=1,…,n
The standard two-parameter item response theory model with probit link is a special case of the model sketched above.
MCMCordfactanal
simulates from the posterior distribution using a
Metropolis-Hastings within Gibbs sampling algorithm. The algorithm employed
is based on work by Cowles (1996). Note that the first element of
φ_i is a 1. As a result, the first column of
Λ can be interpretated as item difficulty parameters.
Further, the first element γ_1 is normalized to zero,
and thus not returned in the mcmc object. The simulation proper is done in
compiled C++ code to maximize efficiency. Please consult the coda
documentation for a comprehensive list of functions that can be used to
analyze the posterior sample.
As is the case with all measurement models, make sure that you have plenty of free memory, especially when storing the scores.
An mcmc object that contains the posterior sample. This object can be summarized by functions provided by the coda package.
Shawn Treier and Simon Jackman. 2008. “Democracy as a Latent Variable." American Journal of Political Science. 52: 201-217.
Andrew D. Martin, Kevin M. Quinn, and Jong Hee Park. 2011. “MCMCpack: Markov Chain Monte Carlo in R.”, Journal of Statistical Software. 42(9): 1-21. https://www.jstatsoft.org/v42/i09/.
M. K. Cowles. 1996. “Accelerating Monte Carlo Markov Chain Convergence for Cumulative-link Generalized Linear Models." Statistics and Computing. 6: 101-110.
Valen E. Johnson and James H. Albert. 1999. “Ordinal Data Modeling." Springer: New York.
Daniel Pemstein, Kevin M. Quinn, and Andrew D. Martin. 2007. Scythe Statistical Library 1.0. http://scythe.lsa.umich.edu.
Martyn Plummer, Nicky Best, Kate Cowles, and Karen Vines. 2006. “Output Analysis and Diagnostics for MCMC (CODA)”, R News. 6(1): 7-11. https://CRAN.R-project.org/doc/Rnews/Rnews_2006-1.pdf.
## Not run: data(painters) new.painters <- painters[,1:4] cuts <- apply(new.painters, 2, quantile, c(.25, .50, .75)) for (i in 1:4){ new.painters[new.painters[,i]<cuts[1,i],i] <- 100 new.painters[new.painters[,i]<cuts[2,i],i] <- 200 new.painters[new.painters[,i]<cuts[3,i],i] <- 300 new.painters[new.painters[,i]<100,i] <- 400 } posterior <- MCMCordfactanal(~Composition+Drawing+Colour+Expression, data=new.painters, factors=1, lambda.constraints=list(Drawing=list(2,"+")), burnin=5000, mcmc=500000, thin=200, verbose=500, L0=0.5, store.lambda=TRUE, store.scores=TRUE, tune=1.2) plot(posterior) summary(posterior) ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.