Fitting multivariate responses
This function extends the fitme
function to fit a joint model for different responses (following possibly different response families) sharing some random-effects, including a new type of random effect defined to exhibit correlations across different responses (see mv
).
The extension of spaMM to multivariate-response models is under advanced development but a few features available for analysis of univariate response may not yet work (see Details).
fitmv(submodels, data, fixed=NULL, init=list(), lower=list(), upper=list(), control=list(), control.dist = list(), method="ML", init.HLfit=list(), ...)
submodels |
A list of sublists each specifying a model for each univariate response. The names given to each submodel in the main list are currently ignored. The names and syntax of elements within each sublist are those of a |
data |
A data frame containing the variables in the response and the model formulas. |
fixed |
A list of fixed values of the parameters controlling random effects. The syntax is that of the same argument in |
init, lower, upper |
Lists of initial values or bounds. The syntax is that of the same arguments in |
control |
A list of control parameters, with possible elements as described for |
control.dist |
See |
method |
Character: the fitting method to be used, such as |
init.HLfit |
See identically named |
... |
Optional arguments passed to (or operating as if passed to) |
Matching random effects across submodels, and referring to them;
Random effects are recognized as identical across submodels by matching the formula terms. As shown in the Examples, if the two models formulas share the (1|clinic)
term, this term is recognized as a single random effect shared between the two responses. But the (1|clinic)
and (1|clinic2)
terms are recognized as distinct random effects. In that case, the init
argument init=list(lambda=c('1'=1,'2'=0.5))
is shown to refer to these by names 1,2
... where the order is defined as the order of first appearance of the terms across the model formulas in the order of the submodels
list.
Alternatively, the syntax fixed=list(lambda=c('clinic2'=0.5,'clinic'=1))
works: this syntax makes order of input irrelevant but assumes that the user guesses names correctly (these are typically the names that appear in the summary of lambda values from the fit object or, more programmatically, names(<fit object>$lambda.object$print_namesTerms)
). Finally, fixed values of parameters can also be specified through each sub-model, with indices referring to the order of random effects with each model.
The matching of random-effect terms occurs after expansion of multIMRF
terms, if any. This may have subtle consequences if two multIMRF terms differ only by their number of levels, as some of the expanded IMRF terms are then shared.
Capacities and limitations:
Practically all features of models that can be fitted by fitme
should be available: this includes all combinations of GLM response families, residual dispersion models, and all types of random-effect terms, whether autocorrelated or not. Among the arguments handled through the ..., covStruct
, distMatrix
, corrMatrix
should be effective; control.HLfit$LevenbergM
and verbose=c(TRACE=TRUE)
will work but some other controls available in fitme
may not.
The codemulti family-like syntax for multinomial models should not be used, but fitmv
could provide other means to model multinomial responses.
Most post-fit functions work, at least with default arguments. This includes point predict
ion and prediction variances calculations sensu lato, including with newdata
; but also simulate
, spaMM_boot
, confint
, anova
, update_resp
, and update
. Usage of the re.form
argument of some of these functions has not been systematically checked.
Some plotting functions may fail. update.formula
fails (see update_formulas
for details). terms
returns a list, which is not usable by other base R functions. step
is a good example of resulting limitations, as it is currently unable to perform any sensible operation on fitmv
output. spaMM::MSFDR
which rests both on terms
and on step
likewise fails.
multcomp::glht
fails.
A perhaps not entirely satisfying feature is that simulate
by default stacks the results of simulating each model in a single vector.
Everything with newdata
may return results in an inconvenient format.
update_resp(<fit>, newresp = simulate(<fit>, ...),
evaluate = FALSE)$data
may then be particularly useful to reformat simulation results. newdata
with insufficient information for prediction of all responses should generally cause troubles (as it may already in univariate-response models).
Which arguments belong to submodels
?:
Overall, arguments specifying individuals submodels should go into submodels
, while other arguments of fitmv
should be those potentially affecting several submodels (notably, random-effect structures, lower
, and upper
) and fitting controls (such as init
and init.HLfit
). One rarely-used exception is REMLformula
which controls the fitting method but should be specified through the submodels
.
The function proceeds by first preprocessing all submodels independently, before merging the resulting information by matching random effects across submodels. The merging operation includes some checks of consistency across submodels, implying that redundant arguments may be needed across submodels (e.g. specifying twice a non-default rand.family
for a random effect shared by two submodels).
A (single) list of class HLfit
, as returned by other fitting functions in spaMM. The main difference is that it contains a families
element describing the response families, instead of the family
elements of fitted objects for univariate response.
# Data preparation npos <- c(11,16,14,2,6,1,1,4,10,22,7,1,0,0,1,6) ntot <- c(36,20,19,16,17,11,5,6,37,32,19,17,12,10,9,7) treatment <- c(rep(1,8),rep(0,8)) clinic <- c(seq(8),seq(8)) clinics <- data.frame(npos=npos,nneg=ntot-npos,treatment=treatment,clinic=clinic) climv <- data.frame(npos=npos, nneg=ntot-npos, treatment=treatment, clinic=clinic, clinic2=clinic) (fitClinics <- HLfit(cbind(npos,nneg)~treatment+(1|clinic), family=binomial(),data=clinics)) set.seed(123) climv$np2 <- simulate(fitClinics, type="residual") # ### fits # Shared random-effect (mvfit <- fitmv( submodels=list(mod1=list(formula=cbind(npos,nneg)~treatment+(1|clinic),family=binomial()), mod2=list(formula=np2~treatment+(1|clinic), family=poisson(), fixed=list(lambda=c("1"=1)))), data=climv)) # Two univariate-response independent fits because random effect terms are distinct # (note how two lambda values are set; same syntax for 'init' values): (mvfit <- fitmv( submodels=list(mod1=list(formula=cbind(npos,nneg)~treatment+(1|clinic),family=binomial()), mod2=list(formula=np2~treatment+(1|clinic2),family=poisson())), data=climv, fixed=list(lambda=c('1'=1,'2'=0.5)))) # '1': (1|clinic); '2': (1|clinic2) # Specifying fixed (but not init) values in submodels is also possible (maybe not a good idea) # (mvfit <- fitmv( # submodels=list(mod1=list(formula=cbind(npos,nneg)~treatment+(1|clinic), # family=binomial(),fixed=list(lambda=c('1'=1))), # '1': (1|clinic) # mod2=list(formula=np2~treatment+(1|clinic2),family=poisson(), # fixed=list(lambda=c('1'=0.5)))), # '2': (1|clinic2) # data=climv))
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.