spaMM: is_separated – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

is_separated

Checking for (quasi-)separation in binomial-response model.

Description

Separation occurs in binomial response models when a combination of the predictor variables perfectly predict a level of the response. In such a case the estimates of the coefficients for these variables diverge to (+/-)infinity, and the numerical algorithms typically fail. To anticipate such a problem, the fitting functions in spaMM try to check for separation by default. The check may take much time, and is skipped if the “problem size” exceeds a threshold defined by spaMM.options(separation_max=<.>), in which case a message will tell users by how much they should increase separation_max to force the check (its exact meaning and default value are subject to changes without notice but the default value aims to correspond to a separation check time of the order of 1s on the author's computer).

is_separated is a convenient interface to procedures from the ROI package, which can be called explicitly by the user to check bootstrap samples (see Example in anova). is_separated.formula is a variant (not yet a formal S3 method) that performs the same check, but using arguments similar to those of fitme(., family=binomial()).

Usage

is_separated(x, y, verbose = TRUE, solver=spaMM.getOption("sep_solver"))
is_separated.formula(formula, ..., separation_max=spaMM.getOption("separation_max"),
                     solver=spaMM.getOption("sep_solver"))

Arguments

`x`	Design matrix for fixed effects.
`y`	Numeric response vector
`formula`	A model formula
`...`	`data` and possibly other arguments of a `fitme` call. `family` is ignored if present.
`separation_max`	numeric: non-default value allow for easier local control of this spaMM option.
`solver`	character: name of linear programming solver used to assess separation; passed to `ROI_solve`'s `solver` argument. One can select other solvers if the corresponding ROI plugin is installed.
`verbose`	Whether to print some messages or not.

Value

Returns a boolean; TRUE means there is (quasi-)separation.

References

The method accessible by solver="glpk" implements algorithms described by

Konis, K. 2007. Linear Programming Algorithms for Detecting Separated Data in Binary Logistic Regression Models. DPhil Thesis, Univ. Oxford. https://ora.ox.ac.uk/objects/uuid:8f9ee0d0-d78e-4101-9ab4-f9cbceed2a2a.

Examples

set.seed(123)
d <- data.frame(success = rbinom(10, size = 1, prob = 0.9), x = 1:10)
is_separated.formula(formula= success~x, data=d) # FALSE
is_separated.formula(formula= success~I(success^2), data=d) # TRUE

spaMM

Mixed-Effect Models, with or without Spatial Random Effects

v3.10.0

CeCILL-2

Authors

François Rousset [aut, cre, cph] (<https://orcid.org/0000-0003-4670-0371>), Jean-Baptiste Ferdy [aut, cph], Alexandre Courtiol [aut] (<https://orcid.org/0000-0003-0637-2959>), GSL authors [ctb] (src/gsl_bessel.*)

Initial release

2022-02-06