Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

is_separated

Checking for (quasi-)separation in binomial-response model.


Description

Separation occurs in binomial response models when a combination of the predictor variables perfectly predict a level of the response. In such a case the estimates of the coefficients for these variables diverge to (+/-)infinity, and the numerical algorithms typically fail. To anticipate such a problem, the fitting functions in spaMM try to check for separation by default. The check may take much time, and is skipped if the “problem size” exceeds a threshold defined by spaMM.options(separation_max=<.>), in which case a message will tell users by how much they should increase separation_max to force the check (its exact meaning and default value are subject to changes without notice but the default value aims to correspond to a separation check time of the order of 1s on the author's computer).

is_separated is a convenient interface to procedures from the ROI package, which can be called explicitly by the user to check bootstrap samples (see Example in anova). is_separated.formula is a variant (not yet a formal S3 method) that performs the same check, but using arguments similar to those of fitme(., family=binomial()).

Usage

is_separated(x, y, verbose = TRUE, solver=spaMM.getOption("sep_solver"))
is_separated.formula(formula, ..., separation_max=spaMM.getOption("separation_max"),
                     solver=spaMM.getOption("sep_solver"))

Arguments

x

Design matrix for fixed effects.

y

Numeric response vector

formula

A model formula

...

data and possibly other arguments of a fitme call. family is ignored if present.

separation_max

numeric: non-default value allow for easier local control of this spaMM option.

solver

character: name of linear programming solver used to assess separation; passed to ROI_solve's solver argument. One can select other solvers if the corresponding ROI plugin is installed.

verbose

Whether to print some messages or not.

Value

Returns a boolean; TRUE means there is (quasi-)separation.

References

The method accessible by solver="glpk" implements algorithms described by

Konis, K. 2007. Linear Programming Algorithms for Detecting Separated Data in Binary Logistic Regression Models. DPhil Thesis, Univ. Oxford. https://ora.ox.ac.uk/objects/uuid:8f9ee0d0-d78e-4101-9ab4-f9cbceed2a2a.

See Also

See also the 'safeBinaryRegression' and 'detectseparation' package.

Examples

set.seed(123)
d <- data.frame(success = rbinom(10, size = 1, prob = 0.9), x = 1:10)
is_separated.formula(formula= success~x, data=d) # FALSE
is_separated.formula(formula= success~I(success^2), data=d) # TRUE

spaMM

Mixed-Effect Models, with or without Spatial Random Effects

v3.10.0
CeCILL-2
Authors
François Rousset [aut, cre, cph] (<https://orcid.org/0000-0003-4670-0371>), Jean-Baptiste Ferdy [aut, cph], Alexandre Courtiol [aut] (<https://orcid.org/0000-0003-0637-2959>), GSL authors [ctb] (src/gsl_bessel.*)
Initial release
2022-02-06

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.