Support functions for model extensions
This documents the methods that ref_grid
calls. A user
or package developer may add emmeans support for a model
class by writing recover_data
and emm_basis
methods
for that class. (Users in need for a quick way to obtain results for a model
that is not supported may be better served by the qdrg
function.)
recover_data(object, ...) ## S3 method for class 'call' recover_data(object, trms, na.action, data = NULL, params = "pi", frame, ...) emm_basis(object, trms, xlev, grid, ...) .recover_data(object, ...) .emm_basis(object, trms, xlev, grid, ...) .emm_register(classes, pkgname)
object |
An object of the same class as is supported by a new method. |
... |
Additional parameters that may be supported by the method. |
trms |
The |
na.action |
Integer vector of indices of observations to ignore; or
|
data |
Data frame. Usually, this is |
params |
Character vector giving the names of any variables in the model
formula that are not predictors. For example, a spline model may involve
a local variable |
frame |
Optional |
xlev |
Named list of factor levels (excluding ones coerced to factors in the model formula) |
grid |
A |
classes |
Character names of one or more classes to be registered.
The package must contain the functions |
pkgname |
Character name of package providing the methods (usually
should be the second argument of |
The recover_data
method must return a data.frame
containing all the variables that appear as predictors in the model,
and attributes "call"
, "terms"
, "predictors"
,
and "responses"
. (recover_data.call
will
provide these attributes.)
The emm_basis
method should return a list
with the
following elements:
The matrix of linear functions over grid
, having the same
number of rows as grid
and the number of columns equal to the length
of bhat
.
The vector of regression coefficients for fixed effects. This
should include any NA
s that result from rank deficiencies.
A matrix whose columns form a basis for non-estimable functions
of beta, or a 1x1 matrix of NA
if there is no rank deficiency.
The estimated covariance matrix of bhat
.
A function of (k, dfargs)
that returns the degrees of
freedom associated with sum(k * bhat)
.
A list
containing additional arguments needed for
dffun
.
.recover_data
and .emm_basis
are hidden exported versions of
recover_data
and emm_basis
, respectively. They run in emmeans's
namespace, thus providing access to all existing methods.
To create a reference grid, the ref_grid
function needs to reconstruct
the data used in fitting the model, and then obtain a matrix of linear
functions of the regression coefficients for a given grid of predictor
values. These tasks are performed by calls to recover_data
and
emm_basis
respectively. A vignette giving details and examples
is available via vignette("xtending", "emmeans")
To extend emmeans's support to additional model types, one need only
write S3 methods for these two functions. The existing methods serve as
helpful guidance for writing new ones. Most of the work for
recover_data
can be done by its method for class "call"
,
providing the terms
component and na.action
data as additional
arguments. Writing an emm_basis
method is more involved, but the
existing methods (e.g., emmeans:::emm_basis.lm
) can serve as models.
Certain recover_data
and emm_basis
methods are exported from
emmeans. (To find out, do methods("recover_data")
.) If your
object is based on another model-fitting object, it
may be that all that is needed is to call one of these exported methods and
perhaps make modifications to the results. Contact the developer if you need
others of these exported.
If the model has a multivariate response, bhat
needs to be
“flattened” into a single vector, and X
and V
must be
constructed consistently.
In models where a non-full-rank result is possible (often, you can tell by
seeing if there is a singular.ok
argument in the model-fitting
function), summary.emmGrid
and its relatives check the
estimability of each
prediction, using the nonest.basis
function in
the estimability package.
The models already supported are detailed in the "models" vignette. Some packages may provide additional emmeans support for its object classes.
If the recover_data
method generates information needed by emm_basis
,
that information may be incorporated by creating a "misc"
attribute in the
returned recovered data. That information is then passed as the misc
argument when ref_grid
calls emm_basis
.
Some models may need something other than standard linear estimates and
standard errors. If so, custom functions may be pointed to via the items
misc$estHook
, misc$vcovHook
and misc$postGridHook
. If
just the name of the hook function is provided as a character string, then it
is retrieved using get
.
The estHook
function should have arguments (object, do.se, tol,
...) where object
is the emmGrid
object,
do.se
is a logical flag for whether to return the standard error, and
tol
is the tolerance for assessing estimability. It should return a
matrix with 3 columns: the estimates, standard errors (NA
when
do.se==FALSE
), and degrees of freedom (NA
for asymptotic). The
number of rows should be the same as object@linfct. The
vcovHook
function should have arguments (object, tol, ...) as
described. It should return the covariance matrix for the estimates. Finally,
postGridHook
, if present, is called at the very end of
ref_grid
; it takes one argument, the constructed object
, and
should return a suitably modified emmGrid
object.
The .emm_register
function is provided as a convenience to conditionally
register your
S3 methods for a model class, recover_data.foo
and emm_basis.foo
,
where foo
is the class name. Your package should implement an
.onLoad
function and call .emm_register
if emmeans is
installed. See the example.
Without an explicit data
argument, recover_data
returns
the current version of the dataset. If the dataset has changed
since the model was fitted, then this will not be the data used to fit
the model. It is especially important to know this in simulation studies
where the data are randomly generated or permuted, and in cases where
several datasets are processed in one step (e.g., using dplyr
).
In those cases, users should be careful to provide the actual data
used to fit the model in the data
argument.
## Not run: #--- If your package provides recover_data and emm_grid methods for class 'mymod', #--- put something like this in your package code -- say in zzz.R: .onLoad = function(libname, pkgname) { if (requireNamespace("emmeans", quietly = TRUE)) emmeans::.emm_register("mymod", pkgname) } ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.