Efficient approximate leave-one-out cross-validation (LOO) using subsampling
Efficient approximate leave-one-out cross-validation (LOO) using subsampling
loo_subsample(x, ...) ## S3 method for class ''function'' loo_subsample( x, ..., data = NULL, draws = NULL, observations = 400, log_p = NULL, log_g = NULL, r_eff = NULL, save_psis = FALSE, cores = getOption("mc.cores", 1), loo_approximation = "plpd", loo_approximation_draws = NULL, estimator = "diff_srs", llgrad = NULL, llhess = NULL )
x |
A function. The Methods (by class) section, below, has detailed descriptions of how to specify the inputs. |
data, draws, ... |
For |
observations |
The subsample observations to use. The argument can take four (4) types of arguments:
|
log_p, log_g |
Should be supplied only if approximate posterior draws are
used. The default ( |
r_eff |
Vector of relative effective sample size estimates for the
likelihood ( |
save_psis |
Should the |
cores |
The number of cores to use for parallelization. This defaults to
the option
|
loo_approximation |
What type of approximation of the loo_i's should be used?
The default is
As point estimates of \hat{θ}, the posterior expectations of the parameters are used. |
loo_approximation_draws |
The number of posterior draws used when
integrating over the posterior. This is used if |
estimator |
How should
|
llgrad |
The gradient of the log-likelihood. This
is only used when |
llhess |
The hessian of the log-likelihood. This is only used
with |
The loo_subsample()
function is an S3 generic and a methods is
currently provided for log-likelihood functions. The implementation works
for both MCMC and for posterior approximations where it is possible to
compute the log density for the approximation.
loo_subsample()
returns a named list with class c("psis_loo_ss", "psis_loo", "loo")
. This has the same structure as objects returned by
loo()
but with the additional slot:
loo_subsampling
: A list with two vectors, log_p
and log_g
, of the
same length containing the posterior density and the approximation density
for the individual draws.
function
: A function f()
that takes arguments data_i
and draws
and returns a
vector containing the log-likelihood for a single observation i
evaluated
at each posterior draw. The function should be written such that, for each
observation i
in 1:N
, evaluating
f(data_i = data[i,, drop=FALSE], draws = draws)
results in a vector of length S
(size of posterior sample). The
log-likelihood function can also have additional arguments but data_i
and
draws
are required.
If using the function method then the arguments data
and draws
must also
be specified in the call to loo()
:
data
: A data frame or matrix containing the data (e.g.
observed outcome and predictors) needed to compute the pointwise
log-likelihood. For each observation i
, the i
th row of
data
will be passed to the data_i
argument of the
log-likelihood function.
draws
: An object containing the posterior draws for any
parameters needed to compute the pointwise log-likelihood. Unlike
data
, which is indexed by observation, for each observation the
entire object draws
will be passed to the draws
argument of
the log-likelihood function.
The ...
can be used if your log-likelihood function takes additional
arguments. These arguments are used like the draws
argument in that they
are recycled for each observation.
Magnusson, M., Riis Andersen, M., Jonasson, J. and Vehtari, A. (2019). Leave-One-Out Cross-Validation for Large Data. In International Conference on Machine Learning
Magnusson, M., Riis Andersen, M., Jonasson, J. and Vehtari, A. (2019). Leave-One-Out Cross-Validation for Model Comparison in Large Data.
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.