Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

get_predicted

Compute Model's Predictions


Description

Compute Model's Predictions.

Usage

get_predicted(x, ...)

## S3 method for class 'lm'
get_predicted(
  x,
  data = NULL,
  predict = c("expectation", "link", "prediction", "relation"),
  iterations = NULL,
  verbose = TRUE,
  ...
)

## S3 method for class 'stanreg'
get_predicted(
  x,
  data = NULL,
  predict = c("expectation", "link", "prediction", "relation"),
  iterations = NULL,
  include_random = TRUE,
  include_smooth = TRUE,
  verbose = TRUE,
  ...
)

Arguments

x

A statistical model (can also be a data.frame, in which case the second argument has to be a model).

...

Other argument to be passed for instance to get_predicted_ci.

data

An optional data frame in which to look for variables with which to predict. If omitted, the data used to fit the model is used.

predict

Can be "link", "expectation" (default), or "prediction". This modulates the scale of the output as well as the type of certainty interval. More specifically, "link" gives an output on the link-scale (for logistic models, that means the log-odds scale) with a confidence interval (CI). "expectation" (default) also returns confidence intervals, but this time the output is on the response scale (for logistic models, that means probabilities). Finally, "predict" also gives an output on the response scale, but this time associated with a prediction interval (PI), which is larger than a confidence interval (though it mostly make sense for linear models). Read more about in the Details section below. "relation" is also accepted as a (deprecated) alias for "expectation".

iterations

For Bayesian models, this corresponds to the number of posterior draws. If NULL, will return all the draws (one for each iteration of the model). For frequentist models, if not NULL, will generate bootstrapped draws, from which bootstrapped CIs will be computed.

verbose

Toggle warnings.

include_random

If TRUE (default), include all random effects in the prediction. If FALSE, don't take them into account. Can also be a formula to specify which random effects to condition on when predicting (passed to the re.form argument). If include_random = TRUE and newdata is provided, make sure to include the random effect variables in newdata as well.

include_smooth

For General Additive Models (GAMs). If FALSE, will fix the value of the smooth to its average, so that the predictions are not depending on it. (default), mean(), or bayestestR::map_estimate().

Details

The predict argument jointly modulates two separate concepts, the scale and the uncertainty interval.

Confidence Interval vs. Prediction Interval)

  • Linear models - lm(): For linear models, Prediction intervals (predict = "prediction") show the range that likely contains the value of a new observation (in what range it is likely to fall), whereas confidence intervals (predict = "expectation" or predict = "link") reflect the uncertainty around the estimated parameters (and gives the range of uncertainty of the regression line). In general, Prediction Intervals (PIs) account for both the uncertainty in the model's parameters, plus the random variation of the individual values. Thus, prediction intervals are always wider than confidence intervals. Moreover, prediction intervals will not necessarily become narrower as the sample size increases (as they do not reflect only the quality of the fit, but also the variability within the data).

  • General Linear models - glm(): For binomial models, prediction intervals are somewhat useless (for instance, for a binomial (bernoulli) model for which the dependent variable is a vector of 1s and 0s, the prediction interval is... [0, 1]).

Link scale vs. Response scale

Having the output is on the scale of the response variable is arguably the most convenient to understand and visualize the relationships. If on the link-scale, no transformation is applied and the values are on the scale of the model's predictors. For instance, for a logistic model, the response scale corresponds to the predicted probabilities, whereas the link-scale makes predictions of log-odds (probabilities on the logit scale).

Value

The fitted values (i.e. predictions for the response). For Bayesian or bootstrapped models (when iterations != NULL), this will be a dataframe with all iterations as columns (observations are still rows).

See Also

get_predicted_ci

Examples

data(mtcars)
x <- lm(mpg ~ cyl + hp, data = mtcars)
predictions <- get_predicted(x)
predictions

get_predicted(x, predict = "prediction")

# Get CI
as.data.frame(predictions)

# Bootsrapped
as.data.frame(get_predicted(x, iterations = 4))
summary(get_predicted(x, iterations = 4)) # Same as as.data.frame(..., keep_iterations = F)

insight

Easy Access to Model Information for Various Model Objects

v0.14.0
GPL-3
Authors
Daniel Lüdecke [aut, cre] (<https://orcid.org/0000-0002-8895-3206>, @strengejacke), Dominique Makowski [aut, ctb] (<https://orcid.org/0000-0001-5375-9967>, @Dom_Makowski), Indrajeet Patil [aut, ctb] (<https://orcid.org/0000-0003-1995-6531>, @patilindrajeets), Philip Waggoner [aut, ctb] (<https://orcid.org/0000-0002-7825-7573>), Mattan S. Ben-Shachar [aut, ctb] (<https://orcid.org/0000-0002-4287-4801>), Brenton M. Wiernik [aut] (<https://orcid.org/0000-0001-9560-6336>, @bmwiernik)
Initial release

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.