Compute Model's Predictions
Compute Model's Predictions.
get_predicted(x, ...) ## S3 method for class 'lm' get_predicted( x, data = NULL, predict = c("expectation", "link", "prediction", "relation"), iterations = NULL, verbose = TRUE, ... ) ## S3 method for class 'stanreg' get_predicted( x, data = NULL, predict = c("expectation", "link", "prediction", "relation"), iterations = NULL, include_random = TRUE, include_smooth = TRUE, verbose = TRUE, ... )
x |
A statistical model (can also be a data.frame, in which case the second argument has to be a model). |
... |
Other argument to be passed for instance to
|
data |
An optional data frame in which to look for variables with which to predict. If omitted, the data used to fit the model is used. |
predict |
Can be |
iterations |
For Bayesian models, this corresponds to the number of
posterior draws. If |
verbose |
Toggle warnings. |
include_random |
If |
include_smooth |
For General Additive Models (GAMs). If |
The predict
argument jointly modulates two separate concepts, the
scale and the uncertainty interval.
Linear models - lm()
: For linear models, Prediction
intervals (predict = "prediction"
) show the range that likely
contains the value of a new observation (in what range it is likely to
fall), whereas confidence intervals (predict = "expectation"
or
predict = "link"
) reflect the uncertainty around the estimated
parameters (and gives the range of uncertainty of the regression line). In
general, Prediction Intervals (PIs) account for both the uncertainty in the
model's parameters, plus the random variation of the individual values.
Thus, prediction intervals are always wider than confidence intervals.
Moreover, prediction intervals will not necessarily become narrower as the
sample size increases (as they do not reflect only the quality of the fit,
but also the variability within the data).
General Linear models - glm()
: For binomial models,
prediction intervals are somewhat useless (for instance, for a binomial
(bernoulli) model for which the dependent variable is a vector of 1s and
0s, the prediction interval is... [0, 1]
).
Having the output is on the scale of the response variable is arguably the most convenient to understand and visualize the relationships. If on the link-scale, no transformation is applied and the values are on the scale of the model's predictors. For instance, for a logistic model, the response scale corresponds to the predicted probabilities, whereas the link-scale makes predictions of log-odds (probabilities on the logit scale).
The fitted values (i.e. predictions for the response). For Bayesian
or bootstrapped models (when iterations != NULL
), this will be a
dataframe with all iterations as columns (observations are still rows).
get_predicted_ci
data(mtcars) x <- lm(mpg ~ cyl + hp, data = mtcars) predictions <- get_predicted(x) predictions get_predicted(x, predict = "prediction") # Get CI as.data.frame(predictions) # Bootsrapped as.data.frame(get_predicted(x, iterations = 4)) summary(get_predicted(x, iterations = 4)) # Same as as.data.frame(..., keep_iterations = F)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.