Heteroscedastic Extended Logistic Regression
This is a wrapper function for clm
(from package
ordinal) to fit (heteroscedastic) extended logistic regression (HXLR)
models (Messner et al. 2013).
hxlr(formula, data, subset, na.action, weights, thresholds, link, control, ...)
formula |
a formula expression of the form |
data |
an optional data frame containing the variables occurring in the formulas. |
subset |
an optional vector specifying a subset of observations to be used for fitting. |
na.action |
a function which indicates what should happen when the data
contain |
weights |
optional case weights in fitting. |
thresholds |
vector of (transformed) thresholds that are used to cut the continuous response into categories. Data frames or matrices with multiple columns are allowed as well. Then each column is used as separate predictor variable for the intercept model. |
link |
link function, i.e., the type of location-scale distribution
assumed for the latent distribution. Default is |
control |
a list of control parameters passed to |
... |
arguments to be used to form the default |
Extended logistic regression (Wilks 2009) extends binary logistic regression to multi-category responses by including the thresholds, that are used to cut a continuous variable into categories, in the regression equation. Heteroscedastic extended logistic regression (Messner et al. 2013) extends this model further and allows to add additional predictor variables that are used to predict the scale of the latent logistic distribution.
An object of class "hxlr"
, i.e., a list with the following elements.
coefficients |
list of CLM coefficients for intercept, location, and scale model. |
fitted.values |
list of fitted location and scale parameters. |
optim |
output from optimization from |
method |
Optimization method used for |
control |
list of control parameters passed to |
start |
starting values of coefficients used in the optimization. |
weights |
case weights used for fitting. |
n |
number of observations. |
nobs |
number of observations with non-zero weights. |
loglik |
log-likelihood. |
vcov |
covariance matrix. |
converged |
logical variable whether optimization has converged or not. |
iterations |
number of iterations in optimization. |
call |
function call. |
scale |
the formula supplied. |
terms |
the |
levels |
list of levels of the factors used in fitting for location and scale respectively. |
thresholds |
the thresholds supplied. |
Messner JW, Mayr GJ, Zeileis A, Wilks DS (2014). Extending Extended Logistic Regression to Effectively Utilize the Ensemble Spread. Monthly Weather Review, 142, 448–456. doi: 10.1175/MWR-D-13-00271.1.
Wilks DS (2009). Extending Logistic Regression to Provide Full-Probability-Distribution MOS Forecasts. Meteorological Applications, 368, 361–368.
data("RainIbk") ## mean and standard deviation of square root transformed ensemble forecasts RainIbk$sqrtensmean <- apply(sqrt(RainIbk[,grep('^rainfc',names(RainIbk))]), 1, mean) RainIbk$sqrtenssd <- apply(sqrt(RainIbk[,grep('^rainfc',names(RainIbk))]), 1, sd) ## climatological deciles q <- unique(quantile(RainIbk$rain, seq(0.1, 0.9, 0.1))) ## fit ordinary extended logistic regression with ensemble mean as ## predictor variable XLR <- hxlr(sqrt(rain) ~ sqrtensmean, data = RainIbk, thresholds = sqrt(q)) ## print XLR ## summary summary(XLR) ## fit ordinary extended logistic regression with ensemble mean ## and standard deviation as predictor variables XLRS <- hxlr(sqrt(rain) ~ sqrtensmean + sqrtenssd, data = RainIbk, thresholds = sqrt(q)) ## fit heteroscedastic extended logistic regression with ensemble ## standard deviation as predictor for the scale HXLR <- hxlr(sqrt(rain) ~ sqrtensmean | sqrtenssd, data = RainIbk, thresholds = sqrt(q)) ## compare AIC of different models AIC(XLR, XLRS, HXLR) ## XLRS and HXLR are nested in XLR -> likelihood-ratio-tests if(require("lmtest")) { lrtest(XLR, XLRS) lrtest(XLR, HXLR) } ## Not run: ################################################################### ## Cross-validation and bootstrapping RPS for different models ## (like in Messner 2013). N <- NROW(RainIbk) ## function that returns model fits fits <- function(data, weights = rep(1, N)) { list( "XLR" = hxlr(sqrt(rain) ~ sqrtensmean, data = data, weights = weights, thresholds = sqrt(q)), "XLR:S" = hxlr(sqrt(rain) ~ sqrtensmean + sqrtenssd, data = data, weights = weights, thresholds = sqrt(q)), "XLR:SM" = hxlr(sqrt(rain) ~ sqrtensmean + I(sqrtensmean*sqrtenssd), data = data, weights = weights, thresholds = sqrt(q)), "HXLR" = hxlr(sqrt(rain) ~ sqrtensmean | sqrtenssd, data = data, weights = weights, thresholds = sqrt(q)), "HXLR:S" = hxlr(sqrt(rain) ~ sqrtensmean + sqrtenssd | sqrtenssd, data = data, weights = weights, thresholds = sqrt(q)) ) } ## cross validation id <- sample(1:10, N, replace = TRUE) obs <- NULL pred <- list(NULL) for(i in 1:10) { ## splitting into test and training data set trainIndex <- which(id != i) testIndex <- which(id == i) ## weights that are used for fitting the models weights <- as.numeric(table(factor(trainIndex, levels = c(1:N)))) ## testdata testdata <- RainIbk[testIndex,] ## observations obs <- c(obs, RainIbk$rain[testIndex]) ## estimation modelfits <- fits(RainIbk, weights) ## Prediction pred2 <- lapply(modelfits, predict, newdata = testdata, type = "cumprob") pred <- mapply(rbind, pred, pred2, SIMPLIFY = FALSE) } names(pred) <- c(names(modelfits)) ## function to compute RPS rps <- function(pred, obs) { OBS <- NULL for(i in 1:N) OBS <- rbind(OBS, rep(0:1, c(obs[i] - 1, length(q) - obs[i] + 1))) apply((OBS-pred)^2, 1, sum) } ## compute rps RPS <- lapply(pred, rps, obs = as.numeric(cut(obs, c(-Inf, q, Inf)))) ## bootstrapping mean rps rpsall <- NULL for(i in 1:250) { index <- sample(length(obs), replace = TRUE) rpsall <- rbind(rpsall, sapply(RPS, function(x) mean(x[index]))) } rpssall <- 1 - rpsall/rpsall[,1] boxplot(rpssall[,-1], ylab = "RPSS", main = "RPSS relative to XLR") abline(h = 0, lty = 2) ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.