Time Series Prediction Comparison Test
Forecast prediction comparison test for two competing forecasts against an observation.
predcomp.test(x, xhat1, xhat2, alternative = c("two.sided", "less", "greater"), lossfun = "losserr", lossfun.args = NULL, test = c("DM", "HG"), ...) losserr(x, xhat, method = c("abserr", "sqerr", "simple", "power", "corrskill", "dtw"), scale = 1, p = 1, dtw.interr = c("abserr", "sqerr", "simple", "power"), ...) exponentialACV(x, y, ...) ## S3 method for class 'predcomp.test' summary(object, ...)
x,xhat1,xhat2,xhat |
numeric vectors giving the verification data and each competing forecast model output (1 and 2). For |
y |
|
object |
list object of class “predcomp.test” as returned by |
alternative |
character string stating which type of hypothesis test to conduct. |
lossfun |
character string naming the loss function to call. The default, |
lossfun.args |
List providing additional arguments to |
test |
character string stating whether to run the Diebold-Mariano type of test or the Hering-Genton modification of it (i.e., use a parametric autocovariance function). |
method,dtw.interr |
character string stating which type of loss (or skill) function to use. In the case of |
scale |
numeric giving a value by which to scale the loss function. In the case of “ |
p |
numeric only used by the “power” loss function. |
... |
For For For For the |
This function performs the analyses described in Gilleland and Roux (2014); although note that while working on another manuscript (Gilleland and Hering, in preparation), a better optimization routine has replaced the one used in said paper, which has been thoroughly tested to yield good size and power under a variety of temporal dependence structures, as well as having far fewer situations where a fit cannot be found. Namely, the Diebold Mariano test for competing forecast performance, the Hering and Genton (2011) modification of this test, as well as the dynamic time warping extension.
The Diebold-Mariano test was proposed in Diebold and Mariano (1995) for obtaining hypothesis tests to compare the forecast accuracy of two competing forecasts against a common verification series. The null hypothesis is that they have the same accuracy. The test statistic is of the form
S = dbar/sqrt(2*pi*se_d(0)/N),
where d is the loss differential, d = e1 - e2 (e1 = loss(x, xhat1) and e2 = loss(x, xhat2)), dbar is its sample mean, and se_d(0) is the standard error for d, which must be estimated, and N is the length of the series investigated. Let V = 2*pi*se_d(0), then V is estimated by
V = sum(gamma(tau)),
where the summation is over tau = -(k - 1) to (k - 1) for temporal lags k, and gamma are the empirical autocovariances.
Hering and Genton (2011) propose a modification that employs fitting a parameteric covariance model in determining the standard error for the test statistic (they also propose a spatial extension, see, e.g., spatMLD
from SpatialVx).
In either case, asymptotic results suggest that S ~ N(0,1), and the hypothesis test is conducted subsequently.
Discrete time warping can be applied (see examples below) in order to obtain a loss function based on location (in time) and intensity errors similar to the spatial version in Gilleland (2013).
The loss functions supplied by losserr
include:
abserr
: Absolute error loss, defined by abs((xhat - x)/scale),
sqerr
: Square error loss, defined by ((xhat - x)/scale)^2,
simple
: Simple loss, defined by (xhat - x)/scale,
power
: Power loss, defined by ((xhat - x)/scale)^p (same as sqerr
if p
=2),
corrskill
: Correlation skill defined by scale * (x - mean(x)) * (xhat - mean(xhat)),
dtw
: Discrete time warp loss defined by: d1 + d2, where d1 is the absolute distance (ignoring direction) of warp movement, and d2 is one of the above loss functions (except for corrskill
) applied to the resulting intensity errors after warping the series.
The exponential
function takes numeric vector arguments x
and y
and estimates the parameters, c(sigma, theta)
, that optimize
y = sigma^2*exp(-3*x/theta)
predcomp.test
returns a list object of class c(“predcomp.test”, “htest”) with components:
call |
the calling string |
method |
character string giving the full name of the method (Diebold-Mariano or Hering-Genton) used. |
fitmodel |
character naming the function used to fit the parametric model to the autocovariances or “none”. |
fitmodel.args |
If fitmodel is used, then this will be a list of any arguments passed in for it. |
loss.function |
character string naming the loss function called. |
statistic |
numeric giving the value of the statistic. |
alternative |
character string naming which type of hypothesis test was used (i.e., two-sided or one of the one-sided possibilities). |
p.value |
numeric giving the p-value for the test. |
data.name |
character vector naming the verification and competing forecast series applied to the test. |
losserr
returns a numeric vector of loss values.
exponentialACV
returns a list object of class “nls” as returned by nls
.
Eric Gilleland
Diebold, F. X. and Mariano, R. S. (1995) Comparing predictive accuracy. Journal of Business and Economic Statistics, 13, 253–263.
Gilleland, E. (2013) Testing competing precipitation forecasts accurately and efficiently: The spatial prediction comparison test. Mon. Wea. Rev., 141 (1), 340–355, http://dx.doi.org/10.1175/MWR-D-12-00155.1.
Gilleland, E. and Roux, G. (2014) A New Approach to Testing Forecast Predictive Accuracy. Accepted to Meteorol. Appl. Available at: http://onlinelibrary.wiley.com/doi/10.1002/met.1485/abstract
Hering, A. S. and Genton, M. G. (2011) Comparing spatial predictions. Technometrics, 53, (4), 414–425, doi:10.1198/TECH.2011.10136.
print.htest
, nls
, dtw
, acf
z0 <- arima.sim(list(order=c(2,0,0), ar=c(0.8,-0.2)), n=1000) z1 <- c(z0[10:1000], z0[1:9]) + rnorm(1000, sd=0.5) z2 <- arima.sim(list(order=c(3,0,1), ar=c(0.7,0,-0.1), ma=0.1), n=1000) + abs(rnorm(1000, mean=1.25)) test <- predcomp.test(z0, z1, z2) summary(test) test2 <- predcomp.test(z0, z1, z2, test = "HG" ) summary(test2) ## Not run: test2 <- predcomp.test(z0, z1, z2, test = "HG" ) summary(test2) test2.2 <- predcomp.test(z0, z1, z2, alternative="less") summary(test2.2) test3 <- predcomp.test(z0, z1, z2, lossfun.args=list(method="dtw") ) summary(test3) test3.2 <- predcomp.test(z0, z1, z2, alternative="less", lossfun.args=list(method="dtw"), test = "HG" ) summary(test3.2) test4 <- predcomp.test(z0, z1, z2, lossfun.args = list(method="corrskill"), test = "HG" ) summary(test4) test5 <- predcomp.test(z0, z1, z2, lossfun.args = list(method="dtw", dtw.interr="sqerr"), test = "HG" ) summary(test5) test5.2 <- predcomp.test(z0, z1, z2, alternative="less", lossfun.args=list(method="dtw", dtw.interr="sqerr"), test = "HG" ) summary(test5.2) ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.