Fitting Copulas to Data – Copula Parameter Estimation
Parameter estimation of copulas, i.e., fitting of a copula model to multivariate (possibly “pseudo”) observations.
loglikCopula(param = getTheta(copula), u, copula, error = c("-Inf", "warn-Inf", "let-it-be")) ## Generic [and "rotCopula" method] : fitCopula(copula, data, ...) ## S4 method for signature 'copula' fitCopula(copula, data, method = c("mpl", "ml", "itau", "irho", "itau.mpl"), posDef = is(copula, "ellipCopula"), start = NULL, lower = NULL, upper = NULL, optim.method = optimMeth(copula, method, dim = d), optim.control = list(maxit=1000), estimate.variance = NA, hideWarnings = FALSE, ...) optimMeth(copula, method, dim)
param |
vector of free (see |
u |
n x d-matrix of (pseudo-)observations in
[0,1]^d for computing the copula log-likelihood, where n denotes
the sample size and d the dimension. Consider applying the function
|
data |
as u, an n x d-matrix of data. For
|
copula |
a |
error |
(for
|
method |
a
|
posDef |
a |
start |
a |
lower, upper |
Lower or upper parameter bounds for the
optimization methods |
optim.control |
a |
optim.method |
a character string specify the optimization
method or a The default has been changed (for copula 0.999-16, in
Aug. 2016) from |
dim |
integer, the data and copula dimension, d >= 2. |
estimate.variance |
a |
hideWarnings |
a |
... |
additional arguments passed to |
The only difference between "mpl"
and "ml"
is in the
variance-covariance estimate, not in the parameter
(θ) estimates.
If method "mpl"
in fitCopula()
is used and if
start
is not assigned a value, estimates obtained from method
"itau"
are used as initial values in the optimization. Standard
errors are computed as explained in Genest, Ghoudi and Rivest (1995);
see also Kojadinovic and Yan (2010, Section 3). Their estimation
requires the computation of certain partial derivatives of the (log)
density. These have been implemented for six copula families thus far:
the Clayton, Gumbel-Hougaard, Frank, Plackett, normal and t copula
families. For other families, numerical differentiation based on
grad()
from package numDeriv
is used (and a warning message is displayed).
For normal and t copulas, fitCopula(, method = "mpl")
and
fitCopula(, method = "ml")
maximize the log-likelihood based on
mvtnorm's dmvnorm()
and dmvt()
, respectively.
The latter two functions set the respective densities to zero if the
correlation matrices of the corresponding distributions are not
positive definite. As such, the estimated correlation matrices
will be positive definite.
If methods "itau"
or "irho"
are used in
fitCopula()
, an estimate of the asymptotic variance (if
available for the copula under consideration) will be correctly
computed only if the argument data
consists of pseudo-observations
(see pobs()
).
Consider the t copula with df.fixed=FALSE
(see
ellipCopula()
). In this case, the methods "itau"
and "irho"
cannot be used in fitCopula()
as they cannot
estimate the degrees of freedom parameter df
. For the methods
"mpl"
and "itau.mpl"
the asymptotic variance cannot be
(fully) estimated (yet). For the methods "ml"
and "mpl"
,
when start
is not specified, the starting value for df
is set to copula@df
, typically 4.
To implement the Inference Functions for Margins (IFM) method
(see, e.g., Joe 2005), set method="ml"
and note that
data
need to be parametric pseudo-observations obtained from fitted
parametric marginal distribution functions. The returned large-sample
variance will then underestimate the true variance (as the procedure
cannot take into account the (unknown) estimation error for the margins).
The fitting procedures based on optim()
generate
warnings because invalid parameter values are tried during the
optimization process. When the number of parameters is one and the
parameter space is bounded, using optim.method="Brent"
is
likely to give less warnings. Furthermore, from experience,
optim.method="Nelder-Mead"
is sometimes a more robust
alternative to optim.method="BFGS"
or "L-BFGS-B"
.
loglikCopula()
returns the copula log-likelihood evaluated at the
parameter (vector) param
given the data u
.
The return value of fitCopula()
is an object of class
"fitCopula"
(inheriting from hidden class
"fittedMV"
), containing (among others!) the slots
The parameter estimates.
The large-sample (i.e., asymptotic) variance estimate of the parameter
estimator unless estimate.variance=FALSE
where it is
matrix(numeric(), 0,0)
(to be distinguishable from cases when the
covariance estimates failed partially).
The fitted copula object.
The summary()
method for "fitCopula"
objects
returns an S3 “class” "summary.fitCopula"
, which is simply a list
with components method
, loglik
and convergence
,
all three from the corresponding slots of the
"fitCopula"
objects, and coefficients
(a matrix of estimated coefficients, standard errors, t values and p-values).
Genest, C. (1987). Frank's family of bivariate distributions. Biometrika 74, 549–555.
Genest, C. and Rivest, L.-P. (1993). Statistical inference procedures for bivariate Archimedean copulas. Journal of the American Statistical Association 88, 1034–1043.
Rousseeuw, P. and Molenberghs, G. (1993). Transformation of nonpositive semidefinite correlation matrices. Communications in Statistics: Theory and Methods 22, 965–984.
Genest, C., Ghoudi, K., and Rivest, L.-P. (1995). A semiparametric estimation procedure of dependence parameters in multivariate families of distributions. Biometrika 82, 543–552.
Joe, H. (2005). Asymptotic efficiency of the two-stage estimation method for copula-based models. Journal of Multivariate Analysis 94, 401–419.
Mashal, R. and Zeevi, A. (2002). Beyond Correlation: Extreme Co-movements Between Financial Assets. https://www0.gsb.columbia.edu/faculty/azeevi/PAPERS/BeyondCorrelation.pdf (2016-04-05)
Demarta, S. and McNeil, A. J. (2005). The t copula and related copulas. International Statistical Review 73, 111–129.
Genest, C. and Favre, A.-C. (2007). Everything you always wanted to know about copula modeling but were afraid to ask. Journal of Hydrologic Engineering 12, 347–368.
Kojadinovic, I. and Yan, J. (2010). Comparison of three semiparametric methods for estimating dependence parameters in copula models. Insurance: Mathematics and Economics 47, 52–63.
For maximum likelihood of (nested) Archimedean copulas, see
emle
, etc.
(Xtras <- copula:::doExtras()) # determine whether examples will be extra (long) n <- if(Xtras) 200 else 64 # sample size ## A Gumbel copula set.seed(7) # for reproducibility gumbel.cop <- gumbelCopula(3, dim=2) x <- rCopula(n, gumbel.cop) # "true" observations (simulated) u <- pobs(x) # pseudo-observations ## Inverting Kendall's tau fit.tau <- fitCopula(gumbelCopula(), u, method="itau") fit.tau confint(fit.tau) # work fine ! confint(fit.tau, level = 0.98) summary(fit.tau) # a bit more, notably "Std. Error"s coef(fit.tau)# named vector coef(fit.tau, SE = TRUE)# matrix ## Inverting Spearman's rho fit.rho <- fitCopula(gumbelCopula(), u, method="irho") summary(fit.rho) ## Maximum pseudo-likelihood fit.mpl <- fitCopula(gumbelCopula(), u, method="mpl") fit.mpl ## Maximum likelihood -- use 'x', not 'u' ! -- fit.ml <- fitCopula(gumbelCopula(), x, method="ml") summary(fit.ml) # now prints a bit more than simple 'fit.ml' ## ... and what's the log likelihood (in two different ways): (ll. <- logLik(fit.ml)) stopifnot(all.equal(as.numeric(ll.), loglikCopula(coef(fit.ml), u=x, copula=gumbel.cop))) ## A Gauss/normal copula ## With multiple/*un*constrained parameters set.seed(6) # for reproducibility normal.cop <- normalCopula(c(0.6, 0.36, 0.6), dim=3, dispstr="un") x <- rCopula(n, normal.cop) # "true" observations (simulated) u <- pobs(x) # pseudo-observations ## Inverting Kendall's tau fit.tau <- fitCopula(normalCopula(dim=3, dispstr="un"), u, method="itau") fit.tau ## Inverting Spearman's rho fit.rho <- fitCopula(normalCopula(dim=3, dispstr="un"), u, method="irho") fit.rho ## Maximum pseudo-likelihood fit.mpl <- fitCopula(normalCopula(dim=3, dispstr="un"), u, method="mpl") summary(fit.mpl) coef(fit.mpl) # named vector coef(fit.mpl, SE = TRUE) # the matrix, with SE ## Maximum likelihood (use 'x', not 'u' !) fit.ml <- fitCopula(normalCopula(dim=3, dispstr="un"), x, method="ml", traceOpt=TRUE) summary(fit.ml) confint(fit.ml) confint(fit.ml, level = 0.999) # clearly non-0 ## Fix some of the parameters param <- c(.6, .3, NA_real_) attr(param, "fixed") <- c(TRUE, FALSE, FALSE) ncp <- normalCopula(param = param, dim = 3, dispstr = "un") fixedParam(ncp) <- c(TRUE, TRUE, FALSE) ## 'traceOpt = 5': showing every 5-th log likelihood evaluation: summary(Fxf.mpl <- fitCopula(ncp, u, method = "mpl", traceOpt = 5)) Fxf.mpl@copula # reminding of the fixed param. values ## With dispstr = "toep" : normal.cop.toep <- normalCopula(c(0, 0), dim=3, dispstr="toep") ## Inverting Kendall's tau fit.tau <- fitCopula(normalCopula(dim=3, dispstr="toep"), u, method="itau") fit.tau ## Inverting Spearman's rho fit.rho <- fitCopula(normalCopula(dim=3, dispstr="toep"), u, method="irho") summary(fit.rho) ## Maximum pseudo-likelihood fit.mpl <- fitCopula(normalCopula(dim=3, dispstr="toep"), u, method="mpl") fit.mpl ## Maximum likelihood (use 'x', not 'u' !) fit.ml <- fitCopula(normalCopula(dim=3, dispstr="toep"), x, method="ml") summary(fit.ml) ## With dispstr = "ar1" normal.cop.ar1 <- normalCopula(c(0), dim=3, dispstr="ar1") ## Inverting Kendall's tau summary(fit.tau <- fitCopula(normalCopula(dim=3, dispstr="ar1"), u, method="itau")) ## Inverting Spearman's rho summary(fit.rho <- fitCopula(normalCopula(dim=3, dispstr="ar1"), u, method="irho")) ## Maximum pseudo-likelihood summary(fit.mpl <- fitCopula(normalCopula(dim=3, dispstr="ar1"), u, method="mpl")) ## Maximum likelihood (use 'x', not 'u' !) fit.ml <- fitCopula(normalCopula(dim=3, dispstr="ar1"), x, method="ml") summary(fit.ml) ## A t copula with variable df (df.fixed=FALSE) (tCop <- tCopula(c(0.2,0.4,0.6), dim=3, dispstr="un", df=5)) set.seed(101) x <- rCopula(n, tCop) # "true" observations (simulated) ## Maximum likelihood (start = (rho[1:3], df)) summary(tc.ml <- fitCopula(tCopula(dim=3, dispstr="un"), x, method="ml", start = c(0,0,0, 10))) ## Maximum pseudo-likelihood (the asymptotic variance cannot be estimated) u <- pobs(x) # pseudo-observations tc.mpl <- fitCopula(tCopula(dim=3, dispstr="un"), u, method="mpl", estimate.variance=FALSE, start= c(0,0,0, 10)) summary(tc.mpl)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.