Kendall's Tau Statistic
Computes Kendall's Tau, which is a rank-based correlation measure, between two vectors.
kendall.tau(x, y, exact = FALSE, max.n = 3000)
x, y |
Numeric vectors. Must be of equal length.
Ideally their values are continuous and not too discrete.
Let |
exact |
Logical. If |
max.n |
Numeric. If |
Kendall's tau is a measure of dependency in a bivariate distribution.
Loosely, two random variables are concordant if large values
of one random variable are associated with large values of the
other random variable.
Similarly, two random variables are disconcordant if large values
of one random variable are associated with small values of the
other random variable.
More formally, if (x[i] - x[j])*(y[i] - y[j]) > 0
then
that comparison is concordant (i \neq j).
And if (x[i] - x[j])*(y[i] - y[j]) < 0
then
that comparison is disconcordant (i \neq j).
Out of choose(N, 2
) comparisons,
let c and d be the
number of concordant and disconcordant pairs.
Then Kendall's tau can be estimated by (c-d)/(c+d).
If there are ties then half the ties are deemed concordant and
half disconcordant so that (c-d)/(c+d+t) is used.
Kendall's tau, which lies between -1 and 1.
If length(x)
is large then
the cost is O(N^2), which is expensive!
Under these circumstances
it is not advisable to set exact = TRUE
or max.n
to a very
large number.
N <- 5000; x <- 1:N; y <- runif(N) true.rho <- -0.8 ymat <- rbinorm(N, cov12 = true.rho) # Bivariate normal, aka N_2 x <- ymat[, 1] y <- ymat[, 2] ## Not run: plot(x, y, col = "blue") kendall.tau(x, y) # A random sample is taken here kendall.tau(x, y) # A random sample is taken here kendall.tau(x, y, exact = TRUE) # Costly if length(x) is large kendall.tau(x, y, max.n = N) # Same as exact = TRUE (rhohat <- sin(kendall.tau(x, y) * pi / 2)) # This formula holds for N_2 actually true.rho # rhohat should be near this value
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.