Kruskal-Wallis Test for the 2 x t Contingency Table
This function uses the Kruskal-Wallis criterion to test the hypothesis of no association between the counts for two responses "A" and "B" across t categories.
contingency2xt(Avec, Bvec, method = c("asymptotic", "simulated", "exact"), dist = FALSE, tab0 = TRUE, Nsim = 1e+06)
Avec |
vector of length t giving the counts A_1,…, A_t for response "A" according to t categories. m = A_1 + … + A_t. |
Bvec |
vector of length t giving the counts B_1,…, B_t for response "B" according to t categories. n = B_1 + … + B_t = N-m. |
method |
=
|
dist |
|
tab0 |
|
Nsim |
|
For this data scenario the Kruskal-Wallis criterion is
K.star = N(N-1)/(mn) (∑ A_i^2/d_i-m^2/N)
with d_i=A_i+B_i, treating "A" responses as 1 and "B" responses as 2, and using midranks as explained in Lehmann (2006), Chapter 5.3.
For small sample sizes exact null distribution calculations are possible, based on Algorithm C (Chase's sequence) in Knuth (2011), which allows the enumeration of all possible splits of m into counts A_1,…, A_t such that m = A_1 + … + A_t, followed by the calculation of the statistic K.star for each such split. Simulation of A_1,…, A_t uses the probability model (5.35) in Lehmann (2006) to successively generate hypergeometric counts A_1,…, A_t. Both these processes, enumeration and simulation, are done in C.
A list of class kSamples
with components
test.name |
|
t |
number of classification categories |
KW.cont |
2 (3) vector giving the observed KW statistic, its asymptotic P-value (and simulated or exact P-value) |
null.dist |
simulated or enumerated null distribution
of the test statistic. It is given as an This format of For
|
method |
the |
Nsim |
the number of simulations. |
method = "exact"
should only be used with caution.
Computation time is proportional to the number of enumerations. In most cases
dist = TRUE
should not be used, i.e.,
when the returned distribution objects
become too large for R's work space.
Knuth, D.E. (2011), The Art of Computer Programming, Volume 4A Combinatorial Algorithms Part 1, Addison-Wesley
Kruskal, W.H. (1952), A Nonparametric Test for the Several Sample Problem, The Annals of Mathematical Statistics, Vol 23, No. 4, 525-540
Kruskal, W.H. and Wallis, W.A. (1952), Use of Ranks in One-Criterion Variance Analysis, Journal of the American Statistical Association, Vol 47, No. 260, 583–621.
Lehmann, E.L. (2006), Nonparametrics, Statistical Methods Based on Ranks, Revised First Edition, Springer, New York.
contingency2xt(c(25,15,20),c(16,6,18),method="exact",dist=FALSE, tab0=TRUE,Nsim=1e3)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.