Compute Hartigans' Dip Test Statistic for Unimodality
Computes Hartigans' dip test statistic for testing unimodality, and additionally the modal interval.
dip(x, full.result = FALSE, min.is.0 = FALSE, debug = FALSE)
x |
numeric; the data. |
full.result |
logical or string; |
min.is.0 |
logical indicating if the minimal value of the
dip statistic Dn can be zero or not. Arguably should be
set to |
debug |
logical; if true, some tracing information is printed (from the C routine). |
depending on full.result
either a number, the dip statistic, or
an object of class "dip"
which is a list
with components
x |
the sorted |
n |
|
dip |
the dip statistic |
lo.hi |
indices into |
xl, xu |
lower and upper end of modal interval |
gcm, lcm |
(last used) indices for greatest convex minorant and the least concave majorant. |
mn, mj |
index vectors of length |
For “full” results of class "dip"
, there are
print
and plot
methods, the latter with
its own manual page.
For n <= 3 where n <- length(x)
, the dip
statistic Dn is always the same minimum value,
1/(2n), i.e., there's no possible dip test.
Note that up to May 2011, from Hartigan's original Fortran code, Dn
was set to zero, when all x
values were identical. However,
this entailed discontinuous behavior, where for arbitrarily close
data x~, Dn(x~) = 1/(2n).
Yong Lu lyongu+@cs.cmu.edu found in Oct 2003 that the code was not giving symmetric results for mirrored data (and was giving results of almost 1, and then found the reason, a misplaced ")" in the original Fortran code. This bug has been corrected for diptest version 0.25-0 (Feb 13, 2004).
Nick Cox (Durham Univ.) said (on March 20, 2008 on the Stata-list):
As it comes from a bimodal husband-wife collaboration, the name
perhaps should be “Hartigan-Hartigan dip test”, but that
does not seem to have caught on. Some of my less statistical
colleagues would sniff out the hegemony of patriarchy there, although
which Hartigan is being overlooked is not clear.
Martin Maechler, as a Swiss, and politician, would say:
Let's find a compromise, and call it “Hartigans' dip test”,
so we only have to adapt orthography (:-).
Martin Maechler maechler@stat.math.ethz.ch, 1994,
based on S (S-PLUS) and C code donated from Dario Ringach
dario@wotan.cns.nyu.edu who had applied f2c
on the
original Fortran code available from Statlib.
In Aug.1993, recreated and improved Hartigans' "P-value" table, which
later became qDiptab
.
P. M. Hartigan (1985)
Computation of the Dip Statistic to Test for Unimodality;
Applied Statistics (JRSS C) 34, 320–325.
Corresponding (buggy!) Fortran code of ‘AS 217’ available from Statlib,
http://lib.stat.cmu.edu/apstat/217
J. A. Hartigan and P. M. Hartigan (1985) The Dip Test of Unimodality; Annals of Statistics 13, 70–84.
data(statfaculty) plot(density(statfaculty)) rug(statfaculty, col="midnight blue"); abline(h=0, col="gray") dip(statfaculty) (dS <- dip(statfaculty, full = TRUE, debug = TRUE)) plot(dS) ## even more output -- + plot showing "global" GCM/LCM: (dS2 <- dip(statfaculty, full = "all", debug = 3)) plot(dS2) data(faithful) fE <- faithful$eruptions plot(density(fE)) rug(fE, col="midnight blue"); abline(h=0, col="gray") dip(fE, debug = 2) ## showing internal work (dE <- dip(fE, full = TRUE)) ## note the print method plot(dE, do.points=FALSE) data(precip) plot(density(precip)) rug(precip, col="midnight blue"); abline(h=0, col="gray") str(dip(precip, full = TRUE, debug = TRUE)) ##----------------- The 'min.is.0' option : --------------------- ##' dip(.) continuity and 'min.is.0' exploration: dd <- function(x, debug=FALSE) { x_ <- x ; x_[1] <- 0.9999999999 * x[1] rbind(dip(x , debug=debug), dip(x_, debug=debug), dip(x , min.is.0=TRUE, debug=debug), dip(x_, min.is.0=TRUE, debug=debug), deparse.level=2) } dd( rep(1, 8) ) # the 3rd one differs ==> min.is.0=TRUE is *dis*continuous dd( 1:7 ) # ditto dd( 1:7, debug=TRUE) ## border-line case .. dd( 1:2, debug=TRUE) ## Demonstrate that 'min.is.0 = TRUE' does not change the typical result: B.sim <- 1000 # or larger D5 <- {set.seed(1); replicate(B.sim, dip(runif(5)))} D5. <- {set.seed(1); replicate(B.sim, dip(runif(5), min.is.0=TRUE))} stopifnot(identical(D5, D5.), all.equal(min(D5), 1/(2*5))) hist(D5, 64); rug(D5) D8 <- {set.seed(7); replicate(B.sim, dip(runif(8)))} D8. <- {set.seed(7); replicate(B.sim, dip(runif(8), min.is.0=TRUE))} stopifnot(identical(D8, D8.))
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.