Bk - Calculating Fowlkes-Mallows Index for two dendrogram
Bk is the calculation of Fowlkes-Mallows index for a series of k cuts for two dendrograms.
Bk(tree1, tree2, k, warn = dendextend_options("warn"), ...)
tree1 |
a dendrogram/hclust/phylo object. |
tree2 |
a dendrogram/hclust/phylo object. |
k |
an integer scalar or vector with the desired number of cluster groups. If missing - the Bk will be calculated for a default k range of 2:(nleaves-1). No point in checking k=1/k=n, since both will give Bk=1. |
warn |
logical (default from dendextend_options("warn") is FALSE). Set if warning are to be issued, it is safer to keep this at TRUE, but for keeping the noise down, the default is FALSE. |
... |
Ignored (passed to FM_index_R). |
From Wikipedia:
Fowlkes-Mallows index (see references) is an external evaluation method that is used to determine the similarity between two clusterings (clusters obtained after a clustering algorithm). This measure of similarity could be either between two hierarchical clusterings or a clustering and a benchmark classification. A higher the value for the Fowlkes-Mallows index indicates a greater similarity between the clusters and the benchmark classifications.
A list (of k's length) of Fowlkes-Mallows index between two dendrogram for a scalar/vector of k values. The names of the lists' items is the k for which it was calculated.
Fowlkes, E. B.; Mallows, C. L. (1 September 1983). "A Method for Comparing Two Hierarchical Clusterings". Journal of the American Statistical Association 78 (383): 553.
## Not run: set.seed(23235) ss <- TRUE # sample(1:150, 10 ) hc1 <- hclust(dist(iris[ss, -5]), "com") hc2 <- hclust(dist(iris[ss, -5]), "single") tree1 <- as.dendrogram(hc1) tree2 <- as.dendrogram(hc2) # cutree(tree1) Bk(hc1, hc2, k = 3) Bk(hc1, hc2, k = 2:10) Bk(hc1, hc2) Bk(tree1, tree2, k = 3) Bk(tree1, tree2, k = 2:5) system.time(Bk(hc1, hc2, k = 2:5)) # 0.01 system.time(Bk(hc1, hc2)) # 1.28 system.time(Bk(tree1, tree2, k = 2:5)) # 0.24 # after fixes. system.time(Bk(tree1, tree2, k = 2:10)) # 0.31 # after fixes. system.time(Bk(tree1, tree2)) # 7.85 Bk(tree1, tree2, k = 99:101) y <- Bk(hc1, hc2, k = 2:10) plot(unlist(y) ~ c(2:10), type = "b", ylim = c(0, 1)) # can take a few seconds y <- Bk(hc1, hc2) plot(unlist(y) ~ as.numeric(names(y)), main = "Bk plot", pch = 20, xlab = "k", ylab = "FM Index", type = "b", ylim = c(0, 1) ) # we are still missing some hypothesis testing here. # for this we'll have the Bk_plot function. ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.