cutree for dendrogram (by 1 k value only!)
Cuts a dendrogram tree into several groups by specifying the desired number of clusters k (only a single k value!).
In case there exists no such k for which exists a relevant split of the dendrogram, a warning is issued to the user, and NA is returned.
cutree_1k.dendrogram( dend, k, dend_heights_per_k = NULL, use_labels_not_values = TRUE, order_clusters_as_data = TRUE, warn = dendextend_options("warn"), ... )
dend |
a dendrogram object |
k |
numeric scalar (not a vector!) with the number of clusters the tree should be cut into. |
dend_heights_per_k |
a named vector that resulted from running.
|
use_labels_not_values |
logical, defaults to TRUE. If the actual labels of the
clusters do not matter - and we want to gain speed (say, 10 times faster) -
then use FALSE (gives the "leaves order" instead of their labels.).
This is passed to |
order_clusters_as_data |
logical, defaults to TRUE. There are two ways by which
to order the clusters: 1) By the order of the original data. 2) by the order of the
labels in the dendrogram. In order to be consistent with cutree, this is set
to TRUE.
This is passed to |
warn |
logical (default from dendextend_options("warn") is FALSE). Set if warning are to be issued, it is safer to keep this at TRUE, but for keeping the noise down, the default is FALSE. Should the function send a warning in case the desried k is not available? |
... |
(not currently in use) |
cutree_1k.dendrogram
returns an integer vector with group
memberships.
In case there exists no such k for which exists a relevant split of the dendrogram, a warning is issued to the user, and NA is returned.
Tal Galili
hc <- hclust(dist(USArrests[c(1, 6, 13, 20, 23), ]), "ave") dend <- as.dendrogram(hc) cutree(hc, k = 3) # on hclust cutree_1k.dendrogram(dend, k = 3) # on a dendrogram labels(dend) # the default (ordered by original data's order) cutree_1k.dendrogram(dend, k = 3, order_clusters_as_data = TRUE) # A different order of labels - order by their order in the tree cutree_1k.dendrogram(dend, k = 3, order_clusters_as_data = FALSE) # make it faster ## Not run: library(microbenchmark) dend_ks <- heights_per_k.dendrogram microbenchmark( cutree_1k.dendrogram = cutree_1k.dendrogram(dend, k = 4), cutree_1k.dendrogram_no_labels = cutree_1k.dendrogram(dend, k = 4, use_labels_not_values = FALSE ), cutree_1k.dendrogram_no_labels_per_k = cutree_1k.dendrogram(dend, k = 4, use_labels_not_values = FALSE, dend_heights_per_k = dend_ks ) ) # the last one is the fastest... ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.