Hybrid Adaptive Tree Cut for Hierarchical Clustering Dendrograms
Detect clusters in a dendorgram produced by the function hclust
.
cutreeHybrid( # Input data: basic tree cutiing dendro, distM, # Branch cut criteria and options cutHeight = NULL, minClusterSize = 20, deepSplit = 1, # Advanced options maxCoreScatter = NULL, minGap = NULL, maxAbsCoreScatter = NULL, minAbsGap = NULL, minSplitHeight = NULL, minAbsSplitHeight = NULL, # External (user-supplied) measure of branch split externalBranchSplitFnc = NULL, minExternalSplit = NULL, externalSplitOptions = list(), externalSplitFncNeedsDistance = NULL, assumeSimpleExternalSpecification = TRUE, # PAM stage options pamStage = TRUE, pamRespectsDendro = TRUE, useMedoids = FALSE, maxPamDist = cutHeight, respectSmallClusters = TRUE, # Various options verbose = 2, indent = 0)
dendro |
a hierarchical clustering dendorgram such as one returned by |
distM |
Distance matrix that was used as input to |
cutHeight |
Maximum joining heights that will be considered. It defaults to 99 of the range between the 5th percentile and the maximum of the joining heights on the dendrogram. |
minClusterSize |
Minimum cluster size. |
deepSplit |
Either logical or integer in the range 0 to 4. Provides a rough control over
sensitivity to cluster splitting. The higher the value, the more and smaller clusters will be produced.
A finer control can be achieved via |
maxCoreScatter |
Maximum scatter of the core for a branch to be a cluster, given as the fraction
of |
minGap |
Minimum cluster gap given as the fraction of the difference between |
maxAbsCoreScatter |
Maximum scatter of the core for a branch to be a cluster given as absolute
heights. If given, overrides |
minAbsGap |
Minimum cluster gap given as absolute height difference. If given, overrides
|
minSplitHeight |
Minimum split height given as the fraction of the difference between
|
minAbsSplitHeight |
Minimum split height given as an absolute height.
Branches merging below this height will automatically be merged. If not given (default), will be determined
from |
externalBranchSplitFnc |
Optional function to evaluate split (dissimilarity) between two branches.
Either a single function or a list in which each component is a function (see
|
minExternalSplit |
Thresholds to decide whether two branches should be merged.
It should be a numeric vector of the same length as the number of functions in
|
externalSplitOptions |
Further arguments to function |
externalSplitFncNeedsDistance |
Optional specification of whether the external branch split
functions need the distance matrix as one of their arguments. Either |
assumeSimpleExternalSpecification |
Logical: when |
pamStage |
Logical, only used for method "hybrid". If |
pamRespectsDendro |
Logical, only used for method "hybrid".
If |
useMedoids |
if TRUE, the second stage will be use object to medoid distance; if FALSE, it will use average object to cluster distance. The default (FALSE) is recommended. |
maxPamDist |
Maximum object distance to closest cluster that will result in the object
assigned to that cluster. Defaults to |
respectSmallClusters |
If TRUE, branches that failed to be clusters in stage 1 only because of insufficient size will be assigned together in stage 2. If FALSE, all objects will be assigned individually. |
verbose |
Controls the verbosity of the output. 0 will make the function completely quiet, values up to 4 gradually increase verbosity. |
indent |
Controls indentation of printed messages (see |
The function detects clusters in a hierarchical dendrogram based on the shape of branches on the dendrogram. For details on the method, see http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/BranchCutting.
In order to make the shape parameters maxCoreScatter
and minGap
more universal, their
values are interpreted relative to cutHeight
and the 5th percetile of the merging heights (we
arbitrarily chose the 5th percetile rather than the minimum for reasons of stability). Thus, the absolute
maximum allowable core scatter is calculated as maxCoreScatter * (cutHeight - refHeight) +
refHeight
and the absolute minimum allowable gap as minGap * (cutHeight - refHeight)
, where
refHeight
is the 5th percentile of the merging heights.
A list containg the following elements:
labels |
Numerical labels of clusters, with 0 meaning unassigned, label 1 labeling the largest cluster etc. |
cores |
Numerical labels indicating cores of found clusters. |
smallLabels |
Numerical labels for branches that failed to be recognized clusters only because of insufficient number of objects. |
mergeDiagnostics |
A data.frame with one row per merge in the input dendrogram. The columns give the values of the various merging criteria used by the algorithm. Missing data indicate that at least one of the "branches" merged was actually a singleton (single node) and hence the branch merging was automatic. |
mergeCriteria |
Values of the merging thresholds. Either a copy of the corresponding input thresholds
or values determined by |
branches |
A list detailing the deteced branch structure. |
Peter Langfelder, Peter.Langfelder@gmail.com
Langfelder P, Zhang B, Horvath S, 2007. http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/BranchCutting
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.