Cluster Lattices
Computations on the lattice of all (hard) partitions, or the lattice of all dendrograms, or the meet semilattice of all hierarchies (n-trees) of/on a set of objects: meet, join, and comparisons.
cl_meet(x, y) cl_join(x, y)
x |
an ensemble of partitions or dendrograms or hierarchies, or an R object representing a partition or dendrogram or hierarchy. |
y |
an R object representing a partition or dendrogram or
hierarchy. Ignored if |
For a given finite set of objects X, the set H(X) of all (hard) partitions of X can be partially ordered by defining a partition P to be “finer” than a partition Q, i.e., P ≤ Q, if each class of P is contained in some class of Q. With this partial order, H(X) becomes a bounded lattice, with intersection and union of two elements given by their greatest lower bound (meet) and their least upper bound (join), respectively.
Specifically, the meet of two partitions computed by cl_meet
is
the partition obtained by intersecting the classes of the partitions;
the classes of the join computed by cl_join
are obtained by
joining all elements in the same class in at least one of the
partitions. Obviously, the least and greatest elements of the
partition lattice are the partitions where each object is in a single
class (sometimes referred to as the “splitter” partition) or in
the same class (the “lumper” partition), respectively. Meet
and join of an arbitrary number of partitions can be defined
recursively.
In addition to computing the meet and join, the comparison operations
corresponding to the above partial order as well as min
,
max
, and range
are available at least for R objects
representing partitions inheriting from "cl_partition"
.
The summary methods give the meet and join of the given partitions
(for min
and max
), or a partition ensemble with the meet
and join (for range
).
If the partitions specified by x
and y
are soft
partitions, the corresponding nearest hard partitions are used.
Future versions may optionally provide suitable “soft” (fuzzy)
extensions for computing meets and joins.
The set of all dendrograms on X can be ordered using pointwise inequality of the associated ultrametric dissimilarities: i.e., if D and E are the dendrograms with ultrametrics u and v, respectively, then D ≤ E if u_{ij} ≤ v_{ij} for all pairs (i, j) of objects. This again yields a lattice (of dendrograms). The join of D and E is the dendrogram with ultrametrics given by \max(u_{ij}, v_{ij}) (as this gives an ultrametric); the meet is the dendrogram with the maximal ultrametric dominated by \min(u_{ij}, v_{ij}), and can be obtained by applying single linkage hierarchical clustering to the minima.
The set of all hierarchies on X can be ordered by set-wise inclusion of the classes: i.e., if H and G are two hierarchies, then H ≤ G if all classes of H are also classes of G. This yields a meet semilattice, with meet given by the classes contained in both hierarchies. The join only exists if the union of the classes is a hierarchy.
In each case, a modular semilattice is obtained, which allows for a
natural metrization via least element (semi)lattice move distances,
see Barthélémy, Leclerc and Monjardet (1981). These latticial metrics
are given by the BA/C (partitions), Manhattan (dendrograms), and
symdiff (hierarchies) dissimilarities, respectively (see
cl_dissimilarity
).
For cl_meet
and cl_join
, an object of class
"cl_partition"
or "cl_dendrogram"
with the
class ids or ultrametric dissimilarities of the meet and join of the
partitions or dendrograms, respectively.
J.-P. Barthélémy, B. Leclerc and B. Monjardet (1981). On the use of ordered sets in problems of comparison and consensus of classification. Journal of Classification, 3, 187–224. doi: 10.1007/BF01894188.
## Two simple partitions of 7 objects. A <- as.cl_partition(c(1, 1, 2, 3, 3, 5, 5)) B <- as.cl_partition(c(1, 2, 2, 3, 4, 5, 5)) ## These disagree on objects 1-3, A splits objects 4 and 5 into ## separate classes. Objects 6 and 7 are always in the same class. (A <= B) || (B <= A) ## (Neither partition is finer than the other.) cl_meet(A, B) cl_join(A, B) ## Meeting with the lumper (greatest) or joining with the splitter ## (least) partition does not make a difference: C_lumper <- as.cl_partition(rep(1, n_of_objects(A))) cl_meet(cl_ensemble(A, B, C_lumper)) C_splitter <- as.cl_partition(seq_len(n_of_objects(A))) cl_join(cl_ensemble(A, B, C_splitter)) ## Another way of computing the join: range(A, B, C_splitter)$max
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.