Trimmed k-means Cluster Analysis
tkmeans
searches for k
(or less) spherical clusters in a data matrix x
, whereas the ceiling (alpha n)
most outlying observations are trimmed.
tkmeans (x, k = 3, alpha = 0.05, nstart = 50, iter.max = 20, equal.weights = FALSE, center = 0, scale = 1, store.x = TRUE, drop.empty.clust = TRUE, trace = 0, warnings = 2, zero.tol = 1e-16)
x |
A matrix or data.frame of dimension |
k |
The number of clusters initially searched for. |
alpha |
The proportion of observations to be trimmed. |
nstart |
The number of random initializations to be performed. |
iter.max |
The maximum number of concentration steps to be performed. The concentration steps are stopped, whenever two consecutive steps lead to the same data partition. |
equal.weights |
A logical value, specifying whether equal cluster weights ( |
center, scale |
A center and scale vector, each of length |
store.x |
A logical value, specifying whether the data matrix |
drop.empty.clust |
Logical value specifying, whether empty clusters shall be omitted in the
resulting object.
(The result structure does not contain center and covariance estimates of
empty clusters anymore.
Cluster names are reassigned such that the first |
trace |
Defines the tracing level, which is set to |
warnings |
The warning level (0: no warnings; 1: warnings on unexpected behavior. |
zero.tol |
The zero tolerance used. By default set to 1e-16. |
The function returns an S3 object of type tkmeans
, containing the
following values:
centers |
A matrix of size |
cluster |
A numerical vector of size |
par |
A list, containing the parameters the algorithm has been called with
( |
k |
The (final) resulting number of clusters.
Some solutions with a smaller number of clusters might be found when using
the option |
obj |
The value of the objective function of the best (returned) solution. |
size |
An integer vector of size k, returning the number of observations contained by each cluster. |
weights |
A numerical vector of length k, containing the weights of each cluster. |
int |
A list of values internally used by function related to |
Agustin Mayo Iscar, Luis Angel Garcia Escudero, Heinrich Fritz
Cuesta-Albertos, J. A.; Gordaliza, A. and Matrán, C. (1997), "Trimmed k-means: an attempt to robustify quantizers". Annals of Statistics, Vol. 25 (2), 553-576.
#--- EXAMPLE 1 ------------------------------------------ sig <- diag (2) cen <- rep (1,2) x <- rbind(mvtnorm::rmvnorm(360, cen * 0, sig), mvtnorm::rmvnorm(540, cen * 5, sig * 6 - 2), mvtnorm::rmvnorm(100, cen * 2.5, sig * 50) ) # Two groups and 10% trimming level clus <- tkmeans (x, k = 2, alpha = 0.1) plot (clus) plot (clus, labels = "observation") plot (clus, labels = "cluster") #--- EXAMPLE 2 ------------------------------------------ data (geyser2) clus <- tkmeans (geyser2, k = 3, alpha = 0.03) plot (clus) #--- EXAMPLE 3 ------------------------------------------ data (swissbank) # Two clusters and 8% trimming level clus <- tkmeans (swissbank, k = 2, alpha = 0.08) # Pairs plot of the clustering solution pairs (swissbank, col = clus$cluster + 1) # Two coordinates plot (swissbank[, 4], swissbank[, 6], col = clus$cluster + 1, xlab = "Distance of the inner frame to lower border", ylab = "Length of the diagonal") plot (clus) # Three clusters and 0% trimming level clus <- tkmeans (swissbank, k = 3, alpha = 0.0) # Pairs plot of the clustering solution pairs (swissbank, col = clus$cluster + 1) # Two coordinates plot (swissbank[, 4], swissbank[, 6], col = clus$cluster + 1, xlab = "Distance of the inner frame to lower border", ylab = "Length of the diagonal") plot (clus)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.