Spectral Clustering
A spectral clustering algorithm. Clustering is performed by embedding the data into the subspace of the eigenvectors of an affinity matrix.
## S4 method for signature 'formula' specc(x, data = NULL, na.action = na.omit, ...) ## S4 method for signature 'matrix' specc(x, centers, kernel = "rbfdot", kpar = "automatic", nystrom.red = FALSE, nystrom.sample = dim(x)[1]/6, iterations = 200, mod.sample = 0.75, na.action = na.omit, ...) ## S4 method for signature 'kernelMatrix' specc(x, centers, nystrom.red = FALSE, iterations = 200, ...) ## S4 method for signature 'list' specc(x, centers, kernel = "stringdot", kpar = list(length=4, lambda=0.5), nystrom.red = FALSE, nystrom.sample = length(x)/6, iterations = 200, mod.sample = 0.75, na.action = na.omit, ...)
x |
the matrix of data to be clustered, or a symbolic
description of the model to be fit, or a kernel Matrix of class
|
data |
an optional data frame containing the variables in the model. By default the variables are taken from the environment which ‘specc’ is called from. |
centers |
Either the number of clusters or a set of initial cluster centers. If the first, a random set of rows in the eigenvectors matrix are chosen as the initial centers. |
kernel |
the kernel function used in computing the affinity matrix. This parameter can be set to any function, of class kernel, which computes a dot product between two vector arguments. kernlab provides the most popular kernel functions which can be used by setting the kernel parameter to the following strings:
The kernel parameter can also be set to a user defined function of class kernel by passing the function name as an argument. |
kpar |
a character string or the list of hyper-parameters (kernel parameters).
The default character string
Hyper-parameters for user defined kernels can be passed through the kpar parameter as well. |
nystrom.red |
use nystrom method to calculate eigenvectors. When
|
nystrom.sample |
number of data points to use for estimating the eigenvalues when using the nystrom method. (default : dim(x)[1]/6) |
mod.sample |
proportion of data to use when estimating sigma (default: 0.75) |
iterations |
the maximum number of iterations allowed. |
na.action |
the action to perform on NA |
... |
additional parameters |
Spectral clustering works by embedding the data points of the
partitioning problem into the
subspace of the k largest eigenvectors of a normalized affinity/kernel matrix.
Using a simple clustering method like kmeans
on the embedded points usually
leads to good performance. It can be shown that spectral clustering methods boil down to
graph partitioning.
The data can be passed to the specc
function in a matrix
or a
data.frame
, in addition specc
also supports input in the form of a
kernel matrix of class kernelMatrix
or as a list of character
vectors where a string kernel has to be used.
An S4 object of class specc
which extends the class vector
containing integers indicating the cluster to which
each point is allocated. The following slots contain useful information
centers |
A matrix of cluster centers. |
size |
The number of point in each cluster |
withinss |
The within-cluster sum of squares for each cluster |
kernelf |
The kernel function used |
Alexandros Karatzoglou
alexandros.karatzoglou@ci.tuwien.ac.at
Andrew Y. Ng, Michael I. Jordan, Yair Weiss
On Spectral Clustering: Analysis and an Algorithm
Neural Information Processing Symposium 2001
http://papers.nips.cc/paper/2092-on-spectral-clustering-analysis-and-an-algorithm.pdf
## Cluster the spirals data set. data(spirals) sc <- specc(spirals, centers=2) sc centers(sc) size(sc) withinss(sc) plot(spirals, col=sc)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.