parameters: cluster_analysis – R documentation

Pricing

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

Get Started for Free

Documentation

parameters

cluster_analysis

Compute cluster analysis and return group indices

Description

Compute hierarchical or kmeans cluster analysis and return the group assignment for each observation as vector.

Usage

cluster_analysis(
  x,
  n_clusters = NULL,
  method = c("hclust", "kmeans"),
  distance = c("euclidean", "maximum", "manhattan", "canberra", "binary", "minkowski"),
  agglomeration = c("ward", "ward.D", "ward.D2", "single", "complete", "average",
    "mcquitty", "median", "centroid"),
  iterations = 20,
  algorithm = c("Hartigan-Wong", "Lloyd", "MacQueen"),
  force = TRUE,
  package = c("NbClust", "mclust"),
  verbose = TRUE
)

Arguments

`x`	A data frame.
`n_clusters`	Number of clusters used for the cluster solution. By default, the number of clusters to extract is determined by calling `n_clusters`.
`method`	Method for computing the cluster analysis. By default (`"hclust"`), a hierarchical cluster analysis, will be computed. Use `"kmeans"` to compute a kmeans cluster analysis. You can specify the initial letters only.
`distance`	Distance measure to be used when `method = "hclust"` (for hierarchical clustering). Must be one of `"euclidean"`, `"maximum"`, `"manhattan"`, `"canberra"`, `"binary"` or `"minkowski"`. See `dist`. If is `method = "kmeans"` this argument will be ignored.
`agglomeration`	Agglomeration method to be used when `method = "hclust"` (for hierarchical clustering). This should be one of `"ward"`, `"single"`, `"complete"`, `"average"`, `"mcquitty"`, `"median"` or `"centroid"`. Default is `"ward"` (see `hclust`). If `method = "kmeans"` this argument will be ignored.
`iterations`	Maximum number of iterations allowed. Only applies, if `method = "kmeans"`. See `kmeans` for details on this argument.
`algorithm`	Algorithm used for calculating kmeans cluster. Only applies, if `method = "kmeans"`. May be one of `"Hartigan-Wong"` (default), `"Lloyd"` (used by SPSS), or `"MacQueen"`. See `kmeans` for details on this argument.
`force`	Logical, if `TRUE`, ordered factors (ordinal variables) are converted to numeric values, while character vectors and factors are converted to dummy-variables (numeric 0/1) and are included in the cluster analysis. If `FALSE`, factors and character vectors are removed before computing the cluster analysis.
`package`	Package from which methods are to be called to determine the number of clusters. Can be `"all"` or a vector containing `"NbClust"`, `"mclust"`, `"cluster"` and `"M3C"`.
`verbose`	Toggle warnings and messages.

Details

The print() and plot() methods show the (standardized) mean value for each variable within each cluster. Thus, a higher absolute value indicates that a certain variable characteristic is more pronounced within that specific cluster (as compared to other cluster groups with lower absolute mean values).

Value

The group classification for each observation as vector. The returned vector includes missing values, so it has the same length as nrow(x).

Note

There is also a plot()-method implemented in the see-package.

References

Maechler M, Rousseeuw P, Struyf A, Hubert M, Hornik K (2014) cluster: Cluster Analysis Basics and Extensions. R package.

Examples

# Hierarchical clustering of mtcars-dataset
groups <- cluster_analysis(iris[, 1:4], 3)
groups

# K-means clustering of mtcars-dataset, auto-detection of cluster-groups
## Not run: 
groups <- cluster_analysis(iris[, 1:4], method = "k")
groups

## End(Not run)

parameters

Processing of Model Parameters

v0.13.0

GPL-3

Authors

Daniel Lüdecke [aut, cre] (<https://orcid.org/0000-0002-8895-3206>, @strengejacke), Dominique Makowski [aut] (<https://orcid.org/0000-0001-5375-9967>), Mattan S. Ben-Shachar [aut] (<https://orcid.org/0000-0002-4287-4801>), Indrajeet Patil [aut] (<https://orcid.org/0000-0003-1995-6531>, @patilindrajeets), Søren Højsgaard [aut], Zen J. Lau [ctb], Vincent Arel-Bundock [ctb] (<https://orcid.org/0000-0003-1995-6531>, @vincentab), Jeffrey Girard [ctb] (<https://orcid.org/0000-0002-7359-3746>, @jeffreymgirard)

Initial release

cluster_analysis

Description

Usage

Arguments

Details

Value

Note

References

See Also

Examples

parameters

We don't support your browser anymore