mixAK: NMixEM – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

NMixEM

EM algorithm for a homoscedastic normal mixture

Description

This function computes ML estimates of the parameters of the p-dimensional K-component normal mixture using the EM algorithm

Usage

NMixEM(y, K, weight, mean, Sigma, toler=1e-5, maxiter=500)

## S3 method for class 'NMixEM'
print(x, ...)

Arguments

`y`	vector (if p=1) matrix or data frame (if p > 1) with data. Rows correspond to observations, columns correspond to margins.
`K`	required number of mixture components.
`weight`	a numeric vector with initial mixture weights. If not given, initial weights are all equal to 1/K.
`mean`	vector or matrix of initial mixture means. For p=1 this should be a vector of length K, for p>1 this should be a K x p matrix with mixture means in rows.
`Sigma`	number or p x p matrix giving the initial variance/covariance matrix.
`toler`	tolerance to determine convergence.
`maxiter`	maximum number of iterations of the EM algorithm.
`x`	an object of class `NMixEM`.
`...`	additional arguments passed to the default `print` method.

Value

An object of class NMixEM which has the following components:

`K`	number of mixture components
`weight`	estimated mixture weights
`mean`	estimated mixture means
`Sigma`	estimated covariance matrix
`loglik`	log-likelihood value at fitted values
`aic`	Akaike information criterion (-2loglik + 2nu), where loglik stands for the log-likelihood value at fitted values and nu for the number of free model parameters
`bic`	Bayesian (Schwarz) information criterion (-2loglik + log(n)nu), where loglik stands for the log-likelihood value at fitted values and nu for the number of free model parameters, and n for the sample size
`iter`	number of iterations of the EM algorithm used to get the solution
`iter.loglik`	values of the log-likelihood at iterations of the EM algorithm
`iter.Qfun`	values of the EM objective function at iterations of the EM algorithm
`dim`	dimension p
`nobs`	number of observations n

Author(s)

Arnošt Komárek arnost.komarek[AT]mff.cuni.cz

References

Dempster, A. P., Laird, N. M., Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39, 1-38.

Examples

## Not run: 
## Estimates for 3-component mixture in  Anderson's iris data
## ==========================================================
data(iris, package="datasets")
summary(iris)

VARS <- names(iris)[1:4]
fit <- NMixEM(iris[, VARS], K = 3)
print(fit)

apply(subset(iris, Species == "versicolor")[, VARS], 2, mean)
apply(subset(iris, Species == "setosa")[, VARS], 2, mean)
apply(subset(iris, Species == "virginica")[, VARS], 2, mean)

## Estimates of 6-component mixture in Galaxy data
## ==================================================
data(Galaxy, package="mixAK")
summary(Galaxy)

fit2 <- NMixEM(Galaxy, K = 6)
y <- seq(5, 40, length=300)
fy <- dMVNmixture(y, weight=fit2$weight, mean=fit2$mean,
                     Sigma=rep(fit2$Sigma, fit2$K))
hist(Galaxy, prob=TRUE, breaks=seq(5, 40, by=0.5),
     main="", xlab="Velocity (km/sec)", col="sandybrown")
lines(y, fy, col="darkblue", lwd=2)

## End(Not run)

mixAK

Multivariate Normal Mixture Models and Mixtures of Generalized Linear Mixed Models Including Model Based Clustering

v5.3

GPL (>= 3)

Authors

Arnošt Komárek <arnost.komarek@mff.cuni.cz>

Initial release

2020-06-02