Shannon Entropy and Mutual Information
Computes Shannon entropy and the mutual information of two variables. The entropy quantifies the expected value of the information contained in a vector. The mutual information is a quantity that measures the mutual dependence of the two random variables.
Entropy(x, y = NULL, base = 2, ...) MutInf(x, y, base = 2, ...)
x |
a vector or a matrix of numerical or categorical type. If only x is supplied it will be interpreted as contingency table. |
y |
a vector with the same type and dimension as x. If y is not |
base |
base of the logarithm to be used, defaults to 2. |
... |
further arguments are passed to the function |
The Shannon entropy equation provides a way to estimate the average minimum number of bits needed to encode a string of symbols, based on the frequency of the symbols.
It is given by the formula H = - ∑(π log(π)) where π is the
probability of character number i showing up in a stream of characters of the given "script".
The entropy is ranging from 0 to Inf.
a numeric value.
Andri Signorell <andri@signorell.net>
Shannon, Claude E. (July/October 1948). A Mathematical Theory of Communication, Bell System Technical Journal 27 (3): 379-423.
Ihara, Shunsuke (1993) Information theory for continuous systems, World Scientific. p. 2. ISBN 978-981-02-0985-8.
package entropy which implements various estimators of entropy
Entropy(as.matrix(rep(1/8, 8))) # http://r.789695.n4.nabble.com/entropy-package-how-to-compute-mutual-information-td4385339.html x <- as.factor(c("a","b","a","c","b","c")) y <- as.factor(c("b","a","a","c","c","b")) Entropy(table(x), base=exp(1)) Entropy(table(y), base=exp(1)) Entropy(x, y, base=exp(1)) # Mutual information is Entropy(table(x), base=exp(1)) + Entropy(table(y), base=exp(1)) - Entropy(x, y, base=exp(1)) MutInf(x, y, base=exp(1)) Entropy(table(x)) + Entropy(table(y)) - Entropy(x, y) MutInf(x, y, base=2) # http://en.wikipedia.org/wiki/Cluster_labeling tab <- matrix(c(60,10000,200,500000), nrow=2, byrow=TRUE) MutInf(tab, base=2) d.frm <- Untable(as.table(tab)) str(d.frm) MutInf(d.frm[,1], d.frm[,2]) table(d.frm[,1], d.frm[,2]) MutInf(table(d.frm[,1], d.frm[,2])) # Ranking mutual information can help to describe clusters # # r.mi <- MutInf(x, grp) # attributes(r.mi)$dimnames <- attributes(tab)$dimnames # # # calculating ranks of mutual information # r.mi_r <- apply( -r.mi, 2, rank, na.last=TRUE ) # # show only first 6 ranks # r.mi_r6 <- ifelse( r.mi_r < 7, r.mi_r, NA) # attributes(r.mi_r6)$dimnames <- attributes(tab)$dimnames # r.mi_r6
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.