Maximum-likelihood genetic clustering using EM algorithm
Do not use. We work on that stuff. Contact us if interested.
snapclust( x, k, pop.ini = "ward", max.iter = 100, n.start = 10, n.start.kmeans = 50, hybrids = FALSE, dim.ini = 100, hybrid.coef = NULL, parent.lab = c("A", "B"), ... )
x |
a genind object |
k |
the number of clusters to look for |
pop.ini |
parameter indicating how the initial group membership should
be found. If |
max.iter |
the maximum number of iteration of the EM algorithm |
n.start |
the number of times the EM algorithm is run, each time with different random starting conditions |
n.start.kmeans |
the number of times the K-means algorithm is run to define the starting point of the ML-EM algorithm, each time with different random starting conditions |
hybrids |
a logical indicating if hybrids should be modelled explicitely; this is currently implemented for 2 groups only. |
dim.ini |
the number of PCA axes to retain in the dimension reduction
step for |
hybrid.coef |
a vector of hybridization coefficients, defining the proportion of hybrid gene pool coming from the first parental population; this is symmetrized around 0.5, so that e.g. c(0.25, 0.5) will be converted to c(0.25, 0.5, 0.75) |
parent.lab |
a vector of 2 character strings used to label the two
parental populations; only used if hybrids are detected (see argument
|
... |
further arguments passed on to |
The function snapclust
returns a list with the following
components:
$group
a factor indicating the maximum-likelihood assignment of
individuals to groups; if identified, hybrids are labelled after
hybridization coefficients, e.g. 0.5_A - 0.5_B for F1, 0.75_A - 0.25_B for
backcross F1 / A, etc.
$ll
: the log-likelihood of the model
$proba
: a matrix of group membership probabilities, with
individuals in rows and groups in columns; each value correspond to the
probability that a given individual genotype was generated under a given
group, under Hardy-Weinberg hypotheses.
$converged
a logical indicating if the algorithm converged; if
FALSE, it is doubtful that the result is an actual Maximum Likelihood
estimate.
$n.iter
an integer indicating the number of iterations the EM
algorithm was run for.
Thibaut Jombart thibautjombart@gmail.com and Marie-Pauline Beugin
## Not run: data(microbov) ## try function using k-means initialization grp.ini <- find.clusters(microbov, n.clust=15, n.pca=150) ## run EM algo res <- snapclust(microbov, 15, pop.ini = grp.ini$grp) names(res) res$converged res$n.iter ## plot result compoplot(res) ## flag potential hybrids to.flag <- apply(res$proba,1,max)<.9 compoplot(res, subset=to.flag, show.lab=TRUE, posi="bottomleft", bg="white") ## Simulate hybrids F1 zebu <- microbov[pop="Zebu"] salers <- microbov[pop="Salers"] hyb <- hybridize(zebu, salers, n=30) x <- repool(zebu, salers, hyb) ## method without hybrids res.no.hyb <- snapclust(x, k=2, hybrids=FALSE) compoplot(res.no.hyb, col.pal=spectral, n.col=2) ## method with hybrids res.hyb <- snapclust(x, k=2, hybrids=TRUE) compoplot(res.hyb, col.pal = hybridpal(col.pal = spectral), n.col = 2) ## Simulate hybrids backcross (F1 / parental) f1.zebu <- hybridize(hyb, zebu, 20, pop = "f1.zebu") f1.salers <- hybridize(hyb, salers, 25, pop = "f1.salers") y <- repool(x, f1.zebu, f1.salers) ## method without hybrids res2.no.hyb <- snapclust(y, k = 2, hybrids = FALSE) compoplot(res2.no.hyb, col.pal = hybridpal(), n.col = 2) ## method with hybrids F1 only res2.hyb <- snapclust(y, k = 2, hybrids = TRUE) compoplot(res2.hyb, col.pal = hybridpal(), n.col = 2) ## method with back-cross res2.back <- snapclust(y, k = 2, hybrids = TRUE, hybrid.coef = c(.25,.5)) compoplot(res2.hyb, col.pal = hybridpal(), n.col = 2) ## End(Not run)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.