adegenet: snapclust – R documentation

Pricing

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

Get Started for Free

Documentation

adegenet

snapclust

Maximum-likelihood genetic clustering using EM algorithm

Description

Do not use. We work on that stuff. Contact us if interested.

Usage

snapclust(
  x,
  k,
  pop.ini = "ward",
  max.iter = 100,
  n.start = 10,
  n.start.kmeans = 50,
  hybrids = FALSE,
  dim.ini = 100,
  hybrid.coef = NULL,
  parent.lab = c("A", "B"),
  ...
)

Arguments

`x`	a genind object
`k`	the number of clusters to look for
`pop.ini`	parameter indicating how the initial group membership should be found. If `NULL`, groups are chosen at random, and the algorithm will be run `n.start times`. If "kmeans", then the function `find.clusters` is used to define initial groups using the K-means algorithm. If "ward", then the function `find.clusters` is used to define initial groups using the Ward algorithm. Alternatively, a factor defining the initial cluster configuration can be provided.
`max.iter`	the maximum number of iteration of the EM algorithm
`n.start`	the number of times the EM algorithm is run, each time with different random starting conditions
`n.start.kmeans`	the number of times the K-means algorithm is run to define the starting point of the ML-EM algorithm, each time with different random starting conditions
`hybrids`	a logical indicating if hybrids should be modelled explicitely; this is currently implemented for 2 groups only.
`dim.ini`	the number of PCA axes to retain in the dimension reduction step for `find.clusters`, if this method is used to define initial group memberships (see argument `pop.ini`).
`hybrid.coef`	a vector of hybridization coefficients, defining the proportion of hybrid gene pool coming from the first parental population; this is symmetrized around 0.5, so that e.g. c(0.25, 0.5) will be converted to c(0.25, 0.5, 0.75)
`parent.lab`	a vector of 2 character strings used to label the two parental populations; only used if hybrids are detected (see argument `hybrids`)
`...`	further arguments passed on to `find.clusters`

Value

The function snapclust returns a list with the following components:

$group a factor indicating the maximum-likelihood assignment of individuals to groups; if identified, hybrids are labelled after hybridization coefficients, e.g. 0.5_A - 0.5_B for F1, 0.75_A - 0.25_B for backcross F1 / A, etc.
$ll: the log-likelihood of the model
$proba: a matrix of group membership probabilities, with individuals in rows and groups in columns; each value correspond to the probability that a given individual genotype was generated under a given group, under Hardy-Weinberg hypotheses.
$converged a logical indicating if the algorithm converged; if FALSE, it is doubtful that the result is an actual Maximum Likelihood estimate.
$n.iter an integer indicating the number of iterations the EM algorithm was run for.

Author(s)

Thibaut Jombart thibautjombart@gmail.com and Marie-Pauline Beugin

Examples

## Not run: 
data(microbov)

## try function using k-means initialization
grp.ini <- find.clusters(microbov, n.clust=15, n.pca=150)

## run EM algo
res <- snapclust(microbov, 15, pop.ini = grp.ini$grp)
names(res)
res$converged
res$n.iter

## plot result
compoplot(res)

## flag potential hybrids
to.flag <- apply(res$proba,1,max)<.9
compoplot(res, subset=to.flag, show.lab=TRUE,
                 posi="bottomleft", bg="white")


## Simulate hybrids F1
zebu <- microbov[pop="Zebu"]
salers <- microbov[pop="Salers"]
hyb <- hybridize(zebu, salers, n=30)
x <- repool(zebu, salers, hyb)

## method without hybrids
res.no.hyb <- snapclust(x, k=2, hybrids=FALSE)
compoplot(res.no.hyb, col.pal=spectral, n.col=2)

## method with hybrids
res.hyb <- snapclust(x, k=2, hybrids=TRUE)
compoplot(res.hyb, col.pal =
          hybridpal(col.pal = spectral), n.col = 2)


## Simulate hybrids backcross (F1 / parental)
f1.zebu <- hybridize(hyb, zebu, 20, pop = "f1.zebu")
f1.salers <- hybridize(hyb, salers, 25, pop = "f1.salers")
y <- repool(x, f1.zebu, f1.salers)

## method without hybrids
res2.no.hyb <- snapclust(y, k = 2, hybrids = FALSE)
compoplot(res2.no.hyb, col.pal = hybridpal(), n.col = 2)

## method with hybrids F1 only
res2.hyb <- snapclust(y, k = 2, hybrids = TRUE)
compoplot(res2.hyb, col.pal = hybridpal(), n.col = 2)

## method with back-cross
res2.back <- snapclust(y, k = 2, hybrids = TRUE, hybrid.coef = c(.25,.5))
 compoplot(res2.hyb, col.pal = hybridpal(), n.col = 2)


## End(Not run)

adegenet

Exploratory Analysis of Genetic and Genomic Data

v2.1.3

GPL (>= 2)

Authors

Thibaut Jombart [aut] (<https://orcid.org/0000-0003-2226-8692>), Zhian N. Kamvar [aut, cre] (<https://orcid.org/0000-0003-1458-7108>), Caitlin Collins [ctb], Roman Lustrik [ctb], Marie-Pauline Beugin [ctb], Brian J. Knaus [ctb], Peter Solymos [ctb], Vladimir Mikryukov [ctb], Klaus Schliep [ctb], Tiago Maié [ctb], Libor Morkovsky [ctb], Ismail Ahmed [ctb], Anne Cori [ctb], Federico Calboli [ctb], RJ Ewing [ctb], Frédéric Michaud [ctb], Rebecca DeCamp [ctb], Alexandre Courtiol [ctb] (<https://orcid.org/0000-0003-0637-2959>)

Initial release

snapclust

Description

Usage

Arguments

Value

Author(s)

Examples

adegenet

We don't support your browser anymore