Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

gl.assign

Assign an individual of unknown provenance to population


Description

This script assigns an individual of unknown provenance to one or more target populations based on first, an analysis of private alleles, and then, if the assignment remains ambigous, on the basis of a weighted likelihood index.

Usage

gl.assign(
  x,
  unknown,
  nmin = 10,
  dim = NULL,
  alpha = 0.05,
  threshold = 0,
  verbose = 3
)

Arguments

x

– name of the input genlight object [required]

unknown

– identity label of the focal individual whose provenance is unknown [required]

nmin

– minimum sample size for a target population to be included in the analysis [default 10]

dim

– number of dimensions to retain in the dimension reduction [default k, number of populations]

alpha

– probability level for bounding ellipses in the PCoA plot [default 0.05]

threshold

– populations to retain for consideration; those for which the focal individual has less than or equal to threshold loci with private alleles [default 0]

verbose

– verbosity: 0, silent or fatal errors; 1, begin and end; 2, progress log ; 3, progress and results summary; 5, full report [default 2 or as specified using gl.set.verbosity]

Details

The algorithm first identifies those target populations for which the individual has no private alleles. If no single population emerges from this analysis, or if a higher threshold than 0 is chosen for the number of tollerable private alleles, then the following process is followed. (a) The space defined by the loci is ordinated to yield a series of orthogonal axes (independent), a necessary condition for combining likelihoods calculated from each axis. (b) A workable subset of dimensions is chosen, normally equal to the number of target populations or the number of dimensions with substantive eigenvalues, whichever is the smaller. (c) The log-likelihood of the value for the unknown on each axis is calculated, weighted by the eigenvalue for that axis, and summed over all dimensions as an assignment index. The assignment index is calculated for a point on the boundary of the 95

There are three considerations to the assignment. First, consider only those populations for which the unknown has no private alleles. Private alleles are an indication that the unknown does not belong to a target population (provided that the sample size is adequate, say >=10).

Second, consider the PCoA plot for populations where no private alleles have been detected and the position of the unknown in relation to the confidence ellipses. Note, this is considering only the top two dimensions of the ordination, and so an unknown lying outside the confidence ellipse can be interpreted as it lying outside the confidence envelope. However, if the unknown lies inside the confidence ellipse in two dimensions, then it may still lie outside the confidence envelope. This is good for eliminating populations from consideration, but does not provide confidence in assignment.

Third, consider the assignment probabilities. This approach calculates the squared Generalised Linear Distance (Mahalanobis distance) of the unknown from the centroid for each population, and calculates the probability associated with its quantile under the zero truncated normal distribution. This index takes into account position of the unknown in relation to the confidence envelope in all selected dimensions of the ordination.

Each of these approaches provides evidence, none are 100

Value

A genlight object containing the focal individual (assigned to population "unknown") and #' populations for which the focal individual is not distinctive (number of loci with private alleles less than or equal to thresold t.

Author(s)

Arthur Georges (Post to https://groups.google.com/d/forum/dartr)

Examples

# Test run with a focal individual from the Macleay River (EmmacMaclGeor)
  x <- gl.assign(testset.gl, unknown="UC_00146", nmin=10, 
  alpha=0.05, threshold=1)

dartR

Importing and Analysing SNP and Silicodart Data Generated by Genome-Wide Restriction Fragment Analysis

v1.9.6
GPL-2
Authors
Bernd Gruber [aut, cre], Arthur Georges [aut], Jose L. Mijangos [aut], Peter J. Unmack [ctb], Oliver Berry [ctb], Lindsay V. Clark [ctb], Floriaan Devloo-Delva [ctb]
Initial release
2021-04-29

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.