Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

gl.report.secondaries

Report loci containing secondary SNPs in sequence tags


Description

SNP datasets generated by DArT include fragments with more than one SNP (that is, with secondaries) and record them separately with the same CloneID (=AlleleID). These multiple SNP loci within a fragment are likely to be linked, and so you may wish to remove secondaries.

Usage

gl.report.secondaries(x, boxplot = "adjusted", range = 1.5, verbose = 2)

Arguments

x

– name of the genlight object containing the SNP data [required]

boxplot

– if 'standard', plots a standard box and whisker plot; if 'adjusted', plots a boxplot adjusted for skewed distributions [default 'adjusted']

range

– specifies the range for delimiting outliers [default = 1.5 interquartile ranges]

verbose

– verbosity: 0, silent or fatal errors; 1, begin and end; 2, progress log ; 3, progress and results summary; 5, full report [default 2 or as specified using gl.set.verbosity]

Details

The script reports statistics associated with secondaries, and the consequences of filtering them out, and provides three plots. The first is a Box and Whisker plot adjusted to account for skewness, the second is a bargraph of the frequency of secondaries per sequence tag, and the third is Poisson expectation for those frequencies including an estimate of the zero class (no. of sequence tags with no SNP scored).

Heterozygosity in gl.report.heterozygosity is in a sense relative, because it is calculated against a background of only those loci that are polymorphic somewhere in the dataset. To allow intercomparability across studies and species, any measure of heterozygosity needs to accommodate loci that are invariant. However, the number of invariant loci are unknown given the SNPs are detected as single point mutational variants and invariant seqeunces are discarded, and because of the particular additional filtering pre-analysis. Modelling the counts of SNPs per sequence tag as a Poisson distribution in this script allows estimate of the zero class, that is, the number of invariant loci. This is reported, and the veracity of the estimate can be assessed by the correspondence of the observed frequencies against those under Poisson expectation in the associated graphs. The number of invariant loci can then be optionally provided to gl.report.heterozygosity via the parameter n.invariants.

Value

returns a genlight object of loci with multiple SNP calls

Author(s)

Arthur Georges (Post to https://groups.google.com/d/forum/dartr)

Examples

out <- gl.report.secondaries(bandicoot.gl)

dartR

Importing and Analysing SNP and Silicodart Data Generated by Genome-Wide Restriction Fragment Analysis

v1.9.6
GPL-2
Authors
Bernd Gruber [aut, cre], Arthur Georges [aut], Jose L. Mijangos [aut], Peter J. Unmack [ctb], Oliver Berry [ctb], Lindsay V. Clark [ctb], Floriaan Devloo-Delva [ctb]
Initial release
2021-04-29

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.