Report summary of Call Rate for loci or individuals
SNP datasets generated by DArT have missing values primarily arising from failure to call a SNP because of a mutation at one or both of the the restriction enzyme recognition sites. This script reports the number of missing values for each of several percentiles. The script gl.filter.callrate() will filter out the loci with call rates below a specified threshold.
gl.report.callrate( x, method = "loc", boxplot = "adjusted", range = 1.5, verbose = NULL )
x |
– name of the genlightobject containing the SNP or presence/absence (SilicoDArT) data [required] |
method |
specify the type of report by locus (method="loc") or individual (method="ind") [default method="loc"] |
boxplot |
– if 'standard', plots a standard box and whisker plot; if 'adjusted', plots a boxplot adjusted for skewed distributions [default 'adjusted'] |
range |
– specifies the range for delimiting outliers [default = 1.5 interquartile ranges] |
verbose |
– verbosity: 0, silent or fatal errors; 1, begin and end; 2, progress log ; 3, progress and results summary; 5, full report [default 2 or as specified using gl.set.verbosity] |
Tag Presence/Absence datasets (SilicoDArT) have missing values where it is not possible to determine reliably if there the sequence tag can be called at a particular locus.
The minimum, maximum and mean call rate are provided. Output also is a histogram of read depth, accompanied by a box and whisker plot.
Refer to Tukey (1977, Exploratory Data Analysis. Addison-Wesley) for standard Box and Whisker Plots and Hubert & Vandervieren (2008), An Adjusted Boxplot for Skewed Distributions, Computational Statistics & Data Analysis 52:5186-5201) for adjusted Box and Whisker Plots.
returns a tabulation of CallRate against Threshold
Arthur Georges (Post to https://groups.google.com/d/forum/dartr)
# SNP data out <- gl.report.callrate(testset.gl) # Tag P/A data out <- gl.report.callrate(testset.gs)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.