Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

gl2fasta

Concatenates DArT trimmed sequences and outputs a fastA file.


Description

Concatenated sequence tags are useful for phylogenetic methods where information on base frequencies and transition and transversion ratios are required (for example, Maximum Liklihood methods). Where relevant, heterozygous loci are resolved before concatenation by either assigning ambiguity codes or by random allele assignment.

Usage

gl2fasta(
  x,
  method = 1,
  outfile = "output.fasta",
  outpath = tempdir(),
  probar = FALSE,
  verbose = NULL
)

Arguments

x

– name of the DArT genlight object [required]

method

– 1 | 2 | 3 | 4. Type method=0 for a list of options [method=1]

outfile

– name of the output file (fasta format) [output.fasta]

outpath

– path where to save the output file (set to tempdir by default)

probar

– if TRUE, a progress bar will be displayed for long loops [default = TRUE]

verbose

– verbosity: 0, silent or fatal errors; 1, begin and end; 2, progress log ; 3, progress and results summary; 5, full report [default 2 or as specified using gl.set.verbosity]

Details

Four methods are employed

Method 1 – heterozygous positions are replaced by the standard ambiguity codes. The resultant sequence fragments are concatenated across loci to generate a single combined sequence to be used in subsequent ML phylogenetic analyses.

Method=2 – the heterozyous state is resolved by randomly assigning one or the other SNP variant to the individual. The resultant sequence fragments are concatenated across loci to generate a single composite haplotype to be used in subsequent ML phylogenetic analyses.

Method 3 – heterozygous positions are replaced by the standard ambiguity codes. The resultant SNP bases are concatenated across loci to generate a single combined sequence to be used in subsequent MP phylogenetic analyses.

Method=4 – the heterozyous state is resolved by randomly assigning one or the other SNP variant to the individual. The resultant SNP bases are concatenated across loci to generate a single composite haplotype to be used in subsequent MP phylogenetic analyses.

Trimmed sequences for which the SNP has been trimmed out, rarely, by adaptor mis-identity are deleted.

The script writes out the composite haplotypes for each individual as a fastA file. Requires 'TrimmedSequence' to be among the locus metrics (@other$loc.metrics) and information of the type of alleles (slot loc.all e.g. "G/A") and the position of the SNP in slot position of the “'genlight“' object (see testset.gl@position and testset.gl@loc.all for how to format these slots.)

Value

A new gl object with all loci rendered homozygous

Author(s)

Bernd Gruber and Arthur Georges (Post to https://groups.google.com/d/forum/dartr)

Examples

gl <- gl.filter.reproducibility(testset.gl,t=1)
gl <- gl.filter.overshoot(gl,verbose=3)
gl <- gl.filter.callrate(testset.gl,t=.98)
gl <- gl.filter.monomorphs(gl)
gl2fasta(gl, method=1, outfile="test.fasta",verbose=3)

dartR

Importing and Analysing SNP and Silicodart Data Generated by Genome-Wide Restriction Fragment Analysis

v1.9.6
GPL-2
Authors
Bernd Gruber [aut, cre], Arthur Georges [aut], Jose L. Mijangos [aut], Peter J. Unmack [ctb], Oliver Berry [ctb], Lindsay V. Clark [ctb], Floriaan Devloo-Delva [ctb]
Initial release
2021-04-29

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.