Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!
Get Started for Free

read.nexus.data

Read Character Data In NEXUS Format


Description

read.nexus.data reads a file with sequences in the NEXUS format. nexus2DNAbin is a helper function to convert the output from the previous function into the class "DNAbin".

For the moment, only sequence data (DNA or protein) are supported.

Usage

read.nexus.data(file)
nexus2DNAbin(x)

Arguments

file

a file name specified by either a variable of mode character, or a double-quoted string.

x

an object output by read.nexus.data.

Details

This parser tries to read data from a file written in a restricted NEXUS format (see examples below).

Please see files ‘data.nex’ and ‘taxacharacters.nex’ for examples of formats that will work.

Some noticeable exceptions from the NEXUS standard (non-exhaustive list):

  • IComments must be either on separate lines or at the end of lines. Examples:
    [Comment] — OK
    Taxon ACGTACG [Comment] — OK
    [Comment line 1

    Comment line 2] — NOT OK!
    Tax[Comment]on ACG[Comment]T — NOT OK!

  • IINo spaces (or comments) are allowed in the sequences. Examples:
    name ACGT — OK
    name AC GT — NOT OK!

  • IIINo spaces are allowed in taxon names, not even if names are in single quotes. That is, single-quoted names are not treated as such by the parser. Examples:
    Genus_species — OK
    'Genus_species' — OK
    'Genus species' — NOT OK!

  • IVThe trailing end that closes the matrix must be on a separate line. Examples:
    taxon AACCGGT

    end; — OK
    taxon AACCGGT;

    end; — OK
    taxon AACCCGT; end; — NOT OK!

  • VMultistate characters are not allowed. That is, NEXUS allows you to specify multiple character states at a character position either as an uncertainty, (XY), or as an actual appearance of multiple states, {XY}. This is information is not handled by the parser. Examples:
    taxon 0011?110 — OK
    taxon 0011{01}110 — NOT OK!
    taxon 0011(01)110 — NOT OK!

  • VIThe number of taxa must be on the same line as ntax. The same applies to nchar. Examples:
    ntax = 12 — OK
    ntax =

    12 — NOT OK!

  • VIIThe word “matrix” can not occur anywhere in the file before the actual matrix command, unless it is in a comment. Examples:
    BEGIN CHARACTERS;

    TITLE 'Data in file "03a-cytochromeB.nex"';

    DIMENSIONS NCHAR=382;

    FORMAT DATATYPE=Protein GAP=- MISSING=?;

    ["This is The Matrix"] — OK

    MATRIX

    BEGIN CHARACTERS;

    TITLE 'Matrix in file "03a-cytochromeB.nex"'; — NOT OK!

    DIMENSIONS NCHAR=382;

    FORMAT DATATYPE=Protein GAP=- MISSING=?;

    MATRIX

Value

A list of sequences each made of a single vector of mode character where each element is a (phylogenetic) character state.

Author(s)

Johan Nylander, Thomas Guillerme, and Klaus Schliep

References

Maddison, D. R., Swofford, D. L. and Maddison, W. P. (1997) NEXUS: an extensible file format for systematic information. Systematic Biology, 46, 590–621.

See Also

Examples

## Use read.nexus.data to read a file in NEXUS format into object x
## Not run: x <- read.nexus.data("file.nex")

ape

Analyses of Phylogenetics and Evolution

v5.5
GPL-2 | GPL-3
Authors
Emmanuel Paradis [aut, cre, cph] (<https://orcid.org/0000-0003-3092-2199>), Simon Blomberg [aut, cph] (<https://orcid.org/0000-0003-1062-0839>), Ben Bolker [aut, cph] (<https://orcid.org/0000-0002-2127-0443>), Joseph Brown [aut, cph] (<https://orcid.org/0000-0002-3835-8062>), Santiago Claramunt [aut, cph] (<https://orcid.org/0000-0002-8926-5974>), Julien Claude [aut, cph] (<https://orcid.org/0000-0002-9267-1228>), Hoa Sien Cuong [aut, cph], Richard Desper [aut, cph], Gilles Didier [aut, cph] (<https://orcid.org/0000-0003-0596-9112>), Benoit Durand [aut, cph], Julien Dutheil [aut, cph] (<https://orcid.org/0000-0001-7753-4121>), RJ Ewing [aut, cph], Olivier Gascuel [aut, cph], Thomas Guillerme [aut, cph] (<https://orcid.org/0000-0003-4325-1275>), Christoph Heibl [aut, cph] (<https://orcid.org/0000-0002-7655-3299>), Anthony Ives [aut, cph] (<https://orcid.org/0000-0001-9375-9523>), Bradley Jones [aut, cph] (<https://orcid.org/0000-0003-4498-1069>), Franz Krah [aut, cph] (<https://orcid.org/0000-0001-7866-7508>), Daniel Lawson [aut, cph] (<https://orcid.org/0000-0002-5311-6213>), Vincent Lefort [aut, cph], Pierre Legendre [aut, cph] (<https://orcid.org/0000-0002-3838-3305>), Jim Lemon [aut, cph], Guillaume Louvel [aut, cph] (<https://orcid.org/0000-0002-7745-0785>), Eric Marcon [aut, cph] (<https://orcid.org/0000-0002-5249-321X>), Rosemary McCloskey [aut, cph] (<https://orcid.org/0000-0002-9772-8553>), Johan Nylander [aut, cph], Rainer Opgen-Rhein [aut, cph], Andrei-Alin Popescu [aut, cph], Manuela Royer-Carenzi [aut, cph], Klaus Schliep [aut, cph] (<https://orcid.org/0000-0003-2941-0161>), Korbinian Strimmer [aut, cph] (<https://orcid.org/0000-0001-7917-2056>), Damien de Vienne [aut, cph] (<https://orcid.org/0000-0001-9532-5251>)
Initial release
2021-04-24

We don't support your browser anymore

Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.