Combine a phylogenetic tree with data
phylo4d
is a generic constructor which merges a
phylogenetic tree with data frames to create a combined object of
class phylo4d
phylo4d(x, ...) ## S4 method for signature 'phylo4' phylo4d( x, tip.data = NULL, node.data = NULL, all.data = NULL, merge.data = TRUE, metadata = list(), ... ) ## S4 method for signature 'matrix' phylo4d( x, tip.data = NULL, node.data = NULL, all.data = NULL, merge.data = TRUE, metadata = list(), edge.length = NULL, tip.label = NULL, node.label = NULL, edge.label = NULL, order = "unknown", annote = list(), ... ) ## S4 method for signature 'phylo' phylo4d( x, tip.data = NULL, node.data = NULL, all.data = NULL, check.node.labels = c("keep", "drop", "asdata"), annote = list(), metadata = list(), ... ) ## S4 method for signature 'phylo4d' phylo4d(x, ...) ## S4 method for signature 'nexml' phylo4d(x)
x |
an object of class |
... |
further arguments to control the behavior of the constructor in the case of missing/extra data and where to look for labels in the case of non-unique labels that cannot be stored as row names in a data frame (see Details). |
tip.data |
a data frame (or object to be coerced to one) containing only tip data (Optional) |
node.data |
a data frame (or object to be coerced to one) containing only node data (Optional) |
all.data |
a data frame (or object to be coerced to one) containing both tip and node data (Optional) |
merge.data |
if both |
metadata |
any additional metadata to be passed to the new object |
edge.length |
Edge (branch) length. (Optional) |
tip.label |
A character vector of species names (names of "tip" nodes). (Optional) |
node.label |
A character vector of internal node names. (Optional) |
edge.label |
A character vector of edge (branch) names. (Optional) |
order |
character: tree ordering (allowable values are listed
in |
annote |
any additional annotation data to be passed to the new object |
check.node.labels |
if |
You can provide several data frames to define traits associated with tip and/or internal nodes. By default, data row names are used to link data to nodes in the tree, with any number-like names (e.g., “10”) matched against node ID numbers, and any non-number-like names (e.g., “n10”) matched against node labels. Alternative matching rules can be specified by passing additional arguments (listed in the Details section); these include positional matching, matching exclusively on node labels, and matching based on a column of data rather than on row names.
Matching rules will apply the same way to all supplied data frames. This means that you need to be consistent with the row names of your data frames. It is good practice to use tip and node labels (or node numbers if you use duplicated labels) when you combine data with a tree.
If you provide both tip.data
and node.data
, the
treatment of columns with common names will depend on the
merge.data
argument. If TRUE, columns with the same name in
both data frames will be merged; when merging columns of different
data types, coercion to a common type will follow standard R
rules. If merge.data
is FALSE, columns with common names
will be preserved independently, with “.tip” and
“.node” appended to the names. This argument has no effect
if tip.data
and node.data
have no column names in
common.
If you provide all.data
along with either of
tip.data
and node.data
, it must have distinct column
names, otherwise an error will result. Additionally, although
supplying columns with the same names within data frames is
not illegal, automatic renaming for uniqeness may lead to
surprising results, so this practice should be avoided.
This is the list of additional arguments that can be used to control matching between the tree and the data:
match.data(logical) should the rownames of the data frame be used to be matched against tip and internal node identifiers?
rownamesAsLabels(logical), should the row names of the data provided be matched only to labels (TRUE), or should any number-like row names be matched to node numbers (FALSE and default)
label.typecharacter, rownames
or column
:
should the labels be taken from the row names of dt
or from
the label.column
column of dt
?
label.columniff label.type=="column"
, column
specifier (number or name) of the column containing tip labels
missing.dataaction to take if there are missing data or if there are data labels that don't match
extra.dataaction to take if there are extra data or if there are labels that don't match
keep.all(logical), should the returned data have rows for all nodes (with NA values for internal rows when type='tip', and vice versa) (TRUE and default) or only rows corresponding to the type argument
Rules for matching rows of data to tree nodes are determined
jointly by the match.data
and rownamesAsLabels
arguments. If match.data
is TRUE, data frame rows will be
matched exclusively against tip and node labels if
rownamesAsLabels
is also TRUE, whereas any all-digit row
names will be matched against tip and node numbers if
rownamesAsLabels
is FALSE (the default). If
match.data
is FALSE, rownamesAsLabels
has no effect,
and row matching is purely positional with respect to the order
returned by nodeId(phy, type)
.
An object of class phylo4d.
merges a tree of
class phylo4
with a data.frame into a phylo4d
object
merges a matrix of tree edges similar
to the edge slot of a phylo4
object (or to \$edge of a
phylo
object) with a data.frame into a phylo4d
object
merges a tree of class phylo
with a data.frame into a phylo4d
object
Checking on matches between the tree and the data will be done by the validity checker (label matches between data and tree tips, number of rows of data vs. number of nodes/tips/etc.)
Ben Bolker, Thibaut Jombart, Steve Kembel, Francois Michonneau, Jim Regetz
coerce-methods
for translation
functions. The phylo4d class; phylo4
class and phylo4 constructor.
treeOwls <- "((Strix_aluco:4.2,Asio_otus:4.2):3.1,Athene_noctua:7.3);" tree.owls.bis <- ape::read.tree(text=treeOwls) try(phylo4d(as(tree.owls.bis,"phylo4"),data.frame(wing=1:3)), silent=TRUE) obj <- phylo4d(as(tree.owls.bis,"phylo4"),data.frame(wing=1:3), match.data=FALSE) obj print(obj) #### data(geospiza_raw) geoTree <- geospiza_raw$tree geoData <- geospiza_raw$data ## fix differences in tip names between the tree and the data geoData <- rbind(geoData, array(, dim = c(1,ncol(geoData)), dimnames = list("olivacea", colnames(geoData)))) ### Example using a tree of class 'phylo' exGeo1 <- phylo4d(geoTree, tip.data = geoData) ### Example using a tree of class 'phylo4' geoTree <- as(geoTree, "phylo4") ## some random node data rNodeData <- data.frame(randomTrait = rnorm(nNodes(geoTree)), row.names = nodeId(geoTree, "internal")) exGeo2 <- phylo4d(geoTree, tip.data = geoData, node.data = rNodeData) ### Example using 'merge.data' data(geospiza) trGeo <- extractTree(geospiza) tDt <- data.frame(a=rnorm(nTips(trGeo)), row.names=nodeId(trGeo, "tip")) nDt <- data.frame(a=rnorm(nNodes(trGeo)), row.names=nodeId(trGeo, "internal")) (matchData1 <- phylo4d(trGeo, tip.data=tDt, node.data=nDt, merge.data=FALSE)) (matchData2 <- phylo4d(trGeo, tip.data=tDt, node.data=nDt, merge.data=TRUE)) ## Example with 'all.data' nodeLabels(geoTree) <- as.character(nodeId(geoTree, "internal")) rAllData <- data.frame(randomTrait = rnorm(nTips(geoTree) + nNodes(geoTree)), row.names = labels(geoTree, 'all')) exGeo5 <- phylo4d(geoTree, all.data = rAllData) ## Examples using 'rownamesAsLabels' and comparing with match.data=FALSE tDt <- data.frame(x=letters[1:nTips(trGeo)], row.names=sample(nodeId(trGeo, "tip"))) tipLabels(trGeo) <- as.character(sample(1:nTips(trGeo))) (exGeo6 <- phylo4d(trGeo, tip.data=tDt, rownamesAsLabels=TRUE)) (exGeo7 <- phylo4d(trGeo, tip.data=tDt, rownamesAsLabels=FALSE)) (exGeo8 <- phylo4d(trGeo, tip.data=tDt, match.data=FALSE)) ## generate a tree and some data set.seed(1) p3 <- ape::rcoal(5) dat <- data.frame(a = rnorm(5), b = rnorm(5), row.names = p3$tip.label) dat.defaultnames <- dat row.names(dat.defaultnames) <- NULL dat.superset <- rbind(dat, rnorm(2)) dat.subset <- dat[-1, ] ## create a phylo4 object from a phylo object p4 <- as(p3, "phylo4") ## create phylo4d objects with tip data p4d <- phylo4d(p4, dat) ###checkData(p4d) p4d.sorted <- phylo4d(p4, dat[5:1, ]) try(p4d.nonames <- phylo4d(p4, dat.defaultnames)) p4d.nonames <- phylo4d(p4, dat.defaultnames, match.data=FALSE) ## Not run: p4d.subset <- phylo4d(p4, dat.subset) p4d.subset <- phylo4d(p4, dat.subset) try(p4d.superset <- phylo4d(p4, dat.superset)) p4d.superset <- phylo4d(p4, dat.superset) ## End(Not run) ## create phylo4d objects with node data nod.dat <- data.frame(a = rnorm(4), b = rnorm(4)) p4d.nod <- phylo4d(p4, node.data = nod.dat, match.data=FALSE) ## create phylo4 objects with node and tip data p4d.all1 <- phylo4d(p4, node.data = nod.dat, tip.data = dat, match.data=FALSE) nodeLabels(p4) <- as.character(nodeId(p4, "internal")) p4d.all2 <- phylo4d(p4, all.data = rbind(dat, nod.dat), match.data=FALSE)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.