Hmisc: Merge – R documentation

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

Merge

Merge Multiple Data Frames or Data Tables

Description

Merges an arbitrarily large series of data frames or data tables containing common id variables (keys for data tables). Information about number of observations and number of unique ids in individual and final merged datasets is printed. The first data frame has special meaning in that all of its observations are kept whether they match ids in other data frames or not. For all other data frames, by default non-matching observations are dropped. The first data frame is also the one against which counts of unique ids are compared. Sometimes merge drops variable attributes such as labels and units. These are restored by Merge. If all objects are of class data.table, faster merging will be done using the data.table package's join operation. This assumes that all objects have identical key variables and those of the variables on which to merge.

Usage

Merge(..., id, all = TRUE, verbose = TRUE)

Arguments

`...`	two or more dataframes or data tables
`id`	a formula containing all the identification variables such that the combination of these variables uniquely identifies subjects or records of interest. May be omitted for data tables; in that case the `key` function retrieves the id variables.
`all`	set to `FALSE` to drop observations not found in second and later data frames (only applies if not using `data.table`)
`verbose`	set to `FALSE` to not print information about observations

Examples

## Not run: 
a <- data.frame(sid=1:3, age=c(20,30,40))
b <- data.frame(sid=c(1,2,2), bp=c(120,130,140))
d <- data.frame(sid=c(1,3,4), wt=c(170,180,190))
all <- Merge(a, b, d, id = ~ sid)
# For data.table, first file must be the master file and must
# contain all ids that ever occur.  ids not in the master will
# not be merged from other datasets.
a <- data.table(a); setkey(a, sid)
# data.table also does not allow duplicates without allow.cartesian=TRUE
b <- data.table(sid=1:2, bp=c(120,130)); setkey(b, sid)
d <- data.table(d); setkey(d, sid)
all <- Merge(a, b, d)

## End(Not run)

Hmisc

Harrell Miscellaneous

v4.5-0

GPL (>= 2)

Authors

Frank E Harrell Jr <fh@fharrell.com>, with contributions from Charles Dupont and many others.

Initial release

2021-02-27