plyr: join – R documentation

Pricing

Become an expert in R — Interactive courses, Cheat Sheets, certificates and more!

Get Started for Free

Documentation

plyr

join

Join two data frames together.

Description

Join, like merge, is designed for the types of problems where you would use a sql join.

Usage

join(x, y, by = NULL, type = "left", match = "all")

Arguments

`x`	data frame
`y`	data frame
`by`	character vector of variable names to join by. If omitted, will match on all common variables.
`type`	type of join: left (default), right, inner or full. See details for more information.
`match`	how should duplicate ids be matched? Either match just the `"first"` matching row, or match `"all"` matching rows. Defaults to `"all"` for compatibility with merge, but `"first"` is significantly faster.

Details

The four join types return:

inner: only rows with matching keys in both x and y
left: all rows in x, adding matching columns from y
right: all rows in y, adding matching columns from x
full: all rows in x with matching columns in y, then the rows of y that don't match x.

Note that from plyr 1.5, join will (by default) return all matches, not just the first match, as it did previously.

Unlike merge, preserves the order of x no matter what join type is used. If needed, rows from y will be added to the bottom. Join is often faster than merge, although it is somewhat less featureful - it currently offers no way to rename output or merge on different variables in the x and y data frames.

Examples

first <- ddply(baseball, "id", summarise, first = min(year))
system.time(b2 <- merge(baseball, first, by = "id", all.x = TRUE))
system.time(b3 <- join(baseball, first, by = "id"))

b2 <- arrange(b2, id, year, stint)
b3 <- arrange(b3, id, year, stint)
stopifnot(all.equal(b2, b3))

plyr

Tools for Splitting, Applying and Combining Data

v1.8.6

MIT + file LICENSE

Authors

Hadley Wickham [aut, cre]

Initial release

join

Description

Usage

Arguments

Details

Examples

plyr

We don't support your browser anymore