[r] Can dplyr join on multiple columns or composite key?

I realize that dplyr v3.0 allows you to join on different variables:

left_join(x, y, by = c("a" = "b") will match x.a to y.b

However, is it possible to join on a combination of variables or do I have to add a composite key beforehand?

Something like this:

left_join(x, y, by = c("a c" = "b d") to match the concatenation of [x.a and x.c] to [y.b and y.d]

This question is related to r dplyr

The answer is


Updating to use tibble()

You can pass a named vector of length greater than 1 to the by argument of left_join():

library(dplyr)

d1 <- tibble(
  x = letters[1:3],
  y = LETTERS[1:3],
  a = rnorm(3)
  )

d2 <- tibble(
  x2 = letters[3:1],
  y2 = LETTERS[3:1],
  b = rnorm(3)
  )

left_join(d1, d2, by = c("x" = "x2", "y" = "y2"))