How to specify names of columns for x and y when joining in dplyr

Question

I have two data frames that I want to join using dplyr  One is a data frame containing first names   test data  lt - data frame first name   c  john    bill    madison    abby    zzz                            stringsAsFactors   FALSE    The other data frame contains a cleaned up version of the Kantrowitz names corpus  identifying gender  Here is a minimal example   kantrowitz  lt - structure list name   c  john    bill    madison    abby    thomas    gender   c  M    either    M    either    M      Names   c  name    gender    row names   c NA  5L   class   c  tbl df    tbl    data frame      I essentially want to look up the gender of the name from the test data table using the kantrowitz table  Because I m going to abstract this into a function encode gender  I won t know the name of the column in the data set that s going to be used  and so I can t guarantee that it will be name  as in kantrowitz name   In base R I would perform the merge this way   merge test data  kantrowitz  by x    first names   by y    name   all x   TRUE    That returns the correct output     first name gender 1       abby either 2       bill either 3       john      M 4    madison      M 5        zzz    lt NA gt    But I want to do this in dplyr because I m using that package for all my other data manipulation  The dplyr by option to the various   join functions only lets me specify one column name  but I need to specify two  I m looking for something like this   library dplyr    either left join test data  kantrowitz  by x    first name   by y    name     or left join test data  kantrowitz  by   c  first name    name      What is the way to perform this kind of join using dplyr    Never mind that the Kantrowitz corpus is a bad way to identify gender  I m working on a better implementation  but I want to get this working first

User · Accepted Answer

This feature has been added in dplyr v0 3  You can now pass a named character vector to the by argument in left join  and other joining functions  to specify which columns to join on in each data frame  With the example given in the original question  the code would be   left join test data  kantrowitz  by   c  first name     name

User · Answer

This is more a workaround than a real solution  You can create a new object test data with another column name   left join  names lt -  test data   name    kantrowitz  by    name         name gender 1    john      M 2    bill either 3 madison      M 4    abby either 5     zzz    lt NA gt

[r] How to specify names of columns for x and y when joining in dplyr?

Examples related to r

Examples related to join

Examples related to left-join

Examples related to dplyr