There is the data.table approach for an inner join, which is very time and memory efficient (and necessary for some larger data.frames):
dt1 <- data.table(df1, key = "CustomerId")
dt2 <- data.table(df2, key = "CustomerId")
joined.dt1.dt.2 <- dt1[dt2]
also works on data.tables (as it is generic and calls
merge(dt1, dt2)
data.table documented on stackoverflow:
How to do a data.table merge operation
Translating SQL joins on foreign keys to R data.table syntax
Efficient alternatives to merge for larger data.frames R
How to do a basic left outer join with data.table in R?
Yet another option is the join
function found in the plyr package
join(df1, df2,
type = "inner")
# CustomerId Product State
# 1 2 Toaster Alabama
# 2 4 Radio Alabama
# 3 6 Radio Ohio
Options for type
: inner
, left
, right
, full
From ?join
: Unlike merge
, [join
] preserves the order of x no matter what join type is used.