Reduce makes this fairly easy:
merged.data.frame = Reduce(function(...) merge(..., all=T), list.of.data.frames)
Here's a fully example using some mock data:
set.seed(1)
list.of.data.frames = list(data.frame(x=1:10, a=1:10), data.frame(x=5:14, b=11:20), data.frame(x=sample(20, 10), y=runif(10)))
merged.data.frame = Reduce(function(...) merge(..., all=T), list.of.data.frames)
tail(merged.data.frame)
# x a b y
#12 12 NA 18 NA
#13 13 NA 19 NA
#14 14 NA 20 0.4976992
#15 15 NA NA 0.7176185
#16 16 NA NA 0.3841037
#17 19 NA NA 0.3800352
And here's an example using these data to replicate my.list
:
merged.data.frame = Reduce(function(...) merge(..., by=match.by, all=T), my.list)
merged.data.frame[, 1:12]
# matchname party st district chamber senate1993 name.x v2.x v3.x v4.x senate1994 name.y
#1 ALGIERE 200 RI 026 S NA <NA> NA NA NA NA <NA>
#2 ALVES 100 RI 019 S NA <NA> NA NA NA NA <NA>
#3 BADEAU 100 RI 032 S NA <NA> NA NA NA NA <NA>
Note: It looks like this is arguably a bug in merge
. The problem is there is no check that adding the suffixes (to handle overlapping non-matching names) actually makes them unique. At a certain point it uses [.data.frame
which does make.unique
the names, causing the rbind
to fail.
# first merge will end up with 'name.x' & 'name.y'
merge(my.list[[1]], my.list[[2]], by=match.by, all=T)
# [1] matchname party st district chamber senate1993 name.x
# [8] votes.year.x senate1994 name.y votes.year.y
#<0 rows> (or 0-length row.names)
# as there is no clash, we retain 'name.x' & 'name.y' and get 'name' again
merge(merge(my.list[[1]], my.list[[2]], by=match.by, all=T), my.list[[3]], by=match.by, all=T)
# [1] matchname party st district chamber senate1993 name.x
# [8] votes.year.x senate1994 name.y votes.year.y senate1995 name votes.year
#<0 rows> (or 0-length row.names)
# the next merge will fail as 'name' will get renamed to a pre-existing field.
Easiest way to fix is to not leave the field renaming for duplicates fields (of which there are many here) up to merge
. Eg:
my.list2 = Map(function(x, i) setNames(x, ifelse(names(x) %in% match.by,
names(x), sprintf('%s.%d', names(x), i))), my.list, seq_along(my.list))
The merge
/Reduce
will then work fine.