Skip to content

Merging list of dataframes with reshape merge_all

4 messages · Johannes Radinger, Rui Barradas, Hadley Wickham

#
Hi,

I'd like to merge mutliple dataframes from a list of dataframes by some common
columns. The approach for simply merging 2 dataframes is working with:

merge(df1,df2,by=c("col1","col2","col3"),all=TRUE)

For mutliple dataframes in a list I try to use the merge_all command
from the package reshape.
The documentation states that the command takes a list of dataframes
and other additional
argument which are passed on to merge. So I tried (just for the case
of two dataframes):

merge_all(list(df1,df2),by=c("col1","col2","col3"),all=TRUE)

but I get following error:
Error in merge.data.frame(dfs[[1]], dfs[[2]], all = TRUE, sort = FALSE,  :
  formal argument "all" matched by multiple actual arguments

What do I need to do to solve that problem?

PS: Just a related side-question: Why is merge_all not included in the
"newer" package reshape2 as this is considered to be a reboot of the
reshape package?

/johannes
#
Hello,

To "solve" the problem you can use all.x and all.y but I think there are 
other problems with merge_all, in the example below it doesn't include 
df2$Y in the result df.


df1 <- data.frame(col1=1:10, col2=11:20, col3=21:30, X = rnorm(10))
df2 <- data.frame(col1=1:10, col2=11:20, col3=21:30, Y = rnorm(10))

merge_all(list(df1, df2), by=c("col1","col2","col3"), all.x=TRUE, 
all.y=TRUE)  # No Y column
merge(df1, df2, by=c("col1","col2","col3"), all.x=TRUE, all.y=TRUE)


Contact the package maintainer for more info.

maintainer('reshape')
[1] "Hadley Wickham <h.wickham at gmail.com>"


Hope this helps,

Rui Barradas

Em 11-01-2013 10:08, Johannes Radinger escreveu:
#
Hi Rui,

thank you so far for your answer...
...as i found these other problems also myself, I decided to go
for a looping apporach with the base merge command and always
add a new dataframe to the old one (growing dataframe).

/johannes
On Fri, Jan 11, 2013 at 1:26 PM, Rui Barradas <ruipbarradas at sapo.pt> wrote:
#
Because it doesn't work very well, as you've discovered.

There's an equivalent join_all in plyr.

Hadley