I have a list of files that I have called like so:
main_dir <- '/path/to/files/'
directories <- list.files(main_dir, pattern = '[[:alnum:]]', full.names=T)
filenames <- list.files(file.path(directories,"/tmpdir/"), pattern =
'[[:alnum:][:punct:]]_eat.txt+$', recursive = TRUE, full.names=T)
This lists around 35 Files. Each has multiple columns but they all
have three columns in common: Burger, Stall and Cost which I want to
merge on using:
m1 <- Reduce(function(a, b) { merge(a, b,
by=c("Burger",Stall","Cost")) }, filenames)
However, I get the error:
Error in fix.by(by.x, x) : 'by' must specify uniquely valid columns
Is there something that I have obviously overlooked here?
Thanks in advance!
Using reduce to merge multiple files
2 messages · Kate Ignatius, Henrik Bengtsson
On Thu, Jun 12, 2014 at 10:16 AM, Kate Ignatius <kate.ignatius at gmail.com> wrote:
I have a list of files that I have called like so:
main_dir <- '/path/to/files/'
directories <- list.files(main_dir, pattern = '[[:alnum:]]', full.names=T)
filenames <- list.files(file.path(directories,"/tmpdir/"), pattern =
'[[:alnum:][:punct:]]_eat.txt+$', recursive = TRUE, full.names=T)
This lists around 35 Files. Each has multiple columns but they all
have three columns in common: Burger, Stall and Cost which I want to
merge on using:
m1 <- Reduce(function(a, b) { merge(a, b,
by=c("Burger",Stall","Cost")) }, filenames)
However, I get the error:
Error in fix.by(by.x, x) : 'by' must specify uniquely valid columns
Is there something that I have obviously overlooked here?
You're forgetting to read the data, i.e. you need to call read.table()
before merging.
Here's an alternative (that does the same internally):
library("R.filesets")
m1 <- readDataFrame(filenames, colClasses=c("(Burger|Stall|Cost)"=NA))
If you know what data types the different column hold, then you can
guide R to the same faster and more memory efficient, e.g.
m1 <- readDataFrame(filenames, colClasses=c("(Burger|Stall)"="factor",
"Cost"="double"))
/Henrik
Thanks in advance!
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.