thr3ads.net - R help - [R] Using reduce to merge multiple files [Jun 2014]

If this information is useful, please help other people find it:
Share via:

Kate Ignatius

2014-Jun-12 17:16 UTC

[R] Using reduce to merge multiple files

I have a list of files that I have called like so:

main_dir <- '/path/to/files/'
directories <- list.files(main_dir, pattern = '[[:alnum:]]',
full.names=T)

filenames <- list.files(file.path(directories,"/tmpdir/"),  pattern
'[[:alnum:][:punct:]]_eat.txt+$', recursive = TRUE, full.names=T)

This lists around 35 Files.  Each has multiple columns but they all
have three columns in common: Burger, Stall and Cost which I want to
merge on using:

m1 <- Reduce(function(a, b) { merge(a, b,
by=c("Burger",Stall","Cost")) }, filenames)

However, I get the error:

Error in fix.by(by.x, x) : 'by' must specify uniquely valid columns

Is there something that I have obviously overlooked here?

Thanks in advance!

Henrik Bengtsson

2014-Jun-13 00:13 UTC

head link

[R] Using reduce to merge multiple files

On Thu, Jun 12, 2014 at 10:16 AM, Kate Ignatius <kate.ignatius at
gmail.com> wrote:> I have a list of files that I have called like so:
>
> main_dir <- '/path/to/files/'
> directories <- list.files(main_dir, pattern = '[[:alnum:]]',
full.names=T)
>
> filenames <- list.files(file.path(directories,"/tmpdir/"), 
pattern > '[[:alnum:][:punct:]]_eat.txt+$', recursive = TRUE,
full.names=T)
>
> This lists around 35 Files.  Each has multiple columns but they all
> have three columns in common: Burger, Stall and Cost which I want to
> merge on using:
>
> m1 <- Reduce(function(a, b) { merge(a, b,
> by=c("Burger",Stall","Cost")) }, filenames)
>
> However, I get the error:
>
> Error in fix.by(by.x, x) : 'by' must specify uniquely valid columns
>
> Is there something that I have obviously overlooked here?
You're forgetting to read the data, i.e. you need to call read.table()
before merging.

Here's an alternative (that does the same internally):

library("R.filesets")
m1 <- readDataFrame(filenames,
colClasses=c("(Burger|Stall|Cost)"=NA))

If you know what data types the different column hold, then you can
guide R to the same faster and more memory efficient, e.g.

m1 <- readDataFrame(filenames,
colClasses=c("(Burger|Stall)"="factor",
"Cost"="double"))

/Henrik

>
> Thanks in advance!
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

R help - Jun 2014 - Using reduce to merge multiple files

[R] Using reduce to merge multiple files

[R] Using reduce to merge multiple files