On Thu, Jun 12, 2014 at 10:16 AM, Kate Ignatius <kate.ignatius at
gmail.com> wrote:> I have a list of files that I have called like so:
>
> main_dir <- '/path/to/files/'
> directories <- list.files(main_dir, pattern = '[[:alnum:]]',
full.names=T)
>
> filenames <- list.files(file.path(directories,"/tmpdir/"),
pattern > '[[:alnum:][:punct:]]_eat.txt+$', recursive = TRUE,
full.names=T)
>
> This lists around 35 Files. Each has multiple columns but they all
> have three columns in common: Burger, Stall and Cost which I want to
> merge on using:
>
> m1 <- Reduce(function(a, b) { merge(a, b,
> by=c("Burger",Stall","Cost")) }, filenames)
>
> However, I get the error:
>
> Error in fix.by(by.x, x) : 'by' must specify uniquely valid columns
>
> Is there something that I have obviously overlooked here?
You're forgetting to read the data, i.e. you need to call read.table()
before merging.
Here's an alternative (that does the same internally):
library("R.filesets")
m1 <- readDataFrame(filenames,
colClasses=c("(Burger|Stall|Cost)"=NA))
If you know what data types the different column hold, then you can
guide R to the same faster and more memory efficient, e.g.
m1 <- readDataFrame(filenames,
colClasses=c("(Burger|Stall)"="factor",
"Cost"="double"))
/Henrik
>
> Thanks in advance!
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.