Dear all, If the question is too easy, please forgive me since I am only few weeks old in R. I have worked on this question a few days and still cannot figure it out. Here I have a folder with more than 50 tab-delimited files. Each file has a few hundreds of thousands rows/subjects, and the number of columns/variables of each file varies.The 1st row consists of all the variable names. Now I would like to merge all the files into one tab-delimited file by a common column named "Ident" Is there any good way to sequencially merge all of them together? Here when I say "sequencially" I mean merging file_1 and file_2 first and then merge the resulting data frame and file_3, and keep going on and on till all files are merged. If it is too complicated to do, merging all files together without the specified order is also an acceptable alternative. Here I am using R version 2.7.2 (2008-08-25) and x86_64-unknown-linux-gnu. Thank you for any advice and help! PingHsun
Richard.Cotton at hsl.gov.uk
2008-Nov-18 08:39 UTC
[R] sequencially merge multiple files in a folder
> Here I have a folder with more than 50 tab-delimited files. Each > file has a few hundreds of thousands rows/subjects, and the number > of columns/variables of each file varies.The 1st row consists of all > the variable names. > > Now I would like to merge all the files into one tab-delimited file > by a common column named "Ident" > Is there any good way to sequencially merge all of them together? > Here when I say "sequencially" I mean merging file_1 and file_2 > first and then merge the resulting data frame and file_3, and keep > going on and on till all files are merged.Read each of the tab-delimited files into R using read.delim dfr1 <- read.delim("file1.txt") dfr2 <- read.delim("file2.txt") dfr3 <- read.delim("file3.txt") (If the files are nicely named then you could do this in a loop.) It's not entirely clear what you want to do next, since you haven't provided an example of your data. Either 1. Use rbind to concatenate the data frames if each frame has the same columns masterdfr <- rbind(dfr1, dfr2) masterdfr <- rbind(masterdfr, dfr3) 2. Use merge to merge the data frames if they have different columns masterdfr <- merge(dfr1, dfr2) #you'll neeed to specify some other arguments If you want a clearer answer, you'll have to read the posting guide and provide more details about your data. Regards, Richie. Mathematical Sciences Unit HSL ------------------------------------------------------------------------ ATTENTION: This message contains privileged and confidential inform...{{dropped:20}}
If you know how to merge 2 of the files together, then you can use the Reduce function to do the merging of multiple files. You could use lapply to read all of the files into a list, then Reduce to merge them together, then output the result to a new file if a file is really what you want. Another approach may be to use one of the database packages and an sql query to merge everything together. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Ping-Hsun Hsieh > Sent: Monday, November 17, 2008 10:38 PM > To: r-help at r-project.org > Subject: [R] sequencially merge multiple files in a folder > > > > > Dear all, > > If the question is too easy, please forgive me since I am only few > weeks old in R. > > I have worked on this question a few days and still cannot figure it > out. > Here I have a folder with more than 50 tab-delimited files. Each file > has a few hundreds of thousands rows/subjects, and the number of > columns/variables of each file varies.The 1st row consists of all the > variable names. > > Now I would like to merge all the files into one tab-delimited file by > a common column named "Ident" > Is there any good way to sequencially merge all of them together? > Here when I say "sequencially" I mean merging file_1 and file_2 > first and then merge the resulting data frame and file_3, and keep > going on and on till all files are merged. > > If it is too complicated to do, merging all files together without the > specified order is also an acceptable alternative. > > Here I am using R version 2.7.2 (2008-08-25) and x86_64-unknown-linux- > gnu. > > Thank you for any advice and help! > > PingHsun