Hi All, I have data files in several folders and want combine all these files in one file. In each folder there are several files and these files have the same structure but different names. First, in each folder I want to concatenate(rbind) all files in to one file. While I am reading each files and concatenating (rbind) all files, I want to added the folder name as one variable in each row. I am reading the folder names from a file and for demonstration I am using only two folders as shown below. Data\week1 # folder name 1 WT13.csv WT26.csv ... WT10.csv Data\week2 #folder name 2 WT02.csv WT12.csv Below please find my attempt, folders=c("week1","week2") for(i in folders){ path=paste("\data\"", i , sep = "") setwd(path) Flist = list.files(path,pattern = "^WT") dataA = lapply(Flist, function(x)read.csv(x, header=T)) Alldata = do.call("rbind", dataA) # combine all files Alldata$foldername=i # adding the folder name } The above works for for one folder but how can I do it for more than one folders? Thank you in advance,
Hi Help with such operations is rather tricky as only you know exact structrure of your folders. see some hints in line> -----Original Message----- > From: R-help <r-help-bounces at r-project.org> On Behalf Of Val > Sent: Tuesday, November 5, 2019 4:33 AM > To: r-help at R-project.org (r-help at r-project.org) <r-help at r-project.org> > Subject: [R] File conca. > > Hi All, > > I have data files in several folders and want combine all these files inone> file. In each folder there are several files and these > files have the same structure but different names. First, in each > folder I want to concatenate(rbind) all files in to one file. While I am > reading each files and concatenating (rbind) all files, I want to addedthe> folder name as one variable in each row. I am reading the folder names > from a file and for demonstration I am using only two folders as shown > below. > Data\week1 # folder name 1 > WT13.csv > WT26.csv ... > WT10.csv > Data\week2 #folder name 2 > WT02.csv > WT12.csv > > Below please find my attempt, > > folders=c("week1","week2") > for(i in folders){ > path=paste("\data\"", i , sep = "") > setwd(path)you should use wd <- setwd(path) which keeps the original directory for subsequent use> Flist = list.files(path,pattern = "^WT") > dataA = lapply(Flist, function(x)read.csv(x, header=T)) > Alldata = do.call("rbind", dataA) # combine all files > Alldata$foldername=i # adding the folder name >now you can do setwd(wd) to return to original directory }> The above works for for one folder but how can I do it for more than one > folders?You also need to decide if you want all data from all folders in one object called Alldata or if you want several Alldata objects, one for each folder. In second case you could use list structure for Alldata. In the first case you could store data from each folder in some temporary object and use rbind directly. something like temp <- do.call("rbind", dataA) temp$foldername <- i Alldata <- temp in the first cycle and Alldata <- rbind(Alldata, temp) in second and all others. Or you could initiate first Alldata manually and use only Alldata <- rbind(Alldata, temp) in your loop. Cheers Petr> > Thank you in advance, > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.
I recommend not using setwd unless you have to (e.g at the beginning of a script run by cron or another task scheduler). It is much simpler to build paths to directories and files using file.path. On November 5, 2019 12:13:19 AM PST, PIKAL Petr <petr.pikal at precheza.cz> wrote:>Hi > >Help with such operations is rather tricky as only you know exact >structrure >of your folders. > >see some hints in line > >> -----Original Message----- >> From: R-help <r-help-bounces at r-project.org> On Behalf Of Val >> Sent: Tuesday, November 5, 2019 4:33 AM >> To: r-help at R-project.org (r-help at r-project.org) ><r-help at r-project.org> >> Subject: [R] File conca. >> >> Hi All, >> >> I have data files in several folders and want combine all these >files in >one >> file. In each folder there are several files and these >> files have the same structure but different names. First, in each >> folder I want to concatenate(rbind) all files in to one file. While >I am >> reading each files and concatenating (rbind) all files, I want to >added >the >> folder name as one variable in each row. I am reading the folder >names >> from a file and for demonstration I am using only two folders as >shown >> below. >> Data\week1 # folder name 1 >> WT13.csv >> WT26.csv ... >> WT10.csv >> Data\week2 #folder name 2 >> WT02.csv >> WT12.csv >> >> Below please find my attempt, >> >> folders=c("week1","week2") >> for(i in folders){ >> path=paste("\data\"", i , sep = "") >> setwd(path) > >you should use >wd <- setwd(path) > >which keeps the original directory for subsequent use > >> Flist = list.files(path,pattern = "^WT") >> dataA = lapply(Flist, function(x)read.csv(x, header=T)) >> Alldata = do.call("rbind", dataA) # combine all files >> Alldata$foldername=i # adding the folder name >> > >now you can do > >setwd(wd) > >to return to original directory >} > >> The above works for for one folder but how can I do it for more than >one >> folders? > >You also need to decide if you want all data from all folders in one >object >called Alldata or if you want several Alldata objects, one for each >folder. > >In second case you could use list structure for Alldata. In the first >case >you could store data from each folder in some temporary object and use >rbind >directly. > >something like > >temp <- do.call("rbind", dataA) >temp$foldername <- i > >Alldata <- temp >in the first cycle >and >Alldata <- rbind(Alldata, temp) >in second and all others. > >Or you could initiate first Alldata manually and use only >Alldata <- rbind(Alldata, temp) > >in your loop. > >Cheers >Petr > >> >> Thank you in advance, >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting- >> guide.html >> and provide commented, minimal, self-contained, reproducible code.-- Sent from my phone. Please excuse my brevity.
Thank you Petr and Jeff fro your suggestions. I made some improvement but still need some tweaking. I could not get correctly the folders names added to each row. Only the last forename was added. table(Alldata$oldername) resulted week2 25500 Please see the complete, #################################################### folders=c("week1","week2") for(i in folders){ path=paste("\data\"", i , sep = "") wd <- setwd(path) Flist = list.files(path,pattern = "^WT") dataA = lapply(Flist, function(x)read.csv(x, header=T)) setwd(wd) temp = do.call("rbind", Alldata) temp$foldername <- i Alldata <- temp Alldata <- rbind(Alldata, temp) } ####################################################### Any suggestion please? On Tue, Nov 5, 2019 at 2:13 AM PIKAL Petr <petr.pikal at precheza.cz> wrote:> > Hi > > Help with such operations is rather tricky as only you know exact structrure > of your folders. > > see some hints in line > > > -----Original Message----- > > From: R-help <r-help-bounces at r-project.org> On Behalf Of Val > > Sent: Tuesday, November 5, 2019 4:33 AM > > To: r-help at R-project.org (r-help at r-project.org) <r-help at r-project.org> > > Subject: [R] File conca. > > > > Hi All, > > > > I have data files in several folders and want combine all these files in > one > > file. In each folder there are several files and these > > files have the same structure but different names. First, in each > > folder I want to concatenate(rbind) all files in to one file. While I am > > reading each files and concatenating (rbind) all files, I want to added > the > > folder name as one variable in each row. I am reading the folder names > > from a file and for demonstration I am using only two folders as shown > > below. > > Data\week1 # folder name 1 > > WT13.csv > > WT26.csv ... > > WT10.csv > > Data\week2 #folder name 2 > > WT02.csv > > WT12.csv > > > > Below please find my attempt, > > > > folders=c("week1","week2") > > for(i in folders){ > > path=paste("\data\"", i , sep = "") > > setwd(path) > > you should use > wd <- setwd(path) > > which keeps the original directory for subsequent use > > > Flist = list.files(path,pattern = "^WT") > > dataA = lapply(Flist, function(x)read.csv(x, header=T)) > > Alldata = do.call("rbind", dataA) # combine all files > > Alldata$foldername=i # adding the folder name > > > > now you can do > > setwd(wd) > > to return to original directory > } > > > The above works for for one folder but how can I do it for more than one > > folders? > > You also need to decide if you want all data from all folders in one object > called Alldata or if you want several Alldata objects, one for each folder. > > In second case you could use list structure for Alldata. In the first case > you could store data from each folder in some temporary object and use rbind > directly. > > something like > > temp <- do.call("rbind", dataA) > temp$foldername <- i > > Alldata <- temp > in the first cycle > and > Alldata <- rbind(Alldata, temp) > in second and all others. > > Or you could initiate first Alldata manually and use only > Alldata <- rbind(Alldata, temp) > > in your loop. > > Cheers > Petr > > > > > Thank you in advance, > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting- > > guide.html > > and provide commented, minimal, self-contained, reproducible code.