Hi All,
I have data files in several folders and want combine all these files
in one file. In each folder there are several files and these
files have the same structure but different names. First, in each
folder I want to concatenate(rbind) all files in to one file. While I
am reading each files and concatenating (rbind) all files, I want to
added the folder name as one variable in each row. I am reading the
folder names from a file and for demonstration I am using only two
folders as shown below.
Data\week1 # folder name 1
WT13.csv
WT26.csv ...
WT10.csv
Data\week2 #folder name 2
WT02.csv
WT12.csv
Below please find my attempt,
folders=c("week1","week2")
for(i in folders){
path=paste("\data\"", i , sep = "")
setwd(path)
Flist = list.files(path,pattern = "^WT")
dataA = lapply(Flist, function(x)read.csv(x, header=T))
Alldata = do.call("rbind", dataA) # combine all files
Alldata$foldername=i # adding the folder name
}
The above works for for one folder but how can I do it for more than
one folders?
Thank you in advance,
Hi Help with such operations is rather tricky as only you know exact structrure of your folders. see some hints in line> -----Original Message----- > From: R-help <r-help-bounces at r-project.org> On Behalf Of Val > Sent: Tuesday, November 5, 2019 4:33 AM > To: r-help at R-project.org (r-help at r-project.org) <r-help at r-project.org> > Subject: [R] File conca. > > Hi All, > > I have data files in several folders and want combine all these files inone> file. In each folder there are several files and these > files have the same structure but different names. First, in each > folder I want to concatenate(rbind) all files in to one file. While I am > reading each files and concatenating (rbind) all files, I want to addedthe> folder name as one variable in each row. I am reading the folder names > from a file and for demonstration I am using only two folders as shown > below. > Data\week1 # folder name 1 > WT13.csv > WT26.csv ... > WT10.csv > Data\week2 #folder name 2 > WT02.csv > WT12.csv > > Below please find my attempt, > > folders=c("week1","week2") > for(i in folders){ > path=paste("\data\"", i , sep = "") > setwd(path)you should use wd <- setwd(path) which keeps the original directory for subsequent use> Flist = list.files(path,pattern = "^WT") > dataA = lapply(Flist, function(x)read.csv(x, header=T)) > Alldata = do.call("rbind", dataA) # combine all files > Alldata$foldername=i # adding the folder name >now you can do setwd(wd) to return to original directory }> The above works for for one folder but how can I do it for more than one > folders?You also need to decide if you want all data from all folders in one object called Alldata or if you want several Alldata objects, one for each folder. In second case you could use list structure for Alldata. In the first case you could store data from each folder in some temporary object and use rbind directly. something like temp <- do.call("rbind", dataA) temp$foldername <- i Alldata <- temp in the first cycle and Alldata <- rbind(Alldata, temp) in second and all others. Or you could initiate first Alldata manually and use only Alldata <- rbind(Alldata, temp) in your loop. Cheers Petr> > Thank you in advance, > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.
I recommend not using setwd unless you have to (e.g at the beginning of a script run by cron or another task scheduler). It is much simpler to build paths to directories and files using file.path. On November 5, 2019 12:13:19 AM PST, PIKAL Petr <petr.pikal at precheza.cz> wrote:>Hi > >Help with such operations is rather tricky as only you know exact >structrure >of your folders. > >see some hints in line > >> -----Original Message----- >> From: R-help <r-help-bounces at r-project.org> On Behalf Of Val >> Sent: Tuesday, November 5, 2019 4:33 AM >> To: r-help at R-project.org (r-help at r-project.org) ><r-help at r-project.org> >> Subject: [R] File conca. >> >> Hi All, >> >> I have data files in several folders and want combine all these >files in >one >> file. In each folder there are several files and these >> files have the same structure but different names. First, in each >> folder I want to concatenate(rbind) all files in to one file. While >I am >> reading each files and concatenating (rbind) all files, I want to >added >the >> folder name as one variable in each row. I am reading the folder >names >> from a file and for demonstration I am using only two folders as >shown >> below. >> Data\week1 # folder name 1 >> WT13.csv >> WT26.csv ... >> WT10.csv >> Data\week2 #folder name 2 >> WT02.csv >> WT12.csv >> >> Below please find my attempt, >> >> folders=c("week1","week2") >> for(i in folders){ >> path=paste("\data\"", i , sep = "") >> setwd(path) > >you should use >wd <- setwd(path) > >which keeps the original directory for subsequent use > >> Flist = list.files(path,pattern = "^WT") >> dataA = lapply(Flist, function(x)read.csv(x, header=T)) >> Alldata = do.call("rbind", dataA) # combine all files >> Alldata$foldername=i # adding the folder name >> > >now you can do > >setwd(wd) > >to return to original directory >} > >> The above works for for one folder but how can I do it for more than >one >> folders? > >You also need to decide if you want all data from all folders in one >object >called Alldata or if you want several Alldata objects, one for each >folder. > >In second case you could use list structure for Alldata. In the first >case >you could store data from each folder in some temporary object and use >rbind >directly. > >something like > >temp <- do.call("rbind", dataA) >temp$foldername <- i > >Alldata <- temp >in the first cycle >and >Alldata <- rbind(Alldata, temp) >in second and all others. > >Or you could initiate first Alldata manually and use only >Alldata <- rbind(Alldata, temp) > >in your loop. > >Cheers >Petr > >> >> Thank you in advance, >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting- >> guide.html >> and provide commented, minimal, self-contained, reproducible code.-- Sent from my phone. Please excuse my brevity.
Thank you Petr and Jeff fro your suggestions.
I made some improvement but still need some tweaking. I could not
get correctly the folders names added to each row. Only the last
forename was added.
table(Alldata$oldername) resulted
week2
25500
Please see the complete,
####################################################
folders=c("week1","week2")
for(i in folders){
path=paste("\data\"", i , sep = "")
wd <- setwd(path)
Flist = list.files(path,pattern = "^WT")
dataA = lapply(Flist, function(x)read.csv(x, header=T))
setwd(wd)
temp = do.call("rbind", Alldata)
temp$foldername <- i
Alldata <- temp
Alldata <- rbind(Alldata, temp)
}
#######################################################
Any suggestion please?
On Tue, Nov 5, 2019 at 2:13 AM PIKAL Petr <petr.pikal at precheza.cz>
wrote:>
> Hi
>
> Help with such operations is rather tricky as only you know exact
structrure
> of your folders.
>
> see some hints in line
>
> > -----Original Message-----
> > From: R-help <r-help-bounces at r-project.org> On Behalf Of Val
> > Sent: Tuesday, November 5, 2019 4:33 AM
> > To: r-help at R-project.org (r-help at r-project.org) <r-help at
r-project.org>
> > Subject: [R] File conca.
> >
> > Hi All,
> >
> > I have data files in several folders and want combine all these files
in
> one
> > file. In each folder there are several files and these
> > files have the same structure but different names. First, in each
> > folder I want to concatenate(rbind) all files in to one file. While I
am
> > reading each files and concatenating (rbind) all files, I want to
added
> the
> > folder name as one variable in each row. I am reading the folder
names
> > from a file and for demonstration I am using only two folders as
shown
> > below.
> > Data\week1 # folder name 1
> > WT13.csv
> > WT26.csv ...
> > WT10.csv
> > Data\week2 #folder name 2
> > WT02.csv
> > WT12.csv
> >
> > Below please find my attempt,
> >
> > folders=c("week1","week2")
> > for(i in folders){
> > path=paste("\data\"", i , sep = "")
> > setwd(path)
>
> you should use
> wd <- setwd(path)
>
> which keeps the original directory for subsequent use
>
> > Flist = list.files(path,pattern = "^WT")
> > dataA = lapply(Flist, function(x)read.csv(x, header=T))
> > Alldata = do.call("rbind", dataA) # combine all files
> > Alldata$foldername=i # adding the folder name
> >
>
> now you can do
>
> setwd(wd)
>
> to return to original directory
> }
>
> > The above works for for one folder but how can I do it for more than
one
> > folders?
>
> You also need to decide if you want all data from all folders in one object
> called Alldata or if you want several Alldata objects, one for each folder.
>
> In second case you could use list structure for Alldata. In the first case
> you could store data from each folder in some temporary object and use
rbind
> directly.
>
> something like
>
> temp <- do.call("rbind", dataA)
> temp$foldername <- i
>
> Alldata <- temp
> in the first cycle
> and
> Alldata <- rbind(Alldata, temp)
> in second and all others.
>
> Or you could initiate first Alldata manually and use only
> Alldata <- rbind(Alldata, temp)
>
> in your loop.
>
> Cheers
> Petr
>
> >
> > Thank you in advance,
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.