Hi all, I'm just a beginner with R but I have not been able to search for any relevant answer to my problem. I apologize if it has in fact been asked before. Recently I've realized that I need to combine hundreds of pairs of data frames. The filenames of the frames I need to combine have unique strings. This is my best guess as to the approach to take: filenames<-list.files() filenames [1] "a1.csv" "a2.csv" "b1.csv" "b2.csv" "c1.csv" "c2.csv" alldata<-lapply(filenames, read.csv, header=TRUE) names(alldata)<-filenames summary(alldata) Length Class Mode a1.csv 27 data.frame list a2.csv 27 data.frame list b1.csv 27 data.frame list b2.csv 27 data.frame list c1.csv 27 data.frame list c2.csv 27 data.frame list My next step would be to cbind files that share a common string at the beginning, such as: cbind(alldata[[1]],alldata[[2]]) cbind(alldata[[3]],alldata[[4]]) cbind(alldata[[5]],alldata[[6]]) ... but file list is hundreds of files long (but is sorted alphanumerically such as in this example - not sure if this is relevant). If I had to guess, I'd do something like this: which(names(alldata)==...), to identify which elements to combine based on unique filename OR x<-seq(1,length(alldata), 2) y=x+1 z<-cbind(x,y) z x y [1,] 1 2 [2,] 3 4 [3,] 5 6 to use the frame created in z to combine based on rows, then use a looped cbind function (or *apply function with nested cbind function?) using the previously returned indexes to create my new combined data frames, including a step to write the frames to a new unique filename (not sure how to do that step in this context). These last steps I've tried a lot of code but nothing worth mentioning as it has all failed miserably. I appreciate the help, M [[alternative HTML version deleted]]
Matthew Ouellette <mouellette89 <at> gmail.com> writes:> > Hi all, > > I'm just a beginner with R but I have not been able to search for any > relevant answer to my problem. I apologize if it has in fact been asked > before. > > Recently I've realized that I need to combine hundreds of pairs of data > frames. The filenames of the frames I need to combine have unique strings. > This is my best guess as to the approach to take: > > filenames<-list.files() > > filenames > [1] "a1.csv" "a2.csv" "b1.csv" "b2.csv" "c1.csv" "c2.csv" > > alldata<-lapply(filenames, read.csv, header=TRUE) > > names(alldata)<-filenames > summary(alldata) > Length Class Mode > a1.csv 27 data.frame list > a2.csv 27 data.frame list > b1.csv 27 data.frame list > b2.csv 27 data.frame list > c1.csv 27 data.frame list > c2.csv 27 data.frame list > > My next step would be to cbind files that share a common string at the > beginning, such as: > cbind(alldata[[1]],alldata[[2]]) > cbind(alldata[[3]],alldata[[4]]) > cbind(alldata[[5]],alldata[[6]]) > ... > > but file list is hundreds of files long (but is sorted alphanumerically > such as in this example - not sure if this is relevant). If I had to > guess, I'd do something like this: > > which(names(alldata)==...), to identify which elements to combine based on > unique filename > > OR > x<-seq(1,length(alldata), 2) > y=x+1 > z<-cbind(x,y) > z > x y > [1,] 1 2 > [2,] 3 4 > [3,] 5 6 > > to use the frame created in z to combine based on rows, > > then use a looped cbind function (or *apply function with nested cbind > function?) using the previously returned indexes to create my new combined > data frames, including a step to write the frames to a new unique filename > (not sure how to do that step in this context). These last steps I've > tried a lot of code but nothing worth mentioning as it has all failed > miserably. > > I appreciate the help, > > M > > [[alternative HTML version deleted]] > >Hi Matthew, You could try using substr() if the cbind is based on a common string in the file name just makes sure that the strings in filenames is in the same order as the files are in list.files: a1 <- data.frame("col1" = seq(1,10, 1)) a2 <- data.frame("col2" = seq(11,20, 1)) b1 <- data.frame("col3" = seq(21,30, 1)) b2 <- data.frame("col4" = seq(31,40, 1)) filenames <- c("a1", "a2", "b1", "b2") list.files <- list(a1, a2, b1, b2) first.letter <- substr(filenames, 1,1) unique.first.letter <- unique(first.letter) l.files <- list() for(i in 1:length(unique.first.letter)){ l.files[[i]] = as.data.frame(list.files[first.letter == unique.first.letter[i]]) } HTH, Ken
Matthew Ouellette <mouellette89 <at> gmail.com> writes:> > Hi all, > > I'm just a beginner with R but I have not been able to search for any > relevant answer to my problem. I apologize if it has in fact been asked > before. > > Recently I've realized that I need to combine hundreds of pairs of data > frames. The filenames of the frames I need to combine have unique strings. > This is my best guess as to the approach to take: > > filenames<-list.files() > > filenames > [1] "a1.csv" "a2.csv" "b1.csv" "b2.csv" "c1.csv" "c2.csv" > > alldata<-lapply(filenames, read.csv, header=TRUE) > > names(alldata)<-filenames > summary(alldata) > Length Class Mode > a1.csv 27 data.frame list > a2.csv 27 data.frame list > b1.csv 27 data.frame list > b2.csv 27 data.frame list > c1.csv 27 data.frame list > c2.csv 27 data.frame list > > My next step would be to cbind files that share a common string at the > beginning, such as: > cbind(alldata[[1]],alldata[[2]]) > cbind(alldata[[3]],alldata[[4]]) > cbind(alldata[[5]],alldata[[6]]) > ... > > but file list is hundreds of files long (but is sorted alphanumerically > such as in this example - not sure if this is relevant). If I had to > guess, I'd do something like this: > > which(names(alldata)==...), to identify which elements to combine based on > unique filename > > OR > x<-seq(1,length(alldata), 2) > y=x+1 > z<-cbind(x,y) > z > x y > [1,] 1 2 > [2,] 3 4 > [3,] 5 6 > > to use the frame created in z to combine based on rows, > > then use a looped cbind function (or *apply function with nested cbind > function?) using the previously returned indexes to create my new combined > data frames, including a step to write the frames to a new unique filename > (not sure how to do that step in this context). These last steps I've > tried a lot of code but nothing worth mentioning as it has all failed > miserably. > > I appreciate the help, > > M > > [[alternative HTML version deleted]] > >Hi Matthew, You could try using substr() if the cbind is based on a common string in the file name just makes sure that the strings in filenames is in the same order as the files are in list.files: a1 <- data.frame("col1" = seq(1,10, 1)) a2 <- data.frame("col2" = seq(11,20, 1)) b1 <- data.frame("col3" = seq(21,30, 1)) b2 <- data.frame("col4" = seq(31,40, 1)) filenames <- c("a1", "a2", "b1", "b2") list.files <- list(a1, a2, b1, b2) first.letter <- substr(filenames, 1,1) unique.first.letter <- unique(first.letter) l.files <- list() for(i in 1:length(unique.first.letter)){ l.files[[i]] = as.data.frame(list.files[first.letter == unique.first.letter[i]]) } HTH, Ken
Hello, You can split the filenames vector according to a pattern, filenames <- c("a1.csv", "a2.csv", "b1.csv", "b2.csv", "c1.csv", "c2.csv") fnpattern <- gsub("[[:digit:]]", "", filenames) df.groups <- split(filenames, fnpattern) and then use this list to process each of the groups of data.frames in 'alldata', possibly using lapply. Hpe this helps, Rui Barradas BustedAvi wrote> > Hi all, > > I'm just a beginner with R but I have not been able to search for any > relevant answer to my problem. I apologize if it has in fact been asked > before. > > Recently I've realized that I need to combine hundreds of pairs of data > frames. The filenames of the frames I need to combine have unique > strings. > This is my best guess as to the approach to take: > > filenames<-list.files() > > filenames > [1] "a1.csv" "a2.csv" "b1.csv" "b2.csv" "c1.csv" "c2.csv" > > alldata<-lapply(filenames, read.csv, header=TRUE) > > names(alldata)<-filenames > summary(alldata) > Length Class Mode > a1.csv 27 data.frame list > a2.csv 27 data.frame list > b1.csv 27 data.frame list > b2.csv 27 data.frame list > c1.csv 27 data.frame list > c2.csv 27 data.frame list > > My next step would be to cbind files that share a common string at the > beginning, such as: > cbind(alldata[[1]],alldata[[2]]) > cbind(alldata[[3]],alldata[[4]]) > cbind(alldata[[5]],alldata[[6]]) > ... > > but file list is hundreds of files long (but is sorted alphanumerically > such as in this example - not sure if this is relevant). If I had to > guess, I'd do something like this: > > which(names(alldata)==...), to identify which elements to combine based on > unique filename > > OR > x<-seq(1,length(alldata), 2) > y=x+1 > z<-cbind(x,y) > z > x y > [1,] 1 2 > [2,] 3 4 > [3,] 5 6 > > to use the frame created in z to combine based on rows, > > then use a looped cbind function (or *apply function with nested cbind > function?) using the previously returned indexes to create my new combined > data frames, including a step to write the frames to a new unique filename > (not sure how to do that step in this context). These last steps I've > tried a lot of code but nothing worth mentioning as it has all failed > miserably. > > I appreciate the help, > > M > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@ mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- View this message in context: http://r.789695.n4.nabble.com/Multiple-cbind-according-to-filename-tp4631298p4631346.html Sent from the R help mailing list archive at Nabble.com.