BARLAS Marios 247554
2015-Dec-04 10:51 UTC
[R] Ordering Filenames stored in list or vector
Hello everyone, I am an R rookie and I'm learning as I program. I am working on a script to process a large amount of data: I read a pattern of filenames in the folder I want and import their data filenames = list.files(path, pattern="*Q_Read_prist*") myfiles = lapply(filenames, function(x) read.xlsx2(file=x, sheetName="Data", header=TRUE, FILENAMEVAR=x)) The problem is that R recognizes the files in a 'non human' order. Q_Read_prist#1 at 1.xls Q_Read_prist#1 at 1.xls Q_Read_prist#1 at 10.xls Q_Read_prist#1 at 10.xls Q_Read_prist#1 at 11.xls Q_Read_prist#1 at 11.xls Q_Read_prist#1 at 12.xls Q_Read_prist#1 at 12.xls Q_Read_prist#1 at 13.xls Q_Read_prist#1 at 13.xls Q_Read_prist#1 at 14.xls Q_Read_prist#1 at 14.xls Q_Read_prist#1 at 15.xls Q_Read_prist#1 at 15.xls Q_Read_prist#1 at 16.xls Q_Read_prist#1 at 16.xls Q_Read_prist#1 at 17.xls Q_Read_prist#1 at 17.xls Q_Read_prist#1 at 18.xls Q_Read_prist#1 at 18.xls Q_Read_prist#1 at 19.xls Q_Read_prist#1 at 19.xls Q_Read_prist#1 at 2.xls Q_Read_prist#1 at 2.xls Q_Read_prist#1 at 3.xls Q_Read_prist#1 at 3.xls Q_Read_prist#1 at 4.xls Q_Read_prist#1 at 4.xls Q_Read_prist#1 at 5.xls Q_Read_prist#1 at 5.xls Q_Read_prist#1 at 6.xls Q_Read_prist#1 at 6.xls Q_Read_prist#1 at 7.xls Q_Read_prist#1 at 7.xls Q_Read_prist#1 at 8.xls Q_Read_prist#1 at 8.xls Q_Read_prist#1 at 9.xls Q_Read_prist#1 at 9.xls I tried to order them using order or sort but it doesn' seem to work. I have had the same issue in matlab but there I have a function to re-define the order in a "correct" way. Anyone knows of a smart way to sort these guys from 1 to 19 ascending or descending? Thanks in advance, Mario [[alternative HTML version deleted]]
The thread below has a number of solutions. I personally like the one with sprintf(). https://stat.ethz.ch/pipermail/r-help/2010-July/246059.html B. On Dec 4, 2015, at 5:51 AM, BARLAS Marios 247554 <Marios.BARLAS at cea.fr> wrote:> Hello everyone, > > I am an R rookie and I'm learning as I program. > > I am working on a script to process a large amount of data: I read a pattern of filenames in the folder I want and import their data > > filenames = list.files(path, pattern="*Q_Read_prist*") > > myfiles = lapply(filenames, function(x) read.xlsx2(file=x, sheetName="Data", header=TRUE, FILENAMEVAR=x)) > > The problem is that R recognizes the files in a 'non human' order. > > Q_Read_prist#1 at 1.xls Q_Read_prist#1 at 1.xls > Q_Read_prist#1 at 10.xls Q_Read_prist#1 at 10.xls > Q_Read_prist#1 at 11.xls Q_Read_prist#1 at 11.xls > Q_Read_prist#1 at 12.xls Q_Read_prist#1 at 12.xls > Q_Read_prist#1 at 13.xls Q_Read_prist#1 at 13.xls > Q_Read_prist#1 at 14.xls Q_Read_prist#1 at 14.xls > Q_Read_prist#1 at 15.xls Q_Read_prist#1 at 15.xls > Q_Read_prist#1 at 16.xls Q_Read_prist#1 at 16.xls > Q_Read_prist#1 at 17.xls Q_Read_prist#1 at 17.xls > Q_Read_prist#1 at 18.xls Q_Read_prist#1 at 18.xls > Q_Read_prist#1 at 19.xls Q_Read_prist#1 at 19.xls > Q_Read_prist#1 at 2.xls Q_Read_prist#1 at 2.xls > Q_Read_prist#1 at 3.xls Q_Read_prist#1 at 3.xls > Q_Read_prist#1 at 4.xls Q_Read_prist#1 at 4.xls > Q_Read_prist#1 at 5.xls Q_Read_prist#1 at 5.xls > Q_Read_prist#1 at 6.xls Q_Read_prist#1 at 6.xls > Q_Read_prist#1 at 7.xls Q_Read_prist#1 at 7.xls > Q_Read_prist#1 at 8.xls Q_Read_prist#1 at 8.xls > Q_Read_prist#1 at 9.xls Q_Read_prist#1 at 9.xls > > I tried to order them using order or sort but it doesn' seem to work. I have had the same issue in matlab but there I have a function to re-define the order in a "correct" way. > > Anyone knows of a smart way to sort these guys from 1 to 19 ascending or descending? > > Thanks in advance, > Mario > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
> filenames <- c("Q_Read_prist#1 at 1.xls", "Q_Read_prist#1 at 10.xls", "Q_Read_prist#1 at 2.xls") > filenames <- gtools::mixedsort(filenames, numeric.type="decimal") > filenames[1] "Q_Read_prist#1 at 1.xls" "Q_Read_prist#1 at 2.xls" "Q_Read_prist#1 at 10.xls" /Henrik On Fri, Dec 4, 2015 at 7:53 AM, Boris Steipe <boris.steipe at utoronto.ca> wrote:> The thread below has a number of solutions. I personally like the one with sprintf(). > https://stat.ethz.ch/pipermail/r-help/2010-July/246059.html > > > B. > > On Dec 4, 2015, at 5:51 AM, BARLAS Marios 247554 <Marios.BARLAS at cea.fr> wrote: > >> Hello everyone, >> >> I am an R rookie and I'm learning as I program. >> >> I am working on a script to process a large amount of data: I read a pattern of filenames in the folder I want and import their data >> >> filenames = list.files(path, pattern="*Q_Read_prist*") >> >> myfiles = lapply(filenames, function(x) read.xlsx2(file=x, sheetName="Data", header=TRUE, FILENAMEVAR=x)) >> >> The problem is that R recognizes the files in a 'non human' order. >> >> Q_Read_prist#1 at 1.xls Q_Read_prist#1 at 1.xls >> Q_Read_prist#1 at 10.xls Q_Read_prist#1 at 10.xls >> Q_Read_prist#1 at 11.xls Q_Read_prist#1 at 11.xls >> Q_Read_prist#1 at 12.xls Q_Read_prist#1 at 12.xls >> Q_Read_prist#1 at 13.xls Q_Read_prist#1 at 13.xls >> Q_Read_prist#1 at 14.xls Q_Read_prist#1 at 14.xls >> Q_Read_prist#1 at 15.xls Q_Read_prist#1 at 15.xls >> Q_Read_prist#1 at 16.xls Q_Read_prist#1 at 16.xls >> Q_Read_prist#1 at 17.xls Q_Read_prist#1 at 17.xls >> Q_Read_prist#1 at 18.xls Q_Read_prist#1 at 18.xls >> Q_Read_prist#1 at 19.xls Q_Read_prist#1 at 19.xls >> Q_Read_prist#1 at 2.xls Q_Read_prist#1 at 2.xls >> Q_Read_prist#1 at 3.xls Q_Read_prist#1 at 3.xls >> Q_Read_prist#1 at 4.xls Q_Read_prist#1 at 4.xls >> Q_Read_prist#1 at 5.xls Q_Read_prist#1 at 5.xls >> Q_Read_prist#1 at 6.xls Q_Read_prist#1 at 6.xls >> Q_Read_prist#1 at 7.xls Q_Read_prist#1 at 7.xls >> Q_Read_prist#1 at 8.xls Q_Read_prist#1 at 8.xls >> Q_Read_prist#1 at 9.xls Q_Read_prist#1 at 9.xls >> >> I tried to order them using order or sort but it doesn' seem to work. I have had the same issue in matlab but there I have a function to re-define the order in a "correct" way. >> >> Anyone knows of a smart way to sort these guys from 1 to 19 ascending or descending? >> >> Thanks in advance, >> Mario >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Mario, I am certain there are more elegant solutions. This is an effort to make the process clear by dividing out each transformation used into separate lines. ## Start of code library(stringi) # This is written in C and C++ (ICU library), is fast, and is well documented. filenames <- c("Q_Read_prist#1 at 1.xls", "Q_Read_prist#1 at 10.xls", "Q_Read_prist#1 at 11.xls", "Q_Read_prist#1 at 12.xls", "Q_Read_prist#1 at 13.xls", "Q_Read_prist#1 at 14.xls", "Q_Read_prist#1 at 15.xls", "Q_Read_prist#1 at 16.xls", "Q_Read_prist#1 at 17.xls", "Q_Read_prist#1 at 18.xls", "Q_Read_prist#1 at 19.xls", "Q_Read_prist#1 at 2.xls", "Q_Read_prist#1 at 3.xls", "Q_Read_prist#1 at 4.xls", "Q_Read_prist#1 at 5.xls", "Q_Read_prist#1 at 6.xls", "Q_Read_prist#1 at 7.xls", "Q_Read_prist#1 at 8.xls", "Q_Read_prist#1 at 9.xls") indx_list <- stri_split_regex(filenames, pattern = "[@.]") indx <- sapply(indx_list, function(x) {x[[2]]}) filenames_df <- data.frame(file_name = filenames, indx = indx, stringsAsFactors = FALSE) filenames_ordered <- filenames_df[order(as.numeric(filenames_df$indx)), "file_name"] filenames_ordered ## end of code Mark R. Mark Sharp, Ph.D. Director of Primate Records Database Southwest National Primate Research Center Texas Biomedical Research Institute P.O. Box 760549 San Antonio, TX 78245-0549 Telephone: (210)258-9476 e-mail: msharp at TxBiomed.org> On Dec 4, 2015, at 4:51 AM, BARLAS Marios 247554 <Marios.BARLAS at cea.fr> wrote: > > Hello everyone, > > I am an R rookie and I'm learning as I program. > > I am working on a script to process a large amount of data: I read a pattern of filenames in the folder I want and import their data > > filenames = list.files(path, pattern="*Q_Read_prist*") > > myfiles = lapply(filenames, function(x) read.xlsx2(file=x, sheetName="Data", header=TRUE, FILENAMEVAR=x)) > > The problem is that R recognizes the files in a 'non human' order. > > Q_Read_prist#1 at 1.xls Q_Read_prist#1 at 1.xls > Q_Read_prist#1 at 10.xls Q_Read_prist#1 at 10.xls > Q_Read_prist#1 at 11.xls Q_Read_prist#1 at 11.xls > Q_Read_prist#1 at 12.xls Q_Read_prist#1 at 12.xls > Q_Read_prist#1 at 13.xls Q_Read_prist#1 at 13.xls > Q_Read_prist#1 at 14.xls Q_Read_prist#1 at 14.xls > Q_Read_prist#1 at 15.xls Q_Read_prist#1 at 15.xls > Q_Read_prist#1 at 16.xls Q_Read_prist#1 at 16.xls > Q_Read_prist#1 at 17.xls Q_Read_prist#1 at 17.xls > Q_Read_prist#1 at 18.xls Q_Read_prist#1 at 18.xls > Q_Read_prist#1 at 19.xls Q_Read_prist#1 at 19.xls > Q_Read_prist#1 at 2.xls Q_Read_prist#1 at 2.xls > Q_Read_prist#1 at 3.xls Q_Read_prist#1 at 3.xls > Q_Read_prist#1 at 4.xls Q_Read_prist#1 at 4.xls > Q_Read_prist#1 at 5.xls Q_Read_prist#1 at 5.xls > Q_Read_prist#1 at 6.xls Q_Read_prist#1 at 6.xls > Q_Read_prist#1 at 7.xls Q_Read_prist#1 at 7.xls > Q_Read_prist#1 at 8.xls Q_Read_prist#1 at 8.xls > Q_Read_prist#1 at 9.xls Q_Read_prist#1 at 9.xls > > I tried to order them using order or sort but it doesn' seem to work. I have had the same issue in matlab but there I have a function to re-define the order in a "correct" way. > > Anyone knows of a smart way to sort these guys from 1 to 19 ascending or descending? > > Thanks in advance, > Mario > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
BARLAS Marios 247554
2015-Dec-07 10:59 UTC
[R] Ordering Filenames stored in list or vector
Thanks a lot for the clarifying code Mark! Actually, I took a lazy option and after some digging around I found out the package called "naturalsort" which provides a pretty compact solution! As a rookie, I have another question. My main interest in R is that I hope to integrate in a scripting fashion good amount of data crunching coming from electrical measurements and at the same time give me a nice visualization option all in the same tool. Is it a proper tool for such use in your experience? So far I was using Matlab + OriginPro for treatment and visualization but now, starting my PhD I feel like I want something more "integrated" Thanks, Mario ________________________________________ From: Mark Sharp [msharp at TxBiomed.org] Sent: Friday, December 04, 2015 5:25 PM To: BARLAS Marios 247554 Cc: r-help at r-project.org Subject: Re: [R] Ordering Filenames stored in list or vector Mario, I am certain there are more elegant solutions. This is an effort to make the process clear by dividing out each transformation used into separate lines. ## Start of code library(stringi) # This is written in C and C++ (ICU library), is fast, and is well documented. filenames <- c("Q_Read_prist#1 at 1.xls", "Q_Read_prist#1 at 10.xls", "Q_Read_prist#1 at 11.xls", "Q_Read_prist#1 at 12.xls", "Q_Read_prist#1 at 13.xls", "Q_Read_prist#1 at 14.xls", "Q_Read_prist#1 at 15.xls", "Q_Read_prist#1 at 16.xls", "Q_Read_prist#1 at 17.xls", "Q_Read_prist#1 at 18.xls", "Q_Read_prist#1 at 19.xls", "Q_Read_prist#1 at 2.xls", "Q_Read_prist#1 at 3.xls", "Q_Read_prist#1 at 4.xls", "Q_Read_prist#1 at 5.xls", "Q_Read_prist#1 at 6.xls", "Q_Read_prist#1 at 7.xls", "Q_Read_prist#1 at 8.xls", "Q_Read_prist#1 at 9.xls") indx_list <- stri_split_regex(filenames, pattern = "[@.]") indx <- sapply(indx_list, function(x) {x[[2]]}) filenames_df <- data.frame(file_name = filenames, indx = indx, stringsAsFactors = FALSE) filenames_ordered <- filenames_df[order(as.numeric(filenames_df$indx)), "file_name"] filenames_ordered ## end of code Mark R. Mark Sharp, Ph.D. Director of Primate Records Database Southwest National Primate Research Center Texas Biomedical Research Institute P.O. Box 760549 San Antonio, TX 78245-0549 Telephone: (210)258-9476 e-mail: msharp at TxBiomed.org> On Dec 4, 2015, at 4:51 AM, BARLAS Marios 247554 <Marios.BARLAS at cea.fr> wrote: > > Hello everyone, > > I am an R rookie and I'm learning as I program. > > I am working on a script to process a large amount of data: I read a pattern of filenames in the folder I want and import their data > > filenames = list.files(path, pattern="*Q_Read_prist*") > > myfiles = lapply(filenames, function(x) read.xlsx2(file=x, sheetName="Data", header=TRUE, FILENAMEVAR=x)) > > The problem is that R recognizes the files in a 'non human' order. > > Q_Read_prist#1 at 1.xls Q_Read_prist#1 at 1.xls > Q_Read_prist#1 at 10.xls Q_Read_prist#1 at 10.xls > Q_Read_prist#1 at 11.xls Q_Read_prist#1 at 11.xls > Q_Read_prist#1 at 12.xls Q_Read_prist#1 at 12.xls > Q_Read_prist#1 at 13.xls Q_Read_prist#1 at 13.xls > Q_Read_prist#1 at 14.xls Q_Read_prist#1 at 14.xls > Q_Read_prist#1 at 15.xls Q_Read_prist#1 at 15.xls > Q_Read_prist#1 at 16.xls Q_Read_prist#1 at 16.xls > Q_Read_prist#1 at 17.xls Q_Read_prist#1 at 17.xls > Q_Read_prist#1 at 18.xls Q_Read_prist#1 at 18.xls > Q_Read_prist#1 at 19.xls Q_Read_prist#1 at 19.xls > Q_Read_prist#1 at 2.xls Q_Read_prist#1 at 2.xls > Q_Read_prist#1 at 3.xls Q_Read_prist#1 at 3.xls > Q_Read_prist#1 at 4.xls Q_Read_prist#1 at 4.xls > Q_Read_prist#1 at 5.xls Q_Read_prist#1 at 5.xls > Q_Read_prist#1 at 6.xls Q_Read_prist#1 at 6.xls > Q_Read_prist#1 at 7.xls Q_Read_prist#1 at 7.xls > Q_Read_prist#1 at 8.xls Q_Read_prist#1 at 8.xls > Q_Read_prist#1 at 9.xls Q_Read_prist#1 at 9.xls > > I tried to order them using order or sort but it doesn' seem to work. I have had the same issue in matlab but there I have a function to re-define the order in a "correct" way. > > Anyone knows of a smart way to sort these guys from 1 to 19 ascending or descending? > > Thanks in advance, > Mario > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.