Hi All-- I used base R list.file function to read files from a directory. The file names are months (April, August, etc). That's the system reads them in alphabetical order., but i want to reordered them in calendar order (January, February, ...December).. I thought i might be able to do it via RegEx or possibly gtools package, I am wondering if there is an easier way. Thanks--EK Example path = "C:/Users/name/Downloads/MyFiles" file.names <- dir(path, pattern =".PDF") Example output Output: "February.PDF" "January.PDF" "March.PDF" Desired output "January.PDF" "February.PDF" "March.PDF"
Hi You could use brute force approach. Just print out "file.names" and estimate ordering vector. In czech locale it is oo <- c(6, 11, 1, 4, 5, 2, 3, 10, 12, 9, 7, 8) In english locale it is different :-) After that file.names[oo] should give you correct order of file names Cheers Petr> -----Original Message----- > From: R-help <r-help-bounces at r-project.org> On Behalf Of Ek Esawi > Sent: Tuesday, October 9, 2018 3:44 PM > To: r-help at r-project.org > Subject: [R] Reorder file names read by list.files function > > Hi All-- > > I used base R list.file function to read files from a directory. The file names are > months (April, August, etc). That's the system reads them in alphabetical order., > but i want to reordered them in calendar order (January, February, > ...December).. I thought i might be able to do it via RegEx or possibly gtools > package, I am wondering if there is an easier way. > > Thanks--EK > > Example > path = "C:/Users/name/Downloads/MyFiles" > file.names <- dir(path, pattern =".PDF") > > Example output > Output: > "February.PDF" "January.PDF" "March.PDF" > Desired output > "January.PDF" "February.PDF" "March.PDF" > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.Osobn? ?daje: Informace o zpracov?n? a ochran? osobn?ch ?daj? obchodn?ch partner? PRECHEZA a.s. jsou zve?ejn?ny na: https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about processing and protection of business partner?s personal data are available on website: https://www.precheza.cz/en/personal-data-protection-principles/ D?v?rnost: Tento e-mail a jak?koliv k n?mu p?ipojen? dokumenty jsou d?v?rn? a podl?haj? tomuto pr?vn? z?vazn?mu prohl??en? o vylou?en? odpov?dnosti: https://www.precheza.cz/01-dovetek/ | This email and any documents attached to it may be confidential and are subject to the legally binding disclaimer: https://www.precheza.cz/en/01-disclaimer/
Hello, You can use the built in variable month.name to get the calendar order and match it with your file names. i <- match(sub("\\.PDF", "", file.names), month.name) file.names[i] #[1] "January.PDF" "February.PDF" "March.PDF" Hope this helps, Rui Barradas ?s 14:44 de 09/10/2018, Ek Esawi escreveu:> Hi All-- > > I used base R list.file function to read files from a directory. The > file names are months (April, August, etc). That's the system reads > them in alphabetical order., but i want to reordered them in calendar > order (January, February, ...December).. I thought i might be able to > do it via RegEx or possibly gtools package, I am wondering if there is > an easier way. > > Thanks--EK > > Example > path = "C:/Users/name/Downloads/MyFiles" > file.names <- dir(path, pattern =".PDF") > > Example output > Output: > "February.PDF" "January.PDF" "March.PDF" > Desired output > "January.PDF" "February.PDF" "March.PDF" > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Instead of changing the order in which you read the files, perhaps your analysis will work if you sort the data after you read it in. This may require that you add the month names as a column in the data frames, or you may already have dates in the data that you could sort by. One idea: fnames <- paste0( month.name, ".PDF" ) resultdf <- do.call( rbind, lapply(fnames, function(fn) { read.csv( file.path( "datadir", fn ), as.is=TRUE ) } ) but that only works if there are exactly 12 files. If there could be fewer, perhaps: fnames <- list.files( "datadir" ) sfnames <- fnames[ match( sub("\\.PDF", "", fnames ), month.name ) ] On October 9, 2018 6:44:21 AM PDT, Ek Esawi <esawiek at gmail.com> wrote:>Hi All-- > >I used base R list.file function to read files from a directory. The >file names are months (April, August, etc). That's the system reads >them in alphabetical order., but i want to reordered them in calendar >order (January, February, ...December).. I thought i might be able to >do it via RegEx or possibly gtools package, I am wondering if there is >an easier way. > >Thanks--EK > >Example >path = "C:/Users/name/Downloads/MyFiles" >file.names <- dir(path, pattern =".PDF") > >Example output >Output: >"February.PDF" "January.PDF" "March.PDF" >Desired output >"January.PDF" "February.PDF" "March.PDF" > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.-- Sent from my phone. Please excuse my brevity.
Instead of changing the order in which you read the files, perhaps your analysis will work if you sort the data after you read it in. This may require that you add the month names as a column in the data frames, or you may already have dates in the data that you could sort by. One idea: fnames <- paste0( month.name, ".PDF" ) resultdf <- do.call( rbind, lapply(fnames, function(fn) { read.csv( file.path( "datadir", fn ), as.is=TRUE ) } ) but that only works if there are exactly 12 files. If there could be fewer, perhaps: fnames <- list.files( "datadir" ) sfnames <- fnames[ match( sub("\\.PDF", "", fnames ), month.name ) ] On October 9, 2018 6:44:21 AM PDT, Ek Esawi <esawiek at gmail.com> wrote:>Hi All-- > >I used base R list.file function to read files from a directory. The >file names are months (April, August, etc). That's the system reads >them in alphabetical order., but i want to reordered them in calendar >order (January, February, ...December).. I thought i might be able to >do it via RegEx or possibly gtools package, I am wondering if there is >an easier way. > >Thanks--EK > >Example >path = "C:/Users/name/Downloads/MyFiles" >file.names <- dir(path, pattern =".PDF") > >Example output >Output: >"February.PDF" "January.PDF" "March.PDF" >Desired output >"January.PDF" "February.PDF" "March.PDF" > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.-- Sent from my phone. Please excuse my brevity.
Hi again, I worked with RUi's idea of using the match function with month.name. I got numerical values for months then i sorted and pasted the PDF file extension. It gave me the file order i wanted, but now statements 8,9,&10 don't work and i kept getting an error which is listed below. The dilemma is if i add full.names=TRUE in statement 6 then statements 9 and 10 don't produce what they did earlier. If i put full.names=FALSE, then i am back to square 1. Any idea is greatly appreciated.: The code 1. nstall.packages("tabulizer") 2. installed.packages("stringr") 3. library(stringr) 4. library(tabulizer) 5. path = "C:/Users/namei/Documents/TextMining/S2017" 6. file.names <- dir(path, pattern =".PDF",full.names = TRUE) 7. file.names <- str_remove(file.names,"\\s[0-9][0-9]") 8. FNs <- sort(match(sub("\\.PDF", "", file.names), month.name)) 9. FNs1 <- paste0(month.name[FNs],".","PDF") 10 A <- lapply(FNs1, function(i) extract_tables(i)) Output and the error message. path = "C:/Users/eesawi/Documents/TextMining/S2017"> file.names <- dir(path, pattern =".PDF",full.names = TRUE) > file.names <- str_remove(file.names,"\\s[0-9][0-9]") > FNs <- sort(match(sub("\\.PDF", "", file.names), month.name)) > FNs1 <- paste0(month.name[FNs],".","PDF") > A <- lapply(FNs1, function(i) extract_tables(i))Show Traceback Error in normalizePath(path.expand(path), winslash, mustWork) : path[1]=".PDF": The system cannot find the file specified On Tue, Oct 9, 2018 at 9:44 AM Ek Esawi <esawiek at gmail.com> wrote:> > Hi All-- > > I used base R list.file function to read files from a directory. The > file names are months (April, August, etc). That's the system reads > them in alphabetical order., but i want to reordered them in calendar > order (January, February, ...December).. I thought i might be able to > do it via RegEx or possibly gtools package, I am wondering if there is > an easier way. > > Thanks--EK > > Example > path = "C:/Users/name/Downloads/MyFiles" > file.names <- dir(path, pattern =".PDF") > > Example output > Output: > "February.PDF" "January.PDF" "March.PDF" > Desired output > "January.PDF" "February.PDF" "March.PDF"
Hello, I would do something along the lines of # work in the directory where the files are located old_dir <- setwd(path) file.names <- list.files(pattern = "\\.PDF") [...] # When you are done reset your wd setwd(old_dir) Hope this helps, Rui Barradas ?s 21:38 de 09/10/2018, Ek Esawi escreveu:> Hi again, > > I worked with RUi's idea of using the match function with month.name. > I got numerical values for months then i sorted and pasted the PDF > file extension. It gave me the file order i wanted, but now statements > 8,9,&10 don't work and i kept getting an error which is listed below. > The dilemma is if i add full.names=TRUE in statement 6 then statements > 9 and 10 don't produce what they did earlier. If i put > full.names=FALSE, then i am back to square 1. > Any idea is greatly appreciated.: > > The code > > 1. nstall.packages("tabulizer") > 2. installed.packages("stringr") > 3. library(stringr) > 4. library(tabulizer) > 5. path = "C:/Users/namei/Documents/TextMining/S2017" > 6. file.names <- dir(path, pattern =".PDF",full.names = TRUE) > 7. file.names <- str_remove(file.names,"\\s[0-9][0-9]") > 8. FNs <- sort(match(sub("\\.PDF", "", file.names), month.name)) > 9. FNs1 <- paste0(month.name[FNs],".","PDF") > 10 A <- lapply(FNs1, function(i) extract_tables(i)) > > Output and the error message. > > path = "C:/Users/eesawi/Documents/TextMining/S2017" >> file.names <- dir(path, pattern =".PDF",full.names = TRUE) >> file.names <- str_remove(file.names,"\\s[0-9][0-9]") >> FNs <- sort(match(sub("\\.PDF", "", file.names), month.name)) >> FNs1 <- paste0(month.name[FNs],".","PDF") >> A <- lapply(FNs1, function(i) extract_tables(i)) > Show Traceback > > Error in normalizePath(path.expand(path), winslash, mustWork) : > path[1]=".PDF": The system cannot find the file specified > On Tue, Oct 9, 2018 at 9:44 AM Ek Esawi <esawiek at gmail.com> wrote: >> >> Hi All-- >> >> I used base R list.file function to read files from a directory. The >> file names are months (April, August, etc). That's the system reads >> them in alphabetical order., but i want to reordered them in calendar >> order (January, February, ...December).. I thought i might be able to >> do it via RegEx or possibly gtools package, I am wondering if there is >> an easier way. >> >> Thanks--EK >> >> Example >> path = "C:/Users/name/Downloads/MyFiles" >> file.names <- dir(path, pattern =".PDF") >> >> Example output >> Output: >> "February.PDF" "January.PDF" "March.PDF" >> Desired output >> "January.PDF" "February.PDF" "March.PDF" > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Use basename(filename) to remove the lead parts of the full path to the file. E.g., replace FNs <- sort(match(sub("\\.PDF", "", file.names), month.name)) with (the untested) FNs <- sort(match(sub("\\.PDF", "", basename(file.names)), month.name)) Bill Dunlap TIBCO Software wdunlap tibco.com On Tue, Oct 9, 2018 at 1:38 PM, Ek Esawi <esawiek at gmail.com> wrote:> Hi again, > > I worked with RUi's idea of using the match function with month.name. > I got numerical values for months then i sorted and pasted the PDF > file extension. It gave me the file order i wanted, but now statements > 8,9,&10 don't work and i kept getting an error which is listed below. > The dilemma is if i add full.names=TRUE in statement 6 then statements > 9 and 10 don't produce what they did earlier. If i put > full.names=FALSE, then i am back to square 1. > Any idea is greatly appreciated.: > > The code > > 1. nstall.packages("tabulizer") > 2. installed.packages("stringr") > 3. library(stringr) > 4. library(tabulizer) > 5. path = "C:/Users/namei/Documents/TextMining/S2017" > 6. file.names <- dir(path, pattern =".PDF",full.names = TRUE) > 7. file.names <- str_remove(file.names,"\\s[0-9][0-9]") > 8. FNs <- sort(match(sub("\\.PDF", "", file.names), month.name)) > 9. FNs1 <- paste0(month.name[FNs],".","PDF") > 10 A <- lapply(FNs1, function(i) extract_tables(i)) > > Output and the error message. > > path = "C:/Users/eesawi/Documents/TextMining/S2017" > > file.names <- dir(path, pattern =".PDF",full.names = TRUE) > > file.names <- str_remove(file.names,"\\s[0-9][0-9]") > > FNs <- sort(match(sub("\\.PDF", "", file.names), month.name)) > > FNs1 <- paste0(month.name[FNs],".","PDF") > > A <- lapply(FNs1, function(i) extract_tables(i)) > Show Traceback > > Error in normalizePath(path.expand(path), winslash, mustWork) : > path[1]=".PDF": The system cannot find the file specified > On Tue, Oct 9, 2018 at 9:44 AM Ek Esawi <esawiek at gmail.com> wrote: > > > > Hi All-- > > > > I used base R list.file function to read files from a directory. The > > file names are months (April, August, etc). That's the system reads > > them in alphabetical order., but i want to reordered them in calendar > > order (January, February, ...December).. I thought i might be able to > > do it via RegEx or possibly gtools package, I am wondering if there is > > an easier way. > > > > Thanks--EK > > > > Example > > path = "C:/Users/name/Downloads/MyFiles" > > file.names <- dir(path, pattern =".PDF") > > > > Example output > > Output: > > "February.PDF" "January.PDF" "March.PDF" > > Desired output > > "January.PDF" "February.PDF" "March.PDF" > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Thank you Jeff. It is an excellent idea and i might try it out if nothing works out. And i don't have 12 files on each sub directory; EK On Tue, Oct 9, 2018 at 11:30 AM Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:> > Instead of changing the order in which you read the files, perhaps your analysis will work if you sort the data after you read it in. This may require that you add the month names as a column in the data frames, or you may already have dates in the data that you could sort by. > > One idea: > > fnames <- paste0( month.name, ".PDF" ) > resultdf <- do.call( rbind, lapply(fnames, function(fn) { read.csv( file.path( "datadir", fn ), as.is=TRUE ) } ) > > but that only works if there are exactly 12 files. If there could be fewer, perhaps: > > fnames <- list.files( "datadir" ) > sfnames <- fnames[ match( sub("\\.PDF", "", fnames ), month.name ) ] > > > On October 9, 2018 6:44:21 AM PDT, Ek Esawi <esawiek at gmail.com> wrote: > >Hi All-- > > > >I used base R list.file function to read files from a directory. The > >file names are months (April, August, etc). That's the system reads > >them in alphabetical order., but i want to reordered them in calendar > >order (January, February, ...December).. I thought i might be able to > >do it via RegEx or possibly gtools package, I am wondering if there is > >an easier way. > > > >Thanks--EK > > > >Example > >path = "C:/Users/name/Downloads/MyFiles" > >file.names <- dir(path, pattern =".PDF") > > > >Example output > >Output: > >"February.PDF" "January.PDF" "March.PDF" > >Desired output > >"January.PDF" "February.PDF" "March.PDF" > > > >______________________________________________ > >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >https://stat.ethz.ch/mailman/listinfo/r-help > >PLEASE do read the posting guide > >http://www.R-project.org/posting-guide.html > >and provide commented, minimal, self-contained, reproducible code. > > -- > Sent from my phone. Please excuse my brevity.
Thank you all. Bill's original idea worked well. I did not realize that i had to paste the full dir name to the correctly ordered file. Once that was done it did work well. I will try REUI's idea and i think Jeff's idea of rearranging the output after extracting the tables might work and i will try it and see. Thank you all. EK On Tue, Oct 9, 2018 at 9:44 AM Ek Esawi <esawiek at gmail.com> wrote:> > Hi All-- > > I used base R list.file function to read files from a directory. The > file names are months (April, August, etc). That's the system reads > them in alphabetical order., but i want to reordered them in calendar > order (January, February, ...December).. I thought i might be able to > do it via RegEx or possibly gtools package, I am wondering if there is > an easier way. > > Thanks--EK > > Example > path = "C:/Users/name/Downloads/MyFiles" > file.names <- dir(path, pattern =".PDF") > > Example output > Output: > "February.PDF" "January.PDF" "March.PDF" > Desired output > "January.PDF" "February.PDF" "March.PDF"