CecĂlia Carmo
2023-Jul-05 10:12 UTC
[R] textual analysis - transforming several pdf to txt - naming the files
convertpdf2txt <- function(dirpath){ files <- list.files(dirpath, pattern = "Consoli.*\\.pdf$", full.names = TRUE) files <- chartr("\\", "/", files) x <- lapply(files, function(x){ pdftools::pdf_text(x) %>% paste0(collapse = " ") %>% stringr::str_squish() }) new_names <- tools::file_path_sans_ext(files) new_names <- paste(new_names, "txt", sep = ".") setNames(x, new_names) } # apply function # note that my test files are in "~/Temp" txts <- convertpdf2txt(here::here("~", "Temp")) names(txts) Thank you very much, but the following error appeared: Error: unexpected '}' in "}" Cec?lia Carmo Universidade de Aveiro [[alternative HTML version deleted]]
Rui Barradas
2023-Jul-05 15:43 UTC
[R] textual analysis - transforming several pdf to txt - naming the files
?s 11:12 de 05/07/2023, Cec?lia Carmo escreveu:> convertpdf2txt <- function(dirpath){ > > files <- list.files(dirpath, pattern = "Consoli.*\\.pdf$", full.names > = TRUE) > files <- chartr("\\", "/", files) > > x <- lapply(files, function(x){ > pdftools::pdf_text(x) %>% > paste0(collapse = " ") %>% > stringr::str_squish() > }) > new_names <- tools::file_path_sans_ext(files) > new_names <- paste(new_names, "txt", sep = ".") > setNames(x, new_names) > } > > # apply function > # note that my test files are in "~/Temp" > txts <- convertpdf2txt(here::here("~", "Temp")) > names(txts) > > > Thank you very much, but the following error appeared: > > Error: unexpected '}' in "}" > > > > > Cec?lia Carmo > > Universidade de Aveiro > > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.Hello, I had tested the code with a couple of PDF's and it ran with no errors or warnings. That error is telling that a "}" is not balanced but in my code they all are, RStudio checks it automatically. Can you try to check in an editor with syntax highlighting? Hope this helps, Rui Barradas