search for: pdf_text

Displaying 4 results from an estimated 4 matches for "pdf_text".

Did you mean: pdf2text
2023 Jul 05
1
textual analysis - transforming several pdf to txt - naming the files
...ade de Aveiro - Portugal dirpath <- ("/Users/ceciliacarmo/documents/RTextualAnalysis/data/pdfs") library(pdftools) library(dplyr) convertpdf2txt <- function(dirpath){ files <- list.files(dirpath, full.names = T) x <- sapply(files, function(x){ x <- pdftools::pdf_text(x) %>% paste0(collapse = " ") %>% stringr::str_squish() return(x) }) } # apply function txts <- convertpdf2txt(here::here("data", "pdf/")) # add names to txt files names(txts) <- paste0(here::here("data","pdftext"), 1:...
2023 Jul 05
1
textual analysis - transforming several pdf to txt - naming the files
convertpdf2txt <- function(dirpath){ files <- list.files(dirpath, pattern = "Consoli.*\\.pdf$", full.names = TRUE) files <- chartr("\\", "/", files) x <- lapply(files, function(x){ pdftools::pdf_text(x) %>% paste0(collapse = " ") %>% stringr::str_squish() }) new_names <- tools::file_path_sans_ext(files) new_names <- paste(new_names, "txt", sep = ".") setNames(x, new_names) } # apply function # note that my test files are in &q...
2004 Jul 01
1
PDF text strangeness (PR#7043)
Hi R-developers I have noticed a strange little bug/feature: I often create pdf's of plots, then edit them in Adobe Illustrator. Generally this works great, but whenever I have text that is aligned vertically (along the y-axis usually), the text is written out as lots of individual objects. When the text is horizontal (x-axis, other stuff), it is all one object. I would prefer one object
2008 Mar 29
1
A patch for extending pdf device to embed popup text and web links
...ons */ + /* * Fonts and encodings used on the device */ *************** *** 5149,5154 **** --- 5154,5166 ---- cidfontfamily defaultCIDFont; /* Record if fonts are used */ Rboolean fontUsed[100]; + + /* + * Current text geometry information (stored in PDF_Text) + */ + int text_size; + double text_a, text_b, text_x, text_y; + double text_ascent, text_descent, text_width; } PDFDesc; *************** *** 5188,5197 **** --- 5200,5217 ---- static double PDF_StrWidth(const char *str, const pGEcontext gc, pDevDesc dd); + s...