Eric Berger
2023-Dec-30 11:58 UTC
[R] Help request: Parsing docx files for key words and appending to a spreadsheet
full_filename <- paste(filepath, filename,sep="/") On Sat, Dec 30, 2023 at 1:45?PM Andy <phaedrusv at gmail.com> wrote:> Thanks Ivan and Calum > > I continue to appreciate your support. > > Calum, I entered the code snippet you provided, and it returns 'file > missing'. Looking at this, while the object 'full_filename' exists, what > is happening is that the path from getwd() is being appended to the > title of the article, but without the '/' between the end of the path > name (here 'TEST' and the name of the article. In other words, > full_filename is reading "~/TESTNow they want us to charge our electric > cars from litter bins.docx", so logically, this file doesn't exist. To > work, the '/' needs to be inserted to differentiate between the end of > the path name and the start of the article name. I've tried both paste0, > as you suggested, and paste but neither do the trick. > > Is this a result of me using the tkinter folder selection that you > remarked on? I wanted to keep that so that the selection is interactive, > but if there are better ways of doing this I am open to suggestions. > > Thanks again, both. > > Best wishes > Andrew > > > On 29/12/2023 22:25, CALUM POLWART wrote: > > > > > > help(read_docx) says that the function only imports one docx file. In > > order to read multiple files, use a for loop or the lapply function. > > > > > > I told you people will suggest better ways to loop!! > > > > > > > > docx_summary(read_docx("Now they want us to charge our electric cars > > from litter bins.docx")) should work. > > > > > > Ivan thanks for spotting my fail! Since the OP is new to all this I'm > > going to suggest a little tweak to this code which we can then build > > into a for loop: > > > > filepath <- getwd() #you will want to change this later. You are doing > > something with tcl to pick a directory which seems rather fancy! But > > keep doing it for now or set the directory here ending in a / > > > > filename <- "Now they want us to charge our electric cars from litter > > bins.docx" > > > > full_filename <- paste0(filepath, filename) > > > > #lets double check the file does exist! > > if (!file.exists(full_filename)) { > > message("File missing") > > } else { > > content <- read_docx(full_filename) |> > > docx_summary() > > # this reads docx for the full filename and > > # passes it ( |> command) to the next line > > # which summarises it. > > # the result is saved in a data frame object > > # called content which we shall show some > > # heading into from > > > > head(content) > > } > > > > Let's get this bit working before we try and loop > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Andy
2023-Dec-30 12:12 UTC
[R] Help request: Parsing docx files for key words and appending to a spreadsheet
Hi Eric Thanks for that. That seems to fix one problem (the lack of a separator), but introduces a new one when I complete the function Calum proposed:Error in docx_summary() : argument "x" is missing, with no default The whole code so far looks like this: # Load libraries library(tcltk) library(tidyverse) library(officer) filepath <- setwd(tk_choose.dir()) filename <- "Now they want us to charge our electric cars from litter bins.docx" #full_filename <- paste0(filepath, filename) # Calum's original suggestion full_filename <- paste(filepath, filename, sep="/") # Eric's proposed fix #lets double check the file does exist! # The rest here is Calum's suggestion if (!file.exists(full_filename)) { ? message("File missing") } else { ? content <- read_docx(full_filename) ? docx_summary() ? # this reads docx for the full filename and ? # passes it ( |> command) to the next line ? # which summarises it. ? # the result is saved in a data frame object ? # called content which we shall show some ? # heading into from ? head(content) } Running this, results in the error cited above. Thanks as always :-) On 30/12/2023 11:58, Eric Berger wrote:> full_filename <- paste(filepath, filename,sep="/")[[alternative HTML version deleted]]
Possibly Parallel Threads
- Help request: Parsing docx files for key words and appending to a spreadsheet
- Help request: Parsing docx files for key words and appending to a spreadsheet
- Help request: Parsing docx files for key words and appending to a spreadsheet
- Help request: Parsing docx files for key words and appending to a spreadsheet
- Help request: Parsing docx files for key words and appending to a spreadsheet