Eric Berger
2023-Dec-30 11:58 UTC
[R] Help request: Parsing docx files for key words and appending to a spreadsheet
full_filename <- paste(filepath, filename,sep="/") On Sat, Dec 30, 2023 at 1:45?PM Andy <phaedrusv at gmail.com> wrote:> Thanks Ivan and Calum > > I continue to appreciate your support. > > Calum, I entered the code snippet you provided, and it returns 'file > missing'. Looking at this, while the object 'full_filename' exists, what > is happening is that the path from getwd() is being appended to the > title of the article, but without the '/' between the end of the path > name (here 'TEST' and the name of the article. In other words, > full_filename is reading "~/TESTNow they want us to charge our electric > cars from litter bins.docx", so logically, this file doesn't exist. To > work, the '/' needs to be inserted to differentiate between the end of > the path name and the start of the article name. I've tried both paste0, > as you suggested, and paste but neither do the trick. > > Is this a result of me using the tkinter folder selection that you > remarked on? I wanted to keep that so that the selection is interactive, > but if there are better ways of doing this I am open to suggestions. > > Thanks again, both. > > Best wishes > Andrew > > > On 29/12/2023 22:25, CALUM POLWART wrote: > > > > > > help(read_docx) says that the function only imports one docx file. In > > order to read multiple files, use a for loop or the lapply function. > > > > > > I told you people will suggest better ways to loop!! > > > > > > > > docx_summary(read_docx("Now they want us to charge our electric cars > > from litter bins.docx")) should work. > > > > > > Ivan thanks for spotting my fail! Since the OP is new to all this I'm > > going to suggest a little tweak to this code which we can then build > > into a for loop: > > > > filepath <- getwd() #you will want to change this later. You are doing > > something with tcl to pick a directory which seems rather fancy! But > > keep doing it for now or set the directory here ending in a / > > > > filename <- "Now they want us to charge our electric cars from litter > > bins.docx" > > > > full_filename <- paste0(filepath, filename) > > > > #lets double check the file does exist! > > if (!file.exists(full_filename)) { > > message("File missing") > > } else { > > content <- read_docx(full_filename) |> > > docx_summary() > > # this reads docx for the full filename and > > # passes it ( |> command) to the next line > > # which summarises it. > > # the result is saved in a data frame object > > # called content which we shall show some > > # heading into from > > > > head(content) > > } > > > > Let's get this bit working before we try and loop > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Andy
2023-Dec-30 12:12 UTC
[R] Help request: Parsing docx files for key words and appending to a spreadsheet
Hi Eric
Thanks for that. That seems to fix one problem (the lack of a
separator), but introduces a new one when I complete the function Calum
proposed:Error in docx_summary() : argument "x" is missing, with no
default
The whole code so far looks like this:
# Load libraries
library(tcltk)
library(tidyverse)
library(officer)
filepath <- setwd(tk_choose.dir())
filename <- "Now they want us to charge our electric cars from litter
bins.docx"
#full_filename <- paste0(filepath, filename) # Calum's original
suggestion
full_filename <- paste(filepath, filename, sep="/") # Eric's
proposed fix
#lets double check the file does exist! # The rest here is Calum's
suggestion
if (!file.exists(full_filename)) {
? message("File missing")
} else {
? content <- read_docx(full_filename)
? docx_summary()
? # this reads docx for the full filename and
? # passes it ( |> command) to the next line
? # which summarises it.
? # the result is saved in a data frame object
? # called content which we shall show some
? # heading into from
? head(content)
}
Running this, results in the error cited above.
Thanks as always :-)
On 30/12/2023 11:58, Eric Berger wrote:> full_filename <- paste(filepath, filename,sep="/")
[[alternative HTML version deleted]]
Apparently Analagous Threads
- Help request: Parsing docx files for key words and appending to a spreadsheet
- Help request: Parsing docx files for key words and appending to a spreadsheet
- Help request: Parsing docx files for key words and appending to a spreadsheet
- Help request: Parsing docx files for key words and appending to a spreadsheet
- Help request: Parsing docx files for key words and appending to a spreadsheet