CALUM POLWART
2023-Dec-29 22:25 UTC
[R] Help request: Parsing docx files for key words and appending to a spreadsheet
help(read_docx) says that the function only imports one docx file. In> order to read multiple files, use a for loop or the lapply function. >I told you people will suggest better ways to loop!!> > docx_summary(read_docx("Now they want us to charge our electric cars > from litter bins.docx")) should work. >Ivan thanks for spotting my fail! Since the OP is new to all this I'm going to suggest a little tweak to this code which we can then build into a for loop: filepath <- getwd() #you will want to change this later. You are doing something with tcl to pick a directory which seems rather fancy! But keep doing it for now or set the directory here ending in a / filename <- "Now they want us to charge our electric cars from litter bins.docx" full_filename <- paste0(filepath, filename) #lets double check the file does exist! if (!file.exists(full_filename)) { message("File missing") } else { content <- read_docx(full_filename) |> docx_summary() # this reads docx for the full filename and # passes it ( |> command) to the next line # which summarises it. # the result is saved in a data frame object # called content which we shall show some # heading into from head(content) } Let's get this bit working before we try and loop>[[alternative HTML version deleted]]
Andy
2023-Dec-30 11:44 UTC
[R] Help request: Parsing docx files for key words and appending to a spreadsheet
Thanks Ivan and Calum I continue to appreciate your support. Calum, I entered the code snippet you provided, and it returns 'file missing'. Looking at this, while the object 'full_filename' exists, what is happening is that the path from getwd() is being appended to the title of the article, but without the '/' between the end of the path name (here 'TEST' and the name of the article. In other words, full_filename is reading "~/TESTNow they want us to charge our electric cars from litter bins.docx", so logically, this file doesn't exist. To work, the '/' needs to be inserted to differentiate between the end of the path name and the start of the article name. I've tried both paste0, as you suggested, and paste but neither do the trick. Is this a result of me using the tkinter folder selection that you remarked on? I wanted to keep that so that the selection is interactive, but if there are better ways of doing this I am open to suggestions. Thanks again, both. Best wishes Andrew On 29/12/2023 22:25, CALUM POLWART wrote:> > > help(read_docx) says that the function only imports one docx file. In > order to read multiple files, use a for loop or the lapply function. > > > I told you people will suggest better ways to loop!! > > > > docx_summary(read_docx("Now they want us to charge our electric cars > from litter bins.docx")) should work. > > > Ivan thanks for spotting my fail! Since the OP is new to all this I'm > going to suggest a little tweak to this code which we can then build > into a for loop: > > filepath <- getwd() #you will want to change this later. You are doing > something with tcl to pick a directory which seems rather fancy! But > keep doing it for now or set the directory here ending in a / > > filename <- "Now they want us to charge our electric cars from litter > bins.docx" > > full_filename <- paste0(filepath, filename) > > #lets double check the file does exist! > if (!file.exists(full_filename)) { > ? message("File missing") > } else { > ? content <- read_docx(full_filename) |> > ? ? docx_summary() > ? ? # this reads docx for the full filename and > ? ? # passes it ( |> command) to the next line > ? ? # which summarises it. > ? ? # the result is saved in a data frame object > ? ? # called content which we shall show some > ? ? # heading into from > > ? ?head(content) > } > > Let's get this bit working before we try and loop >[[alternative HTML version deleted]]
Maybe Matching Threads
- Help request: Parsing docx files for key words and appending to a spreadsheet
- Help request: Parsing docx files for key words and appending to a spreadsheet
- Help request: Parsing docx files for key words and appending to a spreadsheet
- Help request: Parsing docx files for key words and appending to a spreadsheet
- Help request: Parsing docx files for key words and appending to a spreadsheet