Hello all, I need some help with loading text-file data into R for analysis with packages like koRpus. The problem I am facing is getting R to recognize a folder full of Word files (about 4,000) as data which I can then make koRpus perform analyses like Coleman-Liau indexing. If at all possible, I prefer to make this work with Word files. The key problem is the struggle to cause R to recognize the text (Word) files in bulk (that is, all at the same time) so that koRpus can do its thing with those files. My attempts to make this work have all been in vain, but I know that packages like koRpus would be limited in usefulness if there were no way to get the package to do its work on a large collection of files all at once. I hope this problem will make sense to someone, and that there is a tenable solution to it. Thanks, Gordon [[alternative HTML version deleted]]
You may get a helpful response, but if not, I'd suggest posting code you have to read one file. Then lots of people could likely show you how to modify it to read all 4000 files. Duncan Murdoch On 02/11/2020 12:28 p.m., Gordon Ballingrud wrote:> Hello all, > > > > I need some help with loading text-file data into R for analysis with > packages like koRpus. > > > > The problem I am facing is getting R to recognize a folder full of Word > files (about 4,000) as data which I can then make koRpus perform analyses > like Coleman-Liau indexing. If at all possible, I prefer to make this work > with Word files. The key problem is the struggle to cause R to recognize > the text (Word) files in bulk (that is, all at the same time) so that > koRpus can do its thing with those files. > > > > My attempts to make this work have all been in vain, but I know that > packages like koRpus would be limited in usefulness if there were no way to > get the package to do its work on a large collection of files all at once. > > > > I hope this problem will make sense to someone, and that there is a tenable > solution to it. > > > > Thanks, > > Gordon > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Thanks; that's a good point. Here is what I have been working with: library(quanteda) library(readtext) texts <- readtext(paste0("/Users/Gordon/Desktop/WPSCASES/", "/word/*.docx")) And the error message: Error in list_files(file, ignore_missing, TRUE, verbosity) : File '' does not exist. On Mon, Nov 2, 2020 at 3:15 PM Duncan Murdoch <murdoch.duncan at gmail.com> wrote:> You may get a helpful response, but if not, I'd suggest posting code you > have to read one file. Then lots of people could likely show you how to > modify it to read all 4000 files. > > Duncan Murdoch > > On 02/11/2020 12:28 p.m., Gordon Ballingrud wrote: > > Hello all, > > > > > > > > I need some help with loading text-file data into R for analysis with > > packages like koRpus. > > > > > > > > The problem I am facing is getting R to recognize a folder full of Word > > files (about 4,000) as data which I can then make koRpus perform analyses > > like Coleman-Liau indexing. If at all possible, I prefer to make this > work > > with Word files. The key problem is the struggle to cause R to recognize > > the text (Word) files in bulk (that is, all at the same time) so that > > koRpus can do its thing with those files. > > > > > > > > My attempts to make this work have all been in vain, but I know that > > packages like koRpus would be limited in usefulness if there were no way > to > > get the package to do its work on a large collection of files all at > once. > > > > > > > > I hope this problem will make sense to someone, and that there is a > tenable > > solution to it. > > > > > > > > Thanks, > > > > Gordon > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > >[[alternative HTML version deleted]]