Dear Members & Experts, Since the Dictionary () function is no longer available with the tm package. How do I use other functions to do the same as below? I want to capture a list of specific terms from a corpus. By example, if my corpus has 102 files. I want to see a list with occurrences of price, crude, oil in all 102 files. When I use the function Dictionary (), I got the error: Error: could not find function "Dictionary"> d <- Dictionary(c("prostatic", "adenocarcinoma", "grade")) > inspect(DocumentTermMatrix(docs, list(dictionary = d)))But if I use the codes below using inspect, the dictionary only returns the terms for 10 files instead of 102. I need a way to get my dictionary to capture and return those terms for all 102 files or whatever other terms I select. I know I am close but inspect () is not the right function.> myTerms <- c("prostatic", "adenocarcinoma", "grade") > inspect(DocumentTermMatrix(docs, list(dictionary = myTerms)))<<DocumentTermMatrix (documents: 102, terms: 3)>> Non-/sparse entries: 292/14 Sparsity : 5% Maximal term length: 14 Weighting : term frequency (tf) Sample : Terms Docs adenocarcinoma grade prostatic Patient14.txt 11 6 3 Patient15.txt 7 12 2 Patient16.txt 13 16 4 Patient19.txt 5 13 2 Patient24.txt 11 12 4 Patient25.txt 8 9 4 Patient41.txt 8 10 4 Patient46.txt 8 10 3 Patient8.txt 9 12 2 Patient9.txt 8 23 2 Thanks Patrick Casimir, PhD Health Analytics, Data Science, Big Data Expert & Independent Consultant C: 954.614.1178 [[alternative HTML version deleted]]
Dear Members & Experts, Since the Dictionary () function is no longer available with the tm package. How do I use other functions to do the same as below? I want to capture a list of specific terms from a corpus. By example, if my corpus has 102 files. I want to see a list with occurrences of prostatic, adenocarcinoma, grade in all 102 files. When I use the function Dictionary (), I got the error: Error: could not find function "Dictionary"> d <- Dictionary(c("prostatic", "adenocarcinoma", "grade")) > inspect(DocumentTermMatrix(docs, list(dictionary = d)))But if I use the codes below using inspect, the dictionary only returns the terms for 10 files instead of 102. I need a way to get my dictionary to capture and return those terms for all 102 files or whatever other terms I select. I know I am close but inspect () is not the right function.> myTerms <- c("prostatic", "adenocarcinoma", "grade") > inspect(DocumentTermMatrix(docs, list(dictionary = myTerms)))<<DocumentTermMatrix (documents: 102, terms: 3)>> Non-/sparse entries: 292/14 Sparsity : 5% Maximal term length: 14 Weighting : term frequency (tf) Sample : Terms Docs adenocarcinoma grade prostatic Patient14.txt 11 6 3 Patient15.txt 7 12 2 Patient16.txt 13 16 4 Patient19.txt 5 13 2 Patient24.txt 11 12 4 Patient25.txt 8 9 4 Patient41.txt 8 10 4 Patient46.txt 8 10 3 Patient8.txt 9 12 2 Patient9.txt 8 23 2 Thanks Patrick Casimir, PhD Health Analytics, Data Science, Big Data Expert & Independent Consultant C: 954.614.1178 [[alternative HTML version deleted]]
Considering the deafening silence after three repeats, one explanation could be that you are asking the wrong group of people. It is also possible that your failure to follow the Posting Guide with regard to using plain text email and a reproducible example [1][2] means that readers who are not experts do not feel inclined to follow along with you and help you think of solutions. Keep in mind that supporting contributed packages like tm is technically not on topic here, though people often do feel the urge to help solve problems with them anyway. With regard to asking the wrong group of people I would suggest asking the maintainer of the tm package what they recommend. See the help for the maintainer function or read the CRAN Web page for that package. [1] http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example [2] http://adv-r.had.co.nz/Reproducibility.html -- Sent from my phone. Please excuse my brevity. On May 19, 2017 7:12:45 AM PDT, Patrick Casimir <patrcasi at nova.edu> wrote:>Dear Members & Experts, > > >Since the Dictionary () function is no longer available with the tm >package. How do I use other functions to do the same as below? I want >to capture a list of specific terms from a corpus. By example, if my >corpus has 102 files. I want to see a list with occurrences of >prostatic, adenocarcinoma, grade in all 102 files. When I use the >function Dictionary (), I got the error: Error: could not find function >"Dictionary" > > >> d <- Dictionary(c("prostatic", "adenocarcinoma", "grade")) >> inspect(DocumentTermMatrix(docs, list(dictionary = d))) > > >But if I use the codes below using inspect, the dictionary only returns >the terms for 10 files instead of 102. I need a way to get my >dictionary to capture and return those terms for all 102 files or >whatever other terms I select. I know I am close but inspect () is not >the right function. > > >> myTerms <- c("prostatic", "adenocarcinoma", "grade") >> inspect(DocumentTermMatrix(docs, list(dictionary = myTerms))) > > <<DocumentTermMatrix (documents: 102, terms: 3)>> > Non-/sparse entries: 292/14 > Sparsity : 5% > Maximal term length: 14 > Weighting : term frequency (tf) > Sample : > Terms > Docs adenocarcinoma grade prostatic > Patient14.txt 11 6 3 > Patient15.txt 7 12 2 > Patient16.txt 13 16 4 > Patient19.txt 5 13 2 > Patient24.txt 11 12 4 > Patient25.txt 8 9 4 > Patient41.txt 8 10 4 > Patient46.txt 8 10 3 > Patient8.txt 9 12 2 > Patient9.txt 8 23 2 > > >Thanks > > > >Patrick Casimir, PhD >Health Analytics, Data Science, Big Data Expert & Independent >Consultant >C: 954.614.1178 > > > > [[alternative HTML version deleted]] > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.