Hi Patrick, How could anyone possibly answer this question with only the information you've provided? It's like showing me an empty cup and asking why it's empty. Maybe you didn't put anything in it. Maybe you did and then you dog drank it or your cat knocked it over or your girlfriend drank it. How would I possibly know? Bottom line, you need to show exactly what you did to produce that result, preferably in the form of a few lines of code that we can run to reproduce your problem. Finally, you may find it helpful take some time to learn how to ask questions the smart way. http://catb.org/~esr/faqs/smart-questions.html is a good place to learn this important skill. Best, Ista On Dec 6, 2016 7:58 AM, "Patrick Casimir" <patrcasi at nova.edu> wrote: <<DocumentTermMatrix (documents: 4, terms: 0)>> Non-/sparse entries: 0/0 Sparsity : 100% Maximal term length: 0 Weighting : term frequency (tf) [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]
Thanks Ista. See codes below. I am not sure why the DTM is showing 0 term. I have 4 documents in the corpus. And I was able to make transformations to the documents inside the corpus.> cname <- file.path("C:\\Users\\Desktop\\Text Mining\\Cases\\MyCorpus") > dir(cname)[1] "case1.txt" "case2.txt" "case3.txt" "case4.txt"> library(tm) > docs <- Corpus(DirSource(cname)) > install.packages("magrittr" ,dependencies=TRUE) > viewDocs <- function(d, n) {d %>% extract2(n) %>% as.character() %>% writeLines()} > viewDocs(docs, 1) > toSpace <- content_transformer(function(x, pattern) gsub(pattern, " ", x)) > docs <- tm_map(docs, toSpace, "/|@|nn|") > inspect(docs[1]) > docs <- tm_map(docs, removePunctuation) > docs <- tm_map(docs, removeWords, stopwords("english")) > inspect(docs[1]) > docs <- tm_map(docs, stripWhitespace) > docs <- tm_map(docs, stemDocument) > dtm <- DocumentTermMatrix(docs) > dtm<<DocumentTermMatrix (documents: 4, terms: 0)>> Non-/sparse entries: 0/0 Sparsity : 100% Maximal term length: 0 Weighting : term frequency (tf)>________________________________ From: Ista Zahn <istazahn at gmail.com> Sent: Tuesday, December 6, 2016 9:09:37 AM To: Patrick Casimir Cc: r-help at r-project.org Subject: Re: [R] Why is DocumentTermMatrix showing 0 term? Hi Patrick, How could anyone possibly answer this question with only the information you've provided? It's like showing me an empty cup and asking why it's empty. Maybe you didn't put anything in it. Maybe you did and then you dog drank it or your cat knocked it over or your girlfriend drank it. How would I possibly know? Bottom line, you need to show exactly what you did to produce that result, preferably in the form of a few lines of code that we can run to reproduce your problem. Finally, you may find it helpful take some time to learn how to ask questions the smart way. http://catb.org/~esr/faqs/smart-questions.html is a good place to learn this important skill. Best, Ista On Dec 6, 2016 7:58 AM, "Patrick Casimir" <patrcasi at nova.edu<mailto:patrcasi at nova.edu>> wrote: <<DocumentTermMatrix (documents: 4, terms: 0)>> Non-/sparse entries: 0/0 Sparsity : 100% Maximal term length: 0 Weighting : term frequency (tf) [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org<mailto:R-help at r-project.org> mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]
What is in docs? What does inspect(docs) say? --Ista On Tue, Dec 6, 2016 at 9:29 AM, Patrick Casimir <patrcasi at nova.edu> wrote:> Thanks Ista. See codes below. I am not sure why the DTM is showing 0 term. I > have 4 documents in the corpus. And I was able to make transformations > > to the documents inside the corpus. > > >> cname <- file.path("C:\\Users\\Desktop\\Text Mining\\Cases\\MyCorpus") >> dir(cname) > [1] "case1.txt" "case2.txt" "case3.txt" "case4.txt" >> library(tm) >> docs <- Corpus(DirSource(cname)) >> install.packages("magrittr" ,dependencies=TRUE) >> viewDocs <- function(d, n) {d %>% extract2(n) %>% as.character() %>% >> writeLines()} >> viewDocs(docs, 1) >> toSpace <- content_transformer(function(x, pattern) gsub(pattern, " ", x)) >> docs <- tm_map(docs, toSpace, "/|@|nn|") >> inspect(docs[1]) >> docs <- tm_map(docs, removePunctuation) >> docs <- tm_map(docs, removeWords, stopwords("english")) >> inspect(docs[1]) >> docs <- tm_map(docs, stripWhitespace) >> docs <- tm_map(docs, stemDocument) >> dtm <- DocumentTermMatrix(docs) >> dtm > <<DocumentTermMatrix (documents: 4, terms: 0)>> > Non-/sparse entries: 0/0 > Sparsity : 100% > Maximal term length: 0 > Weighting : term frequency (tf) >> > > > > > ________________________________ > From: Ista Zahn <istazahn at gmail.com> > Sent: Tuesday, December 6, 2016 9:09:37 AM > To: Patrick Casimir > Cc: r-help at r-project.org > Subject: Re: [R] Why is DocumentTermMatrix showing 0 term? > > > Hi Patrick, > > How could anyone possibly answer this question with only the information > you've provided? It's like showing me an empty cup and asking why it's > empty. Maybe you didn't put anything in it. Maybe you did and then you dog > drank it or your cat knocked it over or your girlfriend drank it. How would > I possibly know? > > Bottom line, you need to show exactly what you did to produce that result, > preferably in the form of a few lines of code that we can run to reproduce > your problem. > > Finally, you may find it helpful take some time to learn how to ask > questions the smart way. http://catb.org/~esr/faqs/smart-questions.html is a > good place to learn this important skill. > > Best, > Ista > > > On Dec 6, 2016 7:58 AM, "Patrick Casimir" <patrcasi at nova.edu> wrote: > > <<DocumentTermMatrix (documents: 4, terms: 0)>> > Non-/sparse entries: 0/0 > Sparsity : 100% > Maximal term length: 0 > Weighting : term frequency (tf) > > > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >
Fortune Nomination! Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Tue, Dec 6, 2016 at 6:09 AM, Ista Zahn <istazahn at gmail.com> wrote:> Hi Patrick, > > How could anyone possibly answer this question with only the information > you've provided? It's like showing me an empty cup and asking why it's > empty. Maybe you didn't put anything in it. Maybe you did and then you dog > drank it or your cat knocked it over or your girlfriend drank it. How would > I possibly know? > > Bottom line, you need to show exactly what you did to produce that result, > preferably in the form of a few lines of code that we can run to reproduce > your problem. > > Finally, you may find it helpful take some time to learn how to ask > questions the smart way. http://catb.org/~esr/faqs/smart-questions.html is > a good place to learn this important skill. > > Best, > Ista > > On Dec 6, 2016 7:58 AM, "Patrick Casimir" <patrcasi at nova.edu> wrote: > > <<DocumentTermMatrix (documents: 4, terms: 0)>> > Non-/sparse entries: 0/0 > Sparsity : 100% > Maximal term length: 0 > Weighting : term frequency (tf) > > > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.