Kum-Hoe Hwang
2009-Jan-09 14:21 UTC
[R] [R} how to build TermDocMatrix in tm text mining package of R
Howdy Gurus I 'd like to ask a question about how to build TermDocMatrix in tm text mining package. It is not clear about importing a plain text file, and them converting that text file into TermDocMatrix file, etc to me. How can I build a TermDocMatrix of " a plain text document file for text association? Or are there any good manuals? Thank you in advance, -- Kum-Hoe Hwang, Ph.D. Phone : 82-31-250-3516 Email : phdhwang@gmail.com [[alternative HTML version deleted]]
Tony Breyal
2009-Jan-09 15:39 UTC
[R] [R} how to build TermDocMatrix in tm text mining package of R
Hi there, I think something like the following is what you want: ### R start... # if you put your plain text files in a folder like this my.path <- 'C:\\Documents and Settings\\tony\\Desktop\\texts\\' # then you can construct a simple tdm like this library(tm) my.corpus <- Corpus(DirSource(my.path), readerControl = list (reader=readPlain)) my.tdm <- TermDocMatrix(my.corpus) # this show show how words are distributed in the first text document my.tdm[1, ] ### R end. by the way, there are some nice examples of using the tm package in the last Rnews letter (Volume 8/2, October 2008), under the section 'An Introduction to Text Mining in R': http://cran.r-project.org/doc/Rnews/Rnews_2008-2.pdf Hope that helps a little bit, Tony Breyal On 9 Jan, 14:21, "Kum-Hoe Hwang" <phdhw... at gmail.com> wrote:> Howdy Gurus > > I 'd like to ask a question about how to build TermDocMatrix in tm text > mining package. > > It is not clear about importing a plain text file, and them converting that > text file into TermDocMatrix file, etc to me. > How can I build a TermDocMatrix of " a plain text document file for text > association? > Or are there any good manuals? > > Thank you in advance, > > -- > Kum-Hoe Hwang, Ph.D. > > Phone : 82-31-250-3516 > Email : phdhw... at gmail.com > > ? ? ? ? [[alternative HTML version deleted]] > > ______________________________________________ > R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Maybe Matching Threads
- Help needed for Loading "tm" package
- How to Solve the Error( error:cannot allocate vector of size 1.1 Gb)
- Any packages for conducting AHP( Analytic Hierarchy Process) data
- How to get multiple Correlation Coefficients
- How to sample x-y coordinates from GIS files