Hi all,
I have a document term matrix and I would like to have a heatmap (geom_tile) for
20 most associated words to a specific word in it. Here is my dtm:
?corpus=Corpus(VectorSource(data$Message))
?corpus=tm_map(corpus,tolower)
corpus=tm_map(corpus,removePunctuation)
corpus=tm_map(corpus,removeWords,c(stopwords("english")))
corpus=tm_map(corpus,stemDocument,"english")
frequencies=DocumentTermMatrix(corpus)
frequencies=removeSparseTerms(frequencies,0.995)
frequencies
<<DocumentTermMatrix (documents: 16630, terms: 399)>>
Non-/sparse entries: 118557/6516813
Sparsity? ? ? ? ? ?: 98%
Maximal term length: 43
Weighting? ? ? ? ? : term frequency (tf)
and the word I'm looking for the 20 most associated words in dtm for it:
word=c("problem")
corr <- c(0.7, 0.75, 0.1)
my_assocs <- findAssocs(frequencies, word,corr)
my problem is in ggplot line containing only 20 most associated words. How
should I bring these to ggplot?
Thanks for any help.
Elahe