Elahe chalabi
2018-Nov-15 15:14 UTC
[R] create a heatmap for findAssocs results based on time
Hi all, I have the following data for which I create a document term matrix first and then I add the time available to the dtm. In order to see the correlations to the term "updat" in the different years, I would like to have a heat-map for findassoc in a way that x-axis shows the time.> library(tm)library(ggplot2) > dput(df) structure(list(Description = structure(c(5L, 8L, 6L, 4L, 1L, 2L, 7L, 9L, 10L, 3L), .Label = c("general topics done", "keep the general topics updated", "rejected topic ", "several topics in hand", "this is a genetal topic", "topic 333555 needs to be updated", "topic 5647 is handed over", "topic is updated", "update the topic ", "updating the topic is done " ), class = "factor")), class = "data.frame", row.names = c(NA, -10L))> corpus=Corpus(VectorSource(df$Description)) > corpus=tm_map(corpus,tolower) > corpus=tm_map(corpus,removePunctuation)corpus=tm_map(corpus,removeWords,c(stopwords("english")))> corpus=tm_map(corpus,stemDocument,"english") > frequenciescontrol=DocumentTermMatrix(corpus)frequenciescontrol$time=c("2015","2015","2015","2015","2015","2016","2016","2016","2016","2016") findAssocs(frequenciescontrol, "updat", 0.01) Heatmap looking: y axis-> all the words correlated to "updat" x axis: years legend:correlation Thanks for any help. Elahe!