Displaying 20 results from an estimated 35 matches for "tm_map".
Did you mean:
vm_map
2012 Feb 26
2
tm_map help
Hi all,
I am trying to do some text mining with twitter and I am getting the error:
Error in structure(names(sapply(possibleCompletions, "[", 1)), names = x) :
'names' attribute [1] must be the same length as the vector [0]
When I use tm_map. Has anyone had/seen this error before? The code I
have is shown below and this error only occurs with #qantas, hashtags
like #asx, #obama work ok.
Appreciate any help.
Thanks,
Sachin
library(twitteR)
library(tm)
library(wordcloud)
hashTag<-function (hashTag, minFreq){
tweets<- searc...
2012 Oct 25
2
Minería de texto
...e(tm) require(wordcloud) tw.df=twListToDF(tweets) RemoveAtPeople <- function(x){gsub("@\\w+", "",x)} df<- as.vector(sapply(tw.df$text, RemoveAtPeople)) #The following is cribbed and seems to do what it says on the can tw.corpus = Corpus(VectorSource(df)) tw.corpus = tm_map(tw.corpus, function(x) iconv(enc2utf8(x), sub = "byte")) tw.corpus = tm_map(tw.corpus, tolower) tw.corpus = tm_map(tw.corpus, removePunctuation) tw.corpus = tm_map(tw.corpus, function(x) removeWords(x, c(stopwords("spanish"),"rt"))) tw.corpus = tm_map(tw.corpus,...
2009 Nov 12
2
package "tm" fails to remove "the" with remove stopwords
...fo() below.
Thanks!
Mark
require(tm)
myDocument <- c("the rain in Spain", "falls mainly on the plain", "jack and
jill ran up the hill", "to fetch a pail of water")
text.corp <- Corpus(VectorSource(myDocument))
#########################
text.corp <- tm_map(text.corp, stripWhitespace)
text.corp <- tm_map(text.corp, removeNumbers)
text.corp <- tm_map(text.corp, removePunctuation)
## text.corp <- tm_map(text.corp, stemDocument)
text.corp <- tm_map(text.corp, removeWords, c("the", stopwords("english")))
dtm <- DocumentT...
2013 Sep 26
0
R hangs at NGramTokenizer
...; invisible(clusterEvalQ(cl, library(RTextTools)))> myCorpus <-Corpus(DirSource("/home/neeph/Test/DMOZ_Business"), encoding="UTF-8", readerControl=list(reader=readPlain))> removeURL <- function(x) gsub("http[[:alnum:]]*", "", x)> myCorpus <- tm_map(myCorpus, removeURL)> removeAmp <- function(x) gsub("&", "", x)> myCorpus <- tm_map(myCorpus, removeAmp)> removeWWW <- function(x) gsub("www[[:alnum:]]*", "", x)> myCorpus <- tm_map(myCorpus, removeWWW)> myCorpus <- tm_ma...
2014 Jul 29
2
wordcloud y tabla de palabras [Avanzando]
...n R: 3.1.1
require(tm)
require(wordcloud)
require(Rcpp)
tmpinformes<-data.frame(c("todo el informe 2005", "todo el informe
2013"), row.names=c("2005", "2013"))
ds<- DataframeSource(tmpText)
ds<- DataframeSource(tmpinformes)
corp = Corpus(ds)
corp = tm_map(corp,removePunctuation)
corp = tm_map(corp,content_transformer(tolower))
corp = tm_map(corp,removeNumbers)
corp = tm_map(corp, stripWhitespace)
corp = tm_map(corp, removeWords, sw)
corp = tm_map(corp, removeWords, stopwords("spanish"))
term.matrix<- TermDocumentMatrix(corp)
term.matrix...
2014 Jul 22
2
Ayuda Error in `colnames<-`(`*tmp*`, value = c(
...2)
> d1<-readLines(txt1, encoding="UTF-8")
> d1<-iconv(enc2utf8(d1), sub = "byte")
> d2<-readLines(txt2, encoding="UTF-8")
> d2<-iconv(enc2utf8(d2), sub = "byte")
> df<-c(d1,d2)
> corpus<-Corpus(VectorSource(df))
> d<-tm_map(corpus, content_transformer(tolower))
> d<-tm_map(d, stripWhitespace)
> d<-tm_map(d, removePunctuation)
> sw<-readLines("./StopWords.txt", encoding="UTF-8")
> sw<-iconv(enc2utf8(sw), sub="byte")
> d<-tm_map(d, removeWords, sw)
> d<-t...
2014 Nov 22
2
Problemas con tm
Estimados compañeros tengo un problema con la librería tm o con windows
8.1 o con algo que no controlo.
Hace tiempo con windows 7 y una versión anterior de R ejecutaba este código:
library(tm)
data("crude")
crude <- tm_map(crude, tolower)
tdm<-TermDocumentMatrix(crude)
y sin problemas me creaba tdm. Ahora si lo ejecuto me da el siguiente error:
Error: inherits(doc, "TextDocument") is not TRUE
Pero si quito la línea de código
crude <- tm_map(crude, tolower)
Me crea tdm sin problema.
¿Qué está pas...
2012 Jan 13
4
Troubles with stemming (tm + Snowball packages) under MacOS
...1 / R 2.14.1 (I have tried several versions)
I have installed all the needed packages (tm, rJava, rWeka, Snowball)
+ dependencies. I have desactivated AWT (like written in http://r.789695.n4.nabble.com/Problem-with-Snowball-amp-RWeka-td3402126.html)
with :
Sys.setenv(NOAWT=TRUE)
The command tm_map(reuters, stemDocument) gives the following errors :
- First time:
Error in .jnew(name) :
java.lang.InternalError: Can't start the AWT because Java was
started on the first thread. Make sure StartOnFirstThread is not
specified in your application's Info.plist or on the command line...
2012 Jan 27
2
tm package: handling contractions
...tried making a wordcloud of Obama's State of the Union address using
the tm package to process the text
sotu <- scan(file="c:/R/data/sotu2012.txt", what="character")
sotu <- tolower(sotu)
corp <-Corpus(VectorSource(paste(sotu, collapse=" ")))
corp <- tm_map(corp, removePunctuation)
corp <- tm_map(corp, stemDocument)
corp <- tm_map(corp, function(x)removeWords(x,stopwords()))
tdm <- TermDocumentMatrix(corp)
m <- as.matrix(tdm)
v <- sort(rowSums(m),decreasing=TRUE)
d <- data.frame(word = names(v),freq=v)
wordcloud(d$word,d$freq)
I en...
2012 Dec 13
2
Tamaño de la matriz de términos y memoria. Paquete TM
...txt <- readLines("D:/Publico/Documents/texto1.txt",encoding="UTF-8")
txt = iconv(txt, to="ASCII//TRANSLIT")
# construye un corpus
corpus <- Corpus(VectorSource(txt))
# lleva a minúsculas
corpus <- tm_map(corpus, tolower)
# quita espacios en blanco
corpus <- tm_map(corpus, stripWhitespace)
# remueve la puntuación
corpus <- tm_map(corpus, removePunctuation)
# carga el archivo de palabras vacías personalizada en español y lo convierte a ASCII
sw &...
2014 Jun 17
2
No es un problema de tm tienes doc.corpus vacío
...ciologia/Soc Musica/Black
> metal/Analisis texto/Inmortal"inmortal = readLines(TEXTFILE)inmortal =
> readLines(TEXTFILE)length(inmortal)head(inmortal)tail(inmortal)library(tm)vec
> <- VectorSource(inmortal)corpus <-
> Corpus(vec)summary(corpus)inspect(corpus[1:7])corpus <- tm_map(corpus,
> tolower)corpus <- tm_map(corpus, removePunctuation)corpus <- tm_map(corpus,
> removeNumbers)corpus <- tm_map(corpus, removeWords,
> stopwords("english"))inspect(doc.corpus[1:2])library(SnowballC)corpus <-
> tm_map(corpus, stemDocument)corpus <- tm_map(...
2014 Jun 18
2
No es un problema de tm tienes doc.corpus vacío
...rtal"inmortal = readLines(TEXTFILE)inmortal
> >> = readLines(TEXTFILE)length(inmortal)head(inmortal)tail(
> >> inmortal)library(tm)vec
> >> <- VectorSource(inmortal)corpus <-
> >> Corpus(vec)summary(corpus)inspect(corpus[1:7])corpus <-
> >> tm_map(corpus, tolower)corpus <- tm_map(corpus,
> >> removePunctuation)corpus <- tm_map(corpus, removeNumbers)corpus <-
> >> tm_map(corpus, removeWords,
> >>
> stopwords("english"))inspect(doc.corpus[1:2])library(SnowballC)corpus
> >> <- tm_map(...
2014 Jul 25
3
wordcloud y tabla de palabras
...uot;)
>pathname<-"C:/Users/d_2/Documents/Comision/PLAN de INSPECCIONES/Informes/"
>TDM<-function(informes, pathname) {
info.dir<-sprintf("%s/%s", pathname, informes)
info.cor<-Corpus(DirSource(directory=info.dir, encoding="UTF-8"))
info.cor.cl<-tm_map(info.cor, content_transformer(tolower))
info.cor.cl<-tm_map(info.cor.cl, stripWhitespace)
info.cor.cl<-tm_map(info.cor.cl,removePunctuation)
sw<-readLines("C:/Users/d_2/Documents/StopWords.txt", encoding="UTF-8")
sw<-iconv(enc2utf8(sw), sub = "byte")
i...
2011 Apr 18
0
Help with cleaning a corpus
Hi!
I created a corpus and I started to clean through this piece of code:
txt <-tm_map(txt,removeWords, stopwords("spanish"))
txt <-tm_map(txt,stripWhitespace)
txt <-tm_map(txt,tolower)
txt <-tm_map(txt,removeNumbers)
txt <-tm_map(txt,removePunctuation)
But something happpended: some of the documents in the corpus became empty,
this is a problem when i try to...
2014 Jul 28
2
wordcloud y tabla de palabras
...omision/PLAN de
> INSPECCIONES/Informes/"
> >
> >>TDM<-function(informes, pathname) {
> > info.dir<-sprintf("%s/%s", pathname, informes)
> > info.cor<-Corpus(DirSource(directory=info.dir, encoding="UTF-8"))
> > info.cor.cl<-tm_map(info.cor, content_transformer(tolower))
> > info.cor.cl<-tm_map(info.cor.cl, stripWhitespace)
> > info.cor.cl<-tm_map(info.cor.cl,removePunctuation)
> > sw<-readLines("C:/Users/d_2/Documents/StopWords.txt", encoding="UTF-8")
> > sw<-iconv(...
2017 Jun 12
0
count number of stop words in R
Defining data as you mentioned in your respond causes the following error:
Error in UseMethod("tm_map", x) :
no applicable method for 'tm_map' applied to an object of class "character"
I can solve this error by using Corpus(VectorSource(my string)) and the using your command but I cannot see the number of stop words in my string!
On Monday, June 12, 2017 8:36 AM, Patrick...
2017 Jun 12
3
count number of stop words in R
...54.614.1178
________________________________
From: Elahe chalabi <chalabi.elahe at yahoo.de>
Sent: Monday, June 12, 2017 11:23:42 AM
To: Patrick Casimir; Bert Gunter
Cc: R-help Mailing List
Subject: Re: [R] count number of stop words in R
Thanks for your reply. I know the command
data <- tm_map(data, removeWords, stopwords("english"))
removes English stop words, I don't know how should I count stop words of my string:
str="Mhm . Alright . There's um a young boy that's getting a cookie jar . And it he's uh in bad shape because uh the thing is falling over ....
2014 Jun 18
3
No es un problema de tm tienes doc.corpus vacío
...> > > >> readLines(TEXTFILE)length(inmortal)head(inmortal)tail(
> > > >> inmortal)library(tm)vec
> > > >> <- VectorSource(inmortal)corpus <-
> > > >> Corpus(vec)summary(corpus)inspect(corpus[1:7])corpus <-
> > > >> tm_map(corpus, tolower)corpus <- tm_map(corpus,
> > > >> removePunctuation)corpus <- tm_map(corpus, removeNumbers)corpus <-
> > > >> tm_map(corpus, removeWords,
> > > >>
> > > stopwords("english"))inspect(doc.corpus[1:2])library(Snow...
2010 Feb 16
0
tm package
Hi,
I'm using version 0.5.1 of tm package with R 2.10.1. It looks to me
as if after the following
reuters21578 <- Corpus(DirSource(corpusDir), readerControl =
list(reader = readReut21578XMLasPlain))
reuters21578 <- tm_map(reuters21578, stripWhitespace)
reuters21578 <- tm_map(reuters21578, tolower)
reuters21578 <- tm_map(reuters21578, removePunctuation)
reuters21578 <- tm_map(reuters21578, removeNumbers)
reuters21578.dtm <- DocumentTermMatrix(reuters21578)
that reuters21578.dtm does not i...
2011 Mar 24
2
Problem with Snowball & RWeka
Dear Forum,
when I try to use SnowballStemmer() I get the following error message:
"Could not initialize the GenericPropertiesCreator. This exception was
produced: java.lang.NullPointerException"
It seems to have something to do with either Snowball or RWeka, however I
can't figure out, what to do myself. If you could spend 5 minutes of your
valuable time, to help me or give me a