similar to: tm: Why does adding local metadata take so long?

Displaying 20 results from an estimated 1000 matches similar to: "tm: Why does adding local metadata take so long?"

2010 Apr 23
2
Library (tm) Error: could not find function "TermDocMatrix".
Hi List I have the next code and the error. I have try with other codes and I have the same problem. > reut21578 <- system.file("texts", "crude", package = "tm") > (r <- Corpus(DirSource(reut21578), readerControl = list(reader = > readReut21578XMLasPlain))) A corpus with 20 text documents > (r <- Corpus(DirSource(reut21578), readerControl =
2012 May 29
1
package tm: reading XML files
Dear fellow R users, I'm using the package tm for text mining, and have a problem with reading in a corpus from XML files. When I copy the example from "Introduction to the tm package" of the small reuters subset "crude", everything goes well, and I get a corpus with the required meta data. When I read in the entire reuters21578 corpus in XML format however (or a
2009 Jan 10
1
Help needed for Loading "tm" package
Howdy Gurus again Thanks to Tony.Breyal, I was able to writing the following script for analyzing a text document. But I got an error with "tm' package. I don't why I got the error from the R script below. I think I followed proccess of R tm manual. I use R v2.8.1. and tm_0.3-3.zip under Win XP. Thanks in advance, Kum Hwang > # setting directory > my.path
2009 Oct 15
1
Problems with rJava and tm packages
I am looking to do some text analysis using R and have run into some issues with some of the packages. Im not sure if its my goofy Vista OS or what but using R 2.8.1 i s relatively successful loading the text but the rJava package was messed up somehow: library(tm) > library(rJava) Error in if (!nchar(javahome)) stop("JAVA_HOME is not set and could not be determined from the
2009 Mar 30
1
Help with tm assocation analysis and Rgraphviz installation.
Help with tm assocation analysis and Rgraphviz installation. THANK YOU IN ADVANCE Question 1: I saved two txt file in C:\textfile And each txt file contents only one text column, and both have 100 records. I know term “research” occurs 49 times, so I want to find out which other words are correlated to this word, and I got tons of association ‘1’ . I tried other terms, and no
2010 Jan 22
1
Invalid input error in tm package
Hello, I am working on "tm" package. I have 2 pdf files saved in the directory D:/Files I issued the following commands (marked in red bold) for which I got some errors and warnings (marked in bold) *surgj <- Corpus(DirSource("D:/Files"), readerControl = list(language = "ansi"))* *Warning messages: 1: In readLines(y, encoding = x$Encoding) : incomplete final
2011 Jan 24
1
Extracting information from text data
Hi R-Users,   Thanks in advance.   I am using R-2.12.0 on Windows XP.   I am trying to produce an n X m matrix from text data stored in different files. Where n = number of words (say w1, w2, …, wn). M is the number of documents (say d1, d2, …, dm)   A. Using package tm   I am using package tm to do the job. I have provided the code below:   > my.corpus <- Corpus(DirSource(my.path),
2009 Oct 02
1
text mining
The following code is derived from a paper titled "Text Mining Infrastructure in R" (http://www.jstatsoft.org/v25/i05/paper). The example below seems to load some default documents for analysis, some sort of latin document. I cannot for the life of me figure out to load my own document let alone an entire corpus. I have searched the above documenet as well as related documentation.
2010 Feb 16
0
tm package
Hi, I'm using version 0.5.1 of tm package with R 2.10.1. It looks to me as if after the following reuters21578 <- Corpus(DirSource(corpusDir), readerControl = list(reader = readReut21578XMLasPlain)) reuters21578 <- tm_map(reuters21578, stripWhitespace) reuters21578 <- tm_map(reuters21578, tolower) reuters21578 <- tm_map(reuters21578, removePunctuation)
2011 Jun 27
0
Extracting certain text using tm package
I have used "tm" package to import a set of text documents using the following command: text <- Corpus(DirSource("."),readerControl = list(language ="ansi")) I would like to extract only a certain portion of the text in each document using certain keywords. For example, I would like to include all the text between key words <Start Text> and <End
2009 Apr 17
0
question about the Text Mining package tm
Hello. I am trying to work with the text mining package tm. I have a directory called textsTweet1 which contains three files short.txt myTextFile.txt myTextFile.csv short.txt contains one line: THE CAT IN THE HAT\n myTextFile contains some tweets from Twitter. The first few lines of myTextFile.txt are: @oliviamunn I miss a good Yakaniku...I miss Japan...I NEED COCO EVERYBODY. I NEED TO GET ON
2011 Feb 10
2
Help using "tm" text mining package - preprocessing
Thanks all for your help. I fear text mining is an abstract little corner of "R". I have imported 3228 text (.txt) files, each a news story, into R using [tm]: textd <- Corpus(DirSource("other/docs"), readerControl = list(reader =readPlain)) I can pre-process each individual document using tolower(textd[[1]]) however, when I try to run tmTolower() I get a no such command
2011 May 18
0
text mining problem using TM package
Hi, I’m using R (TM package) for text mining and I’m having problems filtering articles out of my data set by local meta data. Here is the code: *data <- ("C:/… /19970331")* * * * * *rs <- ReutersSource(data , encoding = "UTF-8")* *RC <- VCorpus(DirSource(data), readerControl = list(reader = readRCV1asPlain,* * language = "en_US",* * load =
2017 Nov 07
0
Error when attempting to see "Corpus" metadata
R Project I receive the erro highlighted in yellow when attempting to combine two Corpus, so I'm assuming I'm not combining the two variables (nb_pos and nb_neg) in the following line nb_all <- c(nb_pos,nb_neg,recursive=TRUE) # anyone see anything wrong with this line of code -------------------------------------------------- > library("tm") Loading
2009 Jan 15
1
How to Solve the Error( error:cannot allocate vector of size 1.1 Gb)
Hi, Gurus Thanks to your good helps, I have managed starting the use of a text mining package so called "tm" in R under the OS of Win XP. However, during running the tm package, I got another mine like memory problem. What is a the best way to solve this memory problem among increasing a physical RAM, or doing other recipes, etc? ############################### ###### my R
2011 Sep 05
0
Stemming functions only work on the last word of plain text documents
Hello, I want to use the SnowballStemmer on a collection of plain text documents. However, when I apply it to my corpus using the tm_map function it only stems the last word of each document (The problem is the for wordStem and stemDocument does not work at all).  An example: > path <- c("c:\path\to\directory")       # collection of plain text documents > corp <-
2011 May 21
1
DocumentTermMatrix error
Hi all, I have tried to create a DocumentTermMatrix with a tm package, but i get this error : Error in tolower(txt) : invalid input 'PROD Z LAHKO GNETNO MELJNO GLINO, ... in 'utf8towcs' I tried doing this as it is showed in : http://www.r-project.org/doc/Rnews/Rnews_2008-2.pdf (An Introduction to Text Mining), with this R code :
2013 Sep 26
0
R hangs at NGramTokenizer
Hi: I try to construct a Document-Term Meatrix from a corpus. The commands I used are: > library(parallel)> library(tm)> library(RWeka)> library(topicmodels)> library(RTextTools)> cl=makeCluster(detectCores())> invisible(clusterEvalQ(cl, library(tm)))> invisible(clusterEvalQ(cl, library(RWeka))) > invisible(clusterEvalQ(cl, library(topicmodels)))>
2012 Dec 13
2
Tamaño de la matriz de términos y memoria. Paquete TM
Hola a todos! Tengo algunos problemas con el tamaño de la matriz de términos que obtengo. Los comandos que utilizo son los siguientes: # carga librerias library(tm) library(wordcloud) library(Rstem) library(Snowball) # lee el documento UTF-8 y lo convierte a ASCII txt <-
2014 Jun 17
2
No es un problema de tm tienes doc.corpus vacío
No es un problema de tm ni de SnowfallC ni de mcapply (por el path utilizas linux, en windows mcapply según el manual no va bien) No defines bien los objetos que pasas. Pasas doc.corpus en lugar de corpus ( o asignas a corpus en lugar de a doc.corpus) . Depura los programas cuando salga un error de objeto, como te pone en el Error que pasas . Temporalmente lo tienes arreglado en