thr3ads.net - similar to: "error while usig "tm" package"

Displaying 20 results from an estimated 400 matches similar to: "error while usig "tm" package"

findFreqTerms vs minDocFreq in Package 'tm'

2011 Sep 12

findFreqTerms vs minDocFreq in Package 'tm'

I am using 'tm' package for text mining and facing an issue with finding the frequently occuring terms. From the definition it appears that findFreqTerms and minDocFreq are equivalent commands and both tries to identify the documents with terms appearing more than a specified threshold. However, I am getting drastically different results with both. I have given the results from both the

tm package- remove stowords failling

2010 Mar 31

tm package- remove stowords failling

Hi, I just noticed that by inspecting the matrix term that no all stopwords are removed, does someone know how to fix that? library(tm) data("crude") d<-tm_map(crude, removeWords, stopwords(language='english')) dt<-DocumentTermMatrix(d,control=list(minWordLength=3, minDocFreq=2)) inspect( dt) I am using R version 2.10, tm package 0.5-3 cheers Welma [[alternative HTML

new to R: don't understand errors

2006 Oct 03

new to R: don't understand errors

Hello all, I'm brand new to the use of R, and I'm trying to quickly learning the rudiments for a couple of projects here at work. I'm working with the lsa package and trying to generate various semantic spaces. I seem to do well with small collections of clean text files, but now that I am trying to work with larger collections of less than perfection files, I'm getting errors

SVD Memory Issue

2011 Sep 13

SVD Memory Issue

I am trying to perform Singular Value Decomposition (SVD) on a Term Document Matrix I created using the 'tm' package. Eventually I want to do a Latent Semantic Analysis (LSA). There are 5677 documents with 771 terms (the DTM is 771 x 5677). When I try to do the SVD, it runs out of memory. I am using a 12GB Dual core Machine with Windows XP and don't think I can increase the memory

Problem with lsa package (data.frame) on Windows XP

2007 Aug 18

Problem with lsa package (data.frame) on Windows XP

Dear R team, The following piece of code (to use the lsa package) works fine on my mac os x, but when I run the same code on Windows XP, it doesn't work any more. ### code: library("lsa") matrix1 = textmatrix("C:\\Documents and Settings\\tine stalmans.TINE. 000\\LSA\\cuentos\\", stemming=TRUE, language="spanish", minWordLength=2, minDocFreq=1,

package "tm" fails to remove "the" with remove stopwords

2009 Nov 12

package "tm" fails to remove "the" with remove stopwords

I am using code that previously worked to remove stopwords using package "tm". Even manually adding "the" to the list does not work to remove "the". This package has undergone extensive redevelopment with changes to the function syntax, so perhaps I am just missing something. Please see my simple example, output, and sessionInfo() below. Thanks! Mark require(tm)

Solution to: Error "... x must be atomic" when using lsa (latent semantic analysis) package

2008 Mar 25

Solution to: Error "... x must be atomic" when using lsa (latent semantic analysis) package

In case someone else runs into this, I found the problem, it was related to having some zero-length text files. Make sure you have valid (non-empty) data files for loading into the document-term matrix. Alex ---------- Forwarded message ---------- From: Alex McKenzie <ahmckenzie@gmail.com> Date: Mar 25, 2008 2:07 AM Subject: Error "... x must be atomic" when using lsa (latent

FW: new to R: don't understand errors

2006 Oct 04

FW: new to R: don't understand errors

Hello Jerad, > It was suggested I contact you for possible help with this issue. Well, > as you can see for the emails below, that is what I was told at R-help. > Any insight to my lsa problems (also listed below) would be of great > help. from what I see, the problem probably indeed lies within the textfiles: for performance reasons, it was not possible to include any

Error "... x must be atomic" when using lsa (latent semantic analysis) package

2008 Mar 25

Error "... x must be atomic" when using lsa (latent semantic analysis) package

Hello, I'm trying to use the "lsa" (latent semantic analysis) package, and running into a problem that seems to be related to the number of documents being processed. Here's the code I'm running (after loading the lsa and rstem packages), and the error message: > SnippetsPath <- "c:\\OED\\AuditExplain\\" # path where to find text snippets >

Cannot allocate a vector of size...

2020 Feb 07

Cannot allocate a vector of size...

Buenas tardes, Estoy haciendo un análisis de contenido con el paquete tm. A la hora de ejecutar este código: tdm<-TermDocumentMatrix(corpus,control=list(weighting =weightTf)) tdm.reviews.m<-as.matrix(tdm) La primera línea sí me la ejecuta bien pero en la segunda tengo este error: Error: cannot allocate vector of size 14.0 Gb ¿Cómo puedo corregirlo? Estoy usando la versión de 64bits de

tm 0.1 uploaded to CRAN

2007 Jan 11

tm 0.1 uploaded to CRAN

Dear useRs, a first version of tm has just been released on CRAN. tm provides a sophisticated framework for text mining applications within R. It offers functionality for managing text documents, abstracts the process of document manipulation and eases the usage of heterogeneous text formats in R. An advanced metadata management is implemented for collections of text documents to alleviate the

tm 0.1 uploaded to CRAN

2007 Jan 11

tm 0.1 uploaded to CRAN

Cannot allocate a vector of size...

2020 Feb 10

Cannot allocate a vector of size...

Muchas gracias Xabier. He intentaddo trabajar con la sparse matrix pero al pasar tdm a matriz me dice también que "cannot allocate a vector of size 12 gb". He hecho tdm<-as.matrix(tdm) ¿Está bien hecho eso para trabajar con la sparse matrix? Gracias! El Lun, 10 de Febrero de 2020, 16:15, Xavier-Andoni Tibau Alberdi escribió: > La respuesta de Carlos creo que es mucho mas

Cannot allocate a vector of size...

2020 Feb 07

Cannot allocate a vector of size...

Es la primera vez que trabajo con este tipo de datos...No se si se puede dividir esa matriz. ¿Cómo lo podría hacer? Muchas gracias! El Vie, 7 de Febrero de 2020, 17:55, Xavier-Andoni Tibau Alberdi escribió: > Significa que tus datos són muy grandes y no se pueden guardar en la RAM. > Tienes alternativas para dividir la matriz? > > El vie., 7 feb. 2020 17:26, <miriam.alzate en

tm_map help

2012 Feb 26

tm_map help

Hi all, I am trying to do some text mining with twitter and I am getting the error: Error in structure(names(sapply(possibleCompletions, "[", 1)), names = x) : 'names' attribute [1] must be the same length as the vector [0] When I use tm_map. Has anyone had/seen this error before? The code I have is shown below and this error only occurs with #qantas, hashtags like #asx,

Cannot allocate a vector of size...

2020 Feb 10

Cannot allocate a vector of size...

Buenas, El archivo de R ocupa 33 megas. La matriz que quiero construir cupa 14 gb. En el disco local (C) tengo 400 gb disponibles de 670. No estoy muy puesta en trabajar con este tipo de datos. ¿Qué diferencia es trabajar con data.frame? Gracias! El Vie, 7 de Febrero de 2020, 18:07, Xavier-Andoni Tibau Alberdi escribió: > Depende de la operació que quieras hacer con la matriz. Si quitas filas

tm package: handling contractions

2012 Jan 27

tm package: handling contractions

I tried making a wordcloud of Obama's State of the Union address using the tm package to process the text sotu <- scan(file="c:/R/data/sotu2012.txt", what="character") sotu <- tolower(sotu) corp <-Corpus(VectorSource(paste(sotu, collapse=" "))) corp <- tm_map(corp, removePunctuation) corp <- tm_map(corp, stemDocument) corp <- tm_map(corp,

using package tm to find phrases

2009 Aug 13

using package tm to find phrases

I am using the package "tm" for text-mining of abstracts and would like to use it to find instances of gene names that may contain white space. For instance "gene regulatory protein 1". The default behavior of tm is to parse this into 4 separate words, but I would like to use the class constructor "dictionary" to define phrases such as just mentioned. Is this

Troubles with stemming (tm + Snowball packages) under MacOS

2012 Jan 13

Troubles with stemming (tm + Snowball packages) under MacOS

Dear all, I have some troubles using the stemming algorithm provided by the tm (text mining) + Snowball packages. Here is my config: MacOS 10.5 R 2.12.0 / R 2.13.1 / R 2.14.1 (I have tried several versions) I have installed all the needed packages (tm, rJava, rWeka, Snowball) + dependencies. I have desactivated AWT (like written in

data mining/text mining?

2007 Jun 08

data mining/text mining?

Dear R-user, Could anybody tell me of the key difference between data mining and text mining? Please make a list for packages about data/text mining. And give me an example of text mining with R (any relating materials will be highly appreciated), because a vignette written by Ingo Feinerer seems too concise for me. Thanks _____________________________________________ Dr.Ruixin ZHU Shanghai

similar to: error while usig "tm" package