similar to: topicmodels error

Displaying 20 results from an estimated 1000 matches similar to: "topicmodels error"

2014 Jul 25
wordcloud y tabla de palabras
Buenas noches grupo. Saludos cordiales. He seguido en la búsqueda de una forma que me permita realizar la comparación de dos documentos pertenecientes a los años 2005 y 2013, y que pueda representar finalmente con wordcloud y con una table en la que las columnas sean los años de cada informe "2005" y "2013", y las filas sean las palabras con la frecuencia de cada una de ellas
2014 Jul 28
wordcloud y tabla de palabras
Hola, La referencia (gracias por proporcionarla) que has incluido es bastante clara y se puede seguir. ¿Has podido sobre tus dos discursos utilizar la misma lógica? La forma de salir de dudas, para empezar, es que adjuntaras el código que estás empleando por ver si hay algún error evidente. Aunque la forma adecuada para que te podamos ayudar es con un ejemplo reproducible: código + datos.
2014 Jul 29
wordcloud y tabla de palabras [Avanzando]
Buenas tardes grupo. Saludos cordiales Carlos J., muchas gracias por tu orientación. Efectivamente, me había dado cuenta que la razón por la que no se aplicaba colnames era porque no tenía columnas. La cuestión es que no logro visualizar completamente/claramente en qué parte del proceso de creación del corpus se puede hacer. Sin embargo, siguiendo el ejemplo de
2011 May 21
DocumentTermMatrix error
Hi all, I have tried to create a DocumentTermMatrix with a tm package, but i get this error : Error in tolower(txt) : invalid input 'PROD Z LAHKO GNETNO MELJNO GLINO, ... in 'utf8towcs' I tried doing this as it is showed in : (An Introduction to Text Mining), with this R code :
2011 May 11
filtering out unwanted words in a Term Document Matrix
Hi Y'all, I am using the text mining package (tm). I am trying to filter out all of the words in a Term Document Matrix that are not in a list of words that I am interested in. I am using the following code: z<-tm_intersect(txt.dtm, c("communications", "safety", "climate", "blood", "surface", "cleanliness",
2011 Sep 13
SVD Memory Issue
I am trying to perform Singular Value Decomposition (SVD) on a Term Document Matrix I created using the 'tm' package. Eventually I want to do a Latent Semantic Analysis (LSA). There are 5677 documents with 771 terms (the DTM is 771 x 5677). When I try to do the SVD, it runs out of memory. I am using a 12GB Dual core Machine with Windows XP and don't think I can increase the memory
2011 May 20
DocumentTermMatrix - text minig
Hi All, I have a Data.frame that looks like that one below. I would like to do some text mining on it to possibly find some patterns between Opis, ACklasifikacija and Vodja. I looked over a tm package which loks promissing, more specifically DocumentTermMatrix or TermDocumentMatrix. But I can not figure out how to change my data from data.frame to Corpus or VCorpus. Globina
2013 Sep 26
R hangs at NGramTokenizer
Hi: I try to construct a Document-Term Meatrix from a corpus. The commands I used are: > library(parallel)> library(tm)> library(RWeka)> library(topicmodels)> library(RTextTools)> cl=makeCluster(detectCores())> invisible(clusterEvalQ(cl, library(tm)))> invisible(clusterEvalQ(cl, library(RWeka))) > invisible(clusterEvalQ(cl, library(topicmodels)))>
2011 Sep 12
findFreqTerms vs minDocFreq in Package 'tm'
I am using 'tm' package for text mining and facing an issue with finding the frequently occuring terms. From the definition it appears that findFreqTerms and minDocFreq are equivalent commands and both tries to identify the documents with terms appearing more than a specified threshold. However, I am getting drastically different results with both. I have given the results from both the
2015 Apr 12
Loop sobre muchos data frames
Jorge, estimados colaboradores de R-help Estuve tratando de utilizar un script para uno de los pasos en mi análisis, que es transformar cada uno de los corpus en mi espacio de trabajo en un objeto TermDocumentMatrix Tengo un vector llamado bNames que lista todos los corpus que quiero pasar a TDM, y construí los siguientes comandos: tdm.n1 <- vector('list', length = length(bNames))
2009 Nov 12
package "tm" fails to remove "the" with remove stopwords
I am using code that previously worked to remove stopwords using package "tm". Even manually adding "the" to the list does not work to remove "the". This package has undergone extensive redevelopment with changes to the function syntax, so perhaps I am just missing something. Please see my simple example, output, and sessionInfo() below. Thanks! Mark require(tm)
2011 Apr 11
I have been using the rtmvt function in the {tmvtnorm} package i'm getting the warning: "Acceptance rate is very low and rejection sampling becomes inefficient. Consider using Gibbs sampling." but i AM specifying the gibbs algorithm!!: rtmvt(M, mean=q[,,i,j], sigma=((u[i,j] + nu[i])/(p+nu[i]))*delta[,,i], df=ceiling(nu[i]+p), lower=c(0,0), algorithm="gibbs") Any
2009 Aug 17
Bayesian data analysis - help with sampler function
I have downloaded the Umacs (Universal Markov chain sampler) and submitted the following sample code from Kerman and Gelman.   s <-Sampler( J=8, sigma.y  =c(15,10,16,11,9,11,10,18),           y  =c(28, 8,-3,7,-1,1,18,12),      theta =Gibbs(theta.update,theta.init),           V =Gibbs(V.update,mu.init),         mu =Gibbs(mu.update,mu.init),         tau =Gibbs(tau.update,tau.init),       
2004 Mar 19
R_qsort_int_I() error
Hi, I want to use R_qsort_int_I() in my C function, but getting the following error. It looks like there is a conflict between Rmath.h, which I use to generate random numbers, and R_ext/Boolean.h I would appreciate any help to fix this problem. gcc -ansi -g -o Gibbs gibbs.c subroutines.o rand.o vector.o -lm -lRmath -llapack -lblas -lfrtbegin -lg2c -lm -shared-libgcc In file included from
2006 Aug 11
about MCMC pack again...
Hello, thank you very much for your previous answers about the C++ code. I am interested in the application of the Gibbs Sampler in the IRT models, so in the function MCMCirt1d and MCMCirtkd. I've found the C++ source codes, as you suggested, but I cannot find anything about the Gibbs Sampler. All the files are for the Metropolis algorithm. Maybe I am not able to read them very well, by the
2020 Feb 10
Cannot allocate a vector of size...
Muchas gracias Xabier. He intentaddo trabajar con la sparse matrix pero al pasar tdm a matriz me dice también que "cannot allocate a vector of size 12 gb". He hecho tdm<-as.matrix(tdm) ¿Está bien hecho eso para trabajar con la sparse matrix? Gracias! El Lun, 10 de Febrero de 2020, 16:15, Xavier-Andoni Tibau Alberdi escribió: > La respuesta de Carlos creo que es mucho mas
2020 Feb 07
Cannot allocate a vector of size...
Buenas tardes, Estoy haciendo un análisis de contenido con el paquete tm. A la hora de ejecutar este código: tdm<-TermDocumentMatrix(corpus,control=list(weighting =weightTf))<-as.matrix(tdm) La primera línea sí me la ejecuta bien pero en la segunda tengo este error: Error: cannot allocate vector of size 14.0 Gb ¿Cómo puedo corregirlo? Estoy usando la versión de 64bits de
2011 Feb 07
Question about checkTmvArgs function in rtmvnorm (package tmvtnorm)
Hello! I was wondering if it's possible to see the actual code of checkTmvArgs function that is part of the code for rtmvnorm (which is below - I just typed "rtmvnorm" on the prompt). I get an error: Error in checkTmvArgs(mean, sigma, lower, upper) : sigma must be a symmetric matrix At the same time I am pretty sure that the matrix I am passing as sigma is a var-covar matrix
2013 Oct 08
how to check the accuracy for maxent ?
I was going through this example of maxent use: # LOAD LIBRARY library(maxent) # READ THE DATA, PREPARE THE CORPUS, and CREATE THE MATRIX data <- read.csv(system.file("data/NYTimes.csv.gz",package="maxent")) corpus <- Corpus(VectorSource(data$Title[1:150])) matrix <- DocumentTermMatrix(corpus) # TRAIN/PREDICT
2010 Mar 31
tm package- remove stowords failling
Hi, I just noticed that by inspecting the matrix term that no all stopwords are removed, does someone know how to fix that? library(tm) data("crude") d<-tm_map(crude, removeWords, stopwords(language='english')) dt<-DocumentTermMatrix(d,control=list(minWordLength=3, minDocFreq=2)) inspect( dt) I am using R version 2.10, tm package 0.5-3 cheers Welma [[alternative HTML