search for: findfreqterms

Displaying 6 results from an estimated 6 matches for "findfreqterms".

2011 Sep 12
1
findFreqTerms vs minDocFreq in Package 'tm'
I am using 'tm' package for text mining and facing an issue with finding the frequently occuring terms. From the definition it appears that findFreqTerms and minDocFreq are equivalent commands and both tries to identify the documents with terms appearing more than a specified threshold. However, I am getting drastically different results with both. I have given the results from both the commands below: findFreqTerms identifies 3140 words that appea...
2011 Nov 09
0
Min Frequency in findFreqTerms
I am using 'tm' package for text mining. I use the function findFreqTerms to obtain the frequent words based on their frequency in the term document matrix. The following is the example given in the help page of this function: library("tm") data("crude") tdm <- TermDocumentMatrix(crude) findFreqTerms(tdm, 2, 3) The first three columns of the doc...
2008 Oct 18
2
sorting matrix output alphabetically
...ould be generated. This would be quite long as the matrix is 176 x 2796, so I was hoping I could save the output in a csv file, which could be manipulated either in R or EXCEL. Any advice as to how I could do this is appreciated, Bob rec.matrix <- TermDocMatrix(recdata) recdata.matrix <- findFreqTerms(rec.matrix, 5, Inf) # creates a matrix new.matrix <- as(Data(rec.matrix), "matrix") tot <- colSums(new.matrix) sort(tot) unavail unwilling wheels 1 1 1 evans...
2012 Feb 29
1
TM reader with text
Hello everybody, I work, I try, with TM but I have a problem with some special words in french. I think this is due to the manner to transform PDF to text, but I'm not perfectly sure. Let's see to the example : findFreqTerms(tdm1,30) [33] "<U+F0A3>" "<U+FB01>n" "<U+FB01>nancement" "<U+FB01>nancier" "<U+FB01>nanci?re" "<U+FB01>nanci?res" "<U+FB01>nanciers" "<U+FB0...
2012 Dec 13
2
Tamaño de la matriz de términos y memoria. Paquete TM
...anish") # crea matriz de terminos #a) términos como filas y documentos como columnas dtm <- DocumentTermMatrix(corpus) inspect(dtm[1000:1005,1000:1005]) # Términos con frecuencia mínima igual a 30: findFreqTerms(dtm, lowfreq=30) # remueve términos con baja frecuencia inspect(removeSparseTerms(dtm, 0.4)) # nube de palabras m <- as.matrix(dtm) v <- sort(rowSums(m),decreasing=TRUE) df <- data.frame(word = names(v),freq=v) wordcloud(df$wo...
2014 Jul 29
2
wordcloud y tabla de palabras [Avanzando]
Buenas tardes grupo. Saludos cordiales Carlos J., muchas gracias por tu orientación. Efectivamente, me había dado cuenta que la razón por la que no se aplicaba colnames era porque no tenía columnas. La cuestión es que no logro visualizar completamente/claramente en qué parte del proceso de creación del corpus se puede hacer. Sin embargo, siguiendo el ejemplo de