thr3ads.net - similar to: "DocumentTermMatrix error"

Displaying 20 results from an estimated 110 matches similar to: "DocumentTermMatrix error"

2010 Nov 02

count different words in a field

Hi all, I started to ask this in the other post, but it is off topis...so here it is again. I have a data.frame (created with the helpof this mail list) that looks like this : 'data.frame': 22801 obs. of 15 variables: $ V1 : chr "HUMUS" "SLABO" "MALO" "SLABO" ... $ V2 : chr "IN" "GRANULIRAN"

DocumentTermMatrix - text minig

2011 May 20

DocumentTermMatrix - text minig

Hi All, I have a Data.frame that looks like that one below. I would like to do some text mining on it to possibly find some patterns between Opis, ACklasifikacija and Vodja. I looked over a tm package which loks promissing, more specifically DocumentTermMatrix or TermDocumentMatrix. But I can not figure out how to change my data from data.frame to Corpus or VCorpus. Globina

Help using "tm" text mining package - preprocessing

2011 Feb 10

Help using "tm" text mining package - preprocessing

Thanks all for your help. I fear text mining is an abstract little corner of "R". I have imported 3228 text (.txt) files, each a news story, into R using [tm]: textd <- Corpus(DirSource("other/docs"), readerControl = list(reader =readPlain)) I can pre-process each individual document using tolower(textd[[1]]) however, when I try to run tmTolower() I get a no such command

1.0-test79

2005 Jul 22

1.0-test79

http://dovecot.org/test/ Now checks that field alignmentations are in indexes as they're expected. test78 crashed if it was wrong, earlier versions ignored the problem (and crashed with 64bit systems). Now if it's wrong, it prints error to log file and recreates the index. That means you probably should delete all dovecot.index files to avoid tons of errors in log files. Only mbox users

Invalid input in 'utf8towcs' when saving script file

2012 Jul 05

Invalid input in 'utf8towcs' when saving script file

Hello, When I try to save my script file before closing the R console session I get this error. Error: invalid input 'C:\Documents and Settings\xxxx\xxxx\datafile' in 'utf8towcs' Does anyone know what can cause this error? I use the RGui (R verison 2.14.0) in Windows and the problem appears when I try to re-save the script file. Using Save as and rename it works. Kind

topicmodels error

2010 Oct 11

topicmodels error

I try to fit a LDA model to a TermDocumentMatrix with the topicmodels package... but R says: > Error in LDA(TDM, k = k, method = "Gibbs", control = list(seed = SEED, : > x is of class ?TermDocumentMatrix??simple_triplet_matrix? > class(TDM) > [1] "TermDocumentMatrix" "simple_triplet_matrix" I try to use a matrix... but don't work: > MAT

package "tm" fails to remove "the" with remove stopwords

2009 Nov 12

package "tm" fails to remove "the" with remove stopwords

I am using code that previously worked to remove stopwords using package "tm". Even manually adding "the" to the list does not work to remove "the". This package has undergone extensive redevelopment with changes to the function syntax, so perhaps I am just missing something. Please see my simple example, output, and sessionInfo() below. Thanks! Mark require(tm)

importing multiple file form folder

2012 May 28

importing multiple file form folder

Hi all, I have a set of files (which is growing) in a folder. The files are text files... The form of files is such : ...with numbers for Length (m) going up to 2000 ... Anyway...i just need the data from first two columns (length (m) and Temperature (C)), and no data before that... This Lenght (m) values are always the same. My final dataset should lokk like this : column 1 as Length(m) ;

possible bug in "R Editor"

2012 May 31

possible bug in "R Editor"

Dear all, I clicked "File-New Script" to open a R Editor, typed some commands in it and then saved it to a file. If the location where I tried to save the script contained Chinese Character, R Editor complained, Error: invalid input 'E:\Some.Chinese.Characters\new_file.R' in 'utf8towcs' > sessionInfo() R version 2.15.0 (2012-03-30) Platform: i386-pc-mingw32/i386

Locale problem with umlauts in factor levels in 2.7.0 (patched) from grid or lattice

2008 May 01

Locale problem with umlauts in factor levels in 2.7.0 (patched) from grid or lattice

With 2.7.0 patched (not tested with 2.0.0), I get an error message in a program that ran correctly in R 2.6.2 when the grouping factor of a stripplot contains an Umlaut: I am aware that there are a few locale-changes in R 2.7.0, but I could not easily locate who's at fault Dieter library(lattice) dt = data.frame(x=rnorm(100),y=1:100,levs= as.factor(c("Gru","Gr?")))

how to check the accuracy for maxent ?

2013 Oct 08

how to check the accuracy for maxent ?

I was going through this example of maxent use: http://cran.r-project.org/web/packages/maxent/maxent.pdf # LOAD LIBRARY library(maxent) # READ THE DATA, PREPARE THE CORPUS, and CREATE THE MATRIX data <- read.csv(system.file("data/NYTimes.csv.gz",package="maxent")) corpus <- Corpus(VectorSource(data$Title[1:150])) matrix <- DocumentTermMatrix(corpus) # TRAIN/PREDICT

R hangs at NGramTokenizer

2013 Sep 26

R hangs at NGramTokenizer

Hi: I try to construct a Document-Term Meatrix from a corpus. The commands I used are: > library(parallel)> library(tm)> library(RWeka)> library(topicmodels)> library(RTextTools)> cl=makeCluster(detectCores())> invisible(clusterEvalQ(cl, library(tm)))> invisible(clusterEvalQ(cl, library(RWeka))) > invisible(clusterEvalQ(cl, library(topicmodels)))>

tm package- remove stowords failling

2010 Mar 31

tm package- remove stowords failling

Hi, I just noticed that by inspecting the matrix term that no all stopwords are removed, does someone know how to fix that? library(tm) data("crude") d<-tm_map(crude, removeWords, stopwords(language='english')) dt<-DocumentTermMatrix(d,control=list(minWordLength=3, minDocFreq=2)) inspect( dt) I am using R version 2.10, tm package 0.5-3 cheers Welma [[alternative HTML

convert list to Dataframe

2009 Nov 01

convert list to Dataframe

Hi. I have a huge list called twitter: > dim(twitter) NULL > str(twitter) List of 1 $ :Classes 'PlainTextDocument', 'TextDocument', 'character' atomic [1:35575] 11999;10:47:14;20;10;2009;ObamaLouverture;Trails Mixed Lessons For Governance From Campaigner-in-chief: President obama jumps campaign 09 tuesday..

Tamaño de la matriz de términos y memoria. Paquete TM

2012 Dec 13

Tamaño de la matriz de términos y memoria. Paquete TM

Hola a todos! Tengo algunos problemas con el tamaño de la matriz de términos que obtengo. Los comandos que utilizo son los siguientes: # carga librerias library(tm) library(wordcloud) library(Rstem) library(Snowball) # lee el documento UTF-8 y lo convierte a ASCII txt <-

making my own group repo - Re: Back to Xfce

2018 Aug 06

making my own group repo - Re: Back to Xfce

I tried making my own repo with just the groups in them and I got an error.... I used: #createrepo -g /root/mygroups.xml /root/myrepo Saving Primary metadata Saving file lists metadata Saving other metadata Generating sqlite DBs Sqlite DBs complete Where /root/mygroup.xml is attached below. Then I made: cat > /etc/yum.repos.d/myrepo.repo << EOF [myrepo] name=My repo for armhfp

Back to Xfce

2018 Aug 06

Back to Xfce

On 08/06/2018 11:51 AM, Tony Schreiner wrote: > On Mon, Aug 6, 2018 at 11:33 AM Robert Moskowitz <rgm at htt-consult.com> > wrote: > >> >> On 08/06/2018 11:11 AM, Tony Schreiner wrote: >>> On Mon, Aug 6, 2018 at 10:55 AM Robert Moskowitz <rgm at htt-consult.com> >>> wrote: >>> >>>> Nicolas, >>>> >>>>

text mining

2011 May 26

text mining

Hi, how can I import a document whose type is. "txt" using the package tm? it is the command to know that my document is not placed in the library package tm. thanks. -- View this message in context: http://r.789695.n4.nabble.com/text-mining-tp3552221p3552221.html Sent from the R help mailing list archive at Nabble.com.

R results explanation

2011 Jun 07

R results explanation

Hi all, this might be a stupid question, but still. Everytime i find some new function it's prettty easy to understand how to use the syntax and to perform a text. Even the general idea of what the function does is pretty easy to understand, but i can not find an explanation (detailed explanation) of the R output for each function. For example, a function fitdistr() in MASS package i

Invalid input error in tm package

2010 Jan 22

Invalid input error in tm package

Hello, I am working on "tm" package. I have 2 pdf files saved in the directory D:/Files I issued the following commands (marked in red bold) for which I got some errors and warnings (marked in bold) *surgj <- Corpus(DirSource("D:/Files"), readerControl = list(language = "ansi"))* *Warning messages: 1: In readLines(y, encoding = x$Encoding) : incomplete final

similar to: DocumentTermMatrix error