similar to: DocumentTermMatrix error

Displaying 20 results from an estimated 110 matches similar to: "DocumentTermMatrix error"

2010 Nov 02
2
count different words in a field
Hi all, I started to ask this in the other post, but it is off topis...so here it is again. I have a data.frame (created with the helpof this mail list) that looks like this : 'data.frame': 22801 obs. of 15 variables: $ V1 : chr "HUMUS" "SLABO" "MALO" "SLABO" ... $ V2 : chr "IN" "GRANULIRAN"
2011 May 20
1
DocumentTermMatrix - text minig
Hi All, I have a Data.frame that looks like that one below. I would like to do some text mining on it to possibly find some patterns between Opis, ACklasifikacija and Vodja. I looked over a tm package which loks promissing, more specifically DocumentTermMatrix or TermDocumentMatrix. But I can not figure out how to change my data from data.frame to Corpus or VCorpus. Globina
2011 Feb 10
2
Help using "tm" text mining package - preprocessing
Thanks all for your help. I fear text mining is an abstract little corner of "R". I have imported 3228 text (.txt) files, each a news story, into R using [tm]: textd <- Corpus(DirSource("other/docs"), readerControl = list(reader =readPlain)) I can pre-process each individual document using tolower(textd[[1]]) however, when I try to run tmTolower() I get a no such command
2005 Jul 22
5
1.0-test79
http://dovecot.org/test/ Now checks that field alignmentations are in indexes as they're expected. test78 crashed if it was wrong, earlier versions ignored the problem (and crashed with 64bit systems). Now if it's wrong, it prints error to log file and recreates the index. That means you probably should delete all dovecot.index files to avoid tons of errors in log files. Only mbox users
2012 Jul 05
1
Invalid input in 'utf8towcs' when saving script file
Hello, When I try to save my script file before closing the R console session I get this error. Error: invalid input 'C:\Documents and Settings\xxxx\xxxx\datafile' in 'utf8towcs' Does anyone know what can cause this error? I use the RGui (R verison 2.14.0) in Windows and the problem appears when I try to re-save the script file. Using Save as and rename it works. Kind
2010 Oct 11
2
topicmodels error
I try to fit a LDA model to a TermDocumentMatrix with the topicmodels package... but R says: > Error in LDA(TDM, k = k, method = "Gibbs", control = list(seed = SEED, : > x is of class ?TermDocumentMatrix??simple_triplet_matrix? > class(TDM) > [1] "TermDocumentMatrix" "simple_triplet_matrix" I try to use a matrix... but don't work: > MAT
2009 Nov 12
2
package "tm" fails to remove "the" with remove stopwords
I am using code that previously worked to remove stopwords using package "tm". Even manually adding "the" to the list does not work to remove "the". This package has undergone extensive redevelopment with changes to the function syntax, so perhaps I am just missing something. Please see my simple example, output, and sessionInfo() below. Thanks! Mark require(tm)
2012 May 28
6
importing multiple file form folder
Hi all, I have a set of files (which is growing) in a folder. The files are text files... The form of files is such : ...with numbers for Length (m) going up to 2000 ... Anyway...i just need the data from first two columns (length (m) and Temperature (C)), and no data before that... This Lenght (m) values are always the same. My final dataset should lokk like this : column 1 as Length(m) ;
2012 May 31
1
possible bug in "R Editor"
Dear all, I clicked "File-New Script" to open a R Editor, typed some commands in it and then saved it to a file. If the location where I tried to save the script contained Chinese Character, R Editor complained, Error: invalid input 'E:\Some.Chinese.Characters\new_file.R' in 'utf8towcs' > sessionInfo() R version 2.15.0 (2012-03-30) Platform: i386-pc-mingw32/i386
2008 May 01
1
Locale problem with umlauts in factor levels in 2.7.0 (patched) from grid or lattice
With 2.7.0 patched (not tested with 2.0.0), I get an error message in a program that ran correctly in R 2.6.2 when the grouping factor of a stripplot contains an Umlaut: I am aware that there are a few locale-changes in R 2.7.0, but I could not easily locate who's at fault Dieter library(lattice) dt = data.frame(x=rnorm(100),y=1:100,levs= as.factor(c("Gru","Gr?")))
2013 Oct 08
1
how to check the accuracy for maxent ?
I was going through this example of maxent use: http://cran.r-project.org/web/packages/maxent/maxent.pdf # LOAD LIBRARY library(maxent) # READ THE DATA, PREPARE THE CORPUS, and CREATE THE MATRIX data <- read.csv(system.file("data/NYTimes.csv.gz",package="maxent")) corpus <- Corpus(VectorSource(data$Title[1:150])) matrix <- DocumentTermMatrix(corpus) # TRAIN/PREDICT
2013 Sep 26
0
R hangs at NGramTokenizer
Hi: I try to construct a Document-Term Meatrix from a corpus. The commands I used are: > library(parallel)> library(tm)> library(RWeka)> library(topicmodels)> library(RTextTools)> cl=makeCluster(detectCores())> invisible(clusterEvalQ(cl, library(tm)))> invisible(clusterEvalQ(cl, library(RWeka))) > invisible(clusterEvalQ(cl, library(topicmodels)))>
2010 Mar 31
1
tm package- remove stowords failling
Hi, I just noticed that by inspecting the matrix term that no all stopwords are removed, does someone know how to fix that? library(tm) data("crude") d<-tm_map(crude, removeWords, stopwords(language='english')) dt<-DocumentTermMatrix(d,control=list(minWordLength=3, minDocFreq=2)) inspect( dt) I am using R version 2.10, tm package 0.5-3 cheers Welma [[alternative HTML
2009 Nov 01
4
convert list to Dataframe
Hi. I have a huge list called twitter: > dim(twitter) NULL > str(twitter) List of 1 $ :Classes 'PlainTextDocument', 'TextDocument', 'character' atomic [1:35575] 11999;10:47:14;20;10;2009;ObamaLouverture;Trails Mixed Lessons For Governance From Campaigner-in-chief: President obama jumps campaign 09 tuesday..
2012 Dec 13
2
Tamaño de la matriz de términos y memoria. Paquete TM
Hola a todos! Tengo algunos problemas con el tamaño de la matriz de términos que obtengo. Los comandos que utilizo son los siguientes: # carga librerias library(tm) library(wordcloud) library(Rstem) library(Snowball) # lee el documento UTF-8 y lo convierte a ASCII txt <-
2018 Aug 06
0
making my own group repo - Re: Back to Xfce
I tried making my own repo with just the groups in them and I got an error.... I used: #createrepo -g /root/mygroups.xml /root/myrepo Saving Primary metadata Saving file lists metadata Saving other metadata Generating sqlite DBs Sqlite DBs complete Where /root/mygroup.xml is attached below. Then I made: cat > /etc/yum.repos.d/myrepo.repo << EOF [myrepo] name=My repo for armhfp
2018 Aug 06
2
Back to Xfce
On 08/06/2018 11:51 AM, Tony Schreiner wrote: > On Mon, Aug 6, 2018 at 11:33 AM Robert Moskowitz <rgm at htt-consult.com> > wrote: > >> >> On 08/06/2018 11:11 AM, Tony Schreiner wrote: >>> On Mon, Aug 6, 2018 at 10:55 AM Robert Moskowitz <rgm at htt-consult.com> >>> wrote: >>> >>>> Nicolas, >>>> >>>>
2011 May 26
3
text mining
Hi, how can I import a document whose type is. "txt" using the package tm? it is the command to know that my document is not placed in the library package tm. thanks. -- View this message in context: http://r.789695.n4.nabble.com/text-mining-tp3552221p3552221.html Sent from the R help mailing list archive at Nabble.com.
2011 Jun 07
1
R results explanation
Hi all, this might be a stupid question, but still. Everytime i find some new function it's prettty easy to understand how to use the syntax and to perform a text. Even the general idea of what the function does is pretty easy to understand, but i can not find an explanation (detailed explanation) of the R output for each function. For example, a function fitdistr() in MASS package i
2010 Jan 22
1
Invalid input error in tm package
Hello, I am working on "tm" package. I have 2 pdf files saved in the directory D:/Files I issued the following commands (marked in red bold) for which I got some errors and warnings (marked in bold) *surgj <- Corpus(DirSource("D:/Files"), readerControl = list(language = "ansi"))* *Warning messages: 1: In readLines(y, encoding = x$Encoding) : incomplete final