search for: vcorpus

Displaying 14 results from an estimated 14 matches for "vcorpus".

Did you mean: corpus
2009 Sep 15
2
S3 objects in S4 slots
..., I am the maintainer of the stringkernels package and have come across a problem with using S3 objects in my S4 classes. Specifically, I have an S4 class with a slot that takes a text corpus as a list of character vectors. tm (version 0.5) saves corpora as lists with a class attribute of c("VCorpus", "Corpus", "list"). I don't actually need the class-specific attributes, I only care about the list itself. Here's a simplified example of my problem: > setClass("testclass", representation(slot="list")) [1] "testclass" > a = l...
2015 Apr 10
5
Loop sobre muchos data frames
...h = length(names)) #names el el vector donde ya tenía almacenada la lista de txt's for(i in seq_along(txt)){ txt[[i]] <- Corpus(VectorSource(names[i])) } obtengo el objeto txt: > class(txt) [1] "list" si extraigo solamente el primer objeto de esa lista: > txt[[1]] <<VCorpus (documents: 1, metadata (corpus/indexed): 0/0)>> si quiero ver el contenido del primer copus > inspect(txt[[1]]) <<VCorpus (documents: 1, metadata (corpus/indexed): 0/0)>> [[1]] <<PlainTextDocument (metadata: 7)>> qB001.txt me informa cosas sobre el objeto, pero...
2015 Apr 12
2
Loop sobre muchos data frames
...or(i in seq_along(txt)){ >> txt[[i]] <- Corpus(VectorSource(names[i])) >> } >> >> obtengo el objeto txt: >> > class(txt) >> [1] "list" >> >> si extraigo solamente el primer objeto de esa lista: >> > txt[[1]] >> <<VCorpus (documents: 1, metadata (corpus/indexed): 0/0)>> >> >> si quiero ver el contenido del primer copus >> >> > inspect(txt[[1]]) >> <<VCorpus (documents: 1, metadata (corpus/indexed): 0/0)>> >> >> [[1]] >> <<PlainTextDocument (m...
2009 Nov 01
4
convert list to Dataframe
...ot; .. .. ..- attr(*, "names")= chr "LOGNAME" ..$ Children: NULL ..- attr(*, "class")= chr "MetaDataNode" - attr(*, "DMetaData")='data.frame': 1 obs. of 1 variable: ..$ MetaID: num 0 - attr(*, "class")= chr [1:3] "VCorpus" "Corpus" "list" It contains tweets but in many languages. The "columns" are separated by semi-colons. I am using the tm package and it is a "corpus". It looks like this: 547282;06:37:17;21;10;2009;dani_jade18;@Laura_Whyte1 day :p;Huddersfield/Linco...
2013 Jan 15
0
Function failure in tm
HI all: I have a customized source reader for the package tm (that Milan Bouchet-Vallat has been instrumental in producing). I can get it to produce a corpus of class: "VCorpus" "Corpus" "list" class(mycorp[1]) returns "VCorpus" "Corpus" "list" and class(mycorp[[1]] returns "PlainTextDocument" "TextDocument" "character" But now that I've got a corpsu, none of the t...
2010 Jan 25
2
tm installation (PR#14193)
...aded 317 Kb * Installing *source* package ?tm? ... ** libs gcc -std=gnu99 -I/usr/share/R/include -fpic -g -O2 -c lazyTmMap.c -o lazyTmMap.o gcc -std=gnu99 -shared -o tm.so lazyTmMap.o -L/usr/lib/R/lib -lR ** R ** data ** inst ** preparing package for lazy loading Error in setOldClass(c("VCorpus", "Corpus", "list")) : inconsistent old-style class information for "list"; the class is defined but does not extend "oldClass" Error : unable to load R code in package 'tm' ERROR: lazy loading failed for package ?tm? * Removing ?/home/ctelmo/R...
2011 Nov 17
3
merging corpora and metadata
...5 9439 ... ..$ selend : num [1:17] 2518 9702 2577 8881 10102 ... ..$ fname : chr [1:17] "WCPD-2001-01-29-Pg217.scrb" "WCPD-2003-01-13-Pg39.scrb" "WCPD-2003-01-13-Pg39.scrb" "WCPD-2004-05-17-Pg856.scrb" ... - attr(*, "class")= chr [1:3] "VCorpus" "Corpus" "list" Any idea on what I could do to keep the metadata in the merged corpus? Thanks, Henri-Paul -- Henri-Paul Indiogine Curriculum & Instruction Texas A&M University TutorFind Learning Centre Email: hindiogine at gmail.com Skype: hindiogine Websit...
2010 Aug 17
1
TM Package - Corpus function - Memory Allocation Problems
I'm using R 2.11.1 on Win XP (32-bit) with 3 GB of RAM. My data has (only) 16.0 MB. I want to create a VCorpus object using the Corpus function in the tm package but I'm running into Memory allocation issues: "Error: cannot allocate vector of size 372 Kb". My data is stored in a csv file which I've imported with "read.csv" and then used the following to create the Corpus (but i...
2011 May 20
1
DocumentTermMatrix - text minig
...I would like to do some text mining on it to possibly find some patterns between Opis, ACklasifikacija and Vodja. I looked over a tm package which loks promissing, more specifically DocumentTermMatrix or TermDocumentMatrix. But I can not figure out how to change my data from data.frame to Corpus or VCorpus. Globina ACKlasifikacija Opis GlobinaOd GlobinaDo Vodja 3671 8 GP SLABO GRADUIRAN PE©ÈEN PROD DO r = 70 mm, PREVLADUJE DO r = 30 mm, GOST, SI...
2015 Apr 10
3
Loop sobre muchos data frames
Hola a todos! Estoy en un proyecto de text mining y por razones de los recursos con que cuento tuve que separar los archivos de texto de input del proyecto en muchos archivos pequeños. Luego de transformar cada uno de estos archivos en un corpus separado, puedo aplicar limpieza sobre cada corpus, buscar n-gramas, construir cada termDocumentMatrix y finalmente reunir todo en una sola TDM. Pero
2011 May 18
0
text mining problem using TM package
Hi, I’m using R (TM package) for text mining and I’m having problems filtering articles out of my data set by local meta data. Here is the code: *data <- ("C:/… /19970331")* * * * * *rs <- ReutersSource(data , encoding = "UTF-8")* *RC <- VCorpus(DirSource(data), readerControl = list(reader = readRCV1asPlain,* * language = "en_US",* * load = TRUE),* * dbControl = list(useDb = TRUE,* * dbName = "texts.db",* * dbType = "DB1"))* * * * * * * *tm_index(RC, FUN = sFilter, doclevel = F, useMe...
2012 Jan 08
2
cannot find package in Packages>>Install Packages
...;)= chr "LOGNAME" >> ..$ Children: NULL >> ..- attr(*, "class")= chr "MetaDataNode" >> - attr(*, "DMetaData")='data.frame': 1 obs. of 1 variable: >> ..$ MetaID: num 0 >> - attr(*, "class")= chr [1:3] "VCorpus" "Corpus" "list" >> >> It contains tweets but in many languages. The "columns" are separated by >> semi-colons. I am using the tm package and it is a "corpus". >> >> It looks like this: >> > > It is difficult to...
2014 Jul 28
2
wordcloud y tabla de palabras
Hola, La referencia (gracias por proporcionarla) que has incluido es bastante clara y se puede seguir. ¿Has podido sobre tus dos discursos utilizar la misma lógica? La forma de salir de dudas, para empezar, es que adjuntaras el código que estás empleando por ver si hay algún error evidente. Aunque la forma adecuada para que te podamos ayudar es con un ejemplo reproducible: código + datos.
2014 Jul 29
2
wordcloud y tabla de palabras [Avanzando]
...t;) >>> info.05<-iconv(enc2utf8(info.05), sub="byte") >>> info.13<-iconv(enc2utf8(info.13), sub="byte") >>> informes<-c(info.05, info.13) >>> corpus<-Corpus(VectorSource(informes)) >>> inspect(corpus[1:2]) >> <<VCorpus (documents: 2, metadata (corpus/indexed): 0/0)>> >> >> [[1]] >> <<PlainTextDocument (metadata: 7)>> >> Derecho a la seguridad ciudadana. Toda persona tiene derecho a la >> protección del Estado a través de los órganos de seguridad ciudadana >> r...