similar to: How to read plain text documents into a vector?

Displaying 20 results from an estimated 3000 matches similar to: "How to read plain text documents into a vector?"

2015 Apr 10
3
Loop sobre muchos data frames
Hola a todos! Estoy en un proyecto de text mining y por razones de los recursos con que cuento tuve que separar los archivos de texto de input del proyecto en muchos archivos pequeños. Luego de transformar cada uno de estos archivos en un corpus separado, puedo aplicar limpieza sobre cada corpus, buscar n-gramas, construir cada termDocumentMatrix y finalmente reunir todo en una sola TDM. Pero
2009 Oct 02
1
text mining
The following code is derived from a paper titled "Text Mining Infrastructure in R" (http://www.jstatsoft.org/v25/i05/paper). The example below seems to load some default documents for analysis, some sort of latin document. I cannot for the life of me figure out to load my own document let alone an entire corpus. I have searched the above documenet as well as related documentation.
2015 Apr 10
5
Loop sobre muchos data frames
Jorge Gracias por el consejo. Aparentemente no lo estoy aplicando bien, pues el objeto que obtengo no contiene lo que quiero. Me explico, al ejecutar txt <- vector('list', length = length(names)) #names el el vector donde ya tenía almacenada la lista de txt's for(i in seq_along(txt)){ txt[[i]] <- Corpus(VectorSource(names[i])) } obtengo el objeto txt: > class(txt) [1]
2009 Jul 17
3
Ayuda con el paquete de text mining (TM)
Estimados, les escribo para consultar, lo siguiente: Estoy haciendo un trabajo de text mining y necesito importar una serie de textos para preprocesarlos, es decir eliminar los Stopwords, hacer stemming, eliminar signos de puntuación etc. Esto último lo puedo realizar con los datasets que trae la librería TM. Lo que no puedo lograr es importar texto desde algún medio a pesar que existe funciones
2009 Jan 15
1
How to Solve the Error( error:cannot allocate vector of size 1.1 Gb)
Hi, Gurus Thanks to your good helps, I have managed starting the use of a text mining package so called "tm" in R under the OS of Win XP. However, during running the tm package, I got another mine like memory problem. What is a the best way to solve this memory problem among increasing a physical RAM, or doing other recipes, etc? ############################### ###### my R
2015 Apr 12
2
Loop sobre muchos data frames
Jorge, estimados colaboradores de R-help Estuve tratando de utilizar un script para uno de los pasos en mi análisis, que es transformar cada uno de los corpus en mi espacio de trabajo en un objeto TermDocumentMatrix Tengo un vector llamado bNames que lista todos los corpus que quiero pasar a TDM, y construí los siguientes comandos: tdm.n1 <- vector('list', length = length(bNames))
2011 May 26
3
text mining
Hi, how can I import a document whose type is. "txt" using the package tm? it is the command to know that my document is not placed in the library package tm. thanks. -- View this message in context: http://r.789695.n4.nabble.com/text-mining-tp3552221p3552221.html Sent from the R help mailing list archive at Nabble.com.
2009 Jan 09
1
[R} how to build TermDocMatrix in tm text mining package of R
Howdy Gurus I 'd like to ask a question about how to build TermDocMatrix in tm text mining package. It is not clear about importing a plain text file, and them converting that text file into TermDocMatrix file, etc to me. How can I build a TermDocMatrix of " a plain text document file for text association? Or are there any good manuals? Thank you in advance, -- Kum-Hoe Hwang, Ph.D.
2012 May 29
1
package tm: reading XML files
Dear fellow R users, I'm using the package tm for text mining, and have a problem with reading in a corpus from XML files. When I copy the example from "Introduction to the tm package" of the small reuters subset "crude", everything goes well, and I get a corpus with the required meta data. When I read in the entire reuters21578 corpus in XML format however (or a
2011 Sep 05
0
Stemming functions only work on the last word of plain text documents
Hello, I want to use the SnowballStemmer on a collection of plain text documents. However, when I apply it to my corpus using the tm_map function it only stems the last word of each document (The problem is the for wordStem and stemDocument does not work at all).  An example: > path <- c("c:\path\to\directory")       # collection of plain text documents > corp <-
2003 Nov 08
2
malloc errors? out of memory with many files on HP-UX
Hi, folks. I've started getting these errors from rsync, and any help would be appreciated: >ERROR: out of memory in string_area_new buffer >rsync error: error allocating core memory buffers (code 22) at util.c(115) >ERROR: out of memory in string_area_new buffer >rsync error: error allocating core memory buffers (code 22) at util.c(115) >ERROR: out of memory in
2014 Jul 28
2
wordcloud y tabla de palabras
Hola, La referencia (gracias por proporcionarla) que has incluido es bastante clara y se puede seguir. ¿Has podido sobre tus dos discursos utilizar la misma lógica? La forma de salir de dudas, para empezar, es que adjuntaras el código que estás empleando por ver si hay algún error evidente. Aunque la forma adecuada para que te podamos ayudar es con un ejemplo reproducible: código + datos.
2011 May 23
6
Reading Data from mle into excel?
Hi there, I ran the following code: vols=read.csv(file="C:/Documents and Settings/Hugh/My Documents/PhD/Swaption vols.csv" , header=TRUE, sep=",") X<-ts(vols[,2]) #X dcOU<-function(x,t,x0,theta,log=FALSE){ Ex<-theta[1]/theta[2]+(x0-theta[1]/theta[2])*exp(-theta[2]*t) Vx<-theta[3]^2*(1-exp(-2*theta[2]*t))/(2*theta[2]) dnorm(x,mean=Ex,sd=sqrt(Vx),log=log) }
2007 Aug 27
3
rsync out of memory at 8 MB although ulimit is 512MB
Hello again, I encountered something amazing. First I thought there is not enough memory allowed through ulimit. ulimit is now set to (almost) 512MB but rsync still gets out fo memory at 8MB. Can anyone tell me why? That's my configuration: rsync version 2.6.2 from AIX 5.3 to SuSE Linux 9 (also has rsync 2.6.2) ulimit -a (AIX) ulimit -a AIX (source): -------------------------
2014 Jul 22
2
Ayuda Error in `colnames<-`(`*tmp*`, value = c(
Buenas tardes, grupo. Estoy tratando de hacer la comparación de dos archivos de una misma organización para encontrar las diferencias entre su informe del tema edl año 2005 y el del año 2013: Todos los comandos van bien, a exepción del último "colnames", como se ve en la siguiente secuencia: > pdf1<-"./PLAN de INSPECCIONES/05_seguridad_ciudadana.pdf" >
2009 Aug 13
1
using package tm to find phrases
I am using the package "tm" for text-mining of abstracts and would like to use it to find instances of gene names that may contain white space. For instance "gene regulatory protein 1". The default behavior of tm is to parse this into 4 separate words, but I would like to use the class constructor "dictionary" to define phrases such as just mentioned. Is this
2008 Aug 19
1
rsync hangs after aborting a process
Greetings. In testing an rsync backup script I'd created, I made a mistake and aborted the running script with a ctrl-C keyboard interrupt. The command that was running at the time was as follows: ${RSYNC_CMD} -aNHAXx --protect-args --fileflags --force-change --rsync-path="/usr/local/bin/rsync" <username>@<my.server.com>:${CPY_SRC} ${CPY_DEST} The expected data
2003 Jun 11
2
rsync limit to file size/file count
Hi, What are the limits to file size and file count when doing a rsync transfer using 2.5.6? I was trying to rsync about 500 GB of data with many files and many directories, but it has been stuck building the file list for several hours. First of all, is it possible to transfer 500 GB of data? Secondly, what would the limit for file count be when doing a rsync transfer? Any comments or help
2014 Jul 29
2
wordcloud y tabla de palabras [Avanzando]
Buenas tardes grupo. Saludos cordiales Carlos J., muchas gracias por tu orientación. Efectivamente, me había dado cuenta que la razón por la que no se aplicaba colnames era porque no tenía columnas. La cuestión es que no logro visualizar completamente/claramente en qué parte del proceso de creación del corpus se puede hacer. Sin embargo, siguiendo el ejemplo de
2009 Aug 20
1
Creating a list of combinations
Dear R Users, I have 120 objects stored in R's memory and I want to pass the names of these many objects to be held as just one single object. The naming convention is month, year in sequence for all months between January 1986 to December 1995 (e.g. Jan86, Feb86, Mar86... through to Dec95). I hope to pass all these names (and their data I guess) to an object called file_list, however,