Displaying 20 results from an estimated 100 matches similar to: "tm::stemDocument function not work"
2003 Sep 07
4
data manipulation
Hi,
I am new to R, coming from a few years using Stata. I've been twisting my
brain and checking several R and S references over the last few days to
try to solve this data management problem: I have a data set with a unique
patient identifier that is repeated along multiple rows, a variable with
month of patient encounter, and a continous variable for cost of
individual encounters. The data
2008 Jun 09
2
converting a data set to a format for time series analysis
I currently have a data set describing human subjects enrolled into an
international clinical trial, the name of the hospital enrolling this
human subject, the date when the subject was enrolled, and a vector
with variables representing characteristics of the site (e.g., number
of beds in a hospital). my data sets looks like this:
subject hospital date_enrollment hospital_beds
1 hospitalA
2002 Jun 27
2
large survey data set
---------- Forwarded message ----------
Hello,
I am analyzing a weighted, stratified, clustered survey data set
with approximately 1 million observations and 50 variables.
I am new to R (I'm a Stata user), and so far
couldn't find any documentation on how to handle survey data. In
other words, is there a specific package to handle a combination of
weigths, clusters and strata. I am also
2012 Apr 13
4
Help with stemDocument
Hi, All:
I am new to R and tm package. I'm trying to do the stemming using tm_map()
and it doesn't seem to work:
*I used:*
> stemDocument(t_cmts[[100]])
*Where t_cmts is the corpus object, the results is:*
bottle loose box abt airpak sections top plastic bottle squashed nearly
flush neck previous shipments bottle wrapped securely bubble wrap wno
bottle damage packaging poor
2007 Oct 21
4
Input appreciated: R teaching idea + a way to improve R-wiki
Hi all,
I will be teaching a graduate-level course on R at CU Boulder next
semester. I have a teaching idea that might also help improve the R
wiki page... I wanted to know what you all thought of it and wanted to
solicit some advice about doing it.
During the latter part of the course, students will choose a topic of
interest (e.g., hierarchical linear modeling), and show how to achieve
it in
2011 Nov 04
1
Help: stemming and stem completion with package tm in R
Hi All
I came across a problem below when doing stemming and stem completion
with package tm in R. Word "mining" was stemmed to "mine" with
stemDocument(), and then completed to "miners"with stemCompletion().
However, I prefer to keep "mining" intact.
For stemCompletion(), the default type of completion is "prevalent",
which takes the most
2006 May 19
1
factor analysis - discrepancy in results from R vs. Stata
Hi,
I found a discrepancy between results in R and Stata for a factor analysis
with a promax rotation. For Stata:
. *rotate, factor(2) promax*
(promax rotation)
Rotated Factor Loadings
Variable | 1 2 Uniqueness
-------------+--------------------------------
pfq_amanag~y | -0.17802 0.64161 0.70698
pfq_bwalk_~ø | 0.72569 0.05570
2012 Jan 13
4
Troubles with stemming (tm + Snowball packages) under MacOS
Dear all,
I have some troubles using the stemming algorithm provided by the tm
(text mining) + Snowball packages.
Here is my config:
MacOS 10.5
R 2.12.0 / R 2.13.1 / R 2.14.1 (I have tried several versions)
I have installed all the needed packages (tm, rJava, rWeka, Snowball)
+ dependencies. I have desactivated AWT (like written in
2012 Jun 25
2
rrdf package for mac not working
rrdf is incredibly helpful, but I've notice that the rrdf package for mac
hasn't been working for some time: http://goo.gl/5Ukpn . wondering if there
is still a plan to maintain that in the long run, or if there is some other
alternative to read RDF files.
[[alternative HTML version deleted]]
2007 Oct 08
2
R and FDA trials
Yesterday I just noticed the new document on R and regulatory aspects
for biomedical research posted at
http://www.r-project.org/doc/R-FDA.pdf
Coming from an institution that performs a large number of clinical
trials for FDA and being an advocate of R myself, I have found that
the following issues usually come up when discussing the use of R for
FDA trials:
1. Most FDA submissions come down to
2011 Mar 24
2
Problem with Snowball & RWeka
Dear Forum,
when I try to use SnowballStemmer() I get the following error message:
"Could not initialize the GenericPropertiesCreator. This exception was
produced: java.lang.NullPointerException"
It seems to have something to do with either Snowball or RWeka, however I
can't figure out, what to do myself. If you could spend 5 minutes of your
valuable time, to help me or give me a
2009 Nov 12
2
package "tm" fails to remove "the" with remove stopwords
I am using code that previously worked to remove stopwords using package
"tm". Even manually adding "the" to the list does not work to remove "the".
This package has undergone extensive redevelopment with changes to the
function syntax, so perhaps I am just missing something.
Please see my simple example, output, and sessionInfo() below.
Thanks!
Mark
require(tm)
2011 Jun 04
1
Problem with Snowball & RWeka
I too have this problem. Everything worked fine last year, but after
updating R and packages I can no longer do word stemming.
Unfortunately, I didn't save the old binaries, otherwise I would just
revert back.
Hoping someone finds a solution for R on Windows. Thanks!
There is a potential solution for R on Mac OS from Kurt Hornik copied
below, but I cannot get this to work on Windows.
2012 Jan 27
2
tm package: handling contractions
I tried making a wordcloud of Obama's State of the Union address using
the tm package to process the text
sotu <- scan(file="c:/R/data/sotu2012.txt", what="character")
sotu <- tolower(sotu)
corp <-Corpus(VectorSource(paste(sotu, collapse=" ")))
corp <- tm_map(corp, removePunctuation)
corp <- tm_map(corp, stemDocument)
corp <- tm_map(corp,
2012 Dec 13
2
Tamaño de la matriz de términos y memoria. Paquete TM
Hola a todos!
Tengo algunos problemas con el tamaño de la matriz de términos que obtengo. Los comandos que utilizo son los siguientes:
# carga librerias
library(tm)
library(wordcloud)
library(Rstem)
library(Snowball)
# lee el documento UTF-8 y lo convierte a ASCII
txt <-
2014 Jun 17
2
No es un problema de tm tienes doc.corpus vacío
No es un problema de tm ni de SnowfallC ni de mcapply (por el path
utilizas linux, en windows mcapply según el manual no va bien)
No defines bien los objetos que pasas. Pasas doc.corpus en lugar de
corpus ( o asignas a corpus en lugar de a doc.corpus) .
Depura los programas cuando salga un error de objeto, como te pone en el
Error que pasas .
Temporalmente lo tienes arreglado en
2014 Jun 18
2
No es un problema de tm tienes doc.corpus vacío
Creo que lo que quieres hacer necesita esta línea de código justo después de
cargar el paquete tm:
inmortal = unlist(strsplit(inmortal, " ", fixed = T))
De esta forma, trabajas con palabras, y NO con las frases enteras...
Un saludo
Isidro Hidalgo Arellano
Observatorio Regional de Empleo
Consejería de Empleo y Economía
http://www.jccm.es
> -----Mensaje original-----
> De:
2014 Jun 18
3
No es un problema de tm tienes doc.corpus vacío
Muchas gracias isidro,
a la noche reinstalo R y os digo si me ha funcionado. Perdona mi ignorancia
de novato pero no he entendido muy bien eso de avisar al desarrollador.
Entiendo que es a los de los paquetes, no?
un saludo!
ruben
El 18 de junio de 2014, 13:10, Isidro Hidalgo <ihidalgo@jccm.es> escribió:
> Ya he visto que tampoco así funciona.
> Sí te puedo decir que me ha dejado
2006 Dec 15
0
Machine accounts keep expiring
Hi,
I've a problem with samba and ldap but it's the first time that samba
works so bad.
I made a network with samba and a few of windows client. Since four
months (the networks was made on january) and every 10/12 days the
workstations go out from the domain.
The user can't log, and when i try logging with administrator It ask me
to change him password.
So I must unjoin the
2013 Oct 04
2
Possible POSIXlt / wday glitch & bugs.r-project.org status
Wanted to raise two questions:
1. Is bugs.r-project.org down? I haven't been able to reach it for two or three days:
```
ping bugs.r-project.org
PING rbugs.research.att.com (207.140.168.137): 56 data bytes
Request timeout for icmp_seq 0
Request timeout for icmp_seq 1
Request timeout for icmp_seq 2
Request timeout for icmp_seq 3
Request timeout for icmp_seq 4
Request timeout for icmp_seq 5