Displaying 20 results from an estimated 300 matches similar to: "Stemming functions only work on the last word of plain text documents"
2011 Nov 17
3
merging corpora and metadata
Greetings!
I loose all my metadata after concatenating corpora. This is an
example of what happens:
> meta(corpus.1)
MetaID cid fid selfirst selend fname
1 0 1 11 2169 2518 WCPD-2001-01-29-Pg217.scrb
2 0 1 14 9189 9702 WCPD-2003-01-13-Pg39.scrb
3 0 1 14 2109 2577 WCPD-2003-01-13-Pg39.scrb
....
....
17 0
2010 Apr 23
2
Library (tm) Error: could not find function "TermDocMatrix".
Hi List
I have the next code and the error. I have try with other codes and I have
the same problem.
> reut21578 <- system.file("texts", "crude", package = "tm")
> (r <- Corpus(DirSource(reut21578), readerControl = list(reader =
> readReut21578XMLasPlain)))
A corpus with 20 text documents
> (r <- Corpus(DirSource(reut21578), readerControl =
2009 Nov 01
4
convert list to Dataframe
Hi. I have a huge list called twitter:
> dim(twitter)
NULL
> str(twitter)
List of 1
$ :Classes 'PlainTextDocument', 'TextDocument', 'character' atomic
[1:35575] 11999;10:47:14;20;10;2009;ObamaLouverture;Trails Mixed Lessons For
Governance From Campaigner-in-chief: President obama jumps campaign 09
tuesday..
2010 Jan 22
1
Invalid input error in tm package
Hello,
I am working on "tm" package.
I have 2 pdf files saved in the directory D:/Files
I issued the following commands (marked in red bold) for which I got some
errors and warnings (marked in bold)
*surgj <- Corpus(DirSource("D:/Files"), readerControl = list(language =
"ansi"))*
*Warning messages:
1: In readLines(y, encoding = x$Encoding) :
incomplete final
2010 Feb 18
2
Rearranging data
Hi!
I have just started learning R and today only I have joined this group. This is my first mail and I wish to thank all of you for allowing me to be part of this group.
I have following problem. I have an input.csv file such that
corp_id date investment_id rate
corp1 17-Feb 1 65
corp1 16-Feb 1 70
corp1 15-Feb
2010 Aug 09
1
TM Package - installation
Hi All,
I have been trying to do some text analytics in R using tm package. I have
installed and loaded the package, along with dependencies (slam,
rWeka,rjava). When I try to run a tm_map command, it gives me "Error in
.jnew(name) :
java.lang.NoClassDefFoundError: weka/core/stemmers/SnowballStemmer" error.
Can someone please throw some light on this?. Am I missing out on something?
2012 Jan 13
4
Troubles with stemming (tm + Snowball packages) under MacOS
Dear all,
I have some troubles using the stemming algorithm provided by the tm
(text mining) + Snowball packages.
Here is my config:
MacOS 10.5
R 2.12.0 / R 2.13.1 / R 2.14.1 (I have tried several versions)
I have installed all the needed packages (tm, rJava, rWeka, Snowball)
+ dependencies. I have desactivated AWT (like written in
2012 Jan 08
2
cannot find package in Packages>>Install Packages
Hi. I am trying to install a package called DMwR
http://cran.r-project.org/web/packages/DMwR/index.html
located here:
http://cran.r-project.org/bin/windows/contrib/r-release/DMwR_0.2.1.zip
on windows 7.
I am using R 2.10.1.
I also tried typing something like this but it did not work well.
install.packages(c("
http://cran.r-project.org/bin/windows/contrib/r-release/DMwR_0.2.1.zip
2020 May 19
5
FTS-lucene errors : language not available for stemming
I'm getting some log errors with clucene that I am having no luck tracking down on the interwebs.
Errors:
May 19 05:05:16 indexer-worker(gessel at blackrosetech.com)<62971><aPAEI3zLw17A/QAA0J78UA:EF25M3zLw1779QAA0J78UA>: Error: lucene index /mail/blackrosetech.com/gessel//lucene-indexes: IndexWriter::addDocument() failed (#4): language not available for stemming
May 19 05:05:16
2009 Jul 07
1
wordStem problems in R 2.9, Fedora 11; Linux Kernel 2.6.29.5-191.fc11.i586
Dear All,
I just updated from Fedora 9 to Fedora 11, kernel version
2.6.29.5-191.fc11.i586. I'm running R 2.9.
I successfully installed package Rstem from source (it always ran fine
for me in F9). However:
> wordStem(c("This","is","a","test"))
Error in wordStem(c("This", "is", "a", "test")) :
VECTOR_ELT()
2008 Jul 28
1
RStem with portuguese language
Greetings,
I have R 2.7.1 in MacOs and I believe UTF encoding is already installed.
At least:
> Sys.getenv()
shows several variables, including:
LANG "pt_PT.UTF-8"
I installed the Rstem and tm packages and when I try the following code:
> wordStem(c("aberra??o","aberra??es"), language="portuguese")
[1] "aberra?\xc3"
2011 Mar 24
2
Problem with Snowball & RWeka
Dear Forum,
when I try to use SnowballStemmer() I get the following error message:
"Could not initialize the GenericPropertiesCreator. This exception was
produced: java.lang.NullPointerException"
It seems to have something to do with either Snowball or RWeka, however I
can't figure out, what to do myself. If you could spend 5 minutes of your
valuable time, to help me or give me a
2006 Mar 17
4
hidden fields
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi,
I''m a rails newbie trying to develop a blog application with rails. I''ve
some troubles to find the best way of automaticly set a field value on
update and creation of a blog item.
In fact my problem is very simple. I''ve a blog table with two column
named create_date an mod_date. And I''d like :
1 - that
2006 May 10
3
migrations :timestamp becomes :datetime in mySql
For some reason whenever I try and create a timestamp column with
migrations and mysql I get a datetime column instead. That''s kind of
annoying because I want the column to update every time the row gets
changed. Is this a bug, or is there something I can do about it?
(Obviously I can manually change my mysql table, but that kind of
defeats the point of migrations!)
In my migration
2011 Apr 29
0
Trying to get RWeka/Snowball to work
Hi!
I was trying to install RWeka to be able to use SnowballStemmer in a Mac OS
X 10.6.7 environment... but coudn't do it... I get error messages after:
> library(RWeka);
> install(Snowball);
> ## Test the supplied vocabulary for the default stemmer ('porter'):
> source <- readLines(system.file("words", "porter","voc.txt",
+
2011 Jun 04
1
Problem with Snowball & RWeka
I too have this problem. Everything worked fine last year, but after
updating R and packages I can no longer do word stemming.
Unfortunately, I didn't save the old binaries, otherwise I would just
revert back.
Hoping someone finds a solution for R on Windows. Thanks!
There is a potential solution for R on Mac OS from Kurt Hornik copied
below, but I cannot get this to work on Windows.
2011 Jun 09
2
Coercing Output from mget() into Proper Data Frame
Hello R-philes:
I have the following function that gets the output of mget() and
converts it to a data frame to return. What I am finding is that the
dimensions are wrong. Basically, I get:
bridesmaid wed u see m gt lt like love X.0 dress pagetrack one go X3 get
1 56 35 27 30 24 20 20 23 28 17 25 16 16 28 15 26
Instead, I want something like:
[1] bridesmaid
2009 Aug 10
1
Sorting text docs based on document meta values in tm()
Hi all,
I wonder if there's any way to reshuffle the text collection by the document
meta values. For instance, if I have 5 documents that correspond to the
following meta data:
MetaID Sex Age
0 M 38
0 M 46
0 F 24
0 F 49
0 F 33
Can I reorder the text documents based on the ascending order of age? Thank
you very much!!
--
View
2012 Aug 08
0
Testing for a second order factor using SEM package
Hi!
The following model specification works when testing for first order
factors, but when I attempt to test for a second order factor by adding the
last 4 lines in the model, I get the error message below:
model.cfa.ru <- specifyModel()
sRU1 <- sRU, NA, 1
sRU2 <- sRU, lam12
sRU3 <- sRU, lam13
sRU4 <- sRU, lam14
sRU5 <- sRU, lam15
sRU6 <- sRU, lam16
sRU <-> sRU, mak1
2012 Feb 26
2
tm_map help
Hi all,
I am trying to do some text mining with twitter and I am getting the error:
Error in structure(names(sapply(possibleCompletions, "[", 1)), names = x) :
'names' attribute [1] must be the same length as the vector [0]
When I use tm_map. Has anyone had/seen this error before? The code I
have is shown below and this error only occurs with #qantas, hashtags
like #asx,