similar to: Stemming functions only work on the last word of plain text documents

Displaying 20 results from an estimated 300 matches similar to: "Stemming functions only work on the last word of plain text documents"

2011 Nov 17
3
merging corpora and metadata
Greetings! I loose all my metadata after concatenating corpora. This is an example of what happens: > meta(corpus.1) MetaID cid fid selfirst selend fname 1 0 1 11 2169 2518 WCPD-2001-01-29-Pg217.scrb 2 0 1 14 9189 9702 WCPD-2003-01-13-Pg39.scrb 3 0 1 14 2109 2577 WCPD-2003-01-13-Pg39.scrb .... .... 17 0
2010 Apr 23
2
Library (tm) Error: could not find function "TermDocMatrix".
Hi List I have the next code and the error. I have try with other codes and I have the same problem. > reut21578 <- system.file("texts", "crude", package = "tm") > (r <- Corpus(DirSource(reut21578), readerControl = list(reader = > readReut21578XMLasPlain))) A corpus with 20 text documents > (r <- Corpus(DirSource(reut21578), readerControl =
2009 Nov 01
4
convert list to Dataframe
Hi. I have a huge list called twitter: > dim(twitter) NULL > str(twitter) List of 1 $ :Classes 'PlainTextDocument', 'TextDocument', 'character' atomic [1:35575] 11999;10:47:14;20;10;2009;ObamaLouverture;Trails Mixed Lessons For Governance From Campaigner-in-chief: President obama jumps campaign 09 tuesday..
2010 Jan 22
1
Invalid input error in tm package
Hello, I am working on "tm" package. I have 2 pdf files saved in the directory D:/Files I issued the following commands (marked in red bold) for which I got some errors and warnings (marked in bold) *surgj <- Corpus(DirSource("D:/Files"), readerControl = list(language = "ansi"))* *Warning messages: 1: In readLines(y, encoding = x$Encoding) : incomplete final
2010 Feb 18
2
Rearranging data
Hi! I have just started learning R and today only I have joined this group. This is my first mail and I wish to thank all of you for allowing me to be part of this group. I have following problem. I have an input.csv file such that corp_id      date    investment_id       rate corp1        17-Feb         1                 65 corp1        16-Feb         1                 70 corp1        15-Feb   
2010 Aug 09
1
TM Package - installation
Hi All, I have been trying to do some text analytics in R using tm package. I have installed and loaded the package, along with dependencies (slam, rWeka,rjava). When I try to run a tm_map command, it gives me "Error in .jnew(name) : java.lang.NoClassDefFoundError: weka/core/stemmers/SnowballStemmer" error. Can someone please throw some light on this?. Am I missing out on something?
2012 Jan 13
4
Troubles with stemming (tm + Snowball packages) under MacOS
Dear all, I have some troubles using the stemming algorithm provided by the tm (text mining) + Snowball packages. Here is my config: MacOS 10.5 R 2.12.0 / R 2.13.1 / R 2.14.1 (I have tried several versions) I have installed all the needed packages (tm, rJava, rWeka, Snowball) + dependencies. I have desactivated AWT (like written in
2012 Jan 08
2
cannot find package in Packages>>Install Packages
Hi. I am trying to install a package called DMwR http://cran.r-project.org/web/packages/DMwR/index.html located here: http://cran.r-project.org/bin/windows/contrib/r-release/DMwR_0.2.1.zip on windows 7. I am using R 2.10.1. I also tried typing something like this but it did not work well. install.packages(c(" http://cran.r-project.org/bin/windows/contrib/r-release/DMwR_0.2.1.zip
2020 May 19
5
FTS-lucene errors : language not available for stemming
I'm getting some log errors with clucene that I am having no luck tracking down on the interwebs. Errors: May 19 05:05:16 indexer-worker(gessel at blackrosetech.com)<62971><aPAEI3zLw17A/QAA0J78UA:EF25M3zLw1779QAA0J78UA>: Error: lucene index /mail/blackrosetech.com/gessel//lucene-indexes: IndexWriter::addDocument() failed (#4): language not available for stemming May 19 05:05:16
2009 Jul 07
1
wordStem problems in R 2.9, Fedora 11; Linux Kernel 2.6.29.5-191.fc11.i586
Dear All, I just updated from Fedora 9 to Fedora 11, kernel version 2.6.29.5-191.fc11.i586. I'm running R 2.9. I successfully installed package Rstem from source (it always ran fine for me in F9). However: > wordStem(c("This","is","a","test")) Error in wordStem(c("This", "is", "a", "test")) : VECTOR_ELT()
2008 Jul 28
1
RStem with portuguese language
Greetings, I have R 2.7.1 in MacOs and I believe UTF encoding is already installed. At least: > Sys.getenv() shows several variables, including: LANG "pt_PT.UTF-8" I installed the Rstem and tm packages and when I try the following code: > wordStem(c("aberra??o","aberra??es"), language="portuguese") [1] "aberra?\xc3"
2011 Mar 24
2
Problem with Snowball & RWeka
Dear Forum, when I try to use SnowballStemmer() I get the following error message: "Could not initialize the GenericPropertiesCreator. This exception was produced: java.lang.NullPointerException" It seems to have something to do with either Snowball or RWeka, however I can't figure out, what to do myself. If you could spend 5 minutes of your valuable time, to help me or give me a
2006 Mar 17
4
hidden fields
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, I''m a rails newbie trying to develop a blog application with rails. I''ve some troubles to find the best way of automaticly set a field value on update and creation of a blog item. In fact my problem is very simple. I''ve a blog table with two column named create_date an mod_date. And I''d like : 1 - that
2006 May 10
3
migrations :timestamp becomes :datetime in mySql
For some reason whenever I try and create a timestamp column with migrations and mysql I get a datetime column instead. That''s kind of annoying because I want the column to update every time the row gets changed. Is this a bug, or is there something I can do about it? (Obviously I can manually change my mysql table, but that kind of defeats the point of migrations!) In my migration
2011 Apr 29
0
Trying to get RWeka/Snowball to work
Hi! I was trying to install RWeka to be able to use SnowballStemmer in a Mac OS X 10.6.7 environment... but coudn't do it... I get error messages after: > library(RWeka); > install(Snowball); > ## Test the supplied vocabulary for the default stemmer ('porter'): > source <- readLines(system.file("words", "porter","voc.txt", +
2011 Jun 04
1
Problem with Snowball & RWeka
I too have this problem. Everything worked fine last year, but after updating R and packages I can no longer do word stemming. Unfortunately, I didn't save the old binaries, otherwise I would just revert back. Hoping someone finds a solution for R on Windows. Thanks! There is a potential solution for R on Mac OS from Kurt Hornik copied below, but I cannot get this to work on Windows.
2011 Jun 09
2
Coercing Output from mget() into Proper Data Frame
Hello R-philes: I have the following function that gets the output of mget() and converts it to a data frame to return. What I am finding is that the dimensions are wrong. Basically, I get: bridesmaid wed u see m gt lt like love X.0 dress pagetrack one go X3 get 1 56 35 27 30 24 20 20 23 28 17 25 16 16 28 15 26 Instead, I want something like: [1] bridesmaid
2009 Aug 10
1
Sorting text docs based on document meta values in tm()
Hi all, I wonder if there's any way to reshuffle the text collection by the document meta values. For instance, if I have 5 documents that correspond to the following meta data: MetaID Sex Age 0 M 38 0 M 46 0 F 24 0 F 49 0 F 33 Can I reorder the text documents based on the ascending order of age? Thank you very much!! -- View
2012 Aug 08
0
Testing for a second order factor using SEM package
Hi! The following model specification works when testing for first order factors, but when I attempt to test for a second order factor by adding the last 4 lines in the model, I get the error message below: model.cfa.ru <- specifyModel() sRU1 <- sRU, NA, 1 sRU2 <- sRU, lam12 sRU3 <- sRU, lam13 sRU4 <- sRU, lam14 sRU5 <- sRU, lam15 sRU6 <- sRU, lam16 sRU <-> sRU, mak1
2012 Feb 26
2
tm_map help
Hi all, I am trying to do some text mining with twitter and I am getting the error: Error in structure(names(sapply(possibleCompletions, "[", 1)), names = x) : 'names' attribute [1] must be the same length as the vector [0] When I use tm_map. Has anyone had/seen this error before? The code I have is shown below and this error only occurs with #qantas, hashtags like #asx,