similar to: Sorting text docs based on document meta values in tm()

Displaying 14 results from an estimated 14 matches similar to: "Sorting text docs based on document meta values in tm()"

2013 Sep 26
0
R hangs at NGramTokenizer
Hi: I try to construct a Document-Term Meatrix from a corpus. The commands I used are: > library(parallel)> library(tm)> library(RWeka)> library(topicmodels)> library(RTextTools)> cl=makeCluster(detectCores())> invisible(clusterEvalQ(cl, library(tm)))> invisible(clusterEvalQ(cl, library(RWeka))) > invisible(clusterEvalQ(cl, library(topicmodels)))>
2012 Feb 26
2
tm_map help
Hi all, I am trying to do some text mining with twitter and I am getting the error: Error in structure(names(sapply(possibleCompletions, "[", 1)), names = x) : 'names' attribute [1] must be the same length as the vector [0] When I use tm_map. Has anyone had/seen this error before? The code I have is shown below and this error only occurs with #qantas, hashtags like #asx,
2011 Nov 17
3
merging corpora and metadata
Greetings! I loose all my metadata after concatenating corpora. This is an example of what happens: > meta(corpus.1) MetaID cid fid selfirst selend fname 1 0 1 11 2169 2518 WCPD-2001-01-29-Pg217.scrb 2 0 1 14 9189 9702 WCPD-2003-01-13-Pg39.scrb 3 0 1 14 2109 2577 WCPD-2003-01-13-Pg39.scrb .... .... 17 0
2009 Nov 01
4
convert list to Dataframe
Hi. I have a huge list called twitter: > dim(twitter) NULL > str(twitter) List of 1 $ :Classes 'PlainTextDocument', 'TextDocument', 'character' atomic [1:35575] 11999;10:47:14;20;10;2009;ObamaLouverture;Trails Mixed Lessons For Governance From Campaigner-in-chief: President obama jumps campaign 09 tuesday..
2010 Apr 23
2
Library (tm) Error: could not find function "TermDocMatrix".
Hi List I have the next code and the error. I have try with other codes and I have the same problem. > reut21578 <- system.file("texts", "crude", package = "tm") > (r <- Corpus(DirSource(reut21578), readerControl = list(reader = > readReut21578XMLasPlain))) A corpus with 20 text documents > (r <- Corpus(DirSource(reut21578), readerControl =
2011 Sep 05
0
Stemming functions only work on the last word of plain text documents
Hello, I want to use the SnowballStemmer on a collection of plain text documents. However, when I apply it to my corpus using the tm_map function it only stems the last word of each document (The problem is the for wordStem and stemDocument does not work at all).  An example: > path <- c("c:\path\to\directory")       # collection of plain text documents > corp <-
2009 Jul 20
9
rake error
When I run rake test:units I get this error: 292 tests, 350 assertions, 2 failures, 13 errors rake aborted! Command failed with status (1): [/usr/local/bin/ruby -I"lib:test" "/ usr/loc...] This error just showed up yesterday --- I have no idea how I caused it. Here is my gem list in case that helps: actionmailer (2.3.2, 2.2.2) actionpack (2.3.2, 2.2.2) activerecord (2.3.2, 2.2.2)
2010 Jan 22
1
Invalid input error in tm package
Hello, I am working on "tm" package. I have 2 pdf files saved in the directory D:/Files I issued the following commands (marked in red bold) for which I got some errors and warnings (marked in bold) *surgj <- Corpus(DirSource("D:/Files"), readerControl = list(language = "ansi"))* *Warning messages: 1: In readLines(y, encoding = x$Encoding) : incomplete final
2012 Jan 08
2
cannot find package in Packages>>Install Packages
Hi. I am trying to install a package called DMwR http://cran.r-project.org/web/packages/DMwR/index.html located here: http://cran.r-project.org/bin/windows/contrib/r-release/DMwR_0.2.1.zip on windows 7. I am using R 2.10.1. I also tried typing something like this but it did not work well. install.packages(c(" http://cran.r-project.org/bin/windows/contrib/r-release/DMwR_0.2.1.zip
2009 Aug 13
0
Efficiently Extracting Meta Data from TM Corpora
I'm using text miner (the "tm" package) to process large numbers of blog and message board postings (about 245,000). Does anyone have any advice for how to efficiently extract the meta data from a corpus of this size? TM does a great job of using MPI for many functions (e.g. tmMap) which greatly speed up the processing. However, the "meta" function that I need does not
2006 Feb 21
6
+ camping/session
Camping now comes with a sessioning class, checked in tonight. To get sessions working for your application: 1. require ''camping/session'' 2. include Camping::Session in your application''s toplevel module. 3. In your application''s create method, add a call to Camping::Models::Schema.create_schema 4. Throughout your application, use the @state
2007 Sep 25
16
putting away HashWithIndifferentAccess
Hey, campineros. And many good handshakes to zimbatm for getting some patches applied. So, yeah, I''d really like to get rid of any serious dependancies with this 1.6 release. Anything that''s not in stdlib has to go. Of course, camping-omnibus will still assume the whole ActiveRecord, Markaby, Mongrel setup that''s in the history books. Metaid can be removed and
2006 Dec 01
1
Packages build for Solaris ? As CSW packages ?
Well imitation is the highest form of flattery they say. So I''m surprised to see these packages neatly built to install into /opt/csw correctly and yet they exist somewhere else and have nothing to do with us here at Blastwave. fascinating. I guess we can always send an email to the person doing this and just ask if they want those packages in testing and then into the catalog for
2011 Jul 06
7
Issue with puppet file serving api not parsing yaml content correctly
I am working on building a facter tag based node classifier similar to https://github.com/jordansissel/puppet-examples/tree/master/nodeless-puppet/. However, I have run into an issue where I cannot use puppet''s require file ability to push the yaml file containing the facts file to the client because it would require two runs of puppet to pickup changes. Consequently, I have written into