similar to: readHTML within tm package

Displaying 8 results from an estimated 8 matches similar to: "readHTML within tm package"

2010 Feb 04
1
How to read HTML or TEXT file with tm package
??????????????????????????????????????????... ????: ???? URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20100204/a3069c99/attachment.pl>
2010 Apr 23
2
Library (tm) Error: could not find function "TermDocMatrix".
Hi List I have the next code and the error. I have try with other codes and I have the same problem. > reut21578 <- system.file("texts", "crude", package = "tm") > (r <- Corpus(DirSource(reut21578), readerControl = list(reader = > readReut21578XMLasPlain))) A corpus with 20 text documents > (r <- Corpus(DirSource(reut21578), readerControl =
2012 May 29
1
package tm: reading XML files
Dear fellow R users, I'm using the package tm for text mining, and have a problem with reading in a corpus from XML files. When I copy the example from "Introduction to the tm package" of the small reuters subset "crude", everything goes well, and I get a corpus with the required meta data. When I read in the entire reuters21578 corpus in XML format however (or a
2010 Feb 16
0
tm package
Hi, I'm using version 0.5.1 of tm package with R 2.10.1. It looks to me as if after the following reuters21578 <- Corpus(DirSource(corpusDir), readerControl = list(reader = readReut21578XMLasPlain)) reuters21578 <- tm_map(reuters21578, stripWhitespace) reuters21578 <- tm_map(reuters21578, tolower) reuters21578 <- tm_map(reuters21578, removePunctuation)
2012 Nov 18
4
panic fts_solr for bad attachment
Hi! I use dovecot 2.1.7 on Ubuntu 12.10 with fts_solr und decode2text.sh for indexing attachments. This works great in general. Just for one user there is a problem with an unknown bad attachment. I run "doveadm index -A '*'". After a while I receive: doveadm(xyz): Error: fts_solr: Invalid XML input at line 1: mismatched tag doveadm(xyz): Panic: file solr-connection.c: line
2012 Dec 31
5
2.1.12: Panic: file solr-connection.c: line 547 (solr_connection_post_more)
Hi all, I am having a problem indexing one of my mailboxes using the solr fts backend in dovecot 2.1.12 For many mailboxes it works just fine, but on one mailbox I currently always get a panic. solr setup: Java: icedtea 6.1.11.5 Solr: 3.6.2 running in tomcat 7.0.32 Command to reproduce error: doveadm index -u my at user badmailbox I already noticed that there have been some solr backend fixes
2013 Apr 28
3
Dovecot Solr Panic
Hello Everyone, I have a small base of users (30), but a lot of emails. I have an error again when I am indexing a virtual folder with a large number of folders. I appreciate this is a special case, but I am using dovecot and solr as it is, according to the documentation, the favourite way. One user is using a large number of archives sub folders, by years, months and subfolders, since 5 years.
2007 May 19
1
php+mssql support HowTO
Hi, I took the following doc: http://karlkatzke.com/centos-44-ms-sql-freetds-and-php/ (which now appears offline, but was written by Karl Katzke) And ported it to CentOS 5.x (php-5.1.6) by modifying the php.spec that he supplied I'd like to import this into the wiki under the HowTO section. My username is KyleODonnell Let me know what you think. Regards, Kyle O'Donnell