search for: readreut21578xml

Displaying 3 results from an estimated 3 matches for "readreut21578xml".

2012 May 29
1
package tm: reading XML files
...he meta data is lost, and the files are interpreted as plain text. I use the following command, where the indicated directory contains all reuters 21578 documents as separate XML files: > reuters21578 <- Corpus(DirSource("C:/Data/Reuters/preprocessed"), readerContol=list(reader=readReut21578XML)) I'm running R2.15.0 under Windows XP. Has anybody else encountered this problem and found a cause/solution. Best regards, -Ad Feelders
2009 Dec 11
0
readHTML within tm package
...be used to read HTML documents into a corpus. However, when I try to use that routine I get an error. When I run getReaders (below) readHTML isn't listed. > getReaders() [1] "readDOC" "readGmane" [3] "readPDF" "readReut21578XML" [5] "readReut21578XMLasPlain" "readPlain" [7] "readRCV1" "readTabular" I'm a missing something? Is there an extra install I'm missing, or has the routine been removed or replaced? Thanks, Peter Oh, yes, ru...
2010 Feb 04
1
How to read HTML or TEXT file with tm package
??????????????????????????????????????????... ????: ???? URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20100204/a3069c99/attachment.pl>