I'm hoping to work with the tm package with some html documents. In the
documentation and in the the tutorial material it says that there is a
readHTML routine that can be used to read HTML documents into a corpus.
However, when I try to use that routine I get an error. When I run
getReaders (below) readHTML isn't listed.
> getReaders()
[1] "readDOC" "readGmane"
[3] "readPDF" "readReut21578XML"
[5] "readReut21578XMLasPlain" "readPlain"
[7] "readRCV1" "readTabular"
I'm a missing something? Is there an extra install I'm missing, or has
the
routine been removed or replaced?
Thanks, Peter
Oh, yes, running the latest R release on Mac OS 10.6.2
--
View this message in context:
http://n4.nabble.com/readHTML-within-tm-package-tp960778p960778.html
Sent from the R help mailing list archive at Nabble.com.