Marshall Feldman
2010-Mar-18 01:52 UTC
[R] Do colClasses in readHTMLTable (XML Package) work?
Hi,
I can't get the colClasses option to work in the readHTMLTable function
of the XML package. Here's a code fragment:
require("XML")
doc <- "http://www.nber.org/cycles/cyclesmain.html"
table <- getNodeSet(htmlParse(doc),"//table") [[2]] #
The
main table is the second one because it's embedded in the page table.
xt <- readHTMLTable(
table,
header
c("peak","trough","contraction","expansion","trough2trough","peak2peak"),
colClasses
c("character","character","character","character","character","character"),
trim = TRUE
)
Does anyone know what's wrong?
Marsh Feldman
[[alternative HTML version deleted]]
Duncan Temple Lang
2010-Mar-20 13:04 UTC
[R] Do colClasses in readHTMLTable (XML Package) work?
On 3/17/10 6:52 PM, Marshall Feldman wrote:> Hi, > > I can't get the colClasses option to work in the readHTMLTable function > of the XML package. Here's a code fragment: > > require("XML") > doc <- "http://www.nber.org/cycles/cyclesmain.html" > table <- getNodeSet(htmlParse(doc),"//table") [[2]] # The > main table is the second one because it's embedded in the page table. > xt <- readHTMLTable( > table, > header > c("peak","trough","contraction","expansion","trough2trough","peak2peak"), > colClasses > c("character","character","character","character","character","character"), > trim = TRUE > ) > > Does anyone know what's wrong?The coercion of the table columns is done before the call to as.data.frame. You can add stringsAsFactors = FALSE in the call to readHTMLTable() and you'll get what you expect, I believe. D.> > Marsh Feldman > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.