Marshall Feldman
2010-Mar-18 01:52 UTC
[R] Do colClasses in readHTMLTable (XML Package) work?
Hi, I can't get the colClasses option to work in the readHTMLTable function of the XML package. Here's a code fragment: require("XML") doc <- "http://www.nber.org/cycles/cyclesmain.html" table <- getNodeSet(htmlParse(doc),"//table") [[2]] # The main table is the second one because it's embedded in the page table. xt <- readHTMLTable( table, header c("peak","trough","contraction","expansion","trough2trough","peak2peak"), colClasses c("character","character","character","character","character","character"), trim = TRUE ) Does anyone know what's wrong? Marsh Feldman [[alternative HTML version deleted]]
Duncan Temple Lang
2010-Mar-20 13:04 UTC
[R] Do colClasses in readHTMLTable (XML Package) work?
On 3/17/10 6:52 PM, Marshall Feldman wrote:> Hi, > > I can't get the colClasses option to work in the readHTMLTable function > of the XML package. Here's a code fragment: > > require("XML") > doc <- "http://www.nber.org/cycles/cyclesmain.html" > table <- getNodeSet(htmlParse(doc),"//table") [[2]] # The > main table is the second one because it's embedded in the page table. > xt <- readHTMLTable( > table, > header > c("peak","trough","contraction","expansion","trough2trough","peak2peak"), > colClasses > c("character","character","character","character","character","character"), > trim = TRUE > ) > > Does anyone know what's wrong?The coercion of the table columns is done before the call to as.data.frame. You can add stringsAsFactors = FALSE in the call to readHTMLTable() and you'll get what you expect, I believe. D.> > Marsh Feldman > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.