Displaying 2 results from an estimated 2 matches for "docc".
Did you mean:
doc
2008 Dec 31
1
Chinese characters encoding problem with XML
XML is a good tool reading data from web within R. But I wonder how could get the encoding correctly.
library(XML)
url <- 'http://www.szitic.com/docc/jz-lmzq.html'
xml <- htmlTreeParse(url, useInternal=TRUE)
q <- "//tbody/tr/td"
dat <- unlist(xpathApply(xml, q, xmlValue))
df <- as.data.frame(t(matrix(dat, 4)))
dt<-as.character(df[15,1])
The first column of df is dates in Chinese. dt is one of the Chinese dates.
Wh...
2012 Sep 14
0
htmlParse pop ups over web pages
...the routine bombs and I get an error message that the url doesn't exist.
Does the XML package (or perhaps another package) provide a way to deal with this issue?
I've tried:
for(i in 1:len_links) {
. . .
cc <- try(htmlParse(xurl), silent=T)
if(is(cc,"try-error")) {next}
docc <- htmlParse(xurl)
. . .
}
That doesn't work. Any ideas or help would be appreciated.
Thanks,
Winthrop
[[alternative HTML version deleted]]