Displaying 1 result from an estimated 1 matches for "antuj".
Did you mean:
antu
2011 Mar 29
2
Scrap java scripts and styles from an html document
Hi,
I am working on developing a web crawler in R and I needed some help with
regard to removal of javascripts and style sheets from the html document of
a web page.
i tried using the xml package, hence the function xpathApply
library(XML)
txt =
xpathApply(html,"//body//text()[not(ancestor::script)][not(ancestor::style)]",
xmlValue)
The output comes out as text lines, without any html