search for: antujsrv

Displaying 2 results from an estimated 2 matches for "antujsrv".

2011 Mar 03
6
Developing a web crawler
Hi, I wish to develop a web crawler in R. I have been using the functionalities available under the RCurl package. I am able to extract the html content of the site but i don't know how to go about analyzing the html formatted document. I wish to know the frequency of a word in the document. I am only acquainted with analyzing data sets. So how should i go about analyzing data that is not
2011 Mar 29
2
Scrap java scripts and styles from an html document
Hi, I am working on developing a web crawler in R and I needed some help with regard to removal of javascripts and style sheets from the html document of a web page. i tried using the xml package, hence the function xpathApply library(XML) txt = xpathApply(html,"//body//text()[not(ancestor::script)][not(ancestor::style)]", xmlValue) The output comes out as text lines, without any html