Hello,
I am looking to scrape this Webpage:
http://toast.gasunie.de/gud/search.aspx?soid=GUD&lang=de
The page uses the method "POST", it contains various HTML Forms,
mostly
lists and a couple of radio buttons. After submit, I should get forwarded to
a new page. Which selections are being made in the forms does not really
matter, I get quite far, pls see the code:
library(RCurl)
library(RHTMLForms)
library(XML)
pageForms
getHTMLFormDescription("http://toast.gasunie.de/gud/search.aspx?soid=GUD&lang=de")
fun = createFunction(pageForms[[1]])
retSubmit = fun('ctl00$MainContent$GasQuality' = "H",
'ctl00$MainContent$PointList' = "H071",
'ctl00$MainContent$PointType' "EN",
'ctl00$MainContent$Publishers' = "HourValues",
'ctl00$MainContent$ListHourValues' = "-1",
'ctl00_MainContent_webDatePickerFrom_input' = "01.06.2012",
'ctl00_MainContent_webDatePickerTo_input' = "01.06.2012")
retPage = htmlTreeParse(retSubmit, asText = TRUE)
retPage
This is how far I get: All HTML Forms are being selected correctly with the
exception of 'ctl00$MainContent$ListHourValues'. My question is, why is
this
function not correctly electing the ListHourValues? Also I think that the
function is actually not submitting the Form, because if one would submit
the Form 'by Hand' without electing the ListHourValues the page would
return
MessageBox containg an error.
So the function basically returns the original page with the Forms being
selected to the right values, but it doesnt take the next step to return the
final result. Is it possible that I might have to pack the function
retSubmit into a postForm() function?
Best
Sven
--
View this message in context:
http://r.789695.n4.nabble.com/Help-with-this-web-scrape-function-tp4632137.html
Sent from the R help mailing list archive at Nabble.com.