Kabel
2008-Sep-03 09:14 UTC
[Mechanize-users] JS, !(more Connections), is it possible with Mechanize?
Hello out there, I''m using Ruby with Mechanize, and I really love it. I managed to automate some usefull things, but unfortunately I''m stuck with a problem for the moment. I''m not sure this "thing" is makeable with mechanize, but I hope so. So, now what I want to do. I fill out some forms to get an e-mail address. Every thing goes well, until I come to the sixth page, there I have to accept the Terms-Of-Use. Therefor I must, click a button, executing a javascript function, that makes pop-up a new window, where I have to click on accept. Once this is done, a value is set into the first form. Without this value I can''t proceed. Well. I tried a lot of things, for example putting the value I get, while I browse this site with my browser, in my code (hardcoded, bad, I know). But this does not work. I didn''t found the value in a cookie, so I think this values is computed with an algorithm, I don''t know (So I can''t calculate it on my side). Now, I have no idea why this isn''t working, and I do not know how to continue. I think the problem is that I construct a new connection when I open the page, the javascipt is calling. Is there a possibility to open a page with the same connection? Or, can I open a page and doing so that the page things it''s from the same browser? I don''t think so, why would session-id exist if this would work... As you see, I''m stuck, and would be very happy if someone could help me. Here a little piece of my code, the part where I don''t know how to proceed. 73 def fiveForm 74 75 76 page = agent.get(''http://url/w_terms.cfm?accept='') #HERE I call the url that is called by the Javascript 77 80 agent = WWW::Mechanize.new 84 page=@@page #@@page contains the page with the form 85 agent = WWW::Mechanize.new 86 five_form = page.form(''registration'') 87 five_form.checkboxes.name(''q1'')[2].check 88 89 five_form.checkboxes.name(''q7'')[0].check 90 five_form.qAuthentication="24A88F71-F1F6-B693-0A571B7B45FB3CA0" #this is the hardcoded part. I have to extract ist in someway, I need help. Plz. Greetz, Kabel -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part. URL: <http://rubyforge.org/pipermail/mechanize-users/attachments/20080903/01b95737/attachment.bin>
Mat Schaffer
2008-Sep-08 02:48 UTC
[Mechanize-users] JS, !(more Connections), is it possible with Mechanize?
On Sep 3, 2008, at 5:14 AM, Kabel wrote:> So, now what I want to do. I fill out some forms to get an e-mail > address. > Every thing goes well, until I come to the sixth page, there I have > to accept > the Terms-Of-Use. Therefor I must, click a button, executing a > javascript > function, that makes pop-up a new window, where I have to click on > accept. > Once this is done, a value is set into the first form. Without this > value I > can''t proceed. > Well. I tried a lot of things, for example putting the value I get, > while I > browse this site with my browser, in my code (hardcoded, bad, I > know). But > this does not work. > I didn''t found the value in a cookie, so I think this values is > computed with > an algorithm, I don''t know (So I can''t calculate it on my side). > > Now, I have no idea why this isn''t working, and I do not know how to > continue. > I think the problem is that I construct a new connection when I open > the > page, the javascipt is calling. > > Is there a possibility to open a page with the same connection? Or, > can I > open a page and doing so that the page things it''s from the same > browser? I > don''t think so, why would session-id exist if this would work... > > As you see, I''m stuck, and would be very happy if someone could > help me.Unless I missed something, Mechanize doesn''t have the ability to evaluate javascript. So if the particular field has a value that''s being inserted by javascript, Mechanize might not be the answer. Does the response have an <input> tag for the field in question with a value attribute you could grab? If not, and javascript is doing it you''ll probably have to either (a) reverse engineer the javascript into ruby so you can calculate it yourself (firebug is real handy for this). or (b) switch to something like watir (or potentially selenium) that drives an actual browser. I also recommend getting a copy of Charles for this sort of work (charlesproxy.com, free for 30 minutes at a time, $50 for a license). It''s an http debugging proxy that you can point both your browser and mechanize through. That will let you see exactly what''s getting transfered over the wire and what might be different between your script and a real browser. Hope that helps, Mat
Chris Riddoch
2008-Sep-08 16:42 UTC
[Mechanize-users] JS, !(more Connections), is it possible with Mechanize?
On Sun, Sep 7, 2008 at 8:48 PM, Mat Schaffer <mat.schaffer at gmail.com> wrote:> Unless I missed something, Mechanize doesn''t have the ability to evaluate > javascript. So if the particular field has a value that''s being inserted by > javascript, Mechanize might not be the answer. Does the response have an > <input> tag for the field in question with a value attribute you could grab? > If not, and javascript is doing it you''ll probably have to either (a) > reverse engineer the javascript into ruby so you can calculate it yourself > (firebug is real handy for this). or (b) switch to something like watir (or > potentially selenium) that drives an actual browser.The need of a javascript interpreter to really browse the web these days throws a wrench at Mechanize and tools like it. I''ve been thinking it''s time someone made a standalone javascript interpreter with lots and lots of callbacks for code like things that would run in Mechanize. It''s not a simple problem - once a basic Javascript interpreter is implemented, we''d need to hook it to a DOM, and deal with the compatibility issues of Jscript vs Javascript, etc. etc... I''ve been looking into actually doing this, but I don''t have the time, energy, or suitable motivation to bother.> I also recommend getting a copy of Charles for this sort of work > (charlesproxy.com, free for 30 minutes at a time, $50 for a license). It''s > an http debugging proxy that you can point both your browser and mechanize > through. That will let you see exactly what''s getting transfered over the > wire and what might be different between your script and a real browser.For a free option, tcpdump is quite nice. For a graphical free option, try Wireshark. Unless, of course, you''re working with SSL, in which case there''s other complications. I''m sure there are free SSL proxy tools of this sort, but I haven''t researched it myself. -- epistemological humility Chris Riddoch
Chris McMahon
2008-Sep-08 16:50 UTC
[Mechanize-users] JS, !(more Connections), is it possible with Mechanize?
It''s not a simple problem - once a basic Javascript> interpreter is implemented, we''d need to hook it to a DOM, and deal > with the compatibility issues of Jscript vs Javascript, etc. etc... > I''ve been looking into actually doing this, but I don''t have the time, > energy, or suitable motivation to bother.Not to mention that Selenium and Watir/FireWatir already do a pretty good job of this.> For a free option, tcpdump is quite nice. For a graphical free > option, try Wireshark. Unless, of course, you''re working with SSL, in > which case there''s other complications. I''m sure there are free SSL > proxy tools of this sort, but I haven''t researched it myself.Actually no. If you could decode SSL on the fly in a proxy, SSL would be pretty useless. If you need to watch traffic, the traffic has to be over http not https. -Chris
Chris Riddoch
2008-Sep-08 17:21 UTC
[Mechanize-users] JS, !(more Connections), is it possible with Mechanize?
On Mon, Sep 8, 2008 at 10:50 AM, Chris McMahon <christopher.mcmahon at gmail.com> wrote:> Actually no. If you could decode SSL on the fly in a proxy, SSL would > be pretty useless. If you need to watch traffic, the traffic has to > be over http not https.Er, perhaps I didn''t phrase what I meant well. Browser <--SSL--> Proxy <--SSL--> web server A proxy app could talk to the web server via SSL, and the web server would authenticate the proxy as though it were the client itself.>From the web server''s point of view, the proxy *is* the browser. Thebrowser could request a page from the proxy, not the web server directly. The proxy, as an endpoint, can then legitimately decrypt both the SSL session with the browser, and the separate SSL session with the server, and log everything. The browser can verify that it''s talking to the proxy, but not the web server. The proxy can, in turn, verify the web server''s certificates, but this has the flaw that the browser *has* to trust the proxy''s word for it that the web server''s certificate is valid. So I''d only trust such a proxy if it were running locally. If you have to watch the traffic, this is the way I''d go about doing it. You''re quite right that if you could just drop in a proxy server and have it silently decrypt passing SSL traffic, there wouldn''t be much point to SSL. Oh, and FireWatir looks really cool. I look forward to using it without needing to build my own Firefox, which it seems the jssh dependency requires? -- epistemological humility Chris Riddoch
Nick Grandy
2008-Sep-09 04:11 UTC
[Mechanize-users] JS, !(more Connections), is it possible with Mechanize?
re: javscript with mechanize, you guys should check out johnson: http://groups.google.com/group/johnson-talk http://github.com/jbarnette/johnson/tree/master i believe aaron is planning on using johnson as the javascript interpreter with mechanize. n On Mon, Sep 8, 2008 at 10:21 AM, Chris Riddoch <riddochc at gmail.com> wrote:> On Mon, Sep 8, 2008 at 10:50 AM, Chris McMahon > <christopher.mcmahon at gmail.com> wrote: >> Actually no. If you could decode SSL on the fly in a proxy, SSL would >> be pretty useless. If you need to watch traffic, the traffic has to >> be over http not https. > > Er, perhaps I didn''t phrase what I meant well. > > Browser <--SSL--> Proxy <--SSL--> web server > > A proxy app could talk to the web server via SSL, and the web server > would authenticate the proxy as though it were the client itself. > >From the web server''s point of view, the proxy *is* the browser. The > browser could request a page from the proxy, not the web server > directly. The proxy, as an endpoint, can then legitimately decrypt > both the SSL session with the browser, and the separate SSL session > with the server, and log everything. > > The browser can verify that it''s talking to the proxy, but not the web > server. The proxy can, in turn, verify the web server''s certificates, > but this has the flaw that the browser *has* to trust the proxy''s word > for it that the web server''s certificate is valid. So I''d only trust > such a proxy if it were running locally. If you have to watch the > traffic, this is the way I''d go about doing it. > > You''re quite right that if you could just drop in a proxy server and > have it silently decrypt passing SSL traffic, there wouldn''t be much > point to SSL. > > Oh, and FireWatir looks really cool. I look forward to using it > without needing to build my own Firefox, which it seems the jssh > dependency requires? > > -- > epistemological humility > Chris Riddoch > _______________________________________________ > Mechanize-users mailing list > Mechanize-users at rubyforge.org > http://rubyforge.org/mailman/listinfo/mechanize-users >-- Nick Grandy mobile: (+1) 347-835-1706