thr3ads.net - Mechanize users - [Mechanize-users] Capturing the result of submits [Mar 2009]

If this information is useful, please help other people find it:
Share via:

Ross Cameron

2009-Mar-24 23:23 UTC

[Mechanize-users] Capturing the result of submits

Hi

I apologize up front if this is a dumb question because I guess Ajax and 
thus Javascript is involved.

Is there any way to capture the result of a submit if the current page 
is modified as result of the submit?

For example, a couple of input fields, a submit and the result turns up 
in a modified <div> and which it looks like Mechanize doesn''t
get.

I hope I haven''t answered my own question!

Regards



-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://rubyforge.org/pipermail/mechanize-users/attachments/20090325/c1a9e649/attachment.html>

Mat Schaffer

2009-Mar-25 01:17 UTC

head link

[Mechanize-users] Capturing the result of submits

If the page doesn''t refresh then javascript is involved. Of course,  
that''s not to say you couldn''t parse the javascript response
in ruby
and get the information you''re looking for. I''ve done it a lot
with
good results. I actually scripted most of the major webmail systems  
with mechanize a few years back and AOL''s webmail was the only  
javascript nut I couldn''t crack.
-Mat

On Mar 24, 2009, at 7:23 PM, Ross Cameron wrote:
> Hi
>
> I apologize up front if this is a dumb question because I guess Ajax  
> and thus Javascript is involved.
>
> Is there any way to capture the result of a submit if the current  
> page is modified as result of the submit?
>
> For example, a couple of input fields, a submit and the result turns  
> up in a modified <div> and which it looks like Mechanize
doesn''t get.
>
> I hope I haven''t answered my own question!
>
> Regards
>
>
>
> _______________________________________________
> Mechanize-users mailing list
> Mechanize-users at rubyforge.org
> http://rubyforge.org/mailman/listinfo/mechanize-users

Ross Cameron

2009-Mar-25 03:29 UTC

head link

[Mechanize-users] Capturing the result of submits

Hi Matt

Many thanks. I sort of went and solved it in the case of a form GET 
method by scripting the full path for the form action. This wasn''t too 
difficult because the action url can be discovered by inspection. POST 
is somewhat more difficult but I assume there are ways of finding out 
what is passed and setting those.

But what would be nicer, if you wouldn''t mind, is pointing me in the 
right direction to get at the JavaScript response - not sure how to do 
that. That would nail it.

Regards
Ross

Mat Schaffer wrote:> If the page doesn''t refresh then javascript is involved. Of
course,
> that''s not to say you couldn''t parse the javascript
response in ruby
> and get the information you''re looking for. I''ve done it
a lot with
> good results. I actually scripted most of the major webmail systems 
> with mechanize a few years back and AOL''s webmail was the only 
> javascript nut I couldn''t crack.
> -Mat
>
> On Mar 24, 2009, at 7:23 PM, Ross Cameron wrote:
>
>> Hi
>>
>> I apologize up front if this is a dumb question because I guess Ajax 
>> and thus Javascript is involved.
>>
>> Is there any way to capture the result of a submit if the current 
>> page is modified as result of the submit?
>>
>> For example, a couple of input fields, a submit and the result turns 
>> up in a modified <div> and which it looks like Mechanize
doesn''t get.
>>
>> I hope I haven''t answered my own question!
>>
>> Regards
>>
>>
>>
>> _______________________________________________
>> Mechanize-users mailing list
>> Mechanize-users at rubyforge.org
>> http://rubyforge.org/mailman/listinfo/mechanize-users
>
> _______________________________________________
> Mechanize-users mailing list
> Mechanize-users at rubyforge.org
> http://rubyforge.org/mailman/listinfo/mechanize-users
-- 
------------------------------------------------------------------------
Ross Cameron | Director
Roscommon Pty Ltd | ABN 85 099 499 840
p: +61 2 9016 4133 <callto:+61%202%209016%204133> | m: +61 4 3312 9087 
<callto:+61%204%203312%209087> | f: +61 2 9420 4525 
<callto:+61%202%209420%204525> | w: www.roscommonhq.com 
<http://www.roscommonhq.com> | AIM: rossppc

Roscommon uses the five sentences <http://five.sentenc.es> email reply 
policy. Please consider our environment before printing this email.

NOTE: This email and any attachments may be confidential. If received in 
error, please delete the email. Because emails and attachments may be 
interfered with, may contain computer viruses or other defects and may 
not be successfully replicated on other systems, you must be cautious. 
Roscommon cannot guarantee that what you receive is what we sent. If you 
have any doubts about the authenticity of an email from Roscommon, 
please contact us immediately.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://rubyforge.org/pipermail/mechanize-users/attachments/20090325/abe5f7e6/attachment-0001.html>

Mike Mondragon

2009-Mar-25 03:35 UTC

head link

[Mechanize-users] Capturing the result of submits

On Tue, Mar 24, 2009 at 6:17 PM, Mat Schaffer <mat.schaffer at gmail.com>
wrote:> If the page doesn''t refresh then javascript is involved. Of
course, that''s
> not to say you couldn''t parse the javascript response in ruby and
get the
> information you''re looking for. I''ve done it a lot with
good results. I
> actually scripted most of the major webmail systems with mechanize a few
> years back and AOL''s webmail was the only javascript nut I
couldn''t crack.
I think a lot of people came up against the problem with scraping AOL
webmail.  They had an edgecase for URL formatting that Mechanize was
handling a bit differently than a real web browser.  Here''s the duck
punch on WWW::Mechanize::to_absolute_uri that can be used to scrape on
AOL webmail properly.

http://github.com/contentfree/blackbook/blob/ca9d90ff1be576bdbb42a1c6b81940d81840ed9d/lib/blackbook/importer/page_scraper.rb

Mike
> -Mat
>
> On Mar 24, 2009, at 7:23 PM, Ross Cameron wrote:
>
>> Hi
>>
>> I apologize up front if this is a dumb question because I guess Ajax
and
>> thus Javascript is involved.
>>
>> Is there any way to capture the result of a submit if the current page
is
>> modified as result of the submit?
>>
>> For example, a couple of input fields, a submit and the result turns up
in
>> a modified <div> and which it looks like Mechanize
doesn''t get.
>>
>> I hope I haven''t answered my own question!
>>
>> Regards
>>
>>
>>
>> _______________________________________________
>> Mechanize-users mailing list
>> Mechanize-users at rubyforge.org
>> http://rubyforge.org/mailman/listinfo/mechanize-users
>
> _______________________________________________
> Mechanize-users mailing list
> Mechanize-users at rubyforge.org
> http://rubyforge.org/mailman/listinfo/mechanize-users
>

Ross Cameron

2009-Mar-25 03:53 UTC

head link

[Mechanize-users] Capturing the result of submits

Mike

Most helpful. And a very elegant solution to the mechanize uri problem.

Regards
Ross

Mike Mondragon wrote:> On Tue, Mar 24, 2009 at 6:17 PM, Mat Schaffer <mat.schaffer at
gmail.com> wrote:
>   
>> If the page doesn''t refresh then javascript is involved. Of
course, that''s
>> not to say you couldn''t parse the javascript response in ruby
and get the
>> information you''re looking for. I''ve done it a lot
with good results. I
>> actually scripted most of the major webmail systems with mechanize a
few
>> years back and AOL''s webmail was the only javascript nut I
couldn''t crack.
>>     
>
> I think a lot of people came up against the problem with scraping AOL
> webmail.  They had an edgecase for URL formatting that Mechanize was
> handling a bit differently than a real web browser.  Here''s the
duck
> punch on WWW::Mechanize::to_absolute_uri that can be used to scrape on
> AOL webmail properly.
>
>
http://github.com/contentfree/blackbook/blob/ca9d90ff1be576bdbb42a1c6b81940d81840ed9d/lib/blackbook/importer/page_scraper.rb
>
> Mike
>
>   
>> -Mat
>>
>> On Mar 24, 2009, at 7:23 PM, Ross Cameron wrote:
>>
>>     
>>> Hi
>>>
>>> I apologize up front if this is a dumb question because I guess
Ajax and
>>> thus Javascript is involved.
>>>
>>> Is there any way to capture the result of a submit if the current
page is
>>> modified as result of the submit?
>>>
>>> For example, a couple of input fields, a submit and the result
turns up in
>>> a modified <div> and which it looks like Mechanize
doesn''t get.
>>>
>>> I hope I haven''t answered my own question!
>>>
>>> Regards
>>>
>>>
>>>
>>> _______________________________________________
>>> Mechanize-users mailing list
>>> Mechanize-users at rubyforge.org
>>> http://rubyforge.org/mailman/listinfo/mechanize-users
>>>       
>> _______________________________________________
>> Mechanize-users mailing list
>> Mechanize-users at rubyforge.org
>> http://rubyforge.org/mailman/listinfo/mechanize-users
>>
>>     
> _______________________________________________
> Mechanize-users mailing list
> Mechanize-users at rubyforge.org
> http://rubyforge.org/mailman/listinfo/mechanize-users
>   
-- 
------------------------------------------------------------------------
Ross Cameron | Director
Roscommon Pty Ltd | ABN 85 099 499 840
p: +61 2 9016 4133 <callto:+61%202%209016%204133> | m: +61 4 3312 9087 
<callto:+61%204%203312%209087> | f: +61 2 9420 4525 
<callto:+61%202%209420%204525> | w: www.roscommonhq.com 
<http://www.roscommonhq.com> | AIM: rossppc

Roscommon uses the five sentences <http://five.sentenc.es> email reply 
policy. Please consider our environment before printing this email.

NOTE: This email and any attachments may be confidential. If received in 
error, please delete the email. Because emails and attachments may be 
interfered with, may contain computer viruses or other defects and may 
not be successfully replicated on other systems, you must be cautious. 
Roscommon cannot guarantee that what you receive is what we sent. If you 
have any doubts about the authenticity of an email from Roscommon, 
please contact us immediately.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://rubyforge.org/pipermail/mechanize-users/attachments/20090325/cd41c5e0/attachment.html>

Mat Schaffer

2009-Mar-25 11:58 UTC

head link

[Mechanize-users] Capturing the result of submits

On Mar 24, 2009, at 11:35 PM, Mike Mondragon wrote:> I think a lot of people came up against the problem with scraping AOL
> webmail.  They had an edgecase for URL formatting that Mechanize was
> handling a bit differently than a real web browser.  Here''s the
duck
> punch on WWW::Mechanize::to_absolute_uri that can be used to scrape on
> AOL webmail properly.
>
>
http://github.com/contentfree/blackbook/blob/ca9d90ff1be576bdbb42a1c6b81940d81840ed9d/lib/blackbook/importer/page_scraper.rb
>
> Mike
ha! Nice one, man. Sadly the project I was doing it for is long gone,  
but thanks for this lovely gem. I''ll sure be bookmarking this for
later!
-Mat

Mat Schaffer

2009-Mar-25 12:03 UTC

head link

[Mechanize-users] Capturing the result of submits

On Mar 24, 2009, at 11:29 PM, Ross Cameron wrote:> Hi Matt
>
> Many thanks. I sort of went and solved it in the case of a form GET  
> method by scripting the full path for the form action. This wasn''t
> too difficult because the action url can be discovered by  
> inspection. POST is somewhat more difficult but I assume there are  
> ways of finding out what is passed and setting those.
>
> But what would be nicer, if you wouldn''t mind, is pointing me in
the
> right direction to get at the JavaScript response - not sure how to  
> do that. That would nail it.
I often use Charles in these situations (http:// 
www.charlesproxy.com/). There are other options too like TamperData or  
Fiddler for windows, but charles feels a bit more organized/reliable  
and usually the 30 minute time limit is enough to get simple jobs done.

Once you''ve figured out the right request, the response can be  
obtained from #body in mechanize like usual.

-Mat
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://rubyforge.org/pipermail/mechanize-users/attachments/20090325/afb5c32b/attachment.html>

Mechanize users - Mar 2009 - Capturing the result of submits

[Mechanize-users] Capturing the result of submits

[Mechanize-users] Capturing the result of submits

[Mechanize-users] Capturing the result of submits

[Mechanize-users] Capturing the result of submits

[Mechanize-users] Capturing the result of submits

[Mechanize-users] Capturing the result of submits

[Mechanize-users] Capturing the result of submits