Mike Mondragon
2007-Oct-10 22:01 UTC
[Mechanize-users] Scraping AOL Webmail to login and fetch contacts?
I''m helping with a gem that is going to published under the contentfree project on rubyforge (http://rubyforge.org/projects/contentfree/). The gem is called "blackbook" and basically it will go and fetch your contacts from the major webmail providers. So far Gmail, Yahoo!, and MSN have been completed. We are trying to finish up with fetching contacts from AOL Webmail. However its a bit more difficult because of the javascript-like validation AOL has built into their sign-in service. The only resource I''ve found that talks about the correct strategy to sign-in to AOL via a scraping tool is here: http://apsquared.net/blog/2007/04/30/scraping-aol-webmail-for-contacts/ However we''ve not been able to recreate their experience with mechanize. Any suggestions or experience would be appreciated. Blackbook will be released onto rubyforge once we''ve completed AOL Webmail integration. Thanks Mike -- Mike Mondragon Work> http://sas.quat.ch/ Blog> http://blog.mondragon.cc/ Small URLs> http://hurl.it/
Mike Mondragon
2007-Nov-14 08:10 UTC
[Mechanize-users] Scraping AOL Webmail to login and fetch contacts?
On Oct 10, 2007 2:01 PM, Mike Mondragon <mikemondragon at gmail.com> wrote:> I''m helping with a gem that is going to published under the > contentfree project on rubyforge > (http://rubyforge.org/projects/contentfree/). > > The gem is called "blackbook" and basically it will go and fetch your > contacts from the major webmail providers. So far Gmail, Yahoo!, and > MSN have been completed. > > We are trying to finish up with fetching contacts from AOL Webmail. > However its a bit more difficult because of the javascript-like > validation AOL has built into their sign-in service. > > The only resource I''ve found that talks about the correct strategy to > sign-in to AOL via a scraping tool is here: > http://apsquared.net/blog/2007/04/30/scraping-aol-webmail-for-contacts/ > > However we''ve not been able to recreate their experience with > mechanize. Any suggestions or experience would be appreciated. > Blackbook will be released onto rubyforge once we''ve completed AOL > Webmail integration. > > Thanks > Mike > > -- > Mike Mondragon > Work> http://sas.quat.ch/ > Blog> http://blog.mondragon.cc/ > Small URLs> http://hurl.it/ >Dave Myron paid a bounty to Marton Fabo to find a fix so that the upcoming blackbook Gem could scrape contacts properly from AOL webmail. Marton found a fix.>From the fix I was putting together a patch to submit to Mechanize butI ran into the following failing tests. First, here''s the test that I wrote in test/tc_mech.rb that shows the broken behavior for the AOL webmail login test code as pretty pastie: http://pastie.caboo.se/private/q0xbdwhhhjamskjqm4niq def test_to_absolute_uri def @agent.public_to_absolute_uri(url) to_absolute_uri(url) end url = "http://localhost/?arg=val&jank=AAA%3D" assert_equal URI.parse(url), @agent.public_to_absolute_uri(url) # pattern of odd URL created by javascript validator in AOL webmail login # where to_absolute_uri strips out the last ''='' encoded as %3D url = "http://localhost/?arg=val&jank=AAA%3D%3D" assert_equal URI.parse(url), @agent.public_to_absolute_uri(url) end After I apply Marton''s fix test_to_absolute_uri passes but now the "test_link_with_unusual_characters" test in test/tc_links.rb fails Here is Matron''s fix applied to to_absolute_uri in lib/mechanize.rb as petty pastie: http://pastie.caboo.se/private/cb5ara4rlnh9fxe8jl1dea Any suggestions how to proceede? Which solution is more valid, the exiting to_absolute_uri method in mechanize.rb or the fix that Marton has found for to_absolute_uri? We can open up Mechanize in the Blackbook Gem to utilize Marton''s solution since it solves a problem specific to Blackbook''s interaction with AOL webmail via Mechanize. If Marton''s fix is more valid we would to contribute it to Mechanize to help the community out. Thanks Mike
Maybe Matching Threads
- Getting the friends contacts mail Id of gmail,hotmail,yahoo,myspace in RoR
- Weird error downloading a gzip''ed file
- Authentication on delegated web service methods -or- How the heck do I protect these things?
- map.resources :foo_items, :as => :foo confusing my controller specs
- Duda de como plantear