Felipe Jordão A. P. Mattosinho
2010-Jan-26 17:27 UTC
[Mechanize-users] Does Amazon.com block scraping?
Hi there Does anyone know if Amazon.com has any sort of server side script that tries to block scraping activities? I first noticed that if I didn?t change the agent alias, it would fetch a page exactly like the normal one, but without the intial search field(maybe a silly way to prevent scraping). Then after it, I changed to some other alias, and submit a search. I got the result page as response, but right after getting the page, I received a message that Amazon.com closed my connection, and redirects me to another place. If anyone succeeded with Amazon.com, circunventing this protection, please send me some info Regards, Felipe -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://rubyforge.org/pipermail/mechanize-users/attachments/20100126/8262faeb/attachment.html>
Hello Felipe, Jimmy McGrath was kind enough to reply, twice, to your original email. (Thank you, Jimmy.) Please be kind enough to respond to his replies, and please do not resend your request when you''ve already received a response. Thank you for your kind cooperation, and for using Mechanize. 2010/1/26 Felipe Jord?o A. P. Mattosinho <felipemattosinho at terra.com.br>> Hi there > > > > Does anyone know if Amazon.com has any sort of server side script that > tries to block scraping activities? I first noticed that if I didn?t change > the agent alias, it would fetch a page exactly like the normal one, but > without the intial search field(maybe a silly way to prevent scraping). > Then after it, I changed to some other alias, and submit a search. I got the > result page as response, but right after getting the page, I received a > message that Amazon.com closed my connection, and redirects me to another > place. > > If anyone succeeded with Amazon.com, circunventing this protection, please > send me some info > > > > Regards, > > > > Felipe > > > > > > _______________________________________________ > Mechanize-users mailing list > Mechanize-users at rubyforge.org > http://rubyforge.org/mailman/listinfo/mechanize-users >-- mike dalessio mike at csa.net -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://rubyforge.org/pipermail/mechanize-users/attachments/20100126/6b3da265/attachment-0001.html>