Shunquan Tan
2007-May-08 04:32 UTC
[Mechanize-users] the website ban me when I use mechanize to grab its content, how to solve it?
Dear all, I keep grab photos from one website. At first I show them the default user agent (WWW-Mechanize). But obviously they don''t like this behavior and ban the user agent WWW-Mechanize. So I have to forge an IE user agent. It works for some time. But the website administrator seemed to detect my behavior and banned me again. This time I can not bypass it by simply modifying the user agent. I already set the user-agent of the Mechanize the same as the one my IE browser uses. But it can''t work. I can use IE to browse the website, but I only get an empty page when I use Mechanize. All the decisions are done the server side because in my client side any scripts or redirect header is not received. All I get is an empty page! So how can they judge I am not using a standard browser and how can I bypass this obstacle again? Andy suggestions are appreciated. Thanks in advance, Harish
Andy Lester
2007-May-08 04:42 UTC
[Mechanize-users] the website ban me when I use mechanize to grab its content, how to solve it?
On May 7, 2007, at 11:32 PM, Shunquan Tan wrote:> But the website administrator seemed to detect my behavior and > banned me again. This time I can not bypass it by simply modifying the > user agent.So don''t scrape the site. Mechanize isn''t created so you can abuse websites. -- Andy Lester => andy at petdance.com => www.petdance.com => AIM:petdance