Jodi Showers
2009-Mar-30 15:09 UTC
my bug, nefarious scraper or a legitimate browser plugin?
I''ve been faced the the following symptoms for some time. I have links coded as :post or :put, so I can make sure that bots aren''t hitting particular links. But it something is either hitting them as :get through an error I''ve made (like link_to not working well in some browsers?), or there''s 1 or more plugins that pre load urls; or I have scrapers. Each day I''ll get 50-100 error messages - where routes aren''t found - of this nature. When I get many of these hits from the same ip address, I usually assume a scraper, then block that ip address...but I don''t want to do this in very case if it''s possible that a legimate (pre-loader browser plugin) is causing this to happen. Does anybody else this kind of behaviour? How do you handle it? thanks. Jodi --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Frederick Cheung
2009-Mar-30 17:22 UTC
Re: my bug, nefarious scraper or a legitimate browser plugin?
On Mar 30, 4:09 pm, Jodi Showers <j...-vRiTP4Lz4TuakBO8gow8eQ@public.gmane.org> wrote:> I''ve been faced the the following symptoms for some time. > > I have links coded as :post or :put, so I can make sure that bots > aren''t hitting particular links. > > But it something is either hitting them as :get through an error I''ve > made (like link_to not working well in some browsers?), or there''s 1 > or more plugins that pre load urls; or I have scrapers. >A browser with js turned off would also do this (or using a firefox plugin like noscript to only have it on for certain websites) Fred> Each day I''ll get 50-100 error messages - where routes aren''t found - > of this nature. > > When I get many of these hits from the same ip address, I usually > assume a scraper, then block that ip address...but I don''t want to do > this in very case if it''s possible that a legimate (pre-loader browser > plugin) is causing this to happen. > > Does anybody else this kind of behaviour? How do you handle it? > > thanks. > Jodi--~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Jodi Showers
2009-Mar-30 18:31 UTC
Re: my bug, nefarious scraper or a legitimate browser plugin?
On 30-Mar-09, at 1:22 PM, Frederick Cheung wrote:> > On Mar 30, 4:09 pm, Jodi Showers <j...-vRiTP4Lz4TuakBO8gow8eQ@public.gmane.org> wrote: >> I''ve been faced the the following symptoms for some time. >> >> I have links coded as :post or :put, so I can make sure that bots >> aren''t hitting particular links. >> >> But it something is either hitting them as :get through an error I''ve >> made (like link_to not working well in some browsers?), or there''s 1 >> or more plugins that pre load urls; or I have scrapers. >> > > A browser with js turned off would also do this (or using a firefox > plugin like noscript to only have it on for certain websites) > > Fredty Fred - yes. good thought - likely the simplest explanation. will trap, then see where that leads me. Jodi --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Jodi Showers
2009-Apr-01 14:47 UTC
Re: my bug, nefarious scraper or a legitimate browser plugin?
This latest info rules out JS-off or a noscript plugin - On 30-Mar-09, at 1:22 PM, Frederick Cheung wrote:> > On Mar 30, 4:09 pm, Jodi Showers <j...-vRiTP4Lz4TuakBO8gow8eQ@public.gmane.org> wrote: >> I''ve been faced the the following symptoms for some time. >> >> I have links coded as :post or :put, so I can make sure that bots >> aren''t hitting particular links. >> >> But it something is either hitting them as :get through an error I''ve >> made (like link_to not working well in some browsers?), or there''s 1 >> or more plugins that pre load urls; or I have scrapers. >> > > A browser with js turned off would also do this (or using a firefox > plugin like noscript to only have it on for certain websites) > > FredHere''s the markup where the link is: <a style="margin-left: 0pt; padding-left: 0pt; font-size: 16px; margin- bottom: 60px;" onclick="Windows.overlayHideEffectOptions = {duration: 3}; Dialog.info({url: ''http://homestars.com/messages/201895-contex-roofing-company-ltd/new_company'' , options: {method: ''post''}}, {title:''Contact Us'', className: ''bluelighting'', width: 290, height: 450, destroyOnClose: true, draggable: true, evalScripts: true});; return false;" href="#"><img style="margin-right: 5px;" alt="Email Contex Roofing Company Ltd." src="/images/contact_email.jpg"/><img style="margin-right: 5px;" alt="Phone Contex Roofing Company Ltd." src="/images/ contact_phone.jpg"/><span style="font-size: 14px;">Contact: Contex Roofing Company Ltd.</span></a> The bot/human is reaching the url "http://homestars.com/messages/201895-contex-roofing-company-ltd/new_company ", but you can see the href = ''#'' - so something is scanning the html, looking for urls to harvest. So looks like either a bot or a page preloader... I don''t mind pre-loaders - so I think I''ll see if I can find a patter in the plugins loaded..and if I don''t find a plugin then this could work as a honeypot. I guess I don''t have a specific question - merely symptoms - hopeful that someone may have faced such a thing. Jodi --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---