I am trying to write an application which uses screen scraping. I have to first get a screen and get then scrape that page and using the link to a frame in that page, I have to get that frame and then finally scrape that frame and get some relevant information. When I make the first call in the browser, it shows "Please wait while loading..." message and that is what I am getting in the Nokogiri open call. I get the proper page in browser once the loading is complete. Now how do I wait my open call in code to wait so that i get a proper response. I am using Nokogiri for scraping. Thanks. -- Posted via http://www.ruby-forum.com/. -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
On May 19, 2011, at 9:14 PM, renu mehta wrote:> I am trying to write an application which uses screen scraping. I have > to first get a screen and get then scrape that page and using the link > to a frame in that page, I have to get that frame and then finally > scrape that frame and get some relevant information. When I make the > first call in the browser, it shows "Please wait while loading..." > message and that is what I am getting in the Nokogiri open call. I get > the proper page in browser once the loading is complete. Now how do I > wait my open call in code to wait so that i get a proper response. I > am > using Nokogiri for scraping. > > Thanks.Maybe you could do this in two passes. First, do a traditional download of the page source, similar to what wget does in spider mode. Then use a queue or similar to go through the downloaded pages and let Nokogiri at it (and delete that temp file when done). Someone else may have a better suggestion, and more in-depth knowledge of the remote open options, but this might be a way to pursue. Walter -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.
On May 19, 2011, at 9:14 PM, renu mehta wrote:> I am trying to write an application which uses screen scraping. I have > to first get a screen and get then scrape that page and using the link > to a frame in that page, I have to get that frame and then finally > scrape that frame and get some relevant information. When I make the > first call in the browser, it shows "Please wait while loading..." > message and that is what I am getting in the Nokogiri open call. I get > the proper page in browser once the loading is complete. Now how do I > wait my open call in code to wait so that i get a proper response. I > am > using Nokogiri for scraping. > > Thanks.One more thought about this, are you sure that the page isn''t using some JavaScript lazy-loading technique? "Please wait..." might be the actual source, and thus all that a crawler can open. Walter -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.