Hi, I''m using Mechanize, and I''ve developed a lot of code around it. I''d like to be able to check the Etag header during a get to see if the page has changed, as well as some other http header information. Can I do that without hacking Mechanize myself? Does anyone have any examples of how to do this? William
Aaron, thanks. Just checked it and that did the trick! Love the blog name by the way. :-) William On 1/3/07 9:35 PM, "Aaron Patterson" <aaron_patterson at speakeasy.net> wrote:> > Sure. You can access the response from the page object. Here''s an > example: > > page = WWW::Mechanize.new().get(''http://www.google.com/'') > > page.header.each_header do |k,v| > puts "#{k} #{v}" > end > > Hope that helps!
On Wed, Jan 03, 2007 at 07:20:10PM -0500, William Flanagan wrote:> Hi, > > I''m using Mechanize, and I''ve developed a lot of code around it. I''d like > to be able to check the Etag header during a get to see if the page has > changed, as well as some other http header information. Can I do that > without hacking Mechanize myself? > > Does anyone have any examples of how to do this?Sure. You can access the response from the page object. Here''s an example: page = WWW::Mechanize.new().get(''http://www.google.com/'') page.header.each_header do |k,v| puts "#{k} #{v}" end Hope that helps! -- Aaron Patterson http://tenderlovemaking.com/
Is there any way to have a get option that uses the etag or not-modified header to not got a file if it hasn''t changed. This will cut down on bandwidth usage if WWW::Mechanize.new().get(''http://www/some_huge_file_infrequently_changed",:etag => etag, :updated => cached_time) do |page| else puts "not modified" end On 1/3/07, Aaron Patterson <aaron_patterson at speakeasy.net> wrote:> > On Wed, Jan 03, 2007 at 07:20:10PM -0500, William Flanagan wrote: > > Hi, > > > > I''m using Mechanize, and I''ve developed a lot of code around it. I''d > like > > to be able to check the Etag header during a get to see if the page has > > changed, as well as some other http header information. Can I do that > > without hacking Mechanize myself? > > > > Does anyone have any examples of how to do this? > > Sure. You can access the response from the page object. Here''s an > example: > > page = WWW::Mechanize.new().get(''http://www.google.com/'') > > page.header.each_header do |k,v| > puts "#{k} #{v}" > end > > Hope that helps! > > -- > Aaron Patterson > http://tenderlovemaking.com/ > _______________________________________________ > Mechanize-users mailing list > Mechanize-users at rubyforge.org > http://rubyforge.org/mailman/listinfo/mechanize-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/mechanize-users/attachments/20070104/4458476a/attachment.html
That?s a good question. I wanted to read the tag, so that I could do the analysis myself. To conserve bandwidth, you?d have to put the tag and post it in the initial get. I don?t know how to do this. Does anyone have sample code to do this, for the sake of the ?Internet?s? completeness? William On 1/4/07 12:22 PM, "Dominic Sisneros" <dsisnero at gmail.com> wrote:> Is there any way to have a get option that uses the etag or not-modified > header to not got a file if it hasn''t changed. > > This will cut down on bandwidth usage > > if WWW::Mechanize.new().get('' http://www/some_huge_file_infrequently_changed > <http://www/some_huge_file_infrequently_changed> ",:etag => etag, :updated => > cached_time) do |page| > else > puts "not modified" > end >> >> Sure. You can access the response from the page object. Here''s an >> example: >> >> page = WWW::Mechanize.new().get(''http://www.google.com/'' >> <http://www.google.com/'> ) >> >> page.header.each_header do |k,v| >> puts "#{k} #{v}" >> end >>-------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/mechanize-users/attachments/20070104/30109d94/attachment-0001.html
On Thu, Jan 04, 2007 at 10:22:16AM -0700, Dominic Sisneros wrote:> Is there any way to have a get option that uses the etag or not-modified > header to not got a file if it hasn''t changed. > > This will cut down on bandwidth usage > > if > WWW::Mechanize.new().get(''http://www/some_huge_file_infrequently_changed",:etag > => etag, :updated => cached_time) do |page| > else > puts "not modified" > endIt is possible, but not very easy right now. You can subclass mechanize, and implement the "set_headers" method to add an If-Modified-Since header. Then you''d probably have to write a pluggable parser that deals with the response code by looking up the cached page. I was thinking of building this in to mechanize, but I didn''t know if anyone wanted/needed it. How important is this to people? -- Aaron Patterson http://tenderlovemaking.com/