Hey Mat, On Mon, Jul 02, 2007 at 12:58:29PM -0400, Mat Schaffer wrote:> I haven''t spent the time to get a proper test case for this yet, but > there appears to be a bug in the basic_auth code for mechanize > 0.6.9. I''ve attached a CSV (from Charles) that illustrates the problem. > > Basically when running with basic_auth, there''s a failed request > that''s followed up by a successful request. That last POST is a > agent.submit(form) which gets repeated as a GET.I think I know what the problem is.... It looks like I forgot to repeat the request with what was original method. So if you sent a post, then got a 401, mechanize will re-request with a GET. For now, you should downgrade to 0.6.8 and I''ll fix this bug. I''m sorry everyone! :-( -- Aaron Patterson http://tenderlovemaking.com/
I haven''t spent the time to get a proper test case for this yet, but there appears to be a bug in the basic_auth code for mechanize 0.6.9. I''ve attached a CSV (from Charles) that illustrates the problem. Basically when running with basic_auth, there''s a failed request that''s followed up by a successful request. That last POST is a agent.submit(form) which gets repeated as a GET. I''ll email again if I get any thing more conclusive. Thanks, Mat -------------- next part -------------- A non-text attachment was scrubbed... Name: basic auth bug.csv Type: application/octet-stream Size: 917 bytes Desc: not available Url : http://rubyforge.org/pipermail/mechanize-users/attachments/20070702/0d50bbc6/attachment.obj
On Mon, Jul 02, 2007 at 03:00:35PM -0400, Mat Schaffer wrote:> On Jul 2, 2007, at 12:10 PM, Aaron Patterson wrote: > > I think I know what the problem is.... It looks like I forgot to > > repeat > > the request with what was original method. So if you sent a post, > > then got a 401, mechanize will re-request with a GET. > > > > For now, you should downgrade to 0.6.8 and I''ll fix this bug. I''m > > sorry > > everyone! :-( > > No problem at all! Thanks for the quick reply. > > But I''m a little concerned about the duplicate requests. I''m > guessing you''re doing it for security because if you assume that you > should send the basic auth every time, you risk the user sending > their auth string to an unintended site.Not exactly.> > Have you considered an optional third argument to basic_auth that > would be a base path on which to send the information? i.e., > basic_auth(user, pass, ''http://www.mysite.com/blah'') causes all > requests that start with ''http://www.mysite.com/blah'' to send > authorization. > > Extra requests can eat a lot of time if you''re doing a lot of > operations, so it''d be nice to have the option to reduce the number.You shouldn''t get duplicate requests after the first 401 request. Basically the reason I want it to get the 401 is to determine if the site requires basic auth or digest auth. Once mechanize determines which scheme to use, it caches that setting for subsequent requests to that server. -- Aaron Patterson http://tenderlovemaking.com/
On Jul 2, 2007, at 12:10 PM, Aaron Patterson wrote:> I think I know what the problem is.... It looks like I forgot to > repeat > the request with what was original method. So if you sent a post, > then got a 401, mechanize will re-request with a GET. > > For now, you should downgrade to 0.6.8 and I''ll fix this bug. I''m > sorry > everyone! :-(No problem at all! Thanks for the quick reply. But I''m a little concerned about the duplicate requests. I''m guessing you''re doing it for security because if you assume that you should send the basic auth every time, you risk the user sending their auth string to an unintended site. Have you considered an optional third argument to basic_auth that would be a base path on which to send the information? i.e., basic_auth(user, pass, ''http://www.mysite.com/blah'') causes all requests that start with ''http://www.mysite.com/blah'' to send authorization. Extra requests can eat a lot of time if you''re doing a lot of operations, so it''d be nice to have the option to reduce the number. Thanks again, Mat
On Mon, Jul 02, 2007 at 05:49:31PM -0400, Mat Schaffer wrote:> > On Jul 2, 2007, at 1:38 PM, Aaron Patterson wrote: > > You shouldn''t get duplicate requests after the first 401 request. > > Basically the reason I want it to get the 401 is to determine if the > > site requires basic auth or digest auth. Once mechanize determines > > which scheme to use, it caches that setting for subsequent requests to > > that server. > > Hrm... perhaps I have something wrong in my script then. The CSV I > sent you should show that it makes two requests for each page. This > is all using the same agent, so I think there may be a problem there > as well. If I get some time, I''ll try to write up a simple case and > send it along. Props if you beat me to it though :)I see what the problem is.... I''m caching the auth stuff based on URL. Since each of those URL''s change, it tries to re-auth. I''ll just have it cache based on domain name, and that would take care of this problem. -- Aaron Patterson http://tenderlovemaking.com/
On Jul 2, 2007, at 1:38 PM, Aaron Patterson wrote:> You shouldn''t get duplicate requests after the first 401 request. > Basically the reason I want it to get the 401 is to determine if the > site requires basic auth or digest auth. Once mechanize determines > which scheme to use, it caches that setting for subsequent requests to > that server.Hrm... perhaps I have something wrong in my script then. The CSV I sent you should show that it makes two requests for each page. This is all using the same agent, so I think there may be a problem there as well. If I get some time, I''ll try to write up a simple case and send it along. Props if you beat me to it though :) -Mat
On Jul 2, 2007, at 3:57 PM, Aaron Patterson wrote:\> I see what the problem is.... I''m caching the auth stuff based on > URL. > Since each of those URL''s change, it tries to re-auth. I''ll just have > it cache based on domain name, and that would take care of this > problem.Yeah, that makes sense. Come to think of it, I wonder how firefox handles it if I came in on a similar path to my script (one deep url followed by in-site links). Seems like the server has to return a base-url for the authentication or something. I''ll email you if I find anything conclusive. Thanks again, Mat