peter at rubyrailways.com
2010-Jan-26 08:24 UTC
[Mechanize-users] Nokogiri vs mechanize objects
Hey all, Is it possible to ''cast'' a Nokogri objects as a Mechanize one? i.e. I get back a Nokogiri Element after searching with an XPath, and now I''d like to click it (let''s suppose it''s an <a>). So something like>> agent = WWW::Mechanize.new >> ... >> agent.get(''github.com/'') >> ... >> link = agent.page.search("//p[child::strong[contains(.,''GitHub'')]]/a[1]")=> a href"help.github.com/post-receive-hooks"web hooka>> link.clickNoMethodError: undefined method `click'' for <a href="help.github.com/post-receive-hooks">web hook</a>:Nokogiri::XML::NodeSet from (irb):8 There has to be a mechanize object which is the "alter-ego" of the Nokogiri element - how do I switch between the two? Cheers, Peter
peter at rubyrailways.com
2010-Jan-28 20:50 UTC
[Mechanize-users] Nokogiri vs mechanize objects
I suppose getting no answer after several days means "it''s not possible" - so I hacked around a bit, and now it is!>> require ''mechanize''=> true>> a = WWW::Mechanize.new=> ...>> a.get(''google.com/ncr'')=> ...>> a.page.at("//input[@name=''q'']").fill_textfield(''ruby'')=> ...>> a.page.at("//input[@name=''btnG'']").submit_form=> ...>> a.page.at("//a[@class=''l'']")=> a href"ruby-lang.org" class"l"emRubyem Programming Languagea I am using XPaths for everything, and this way I never have to wonder any more about properly matching a form element on the page, or a button having funky text which blows up when going through iconv and other problems, so I used it all the time in celerity and missed it from mechanize. Do you think this patch has a chance to get accepted into mechanize? If yes, I''d like to discuss it with the gem maintainer (whether the solution is OK conceptually, make sure there is good test coverage etc). Cheers, Peter> Hey all, > > Is it possible to ''cast'' a Nokogri objects as a Mechanize one? i.e. I get > back a Nokogiri Element after searching with an XPath, and now I''d like to > click it (let''s suppose it''s an <a>). So something like > >>> agent = WWW::Mechanize.new >>> ... >>> agent.get(''github.com/'') >>> ... >>> link >>> agent.page.search("//p[child::strong[contains(.,''GitHub'')]]/a[1]") > => a href"help.github.com/post-receive-hooks"web hooka >>> link.click > NoMethodError: undefined method `click'' for <a > href="help.github.com/post-receive-hooks">web > hook</a>:Nokogiri::XML::NodeSet > from (irb):8 > > There has to be a mechanize object which is the "alter-ego" of the > Nokogiri element - how do I switch between the two? > > Cheers, > Peter > > > > _______________________________________________ > Mechanize-users mailing list > Mechanize-users at rubyforge.org > rubyforge.org/mailman/listinfo/mechanize-users >
On Thu, Jan 28, 2010 at 3:50 PM, <peter at rubyrailways.com> wrote:> I suppose getting no answer after several days means "it''s not possible" - > so I hacked around a bit, and now it is! > > >> require ''mechanize'' > => true > >> a = WWW::Mechanize.new > => ... > >> a.get(''google.com/ncr'') > => ... > >> a.page.at("//input[@name=''q'']").fill_textfield(''ruby'') > => ... > >> a.page.at("//input[@name=''btnG'']").submit_form > => ... > >> a.page.at("//a[@class=''l'']") > => a href"ruby-lang.org" class"l"emRubyem Programming > Languagea > > I am using XPaths for everything, and this way I never have to wonder any > more about properly matching a form element on the page, or a button > having funky text which blows up when going through iconv and other > problems, so I used it all the time in celerity and missed it from > mechanize. > > Do you think this patch has a chance to get accepted into mechanize? If > yes, I''d like to discuss it with the gem maintainer (whether the solution > is OK conceptually, make sure there is good test coverage etc). >Yes, this is a patch I''d be interested in seeing. Discuss away! Do you have a github branch we can take a look at?> > Cheers, > Peter > > > Hey all, > > > > Is it possible to ''cast'' a Nokogri objects as a Mechanize one? i.e. I get > > back a Nokogiri Element after searching with an XPath, and now I''d like > to > > click it (let''s suppose it''s an <a>). So something like > > > >>> agent = WWW::Mechanize.new > >>> ... > >>> agent.get(''github.com/'') > >>> ... > >>> link > >>> agent.page.search("//p[child::strong[contains(.,''GitHub'')]]/a[1]") > > => a href"help.github.com/post-receive-hooks"web hooka > >>> link.click > > NoMethodError: undefined method `click'' for <a > > href="help.github.com/post-receive-hooks">web > > hook</a>:Nokogiri::XML::NodeSet > > from (irb):8 > > > > There has to be a mechanize object which is the "alter-ego" of the > > Nokogiri element - how do I switch between the two? > > > > Cheers, > > Peter > > > > > > > > _______________________________________________ > > Mechanize-users mailing list > > Mechanize-users at rubyforge.org > > rubyforge.org/mailman/listinfo/mechanize-users > > > > > > > _______________________________________________ > Mechanize-users mailing list > Mechanize-users at rubyforge.org > rubyforge.org/mailman/listinfo/mechanize-users >-- mike dalessio mike at csa.net -------------- next part -------------- An HTML attachment was scrubbed... URL: <rubyforge.org/pipermail/mechanize-users/attachments/20100128/21d3de37/attachment.html>
peter at rubyrailways.com
2010-Jan-28 21:32 UTC
[Mechanize-users] Nokogiri vs mechanize objects
> Yes, this is a patch I''d be interested in seeing. Discuss away!The idea is simple: - there is a hash caled nokogiri2mechanize, which maps Nokogiri::XML::Nodes to mechanize objects (this is one thing I''d like to ask - I think it''s enough to map Form::Field::XXX and Page::Link?). - every time a new mechanize object we are interested in is created (i.e. Form::Field::XXX or Page::Link) an entry is added to nokogiri2mechanize - Nokogiri::XML::Node is monkey patched with #click etc. (nokogiri_utils.rb) so if one calls #click on a nokgiri Node, it''s the corresponding Mech object is looked up and the message forwarded to it As simple as this sounds, the devil is in the details... since this is the first time I ever opened the mechanize source code, I am not sure about a few things: - are we really interested just in Form::Field::XXX and Page::Link and nothing more? - how/when to populate nokogiri2mechanize ? At the moment it''s done in page.rb, lines 87 -> 91 by manually calling page#forms() and page#links() but this doesn''t feel intuitive to me - how do I make sure nokogiri_utils has all the methods we need (this is kind of related to the first question, ie identifying all the classes we want to represent in the mapping). Since this patch is just a proof of concept, I didn''t strive (well, even try) to provide all the methods that''ll be needed - testing: how to test all this - write a test case for each method in nokogiri_utils.rb? What else?> Do you have a github branch we can take a look at?github.com/scrubber/mechanize/commit/8cf963c9d28ee09395dac7306a599cb5335ef007 As I said it''s just a proof of concept (tested all the methods in nokogiri_utls.rb on several pages and it works) so obviously the code might be rough around the edges. If you are wondering why is form passed to all the buttons - it''s so that I can submit the form just using the button, i.e.: a.page.at("//input[@name=''btnG'']").submit_form #no need to pass the form since the button knows which form it belongs to Cheers, Peter