Felipe Jordão A. P. Mattosinho
2010-Jan-18 21:40 UTC
[Mechanize-users] How to click on a link, in a specific part of the web page! Help
Hi everybody, I am a new Ruby programmer and Mechanize & Nokogiri user. I am using the both gems however for my master thesis, and since I can?t see much documentation for mechanize on the internet I have a question. URL = ''http://reviews.cnet.com'' SEARCH_FIELD_NAME = ''tsearch'' XPATH_TO_RESULT_PAGE "/html/body/div[2]/div/div[2]/div[3]/div[2]/form/ul" XPATH_TO_FIRST_LINK_RESULT_PAGE "/html/body/div[2]/div/div[2]/div[3]/div[2]/form/ul/li/div[4]/a" @@mech = WWW::Mechanize.new # Creates an instance of Mechanize and select CNET Website page = @@mech.get(URL) search_form = page.form(SEARCH_FIELD_NAME) search_form.query = query pre_page = @@mech.submit(search_form, search_form.buttons.first) pre_page.search(XPATH_TO_FIRST_LINK_RESULT_PAGE) do |result| @last_page WWW::Mechanize::Page::Link.new(result,@@mech, at pre_page).click end My problem is with the variable @last_page. I am not so sure if I am doing something but I believe I am. I mean that was the only way I found to do what I wanted to. On the variable pre_page I search for a specific field where results are present. I cannot rely just on the name of the link because links with the name of my search can be everywhere on this page. That is why I want to specify just a part of the page where a link with the name of my query should be clicked. That was the only way that I found to restrict the links that I want to click. The problem is that I tried to make a new link, based on the result (which is correct, that was the link I was searching for), and to click on it to proceed to the next page. However @last_page is always nil and this is not working. If someone has a good idea or how can I make it correct , please send me a reply! Best Regards -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://rubyforge.org/pipermail/mechanize-users/attachments/20100118/797d91dc/attachment-0001.html>
Jeremy Woertink
2010-Jan-19 00:15 UTC
[Mechanize-users] How to click on a link, in a specific part of the web page! Help
well.. first, your variables are all over the place. You have a mix of Constants, class, instance, and local variables. Not a problem, but since you''re new to Ruby, I would recommend sticking to some sort of normal scheme. Now with that being said, looking at *@last_page* = WWW::Mechanize::Page::Link.new(result,@@mech,*@pre_page* ).click you call @pre_page which is nil because it was never defined. Did you try creating it with just the pre_page variable? ~Jeremy Woertink 2010/1/18 Felipe Jord?o A. P. Mattosinho <felipemattosinho at terra.com.br>> Hi everybody, > > > > I am a new Ruby programmer and Mechanize & Nokogiri user. I am using the > both gems however for my master thesis, and since I can?t see much > documentation for mechanize on the internet I have a question. > > URL = ''http://reviews.cnet.com'' > > SEARCH_FIELD_NAME = ''tsearch'' > > XPATH_TO_RESULT_PAGE ="/html/body/div[2]/div/div[2]/div[3]/div[2]/form/ul" > > XPATH_TO_FIRST_LINK_RESULT_PAGE ="/html/body/div[2]/div/div[2]/div[3]/div[2]/form/ul/li/div[4]/a" > > > > @@mech = WWW::Mechanize.new > > > > # Creates an instance of Mechanize and select CNET Website > > > > page = @@mech.get(URL) > > > > > > search_form = page.form(SEARCH_FIELD_NAME) > > > > search_form.query = query > > > > pre_page = @@mech.submit(search_form, search_form.buttons.first) > > * * > > * pre_page*.search(XPATH_TO_FIRST_LINK_RESULT_PAGE)* do* |result| > > > > * @last_page* = WWW::Mechanize::Page::Link.new(result,@@mech,*@pre_page > *).click > > > > end > > > > My problem is with the variable @last_page. I am not so sure if I am doing > something but I believe I am. I mean that was the only way I found to do > what I wanted to. > > On the variable pre_page I search for a specific field where results are > present. I cannot rely just on the name of the link because links with the > name of my search can be everywhere on this page. That is why I want to > specify just a part of the page where a link with the name of my query > should be clicked. That was the only way that I found to restrict the links > that I want to click. The problem is that I tried to make a new link, based > on the result (which is correct, that was the link I was searching for), and > to click on it to proceed to the next page. However @last_page is always nil > and this is not working. If someone has a good idea or how can I make it > correct , please send me a reply! > > Best Regards > > _______________________________________________ > Mechanize-users mailing list > Mechanize-users at rubyforge.org > http://rubyforge.org/mailman/listinfo/mechanize-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://rubyforge.org/pipermail/mechanize-users/attachments/20100118/b6d355a4/attachment-0001.html>
Jeremy Woertink
2010-Jan-19 00:29 UTC
[Mechanize-users] How to click on a link, in a specific part of the web page! Help
Here, got bored and decided to try your code out. I cleaned it up a little. http://pastie.org/784099 Try this and see if you get what you''re looking for. ~Jeremy Woertink On Mon, Jan 18, 2010 at 4:15 PM, Jeremy Woertink <jeremywoertink at gmail.com>wrote:> well.. first, your variables are all over the place. You have a mix of > Constants, class, instance, and local variables. Not a problem, but since > you''re new to Ruby, I would recommend sticking to some sort of normal > scheme. Now with that being said, looking at > > > *@last_page* = WWW::Mechanize::Page::Link.new(result,@@mech,*@pre_page* > ).click > > you call @pre_page which is nil because it was never defined. Did you try > creating it with just the pre_page variable? > > ~Jeremy Woertink > > 2010/1/18 Felipe Jord?o A. P. Mattosinho <felipemattosinho at terra.com.br> > >> Hi everybody, >> >> >> >> I am a new Ruby programmer and Mechanize & Nokogiri user. I am using the >> both gems however for my master thesis, and since I can?t see much >> documentation for mechanize on the internet I have a question. >> >> URL = ''http://reviews.cnet.com'' >> >> SEARCH_FIELD_NAME = ''tsearch'' >> >> XPATH_TO_RESULT_PAGE ="/html/body/div[2]/div/div[2]/div[3]/div[2]/form/ul" >> >> XPATH_TO_FIRST_LINK_RESULT_PAGE ="/html/body/div[2]/div/div[2]/div[3]/div[2]/form/ul/li/div[4]/a" >> >> >> >> @@mech = WWW::Mechanize.new >> >> >> >> # Creates an instance of Mechanize and select CNET Website >> >> >> >> page = @@mech.get(URL) >> >> >> >> >> >> search_form = page.form(SEARCH_FIELD_NAME) >> >> >> >> search_form.query = query >> >> >> >> pre_page = @@mech.submit(search_form, search_form.buttons.first) >> >> * * >> >> * pre_page*.search(XPATH_TO_FIRST_LINK_RESULT_PAGE)* do* |result| >> >> >> >> * @last_page* = WWW::Mechanize::Page::Link.new(result,@@mech,* >> @pre_page*).click >> >> >> >> end >> >> >> >> My problem is with the variable @last_page. I am not so sure if I am >> doing something but I believe I am. I mean that was the only way I found to >> do what I wanted to. >> >> On the variable pre_page I search for a specific field where results are >> present. I cannot rely just on the name of the link because links with the >> name of my search can be everywhere on this page. That is why I want to >> specify just a part of the page where a link with the name of my query >> should be clicked. That was the only way that I found to restrict the links >> that I want to click. The problem is that I tried to make a new link, based >> on the result (which is correct, that was the link I was searching for), and >> to click on it to proceed to the next page. However @last_page is always nil >> and this is not working. If someone has a good idea or how can I make it >> correct , please send me a reply! >> >> Best Regards >> >> _______________________________________________ >> Mechanize-users mailing list >> Mechanize-users at rubyforge.org >> http://rubyforge.org/mailman/listinfo/mechanize-users >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://rubyforge.org/pipermail/mechanize-users/attachments/20100118/2974f9f9/attachment.html>