Hello.
I wrote a script that visits people''s profiles and then downloads their
.jpeg pictures.
It works fine as long as as there exists a direct link to a picture, which I can
access via
page = agent.click(page.link_with(:href => /profile/pic/34243242)
which for example, opens up: http://page/2948717/1/main/ad5867be9a.jpeg
and then save it:
File.new(".../pic.jpeg", "a+") << agent.get_file(page)
But, sometimes there''s no direct link, although the picture is still in
their profile. In the page source by using nokogiri I am locating it''s
<img> tag, i.e:
<img id="photo_img" alt="zdj?cie"
src="http://page/2948717/1/main/ad5867be9a.jpeg">
and then scraping the contents of the src attribute.
and loading the page:
page = agent.get(the_src_link)
But unfortunately I am getting a HTTP Error 403 Forbidden Access.
From what I understand is that once I invoke the ''get'' method
it''s treated as a new session and I am losing my previous settings
(cookies, etc.)
How would I pass previous session data to the new link? There are a few methods
in the CookieJar class, but I really have no clue how to make them work.
Thanks in advance.