similar to: Scraping and saving.

Displaying 20 results from an estimated 300 matches similar to: "Scraping and saving."

2007 Jul 13
2
How do you handle pop ups?
When I click a link to download a file, a pop up window comes up to save the file. Is there a way to enter a file name and click the submit button with mechanize? Thanks, Jeff -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/mechanize-users/attachments/20070712/558f7018/attachment.html
2007 Jul 23
4
Design ideas
I''m trying to use mechanize against a site that has four fields in the form. However, those four fields have to be filled in order. So putting something in field one, populates the second field drop down. So I''m thinking that I''ll probably have to call the page multiple times? What sort of things should I be doing to figure out how to interact with this page.
2007 Oct 26
2
Post problems
Hello, I''m having trouble using mechanise to post a form. I took a wireshark capture of the form being submitted by firefox and by mechanise. In both there is an HTTP Post and then a ''continuation of non-http traffic'' packet. The obvious difference is that the firefox continuation packet has some http metadata (I don''t know the proper terminology) where the
2007 Mar 18
1
Submitting a form sends a file. How do I save it?
I''ve been using Mechanize for a project that i''ve been working on, but this is the first time i''m having to use forms (scraping previously). So, after i fill out the form, when I hit submit, it sends me information in the form of a text file to download. For the life of me, I can''t see how to get access to it. When clicking on a link, you can put a
2007 Oct 04
2
newbie question with login form
hi, i''m just starting to work with this incredible tool... but i got a first problem with the login process i''m logging on my app like this : ------ @agent = WWW::Mechanize.new { |a| a.log = Logger.new("mech.log") } @agent.user_agent_alias = ''Mac Safari'' @page = @agent.get("http://myappAdress/") @form = @page.forms.first
2008 Apr 29
6
Intercepting an onClick file download
Hi, I''m having some trouble downloading a .csv file from a particular website. The file isn''t part of a url, you need to click on a link in order to get the file sent. I don''t know how to get mechanize to correctly identify that. Here is the link to the file I''m trying to retrieve: <td style="vertical-align: bottom; text-align: center;">
2007 Aug 21
7
Signin to LinkedIn
Hi, Does anyone have the formula for getting logged into LinkedIn? Here''s my current attempt: require ''rubygems'' require ''mechanize'' agent = WWW::Mechanize.new home_page = agent.get(''http://www.linkedin.com'') signin_page = agent.click home_page.links.text(''Sign in'') puts "\nSIGNIN PAGE"
2006 Nov 07
5
mechanize: 400 Bad Request
Hello, when trying to access a certain HTML-frame, I get: "in `request'': Unhandled response (WWW::Mechanize::ResponseCodeError)" and the page returns: "400 Bad Request" * Why? * How to solve this? With browser, it works. In the logs below, I marked 4 lines with "***", where I see possible differences in the URI. But I don''t know, if this is
2006 Apr 28
3
persistent cookies
hello, I am trying to implement a "remember be" box for logins, however I cant seem to get it to work. I have tried the following 2 methods but neither seem to work. When i check the expiry time in firefox it always says "end of session". What is the proper way to handle this so the session cookie "_session_id" doesnt expire for a year? I tried
2007 Apr 16
7
pdf-file tot desktop with pdf/writer problem
Hi, I''m using pdf/Writer to generate a pdf-file, but whatever I do, it gets saved in folder of my app, and not on the users desktop. I''ve used pdf = PDF::Writer.new pdf.select_font "Times-Roman" pdf.text "Hello.pdf", :font_size => 12, :justification => :left File.open("hello.pdf", "wb") { |f|
2005 Aug 05
4
Dev Tools
What tools does everyone use to develop? Are there any tools to see what is going to and coming back from and Ajax call that can be plugged into firefox? -- Eric Fleming efleming@gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://wrath.rubyonrails.org/pipermail/rails-spinoffs/attachments/20050805/55f7e02e/attachment.html
2005 Aug 03
2
Ajax in prototype.js
Can someone please explain to me what I need to do to create an Ajax class using the prototype lib. Thanks, Jon Whitcraft Web Application Developer Online Services - Indianapolis Motor Speedway (317) 492-8623 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://wrath.rubyonrails.org/pipermail/rails-spinoffs/attachments/20050803/2f662bb3/attachment.html
2006 Dec 07
6
Response To Form Submission Hanging
Hello, I am using Mechanize to post a form to a website. When I do this by hand in my browser the response takes about 35s to come back (it''s a long page full of tables and graphics). When I do this with Mechanize, the server starts to respond and then appears to hang. The obvious conclusion is that my code is wrong but I am reasonably sure that I haven''t altered it
2007 Nov 12
3
Weird error downloading a gzip''ed file
Hi all, I''ve been using mechanize for a while and it rocks. Docs are pretty clear and so far I''ve been able to do it on my own. However, I''m stuck in a weird situation in a script to download my contact list from hotmail. I''ve used Firebug to check all urls, and tested it by hand while logged in via browser. Even in the script everything works well until the
2010 Jan 26
1
Does Amazon.com block scraping?
Hi there Does anyone know if Amazon.com has any sort of server side script that tries to block scraping activities? I first noticed that if I didn?t change the agent alias, it would fetch a page exactly like the normal one, but without the intial search field(maybe a silly way to prevent scraping). Then after it, I changed to some other alias, and submit a search. I got the result page as
2007 Oct 10
1
Scraping AOL Webmail to login and fetch contacts?
I''m helping with a gem that is going to published under the contentfree project on rubyforge (http://rubyforge.org/projects/contentfree/). The gem is called "blackbook" and basically it will go and fetch your contacts from the major webmail providers. So far Gmail, Yahoo!, and MSN have been completed. We are trying to finish up with fetching contacts from AOL Webmail. However
2006 May 10
2
Output Compression in Mongrel?
I''d like to implement output compression in Mongrel (ala Apache''s mod_deflate). I have found a Rails plugin that, with minor modification, works. Is there even an advantage to moving the output compression from the Rails app to a Mongrel handler? Unless, of course, someone knows how to configure mod_proxy_balancer with mod_deflate... == Will Green Web Developer & IT
2011 Jan 13
2
send_file works on Rails2,SSL - except for IE7/6
Quirky stuff here, The current setup works fine for Firefox 3.6 and IE 8 but balks on earlier IEs. The files are PDFs, .doc files, and other binaries but up to several MB: A. the users'' problem is a minor IE7 quirk: it works to download files from clicking links, but if you use the URL instead then the browser stops after a fraction of a second. Refreshing or hitting Return again
2010 Jan 25
4
Does Amazon.com blocks scraping?
Hi there Does anyone know if Amazon.com has any sort of server side script that tries to block scraping activities? I first noticed that if I didn?t change the agent alias, it would fetch a page exactly like the normal one, but without the intial search field(maybe a silly way to prevent scraping). Then after it, I changed to some other alias, and submit a search. I got the result page as
2006 Apr 16
9
''depot'' app, where''s session?
Many of you probably know the ''depot'' app from the ''Agile Rails development'' book. I have constructed the ''products'' model, the Store-controller, and the ''Cart'' and ''Line Item'' classes. I have told Application-controller about :cart and :line_item: model :cart model :line_item Here''s part of