similar to: Tips on testing

Displaying 20 results from an estimated 3000 matches similar to: "Tips on testing"

2010 Jan 26
1
Does Amazon.com block scraping?
Hi there Does anyone know if Amazon.com has any sort of server side script that tries to block scraping activities? I first noticed that if I didn?t change the agent alias, it would fetch a page exactly like the normal one, but without the intial search field(maybe a silly way to prevent scraping). Then after it, I changed to some other alias, and submit a search. I got the result page as
2006 Nov 22
1
to_absolute_uri typo in 0.6.3?
I just started using Mechanize, and started using Ruby about thirty seconds before that, but one of the sites I''m scraping does a redirect on form submission to a badly-formed relative URL: index.cfm?action=bing&bang=boom=1|a=|b=|c= (etc.) Interestingly, Mechanize 0.6.2 handled this OK, but in 0.6.3 this causes a URI::InvalidURIError exception from URI.parse() in to_absolute_uri
2007 Oct 10
1
Scraping AOL Webmail to login and fetch contacts?
I''m helping with a gem that is going to published under the contentfree project on rubyforge (http://rubyforge.org/projects/contentfree/). The gem is called "blackbook" and basically it will go and fetch your contacts from the major webmail providers. So far Gmail, Yahoo!, and MSN have been completed. We are trying to finish up with fetching contacts from AOL Webmail. However
2008 Jul 17
3
Convert data to utf-8
Hello, I''m trying to find a solution to convert everything returned by mechanize to utf-8, no matter if the original page is utf-8 or iso and I really don''t know where to start from... agent = WWW::Mechanize.new { |a| a.log = Logger.new(File::join(RAILS_ROOT, "log/mechanize.log")) } one_page = agent.get("www.google.fr") My first problem is that one_page
2010 Jan 25
4
Does Amazon.com blocks scraping?
Hi there Does anyone know if Amazon.com has any sort of server side script that tries to block scraping activities? I first noticed that if I didn?t change the agent alias, it would fetch a page exactly like the normal one, but without the intial search field(maybe a silly way to prevent scraping). Then after it, I changed to some other alias, and submit a search. I got the result page as
2007 Sep 14
1
Unable to scrap gmail.com - EOFError: End of file reached
Hi all, I am so excited to use mechanize! It has opened a whole new world of projects for me :) I am trying to login into the Gmail.com server, as described in http://schf.uc.org/articles/2007/02/14/scraping-gmail-with-mechanize-and-hpricot but am running into a few issues... irb(main):010:0> page = agent.submit form EOFError: end of file reached from
2008 Jun 12
1
setting request headers via get()
Hey all, Found a email thread from Jan 2007 discussing the inability to set request headers (like ETag and If-Modified-Since) through the API, and this is something that''s bothering me a bit. Currently the "way" to do this is to subclass Mechanize and override set_headers(). That seems fine for headers that you''d like to send in every request or for classes of request,
2007 Mar 18
1
Submitting a form sends a file. How do I save it?
I''ve been using Mechanize for a project that i''ve been working on, but this is the first time i''m having to use forms (scraping previously). So, after i fill out the form, when I hit submit, it sends me information in the form of a text file to download. For the life of me, I can''t see how to get access to it. When clicking on a link, you can put a
2007 Nov 12
3
Weird error downloading a gzip''ed file
Hi all, I''ve been using mechanize for a while and it rocks. Docs are pretty clear and so far I''ve been able to do it on my own. However, I''m stuck in a weird situation in a script to download my contact list from hotmail. I''ve used Firebug to check all urls, and tested it by hand while logged in via browser. Even in the script everything works well until the
2006 May 22
2
How to execute time consuming code
Hello all, I have a screen scraping application (go to a lots of sites, extract 10k stuff, integrate the results, put them to DB etc). Now i want to use a Rails application as a frontend to this: The user can push a button which triggers the screen scraping app and view the results (preferably asynchronously, but that does not really matter right now). Questions: - Should the screen scraping app
2007 Apr 03
2
Scraping and saving.
Hi, I''m working to scrape and save some ebooks. Mechanize has been wonderful so far. The link I''m having trouble with is this one. http://www.webscription.net/SendZip.aspx?SKU=0671578499&ProductID=379&format=H When I click that in the browser it saves it to a file named H_1632.zip. How do I get that name from the page. I suspect to save this to a file I would just do
2006 Jan 27
1
Caching from screen scraping
Hi all, I need to do some screen scraping from my rails app. Given an ethernet (MAC) adress, I scrape results from an internal web page that returns location and hostname. How can I cache the result from that screen scraping as to be polite to the scrapee? I would like to expire the results daily. In perl, I would use Cache::File. Can I use rails caching for this? What''s the best
2007 May 07
6
mock frameworks
Just curious - now that rspec (as of 0.9) let''s you choose your mock framework, how many of you are actually using (or planning to use) mocha or flexmock? Anybody planning to use any other mock framework besides rspec, mocha or flexmock? Thanks, David
2007 Aug 31
48
Deprecating the mocking framework?
I saw in one of Dave C.''s comments to a ticket that "our current plan is to deprecate the mocking framework." I hadn''t heard anything about that, but then again I haven''t paid super close attention to the list. Are we planning on dumping the mock framework in favor of using Mocha (or any other framework one might want to plug in?). Pat
2007 Nov 04
3
Returning the mock associated with an expectation.
I was reading through the FlexMock docs and noticed the expectation method .mock, which returns the original mock associated with an expectation. It looks really handy for writing nice all-in-one mocks like: mock_user = mock(''User'').expects(:first_name).returns(''Jonah'').mock So I started playing around with mocha and found I could actually already do this!
2009 Feb 18
1
R as a web scraping tool using RCurl
Hi List, I am trying to leverage my knowledge of R in trying to use it for tasks that may not make R the best choice for these tasks. I wish to automate a web scraping task, which requires a multi-step procedure: 1) log in to a website 2) Go to a particular page 3) From the drop down menu, click on a particular link 4) From the tabulated data presented, choose relevant information based on a
2008 Jun 10
4
adding results from threads to a collection and returning it
Forgive me if this has been addressed somewhere, but I have searched and can''t come up with anything. I am basically trying to distribute several web page scraping tasks among different threads, and have the results from each added to an Array which is ultimately returned by the backgroundrb worker. Here is an example of what I''m trying to do in a worker method: pages =
2006 Oct 25
5
Mocha, Stubba and RSpec
Hi, I''ve been reading with interest the threads trying to integrate Mocha and Stubba with RSpec. So far, I''ve made the two changes in spec_helper.rb suggested, but discovered another one that neither of the archives mentions: If you use traditional mocking: object = mock or the stub shortcut : object = stub(:method => :result), you run into namespace conflicts with
2012 Mar 05
2
How to choose a button and scrape the website data
hi all, I'm working on scrapping some website data to build a database. Under most cases, I can use package XML to get the dataset. However, some of the website doesn't give a explicit address of the downloaded tables. To be more specific, for example, I'm interested in the website http://ets.aeso.ca/ The data we are scraping is the "Pool Weekly Summary" under the
2008 Jul 25
21
Problems with mock assigned to a constant
Hi all, Initially I thought this was a bug in the built-in mocking framework(and it still may be), but I better hash it out on the mailing list before I file/reopen the ticket: http://rspec.lighthouseapp.com/projects/5645/tickets/478-mocks-on-constants#ticket-478-6 I thought my example illustrated my problem, but obviously I was passing the wrong arguments to the mock. I revised my example to