similar to: OT: Scraper library recommendation

Displaying 20 results from an estimated 700 matches similar to: "OT: Scraper library recommendation"

2006 May 18
1
Unnecessary Gem modules loaded under Rails 1.1.2
All, Rails 1.1.2 Win XP Pro Rubyful Soup 1.0.4 htmltools 1.0.9 I am terribly confused as to what pulling in a gem does with respect to how many modules get loaded at runtime. I am using two gems in my app, Rubyful Soup and htmltools. RubyfulSoup requires one module from the htmltools gem (html/sgml-parser). My app requires the RubyfulSoup gem. When I started my app, something was causing
2006 Jan 25
0
screenscraping using htmltools and rexml
Hi, I need to do some screen scraping and I''ve spent a couple hour getting htmltools and rexml do the right thing. Here''s the code: parser = HTMLTree::Parser.new(false, false) parser.feed(res.body) tree = parser.tree.html_node.as_rexml_document I works for one page, but for another I get "undefined method `add'' for #<HTMLTree::Element:0x37f9cc8>" in
2009 Jun 07
17
ActiveRecord Classes
I''m having a little trouble with understanding how to work out the schematic for some of my classes using ActiveRecord when a file is in my lib directory: Brief example: Here''s the outline of the files in use: ....app ........controllers ............application_controller.rb ............rushing_offenses_controller.rb ........models ............rushing_offense.rb ....lib
2010 Aug 01
0
ScrapeR Unanticipated XML objects
Dear All, I have come across a very surprising result as I have started to learn how to use R to pull data from the web for analysis. I am trying to isolate that table headers for the quarterly income statement (qtrinc) that I pulled from Google finance. I executed the following commands after installing the scrapeR package. require(scrapeR)
2009 Jun 06
5
Rake Tasks
Hi Everyone, I just need some further help clarifying a custom rake task I''m building and the logistics of how it should be working. I''ve created a custom rake task in libs/tasks called scraper.rake which so far just contains the following: desc "This task will parse data from ncaa.org and upload the data to our db" task :scraper => :environment do # code goes
2006 May 17
0
Rubyful-soup and ''malformed utf-8 character''
Hi Guys, I am trying to use Rubyful-soup for a simple webpage modification project. The issue is that when I try to display the modified html (generated by @soup.to_s) using RJS, an error pops up saying ''malformed utf-8 character''. I can fix this by using @soup.to_s.toutf8 but that causes some of the characters in the document to be messed up (ie ''&nbsp becomes
2006 Jun 23
1
rubyful_soup works fine as an RB file but bugs in Rails
This is the code: 1 require ''rubyful_soup'' 2 require ''open-uri'' 3 4 url = "http://www.google.com/search?q=ruby" 5 open(url) { 6 |page| page_content = page.read() 7 soup = BeautifulSoup.new(page_content) 8 result = soup.find_all(''a'', :attrs => {''class'' => ''l''}) 9 result.each {
2006 Mar 22
2
Successfully importing Rubyful Soup objects
All, At the top of my controller, I have: require ''rubygems'' require_gem ''rubyful_soup'' The rubyful_soup gem has been successfully installed. However, when I go to instantiate a class from it, using parser = BeautifulSoup.new(html) I get uninitialized constant BeautifulSoup Is there something else I need to do to see the symbols in the Rubyful Soup gem?
2016 Sep 28
2
Good Bye SAMBA?!?!?
Am 28.09.2016 um 04:01 schrieb Steve Litt via samba: > Why would ANYBODY type a command when they could perform a bunch of > mouse clicks. Better yet, you can automate Windows tools with a screen > scraper and a keyboard injector, or with a top notch language like > Powershell or Visual Basic *lol* why would ANYBODY click in a GUI when he have a console - and i mean that really
2006 Jun 05
6
HTML Parsing libraries
Hi, What is the best way to parse HTML? Or is there a simple way to convert a table to an array? I tried beautiful_soup and the built-in htmltools, but have trouble getting them to run. Any pointers? Thanks, Hari -- Posted via http://www.ruby-forum.com/.
2007 Jan 23
3
Someone getting RDig work for Linux?
I got this root at linux:~# rdig -c configfile RDig version 0.3.4 using Ferret 0.10.14 added url file:///home/myaccount/documents/ waiting for threads to finish... root at linux:~# rdig -c configfile -q "Ruby" RDig version 0.3.4 using Ferret 0.10.14 executing query >Ruby< Query: total results: 0 root at linux:~# my configfile I changed from config to cfg, because of maybe
2011 May 15
1
Find String Between Characters
Dear R Helpers, I am trying to isolate a set of characters between two other characters in a long string file. I tried some of the examples on the R help pages and elsewhere, but I am not able to get it. Your help would be much appreciated. require(scrapeR)
2012 May 28
1
Rcurl, postForm()
Dear colleagues, Could I get some assistance using postForm() to scrape the business names and addresses at this website: http://www.brantford.ca/business/LocalBusinessCommunity/Pages/BusinessDirectorySearch.aspx I've read through (http://www.omegahat.org/RCurl/RCurlJSS.pdf) and scoured the web for tutorials, but I can't crack it. I'm aware that this is probably a pretty basic
2006 Dec 31
0
backgroundrb 0.2.1 doesn''t always load rails environment
I found this stack trace in my logs. My worker name is MiscWorker, and Qualifier is a Rails model. uninitialized constant MiscWorker::Qualifier: /Users/bryan/ Workspace/sandbox/scraper-trunk/config/../vendor/rails/activerecord/ lib/../../activesupport/lib/active_support/dependencies.rb:476:in `const_missing'' /Users/bryan/Workspace/sandbox/scraper-trunk/lib/workers/
2011 Jan 26
1
Error handling with frozen RCurl function calls + Identification of frozen R processes
Dear list, I'm tackling an empiric research problem that requires me to address a whole bunch of conceptual and/or technical details at the same time which cuts time short for all the nitty-gritty details of the "components" involved. Having said this, I'm lacking the time at the moment to deeply dive into parallel computing and HTTP requests via RCurl and I hope you can help me
2017 Sep 28
1
rgl crash on windows 7
I have a co-worker who has installed R 3.4.2 on Windows 7. When this person tries to load the rgl package with library(rgl) A dialog box appears with the message: R for windows gui frontend has stopped working I suspect a conflict problem with a dll, but I'm not sure how to identify if this is the problem since R is crashing immediately. Interestingly, when we start R and do NOT load rgl,
2006 Apr 12
1
How best to handle non-serializable session data?
I have a piece of data that needs to persist across requests that is not serializable. It''s a Rubyful soup parse tree and it''s very expensive to instantiate and I need it for a while in my app. Therefore, by default, it can''t be stored in the session since the default session storage mechanism is pstore. One option I have to is change the session storage mechanism
2006 May 16
0
htmltools 1.09 doesn''t play nice with ActionPack strip_tags!
All, I''ve discovered an incompatibility between HTMLTools 1.09 (a very handy HTML parser) and ActionPack 1.12.1. Basically, they both do some HTML parsing and they both create a module named HTML::Tag, which causes confusion when said Tag object attempts to be instantiated in the ActionPack context. That said, now I get to choose which one''s namespace to fiddle with. But a
2017 Oct 18
1
dygraphs, multiple graphs and shiny
Hi All: This is really getting into the weeds, but I am hoping someone will have a solution. I am trying to use dygrahs for R, within Shiny. The situation arises when I am combining a number of dygraphs into one plot. If I am just in an RNotebook, if you look at: https://stackoverflow.com/questions/30509866/for-loop-over-dygraph-does-not-work-in-r the solution to have the plot shown from a
2004 Oct 04
3
Cisco XML 411 Interface
Hi All, Did anyone came across a 411 XML service I can feed to the "service" button with XML? Some other feed I can manipulate to XML query? Assaf Benharoosh -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.digium.com/pipermail/asterisk-users/attachments/20041004/dbf552ac/attachment.htm