Displaying 20 results from an estimated 700 matches similar to: "OT: Scraper library recommendation"
2006 May 18
1
Unnecessary Gem modules loaded under Rails 1.1.2
All,
Rails 1.1.2
Win XP Pro
Rubyful Soup 1.0.4
htmltools 1.0.9
I am terribly confused as to what pulling in a gem does with respect to
how many modules get loaded at runtime.
I am using two gems in my app, Rubyful Soup and htmltools.
RubyfulSoup requires one module from the htmltools gem
(html/sgml-parser).
My app requires the RubyfulSoup gem.
When I started my app, something was causing
2006 Jan 25
0
screenscraping using htmltools and rexml
Hi,
I need to do some screen scraping and I''ve spent a couple hour getting
htmltools and rexml do the right thing. Here''s the code:
parser = HTMLTree::Parser.new(false, false)
parser.feed(res.body)
tree = parser.tree.html_node.as_rexml_document
I works for one page, but for another I get "undefined method `add'' for
#<HTMLTree::Element:0x37f9cc8>" in
2009 Jun 07
17
ActiveRecord Classes
I''m having a little trouble with understanding how to work out the
schematic for some of my classes using ActiveRecord when a file is in my
lib directory:
Brief example:
Here''s the outline of the files in use:
....app
........controllers
............application_controller.rb
............rushing_offenses_controller.rb
........models
............rushing_offense.rb
....lib
2010 Aug 01
0
ScrapeR Unanticipated XML objects
Dear All,
I have come across a very surprising result as I have started to learn how
to use R to pull data from the web for analysis.
I am trying to isolate that table headers for the quarterly income
statement (qtrinc) that I pulled from Google finance. I executed the
following commands after installing the scrapeR package.
require(scrapeR)
2009 Jun 06
5
Rake Tasks
Hi Everyone,
I just need some further help clarifying a custom rake task I''m building
and the logistics of how it should be working.
I''ve created a custom rake task in libs/tasks called scraper.rake which
so far just contains the following:
desc "This task will parse data from ncaa.org and upload the data to our
db"
task :scraper => :environment do
# code goes
2006 May 17
0
Rubyful-soup and ''malformed utf-8 character''
Hi Guys,
I am trying to use Rubyful-soup for a simple webpage modification
project. The issue is that when I try to display the modified html
(generated by @soup.to_s) using RJS, an error pops up saying
''malformed utf-8 character''.
I can fix this by using @soup.to_s.toutf8 but that causes some of the
characters in the document to be messed up (ie ''  becomes
2006 Jun 23
1
rubyful_soup works fine as an RB file but bugs in Rails
This is the code:
1 require ''rubyful_soup''
2 require ''open-uri''
3
4 url = "http://www.google.com/search?q=ruby"
5 open(url) {
6 |page| page_content = page.read()
7 soup = BeautifulSoup.new(page_content)
8 result = soup.find_all(''a'', :attrs => {''class'' => ''l''})
9 result.each {
2006 Mar 22
2
Successfully importing Rubyful Soup objects
All,
At the top of my controller, I have:
require ''rubygems''
require_gem ''rubyful_soup''
The rubyful_soup gem has been successfully installed.
However, when I go to instantiate a class from it, using
parser = BeautifulSoup.new(html)
I get
uninitialized constant BeautifulSoup
Is there something else I need to do to see the symbols in the Rubyful
Soup gem?
2016 Sep 28
2
Good Bye SAMBA?!?!?
Am 28.09.2016 um 04:01 schrieb Steve Litt via samba:
> Why would ANYBODY type a command when they could perform a bunch of
> mouse clicks. Better yet, you can automate Windows tools with a screen
> scraper and a keyboard injector, or with a top notch language like
> Powershell or Visual Basic
*lol*
why would ANYBODY click in a GUI when he have a console - and i mean
that really
2006 Jun 05
6
HTML Parsing libraries
Hi,
What is the best way to parse HTML?
Or is there a simple way to convert a table to an array?
I tried beautiful_soup and the built-in htmltools, but have trouble
getting them to run.
Any pointers?
Thanks, Hari
--
Posted via http://www.ruby-forum.com/.
2007 Jan 23
3
Someone getting RDig work for Linux?
I got this
root at linux:~# rdig -c configfile
RDig version 0.3.4
using Ferret 0.10.14
added url file:///home/myaccount/documents/
waiting for threads to finish...
root at linux:~# rdig -c configfile -q "Ruby"
RDig version 0.3.4
using Ferret 0.10.14
executing query >Ruby<
Query:
total results: 0
root at linux:~#
my configfile
I changed from config to cfg, because of maybe
2011 May 15
1
Find String Between Characters
Dear R Helpers,
I am trying to isolate a set of characters between two other characters in
a long string file. I tried some of the examples on the R help pages and
elsewhere, but I am not able to get it. Your help would be much
appreciated.
require(scrapeR)
2012 May 28
1
Rcurl, postForm()
Dear colleagues,
Could I get some assistance using postForm() to scrape the business names and addresses at this website:
http://www.brantford.ca/business/LocalBusinessCommunity/Pages/BusinessDirectorySearch.aspx
I've read through (http://www.omegahat.org/RCurl/RCurlJSS.pdf) and scoured the web for tutorials, but I can't crack it. I'm aware that this is probably a pretty basic
2006 Dec 31
0
backgroundrb 0.2.1 doesn''t always load rails environment
I found this stack trace in my logs. My worker name is MiscWorker,
and Qualifier is a Rails model.
uninitialized constant MiscWorker::Qualifier: /Users/bryan/
Workspace/sandbox/scraper-trunk/config/../vendor/rails/activerecord/
lib/../../activesupport/lib/active_support/dependencies.rb:476:in
`const_missing''
/Users/bryan/Workspace/sandbox/scraper-trunk/lib/workers/
2011 Jan 26
1
Error handling with frozen RCurl function calls + Identification of frozen R processes
Dear list,
I'm tackling an empiric research problem that requires me to address a whole
bunch of conceptual and/or technical details at the same time which cuts
time short for all the nitty-gritty details of the "components" involved.
Having said this, I'm lacking the time at the moment to deeply dive into
parallel computing and HTTP requests via RCurl and I hope you can help me
2017 Sep 28
1
rgl crash on windows 7
I have a co-worker who has installed R 3.4.2 on Windows 7. When this
person tries to load the rgl package with
library(rgl)
A dialog box appears with the message:
R for windows gui frontend has stopped working
I suspect a conflict problem with a dll, but I'm not sure how to identify
if this is the problem since R is crashing immediately.
Interestingly, when we start R and do NOT load rgl,
2006 Apr 12
1
How best to handle non-serializable session data?
I have a piece of data that needs to persist across requests that is not
serializable. It''s a Rubyful soup parse tree and it''s very expensive to
instantiate and I need it for a while in my app.
Therefore, by default, it can''t be stored in the session since the
default session storage mechanism is pstore.
One option I have to is change the session storage mechanism
2006 May 16
0
htmltools 1.09 doesn''t play nice with ActionPack strip_tags!
All,
I''ve discovered an incompatibility between HTMLTools 1.09 (a very handy
HTML parser) and ActionPack 1.12.1.
Basically, they both do some HTML parsing and they both create a module
named HTML::Tag, which causes confusion when said Tag object attempts to
be instantiated in the ActionPack context.
That said, now I get to choose which one''s namespace to fiddle with.
But a
2017 Oct 18
1
dygraphs, multiple graphs and shiny
Hi All:
This is really getting into the weeds, but I am hoping someone will have a solution. I am trying to use dygrahs for R, within Shiny.
The situation arises when I am combining a number of dygraphs into one plot. If I am just in an RNotebook, if you look at:
https://stackoverflow.com/questions/30509866/for-loop-over-dygraph-does-not-work-in-r
the solution to have the plot shown from a
2004 Oct 04
3
Cisco XML 411 Interface
Hi All,
Did anyone came across a 411 XML service I can feed to the "service"
button with XML?
Some other feed I can manipulate to XML query?
Assaf Benharoosh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.digium.com/pipermail/asterisk-users/attachments/20041004/dbf552ac/attachment.htm