search for: rubyful

Displaying 8 results from an estimated 8 matches for "rubyful".

2006 Mar 22
2
Successfully importing Rubyful Soup objects
All, At the top of my controller, I have: require ''rubygems'' require_gem ''rubyful_soup'' The rubyful_soup gem has been successfully installed. However, when I go to instantiate a class from it, using parser = BeautifulSoup.new(html) I get uninitialized constant BeautifulSoup Is there something else I need to do to see the symbols in the Rubyful Soup gem? Thanks,...
2006 May 17
0
Rubyful-soup and ''malformed utf-8 character''
Hi Guys, I am trying to use Rubyful-soup for a simple webpage modification project. The issue is that when I try to display the modified html (generated by @soup.to_s) using RJS, an error pops up saying ''malformed utf-8 character''. I can fix this by using @soup.to_s.toutf8 but that causes some of the characters in...
2006 May 16
3
Best way to handle namespace collisions?
All, I have a little namespace collision here. I am trying to use both RubyfulSoup (an HTML parser - which I highly recommend by the way) and the ActionView::Helpers::TextHelper class. Within the TextHelper class, there''s an attempt to create a new "Tag" object. However, Tag is also defined in the RubyfulSoup gem and it is _this_ Tag class whose initi...
2006 Jan 10
1
OT: Scraper library recommendation
...... I''m writing a website scraper script that needs to download a web page, traverse the (X)HTML tree and finally insert data and HTML pieces into a DB. Eventually this data will be served up as RSS and/or Atom. I''m currently using html/tree (htmltools); I''ve also tried Rubyful Soup; both have their own shortcomings. What do you people suggest? Regarding htmltools: I had to tweak it quite a bit, as it wouldn''t recognize XHTML-style "empty" tags (for instance, it dislikes <link ... />). What''s even worse, I can''t seem to get it t...
2007 Jan 23
3
Someone getting RDig work for Linux?
...iven values are the defaults # set to true to get stack traces on errors cfg.verbose = true # content extraction options cfg.content_extraction = OpenStruct.new( # HPRICOT configuration # this is the html parser used by default from RDig 0.3.3 upwards. # Hpricot by far outperforms Rubyful Soup, and is at least as flexible when # it comes to selection of portions of the html documents. :hpricot => OpenStruct.new( # css selector for the element containing the page title :title_tag_selector => ''title'', # might also be a proc returnin...
2006 Feb 13
2
How to parse HTML doc in Ruby?
Hi, I want to parse the html doc using ruby. I tried using reXML but failed to load html doc as it is not in well formed structure. Can you please suggest me a good parser which I can use to parse HTML page using Ruby? Thanks, Karika. -- Posted via http://www.ruby-forum.com/.
2006 May 18
1
Unnecessary Gem modules loaded under Rails 1.1.2
All, Rails 1.1.2 Win XP Pro Rubyful Soup 1.0.4 htmltools 1.0.9 I am terribly confused as to what pulling in a gem does with respect to how many modules get loaded at runtime. I am using two gems in my app, Rubyful Soup and htmltools. RubyfulSoup requires one module from the htmltools gem (html/sgml-parser). My app requires the R...
2006 Apr 12
1
How best to handle non-serializable session data?
I have a piece of data that needs to persist across requests that is not serializable. It''s a Rubyful soup parse tree and it''s very expensive to instantiate and I need it for a while in my app. Therefore, by default, it can''t be stored in the session since the default session storage mechanism is pstore. One option I have to is change the session storage mechanism to in-memor...