thr3ads.net - search: "rubyful"

Displaying 8 results from an estimated 8 matches for "rubyful".

Successfully importing Rubyful Soup objects

2006 Mar 22

Successfully importing Rubyful Soup objects

All, At the top of my controller, I have: require ''rubygems'' require_gem ''rubyful_soup'' The rubyful_soup gem has been successfully installed. However, when I go to instantiate a class from it, using parser = BeautifulSoup.new(html) I get uninitialized constant BeautifulSoup Is there something else I need to do to see the symbols in the Rubyful Soup gem? Thanks,...

Rubyful-soup and ''malformed utf-8 character''

2006 May 17

Rubyful-soup and ''malformed utf-8 character''

Hi Guys, I am trying to use Rubyful-soup for a simple webpage modification project. The issue is that when I try to display the modified html (generated by @soup.to_s) using RJS, an error pops up saying ''malformed utf-8 character''. I can fix this by using @soup.to_s.toutf8 but that causes some of the characters in...

Best way to handle namespace collisions?

2006 May 16

Best way to handle namespace collisions?

All, I have a little namespace collision here. I am trying to use both RubyfulSoup (an HTML parser - which I highly recommend by the way) and the ActionView::Helpers::TextHelper class. Within the TextHelper class, there''s an attempt to create a new "Tag" object. However, Tag is also defined in the RubyfulSoup gem and it is _this_ Tag class whose initi...

OT: Scraper library recommendation

2006 Jan 10

OT: Scraper library recommendation

...... I''m writing a website scraper script that needs to download a web page, traverse the (X)HTML tree and finally insert data and HTML pieces into a DB. Eventually this data will be served up as RSS and/or Atom. I''m currently using html/tree (htmltools); I''ve also tried Rubyful Soup; both have their own shortcomings. What do you people suggest? Regarding htmltools: I had to tweak it quite a bit, as it wouldn''t recognize XHTML-style "empty" tags (for instance, it dislikes <link ... />). What''s even worse, I can''t seem to get it t...

Someone getting RDig work for Linux?

2007 Jan 23

Someone getting RDig work for Linux?

...iven values are the defaults # set to true to get stack traces on errors cfg.verbose = true # content extraction options cfg.content_extraction = OpenStruct.new( # HPRICOT configuration # this is the html parser used by default from RDig 0.3.3 upwards. # Hpricot by far outperforms Rubyful Soup, and is at least as flexible when # it comes to selection of portions of the html documents. :hpricot => OpenStruct.new( # css selector for the element containing the page title :title_tag_selector => ''title'', # might also be a proc returnin...

How to parse HTML doc in Ruby?

2006 Feb 13

How to parse HTML doc in Ruby?

Hi, I want to parse the html doc using ruby. I tried using reXML but failed to load html doc as it is not in well formed structure. Can you please suggest me a good parser which I can use to parse HTML page using Ruby? Thanks, Karika. -- Posted via http://www.ruby-forum.com/.

Unnecessary Gem modules loaded under Rails 1.1.2

2006 May 18

Unnecessary Gem modules loaded under Rails 1.1.2

All, Rails 1.1.2 Win XP Pro Rubyful Soup 1.0.4 htmltools 1.0.9 I am terribly confused as to what pulling in a gem does with respect to how many modules get loaded at runtime. I am using two gems in my app, Rubyful Soup and htmltools. RubyfulSoup requires one module from the htmltools gem (html/sgml-parser). My app requires the R...

How best to handle non-serializable session data?

2006 Apr 12

How best to handle non-serializable session data?

I have a piece of data that needs to persist across requests that is not serializable. It''s a Rubyful soup parse tree and it''s very expensive to instantiate and I need it for a while in my app. Therefore, by default, it can''t be stored in the session since the default session storage mechanism is pstore. One option I have to is change the session storage mechanism to in-memor...

search for: rubyful