wbsmith83-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
2007-Jan-26 14:15 UTC
Mechanize out of buffer space
I am trying to scrape a site and then its children to get data I relate in tables, the only problems is that I keep getting an "OUT OF BUFFER SPACE" error. Is there a way to clear the buffer after each iteration or am I doing something wrong? Here''s the code: require ''rubygems'' require ''mechanize'' require ''active_record'' ActiveRecord::Base.establish_connection( #connection goes here ) class Major < ActiveRecord::Base has_many :courses end class Course < ActiveRecord::Base belongs_to :major end class Sections def scrape(url) agent = WWW::Mechanize.new page = agent.get(url) table = (page/''//table'')[6] (table/"tr").each do |major| @newMajor = Major.new @newMajor.title = (major/''//td'').first.inner_html @newMajor.abbrev = (major/''acronym'').inner_html @newMajor.link_to = (major/''a'').to_s.split(''"'')[1] puts title,abbrev,link_to end end end class Classes attr_writer :major_id def scrape(url) agent = WWW::Mechanize.new page = agent.get("http://courses.tamu.edu/"+url.to_s) (page/"//td[@class=''sectionheading'']").each do |course| course = course.inner_html.strip.split('' '') course.pop @newCourse = Course.new @newCourse.major_id = @major_id @newCourse.course_no = course[1] @newCourse.name = course.slice!(3,course.length).join('' '') @newCourse.save end end end AllMajors = Major.find(:all) AllMajors.each do |course| start = Time.now newClass = Classes.new newClass.major_id = course.id newClass.scrape(course.link_to) puts "Added courses for #{course.title}" finish = Time.now puts "Took #{finish-start} seconds" end puts "Finished scraping courses" --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
wbsmith83-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
2007-Jan-26 17:08 UTC
Re: Mechanize out of buffer space
After having to delve into the actual Hpricot source it turns out there''s a predefined buffer size and you can''t change it without actually editing the source and recompiling. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk-unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---