I wonder if a Ruby on Rails developer has encounter this before: suppose it is a long article (say 100,000 words), and I need to write a Ruby file to display page 1, 2, or page 38 of the article, by display.html.erb?page=38 but the number of words for each page can change over time (for example, right now if it is 500 words per page, but next month, we can change it to 300 words per page easily). What is a good way to divide the long article and store into the database? P.S. The design may be complicated if we want to display 500 words but include whole paragraphs. That is, if we are showing word 480 already but the paragraph has 100 more words remaining, show those 100 words anyway even though it exceeds the 500 words limit. -- Posted via http://www.ruby-forum.com/.
Make each page a text file, put them all in a directory (document/1.txt, document/2.txt, etc), and then you won''t even have to use the database. -- Posted via http://www.ruby-forum.com/.
Jonathan Rochkind
2009-Jun-01 03:09 UTC
Re: way to divide long article and store in database
Jian Lin wrote:> I wonder if a Ruby on Rails developer has encounter this before: suppose > it is a long article (say 100,000 words), and I need to write a Ruby > file to display page 1, 2, or page 38 of the article, by > > display.html.erb?page=38 > > but the number of words for each page can change over time (for example, > right now if it is 500 words per page, but next month, we can change it > to 300 words per page easilyWhy divide it in the database? Store it one field in the database, and when you fetch it from the database just perform the logic to find page=38 and then display that. If actual testing indicates that''s too slow with the actual quantity of data you expect, then you''d have to perform a word-boundary calculation on inserting the value in the db, and store the results as an ''index'' to the text somehow. Either way, I don''t see any reason to actually split up the text file in the db. Unless you want to let the user _search_ for, say, word X on page N of the text. But then you''re getting into complicated enough text searching land that I''d investigate using something like lucene/solr to index your text, instead of an rdbms, and seeing what support for page-boundary-based-searching eg lucene/solr have. -- Posted via http://www.ruby-forum.com/.
Jonathan Rochkind wrote:> Jian Lin wrote: >> I wonder if a Ruby on Rails developer has encounter this before: suppose >> it is a long article (say 100,000 words), and I need to write a Ruby >> file to display page 1, 2, or page 38 of the article, by >> >> display.html.erb?page=38 >> >> but the number of words for each page can change over time (for example, >> right now if it is 500 words per page, but next month, we can change it >> to 300 words per page easily > > Why divide it in the database? Store it one field in the database, and > when you fetch it from the database just perform the logic to find > page=38 and then display that.is it true that it all the 100,000 words are in one record (one row), then every time, the whole field needs to be retrieved. If we assume one work is about 6 characters long (with the space), then it is 600kbyte per read. I hope to make it "read as needed"... 500 words and about 3kbyte read per page each time. -- Posted via http://www.ruby-forum.com/.
nodoubtarockstar
2009-Jun-01 06:26 UTC
Re: way to divide long article and store in database
If you *must* split it up in the database, your changing your mind from 500 to 300 is going to suck, otherwise you might use a "pages" assocation or something of the like which would be very simple... for instance: class Article < ActiveRecord::Base has_many :pages validates_presence_of :text after_create i = 0 b = text.scan(/\b\S+\b/) b.each_slice(500) do |x| self.pages.create(:page => i+=1, :text => x.join(" ")) end end end class Page < ActiveRecord::Base belongs_to :article end Someone probably has a MUCH prettier method of doing this, was just kind of on-the-fly... Cheers! On May 31, 9:14 pm, Jian Lin <rails-mailing-l...-ARtvInVfO7ksV2N9l4h3zg@public.gmane.org> wrote:> Jonathan Rochkind wrote: > > Jian Lin wrote: > >> I wonder if a Ruby on Rails developer has encounter this before: suppose > >> it is a long article (say 100,000 words), and I need to write a Ruby > >> file to display page 1, 2, or page 38 of the article, by > > >> display.html.erb?page=38 > > >> but the number of words for each page can change over time (for example, > >> right now if it is 500 words per page, but next month, we can change it > >> to 300 words per page easily > > > Why divide it in the database? Store it one field in the database, and > > when you fetch it from the database just perform the logic to find > > page=38 and then display that. > > is it true that it all the 100,000 words are in one record (one row), > then every time, the whole field needs to be retrieved. If we assume > one work is about 6 characters long (with the space), then it is > 600kbyte per read. I hope to make it "read as needed"... 500 words and > about 3kbyte read per page each time. > > -- > Posted viahttp://www.ruby-forum.com/.