Hi All, I have a Ferret index containing some cached RSS feeds. I have a nightly cron script to cache the feeds, and I''d like to update the index with the latest feeds. I see the Index class has an update method, but I can''t work out how to get the id of the relevant document to pass in. Lets say I have a file called "google_news.xml" I want to go: my_index.update(google_id, google_doc) I''m sure this is way too easy and I''m being massively dumb, but - - any hints/advice gratefully received. Many Thanks, Steven -- Posted via http://www.ruby-forum.com/.
The way I usually handle updates like this is to store the filename in the index as a different field in the document. You can then search the index for that filename, get the index for that entry, and update accordingly. On 6/15/06, steven <shingler at gmail.com> wrote:> > Hi All, > > I have a Ferret index containing some cached RSS feeds. > > I have a nightly cron script to cache the feeds, and I''d like to update > the index with the latest feeds. > > I see the Index class has an update method, but I can''t work out how to > get the id of the relevant document to pass in. > > Lets say I have a file called "google_news.xml" > > I want to go: > my_index.update(google_id, google_doc) > > I''m sure this is way too easy and I''m being massively dumb, but - - any > hints/advice gratefully received. > > Many Thanks, > Steven > > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20060615/2f4f5e00/attachment.htm
In this case sounds like RSS feed URL is your natural primary key. You could add untokenized ''id'' field to your documents and then retrieve and update them by using URLs as keys. And you could even have a more natural field name if you create index with some optional params. Example: url = ''http://feeds.feedburner.com/RidingRails'' index = Ferret::Index::Index.new(:path => "#{RAILS_ROOT}/db/ferret", :id_field => ''url'') document = Ferret::Document::Document.new document << Ferret::Document::Field.new(''url'', url, Ferret::Document::Field::Store::YES, Ferret::Document::Field::Index::UNTOKENIZED) document << Ferret::Document::Field.new(''content'', ''Rails are great!'', Ferret::Document::Field::Store::YES, Ferret::Document::Field::Index::TOKENIZED) index << document document = index[url] puts document[''url''] == url # true document[''content''] = ''I agree'' index.update(url, document) index[url][''content''] == I agree # true index.size == 1 # true -- Sergei Serdyuk Red Leaf Software LLC web: http://redleafsoft.com> Hi All, > > I have a Ferret index containing some cached RSS feeds. > > I have a nightly cron script to cache the feeds, and I''d like to update > the index with the latest feeds. > > I see the Index class has an update method, but I can''t work out how to > get the id of the relevant document to pass in. > > Lets say I have a file called "google_news.xml" > > I want to go: > my_index.update(google_id, google_doc) > > I''m sure this is way too easy and I''m being massively dumb, but - - any > hints/advice gratefully received. > > Many Thanks, > Steven-- Posted via http://www.ruby-forum.com/.