Hi. First of all I would like to say "thank you" to David for its really valuable work. Ferret is a great project and it have great future. Well now is my questions as beginner in Ferret. How to remove ALL documents from index. Remove files is not a solution. I am interesting in something like index.remove_index or something like this. What is a usual way of doing it?? What is the name of default key field. (Field that we could later used in method like as index.remove("23") ). In some docs I seen the name :id in other as :key What is the difference in soring field as string and as integer. For example how should be id field stored. As integer?? ( index << {:id=>self.id.to_s} )?? How index.update() works?? What if document with given id not found. Is such document will be created?? What is the best practice to use Rails hooks for ferret?? I am tried to use following code but it seems does not work correctly. Document indexed twice after object update. Could you help me to write right Rails hook methods?? def after_save index = FerretConfig::INDEX index.remove(self.id.to_s) index.update(self.id.to_s, self.to_document) index.optimize end def before_destroy index = FerretConfig::INDEX index.remove(self.id.to_s) index.optimize end def to_document doc = Document.new doc << Field.new(''id'', self.id.to_s, Field::Store::YES, Field::Index::UNTOKENIZED) doc << Field.new(''body_en'', self.body_en, Field::Store::YES, Field::Index::TOKENIZED, Field::TermVector::NO, false, 1.0) doc << Field.new(''title_en'', self.title_en, Field::Store::YES, Field::Index::TOKENIZED, Field::TermVector::NO, false, 3.0) -- anatol (http://pomozov.info) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20051126/ed338171/attachment-0001.htm
Hi, David. Thank you for the answers. It were helpful. I just missed :key=>:id in index creation. Is any FAQ page in wiki where I could add this information?? I just have added Ferret to rails app at seems that it plays correctly. I also could add some code examples to Ferret wiki if you dont mind. And again - thank you David for the Ferret. On 11/26/05, David Balmain <dbalmain.ml at gmail.com> wrote:> > On 11/26/05, Anatol Pomozov <anatol.pomozov at gmail.com> wrote: > > Hi. > > > > First of all I would like to say "thank you" to David for its really > > valuable work. Ferret is a great project and it have great future. > > Hi Anatol, > You''re welcome. I hope you find it fills your needs. > > > Well now is my questions as beginner in Ferret. > > > > How to remove ALL documents from index. Remove files is not a solution. > I am > > interesting in something like > > index.remove_index or something like this. What is a usual way of doing > it?? > > Perhaps this is the best solution; > > index.size.times {|i| index.delete(i)} > > > What is the name of default key field. (Field that we could later used > in > > method like as index.remove("23") ). In some docs I seen the name :id in > > other as :key > > The id field used in update, doc and delete is always "id". I may make > this an option in future. This field is used when you pass a string to > any of those three methods. If you pass an integer, the document > number in the index is assumed. > > If you want to use a different field as your key, for example "key", > you can use the query_delete method; > > index.query_delete("key:23") > > > What is the difference in soring field as string and as integer. For > example > > how should be id field stored. As integer?? ( index << {:id=> > self.id.to_s} > > )?? > > All fields are stored as strings. Even if you use index << > {:id=>self.id}, id will be converted into a string. > > > How index.update() works?? What if document with given id not found. Is > such > > document will be created?? > > update only updates a document if it already exists. What you are > looking for is add_document ("<<") mixed with the :key option. For > example; > > index = Index::Index.new(:key => :id) > > index << {:id => 23, :data => "This is the data..."} > index << {:id => 23, :data => "This is the new data..."} > > You can even use this when indexing multiple tables like this; > > index = Index::Index.new(:key => [:id, :table]) > > index << {:id => 23, :table => "content", :data => "This is the > data..."} > index << {:id => 23, :table => "content", :data => "This is the > new data..."} > > Note that :key is nil by default so adding a new document will always > add a new document. > > > > > What is the best practice to use Rails hooks for ferret?? I am tried to > use > > following code but it seems does not work correctly. Document indexed > twice > > after object update. Could you help me to write right Rails hook > methods?? > > def after_save > > index = FerretConfig::INDEX > > index.remove(self.id.to_s) > > index.update(self.id.to_s, self.to_document) > > > > index.optimize > > end > > > > def before_destroy > > index = FerretConfig::INDEX > > index.remove(self.id.to_s) > > index.optimize > > end > > > > def to_document > > doc = Document.new > > doc << Field.new(''id'', self.id.to_s, Field::Store::YES, > > Field::Index::UNTOKENIZED) > > doc << Field.new(''body_en'', self.body_en, Field::Store::YES, > > Field::Index::TOKENIZED, Field::TermVector::NO, false, 1.0) > > doc << Field.new(''title_en'', self.title_en, Field::Store::YES, > > Field::Index::TOKENIZED, Field::TermVector::NO, false, 3.0) > > Unfortunately I haven''t had enough time to play with Ferret + Rails > yet. There is a tutorial by Jan Prill here; > > http://wiki.rubyonrails.com/rails/pages/HowToIntegrateFerretWithRails > > I''m not sure where the remove method comes from. Perhaps you''ve mapped > it to delete. Also, personally, I wouldn''t use optimize all the time > like that unless updates are very rare. It''s not really necessary. Of > course there is a payoff between update speed and query speed. You > should play around with or without the optimize to see what works best > for you. This is what I would do; > > # create the index with the :key option plus whatever other options > you''re using; > FerretConfig::INDEX = Index::Index.new(:key => :id) > > def after_save > FerretConfig::INDEX << self.to_document > end > > def before_destroy > # NOTE: the "to_s" is necessary here so that Ferret > #knows to use the id field > FerretConfig::INDEX.delete(self.id.to_s) > end > > If you find a better way to do this, please contribute to Jan Prills > rails wiki entry. I''ll work on better integration between Ferret and > Rails when I finish working on the performance. I''m currently very > busy integrating my C indexer which has been a little harder than I > thought. Currently 2000 lines of C this week and still growing. > > Please let me know if you run into any more problems. > > Cheers, > Dave >-- anatol (http://pomozov.info) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20051126/8741dd6a/attachment.htm
On 11/26/05, Anatol Pomozov <anatol.pomozov at gmail.com> wrote:> Hi. > > First of all I would like to say "thank you" to David for its really > valuable work. Ferret is a great project and it have great future.Hi Anatol, You''re welcome. I hope you find it fills your needs.> Well now is my questions as beginner in Ferret. > > How to remove ALL documents from index. Remove files is not a solution. I am > interesting in something like > index.remove_index or something like this. What is a usual way of doing it??Perhaps this is the best solution; index.size.times {|i| index.delete(i)}> What is the name of default key field. (Field that we could later used in > method like as index.remove("23") ). In some docs I seen the name :id in > other as :keyThe id field used in update, doc and delete is always "id". I may make this an option in future. This field is used when you pass a string to any of those three methods. If you pass an integer, the document number in the index is assumed. If you want to use a different field as your key, for example "key", you can use the query_delete method; index.query_delete("key:23")> What is the difference in soring field as string and as integer. For example > how should be id field stored. As integer?? ( index << {:id=>self.id.to_s} > )??All fields are stored as strings. Even if you use index << {:id=>self.id}, id will be converted into a string.> How index.update() works?? What if document with given id not found. Is such > document will be created??update only updates a document if it already exists. What you are looking for is add_document ("<<") mixed with the :key option. For example; index = Index::Index.new(:key => :id) index << {:id => 23, :data => "This is the data..."} index << {:id => 23, :data => "This is the new data..."} You can even use this when indexing multiple tables like this; index = Index::Index.new(:key => [:id, :table]) index << {:id => 23, :table => "content", :data => "This is the data..."} index << {:id => 23, :table => "content", :data => "This is the new data..."} Note that :key is nil by default so adding a new document will always add a new document.> > What is the best practice to use Rails hooks for ferret?? I am tried to use > following code but it seems does not work correctly. Document indexed twice > after object update. Could you help me to write right Rails hook methods?? > def after_save > index = FerretConfig::INDEX > index.remove(self.id.to_s) > index.update(self.id.to_s, self.to_document) > > index.optimize > end > > def before_destroy > index = FerretConfig::INDEX > index.remove(self.id.to_s) > index.optimize > end > > def to_document > doc = Document.new > doc << Field.new(''id'', self.id.to_s, Field::Store::YES, > Field::Index::UNTOKENIZED) > doc << Field.new(''body_en'', self.body_en, Field::Store::YES, > Field::Index::TOKENIZED, Field::TermVector::NO, false, 1.0) > doc << Field.new(''title_en'', self.title_en, Field::Store::YES, > Field::Index::TOKENIZED, Field::TermVector::NO, false, 3.0)Unfortunately I haven''t had enough time to play with Ferret + Rails yet. There is a tutorial by Jan Prill here; http://wiki.rubyonrails.com/rails/pages/HowToIntegrateFerretWithRails I''m not sure where the remove method comes from. Perhaps you''ve mapped it to delete. Also, personally, I wouldn''t use optimize all the time like that unless updates are very rare. It''s not really necessary. Of course there is a payoff between update speed and query speed. You should play around with or without the optimize to see what works best for you. This is what I would do; # create the index with the :key option plus whatever other options you''re using; FerretConfig::INDEX = Index::Index.new(:key => :id) def after_save FerretConfig::INDEX << self.to_document end def before_destroy # NOTE: the "to_s" is necessary here so that Ferret #knows to use the id field FerretConfig::INDEX.delete(self.id.to_s) end If you find a better way to do this, please contribute to Jan Prills rails wiki entry. I''ll work on better integration between Ferret and Rails when I finish working on the performance. I''m currently very busy integrating my C indexer which has been a little harder than I thought. Currently 2000 lines of C this week and still growing. Please let me know if you run into any more problems. Cheers, Dave
On 11/26/05, Anatol Pomozov <anatol.pomozov at gmail.com> wrote:> Hi, David. > > Thank you for the answers. It were helpful. I just missed :key=>:id in index > creation. > > Is any FAQ page in wiki where I could add this information??At the moment there is a howtos page here; http://ferret.davebalmain.com/trac/wiki/HowTos It''s not very well organized yet. It''s another thing I just haven''t gotten around to. Please feel free to add to it. Also, I just added a Powered By page so please add your site there when it goes live. And if you write any articles or blog entries, please link to them on the FerretArticles page. Thanks, Dave> I just have added Ferret to rails app at seems that it plays correctly. I > also could add some code examples to Ferret wiki if you dont mind. > > And again - thank you David for the Ferret. > > > On 11/26/05, David Balmain <dbalmain.ml at gmail.com> wrote: > > On 11/26/05, Anatol Pomozov <anatol.pomozov at gmail.com > wrote: > > > Hi. > > > > > > First of all I would like to say "thank you" to David for its really > > > valuable work. Ferret is a great project and it have great future. > > > > Hi Anatol, > > You''re welcome. I hope you find it fills your needs. > > > > > Well now is my questions as beginner in Ferret. > > > > > > How to remove ALL documents from index. Remove files is not a solution. > I am > > > interesting in something like > > > index.remove_index or something like this. What is a usual way of doing > it?? > > > > Perhaps this is the best solution; > > > > index.size.times {|i| index.delete(i)} > > > > > What is the name of default key field. (Field that we could later used > in > > > method like as index.remove("23") ). In some docs I seen the name :id in > > > other as :key > > > > The id field used in update, doc and delete is always "id". I may make > > this an option in future. This field is used when you pass a string to > > any of those three methods. If you pass an integer, the document > > number in the index is assumed. > > > > If you want to use a different field as your key, for example "key", > > you can use the query_delete method; > > > > index.query_delete("key:23") > > > > > What is the difference in soring field as string and as integer. For > example > > > how should be id field stored. As integer?? ( index << > {:id=>self.id.to_s} > > > )?? > > > > All fields are stored as strings. Even if you use index << > > {:id=>self.id }, id will be converted into a string. > > > > > How index.update() works?? What if document with given id not found. Is > such > > > document will be created?? > > > > update only updates a document if it already exists. What you are > > looking for is add_document ("<<") mixed with the :key option. For > > example; > > > > index = Index::Index.new(:key => :id) > > > > index << {:id => 23, :data => "This is the data..."} > > index << {:id => 23, :data => "This is the new data..."} > > > > You can even use this when indexing multiple tables like this; > > > > index = Index::Index.new(:key => [:id, :table]) > > > > index << {:id => 23, :table => "content", :data => "This is the > data..."} > > index << {:id => 23, :table => "content", :data => "This is the > > new data..."} > > > > Note that :key is nil by default so adding a new document will always > > add a new document. > > > > > > > > What is the best practice to use Rails hooks for ferret?? I am tried to > use > > > following code but it seems does not work correctly. Document indexed > twice > > > after object update. Could you help me to write right Rails hook > methods?? > > > def after_save > > > index = FerretConfig::INDEX > > > index.remove(self.id.to_s) > > > index.update(self.id.to_s , self.to_document) > > > > > > index.optimize > > > end > > > > > > def before_destroy > > > index = FerretConfig::INDEX > > > index.remove(self.id.to_s) > > > index.optimize > > > end > > > > > > def to_document > > > doc = Document.new > > > doc << Field.new(''id'', self.id.to_s, Field::Store::YES, > > > Field::Index::UNTOKENIZED) > > > doc << Field.new(''body_en'', self.body_en, Field::Store::YES, > > > Field::Index::TOKENIZED, Field::TermVector::NO, false, 1.0) > > > doc << Field.new(''title_en'', self.title_en, Field::Store::YES, > > > Field::Index::TOKENIZED, Field::TermVector::NO, false, 3.0) > > > > Unfortunately I haven''t had enough time to play with Ferret + Rails > > yet. There is a tutorial by Jan Prill here; > > > > > http://wiki.rubyonrails.com/rails/pages/HowToIntegrateFerretWithRails > > > > I''m not sure where the remove method comes from. Perhaps you''ve mapped > > it to delete. Also, personally, I wouldn''t use optimize all the time > > like that unless updates are very rare. It''s not really necessary. Of > > course there is a payoff between update speed and query speed. You > > should play around with or without the optimize to see what works best > > for you. This is what I would do; > > > > # create the index with the :key option plus whatever other options > > you''re using; > > FerretConfig::INDEX = Index::Index.new(:key => :id) > > > > def after_save > > FerretConfig::INDEX << self.to_document > > end > > > > def before_destroy > > # NOTE: the "to_s" is necessary here so that Ferret > > #knows to use the id field > > FerretConfig::INDEX.delete(self.id.to_s) > > end > > > > If you find a better way to do this, please contribute to Jan Prills > > rails wiki entry. I''ll work on better integration between Ferret and > > Rails when I finish working on the performance. I''m currently very > > busy integrating my C indexer which has been a little harder than I > > thought. Currently 2000 lines of C this week and still growing. > > > > Please let me know if you run into any more problems. > > > > Cheers, > > Dave > > > > > > -- > anatol (http://pomozov.info) > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk > > >