thr3ads.net - Ferret talk - [Ferret-talk] Several questions about Ferret. [Nov 2005]

If this information is useful, please help other people find it:
Share via:

Anatol Pomozov

2005-Nov-26 02:36 UTC

[Ferret-talk] Several questions about Ferret.

Hi.

First of all I would like to say "thank you" to David for its really
valuable work. Ferret is a great project and it have great future.

Well now is my questions as beginner in Ferret.

How to remove ALL documents from index. Remove files is not a solution. I am
interesting in something like
index.remove_index or something like this. What is a usual way of doing it??

What is the name of default key field. (Field that we could later used in
method like as index.remove("23") ). In some docs I seen the name :id
in
other as :key

What is the difference in soring field as string and as integer. For example
how should be id field stored. As integer?? ( index <<
{:id=>self.id.to_s}
)??

How index.update() works?? What if document with given id not found. Is such
document will be created??


What is the best practice to use Rails hooks for ferret?? I am tried to use
following code but it seems does not work correctly. Document indexed twice
after object update. Could you help me to write right Rails hook methods??
  def after_save
    index = FerretConfig::INDEX
    index.remove(self.id.to_s)
    index.update(self.id.to_s, self.to_document)

    index.optimize
  end

  def before_destroy
    index = FerretConfig::INDEX
    index.remove(self.id.to_s)
    index.optimize
  end

  def to_document
    doc = Document.new
    doc << Field.new(''id'',    self.id.to_s,   
Field::Store::YES,
Field::Index::UNTOKENIZED)
    doc << Field.new(''body_en'', self.body_en,
Field::Store::YES,
Field::Index::TOKENIZED, Field::TermVector::NO, false, 1.0)
    doc << Field.new(''title_en'',  self.title_en, 
Field::Store::YES,
Field::Index::TOKENIZED, Field::TermVector::NO, false, 3.0)


--
anatol (http://pomozov.info)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/ferret-talk/attachments/20051126/ed338171/attachment-0001.htm

Anatol Pomozov

2005-Nov-26 09:33 UTC

head link

[Ferret-talk] Several questions about Ferret.

Hi, David.

Thank you for the answers. It were helpful. I just missed :key=>:id in index
creation.

Is any FAQ page in wiki where I could add this information??

I just have added Ferret to rails app at seems that it plays correctly. I
also could add some code examples to Ferret wiki if you dont mind.

And again - thank you David for the Ferret.

On 11/26/05, David Balmain <dbalmain.ml at gmail.com>
wrote:>
> On 11/26/05, Anatol Pomozov <anatol.pomozov at gmail.com> wrote:
> > Hi.
> >
> > First of all I would like to say "thank you" to David for
its really
> > valuable work. Ferret is a great project and it have great future.
>
> Hi Anatol,
> You''re welcome. I hope you find it fills your needs.
>
> > Well now is my questions as beginner in Ferret.
> >
> > How to remove ALL documents from index. Remove files is not a
solution.
> I am
> > interesting in something like
> > index.remove_index or something like this. What is a usual way of
doing
> it??
>
> Perhaps this is the best solution;
>
>     index.size.times {|i| index.delete(i)}
>
> > What is the name of default key field. (Field that we could later used
> in
> > method like as index.remove("23") ). In some docs I seen the
name :id in
> > other as :key
>
> The id field used in update, doc and delete is always "id". I may
make
> this an option in future. This field is used when you pass a string to
> any of those three methods. If you pass an integer, the document
> number in the index is assumed.
>
> If you want to use a different field as your key, for example
"key",
> you can use the query_delete method;
>
>     index.query_delete("key:23")
>
> > What is the difference in soring field as string and as integer. For
> example
> > how should be id field stored. As integer?? ( index << {:id=>
> self.id.to_s}
> > )??
>
> All fields are stored as strings. Even if you use index <<
> {:id=>self.id}, id will be converted into a string.
>
> > How index.update() works?? What if document with given id not found.
Is
> such
> > document will be created??
>
> update only updates a document if it already exists. What you are
> looking for is add_document ("<<") mixed with the :key
option. For
> example;
>
>     index = Index::Index.new(:key => :id)
>
>     index << {:id => 23, :data => "This is the
data..."}
>     index << {:id => 23, :data => "This is the new
data..."}
>
> You can even use this when indexing multiple tables like this;
>
>     index = Index::Index.new(:key => [:id, :table])
>
>     index << {:id => 23, :table => "content", :data
=> "This is the
> data..."}
>     index << {:id => 23, :table => "content", :data
=> "This is the
> new data..."}
>
> Note that :key is nil by default so adding a new document will always
> add a new document.
>
> >
> > What is the best practice to use Rails hooks for ferret?? I am tried
to
> use
> > following code but it seems does not work correctly. Document indexed
> twice
> > after object update. Could you help me to write right Rails hook
> methods??
> >   def after_save
> >     index = FerretConfig::INDEX
> >     index.remove(self.id.to_s)
> >     index.update(self.id.to_s, self.to_document)
> >
> >     index.optimize
> >   end
> >
> >   def before_destroy
> >     index = FerretConfig::INDEX
> >     index.remove(self.id.to_s)
> >     index.optimize
> >   end
> >
> >   def to_document
> >     doc = Document.new
> >     doc << Field.new(''id'',    self.id.to_s,   
Field::Store::YES,
> > Field::Index::UNTOKENIZED)
> >      doc << Field.new(''body_en'', self.body_en,
Field::Store::YES,
> > Field::Index::TOKENIZED, Field::TermVector::NO, false, 1.0)
> >     doc << Field.new(''title_en'', 
self.title_en,  Field::Store::YES,
> > Field::Index::TOKENIZED, Field::TermVector::NO, false, 3.0)
>
> Unfortunately I haven''t had enough time to play with Ferret +
Rails
> yet. There is a tutorial by Jan Prill here;
>
> http://wiki.rubyonrails.com/rails/pages/HowToIntegrateFerretWithRails
>
> I''m not sure where the remove method comes from. Perhaps
you''ve mapped
> it to delete. Also, personally, I wouldn''t use optimize all the
time
> like that unless updates are very rare. It''s not really necessary.
Of
> course there is a payoff between update speed and query speed. You
> should play around with or without the optimize to see what works best
> for you. This is what I would do;
>
>    # create the index with the :key option plus whatever other options
> you''re using;
>    FerretConfig::INDEX = Index::Index.new(:key => :id)
>
>    def after_save
>      FerretConfig::INDEX << self.to_document
>    end
>
>    def before_destroy
>      # NOTE: the "to_s" is necessary here so that Ferret
>      #knows to use the id field
>      FerretConfig::INDEX.delete(self.id.to_s)
>    end
>
> If you find a better way to do this, please contribute to Jan Prills
> rails wiki entry. I''ll work on better integration between Ferret
and
> Rails when I finish working on the performance. I''m currently very
> busy integrating my C indexer which has been a little harder than I
> thought. Currently 2000 lines of C this week and still growing.
>
> Please let me know if you run into any more problems.
>
> Cheers,
> Dave
>


--
anatol (http://pomozov.info)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://rubyforge.org/pipermail/ferret-talk/attachments/20051126/8741dd6a/attachment.htm

David Balmain

2005-Nov-26 17:23 UTC

head link

[Ferret-talk] Several questions about Ferret.

On 11/26/05, Anatol Pomozov <anatol.pomozov at gmail.com>
wrote:> Hi.
>
> First of all I would like to say "thank you" to David for its
really
> valuable work. Ferret is a great project and it have great future.
Hi Anatol,
You''re welcome. I hope you find it fills your needs.
> Well now is my questions as beginner in Ferret.
>
> How to remove ALL documents from index. Remove files is not a solution. I
am
> interesting in something like
> index.remove_index or something like this. What is a usual way of doing
it??
Perhaps this is the best solution;

    index.size.times {|i| index.delete(i)}
> What is the name of default key field. (Field that we could later used in
> method like as index.remove("23") ). In some docs I seen the name
:id in
> other as :key
The id field used in update, doc and delete is always "id". I may make
this an option in future. This field is used when you pass a string to
any of those three methods. If you pass an integer, the document
number in the index is assumed.

If you want to use a different field as your key, for example "key",
you can use the query_delete method;

    index.query_delete("key:23")
> What is the difference in soring field as string and as integer. For
example
> how should be id field stored. As integer?? ( index <<
{:id=>self.id.to_s}
> )??
All fields are stored as strings. Even if you use index <<
{:id=>self.id}, id will be converted into a string.
> How index.update() works?? What if document with given id not found. Is
such
> document will be created??
update only updates a document if it already exists. What you are
looking for is add_document ("<<") mixed with the :key option.
For
example;

    index = Index::Index.new(:key => :id)

    index << {:id => 23, :data => "This is the data..."}
    index << {:id => 23, :data => "This is the new
data..."}

You can even use this when indexing multiple tables like this;

    index = Index::Index.new(:key => [:id, :table])

    index << {:id => 23, :table => "content", :data =>
"This is the data..."}
    index << {:id => 23, :table => "content", :data =>
"This is the
new data..."}

Note that :key is nil by default so adding a new document will always
add a new document.
>
> What is the best practice to use Rails hooks for ferret?? I am tried to use
> following code but it seems does not work correctly. Document indexed twice
> after object update. Could you help me to write right Rails hook methods??
>   def after_save
>     index = FerretConfig::INDEX
>     index.remove(self.id.to_s)
>     index.update(self.id.to_s, self.to_document)
>
>     index.optimize
>   end
>
>   def before_destroy
>     index = FerretConfig::INDEX
>     index.remove(self.id.to_s)
>     index.optimize
>   end
>
>   def to_document
>     doc = Document.new
>     doc << Field.new(''id'',    self.id.to_s,   
Field::Store::YES,
> Field::Index::UNTOKENIZED)
>      doc << Field.new(''body_en'', self.body_en,
Field::Store::YES,
> Field::Index::TOKENIZED, Field::TermVector::NO, false, 1.0)
>     doc << Field.new(''title_en'',  self.title_en, 
Field::Store::YES,
> Field::Index::TOKENIZED, Field::TermVector::NO, false, 3.0)
Unfortunately I haven''t had enough time to play with Ferret + Rails
yet. There is a tutorial by Jan Prill here;

http://wiki.rubyonrails.com/rails/pages/HowToIntegrateFerretWithRails

I''m not sure where the remove method comes from. Perhaps
you''ve mapped
it to delete. Also, personally, I wouldn''t use optimize all the time
like that unless updates are very rare. It''s not really necessary. Of
course there is a payoff between update speed and query speed. You
should play around with or without the optimize to see what works best
for you. This is what I would do;

   # create the index with the :key option plus whatever other options
you''re using;
   FerretConfig::INDEX = Index::Index.new(:key => :id)

   def after_save
     FerretConfig::INDEX << self.to_document
   end

   def before_destroy
     # NOTE: the "to_s" is necessary here so that Ferret
     #knows to use the id field
     FerretConfig::INDEX.delete(self.id.to_s)
   end

If you find a better way to do this, please contribute to Jan Prills
rails wiki entry. I''ll work on better integration between Ferret and
Rails when I finish working on the performance. I''m currently very
busy integrating my C indexer which has been a little harder than I
thought. Currently 2000 lines of C this week and still growing.

Please let me know if you run into any more problems.

Cheers,
Dave

David Balmain

2005-Nov-26 17:23 UTC

head link

[Ferret-talk] Several questions about Ferret.

On 11/26/05, Anatol Pomozov <anatol.pomozov at gmail.com>
wrote:> Hi, David.
>
> Thank you for the answers. It were helpful. I just missed :key=>:id in
index
> creation.
>
> Is any FAQ page in wiki where I could add this information??
At the moment there is a howtos page here;

http://ferret.davebalmain.com/trac/wiki/HowTos

It''s not very well organized yet. It''s another thing I just
haven''t
gotten around to. Please feel free to add to it.

Also, I just added a Powered By page so please add your site there
when it goes live. And if you write any articles or blog entries,
please link to them on the FerretArticles page.

Thanks,
Dave
> I just have added Ferret to rails app at seems that it plays correctly. I
> also could add some code examples to Ferret wiki if you dont mind.
>
> And again - thank you David for the Ferret.
>
>
> On 11/26/05, David Balmain <dbalmain.ml at gmail.com> wrote:
> > On 11/26/05, Anatol Pomozov <anatol.pomozov at gmail.com >
wrote:
> > > Hi.
> > >
> > > First of all I would like to say "thank you" to David
for its really
> > > valuable work. Ferret is a great project and it have great
future.
> >
> > Hi Anatol,
> > You''re welcome. I hope you find it fills your needs.
> >
> > > Well now is my questions as beginner in Ferret.
> > >
> > > How to remove ALL documents from index. Remove files is not a
solution.
> I am
> > > interesting in something like
> > > index.remove_index or something like this. What is a usual way of
doing
> it??
> >
> > Perhaps this is the best solution;
> >
> >     index.size.times {|i| index.delete(i)}
> >
> > > What is the name of default key field. (Field that we could later
used
> in
> > > method like as index.remove("23") ). In some docs I
seen the name :id in
> > > other as :key
> >
> > The id field used in update, doc and delete is always "id".
I may make
> > this an option in future. This field is used when you pass a string to
> > any of those three methods. If you pass an integer, the document
> > number in the index is assumed.
> >
> > If you want to use a different field as your key, for example
"key",
> > you can use the query_delete method;
> >
> >     index.query_delete("key:23")
> >
> > > What is the difference in soring field as string and as integer.
For
> example
> > > how should be id field stored. As integer?? ( index <<
> {:id=>self.id.to_s}
> > > )??
> >
> > All fields are stored as strings. Even if you use index <<
> > {:id=>self.id }, id will be converted into a string.
> >
> > > How index.update() works?? What if document with given id not
found. Is
> such
> > > document will be created??
> >
> > update only updates a document if it already exists. What you are
> > looking for is add_document ("<<") mixed with the :key
option. For
> > example;
> >
> >     index = Index::Index.new(:key => :id)
> >
> >     index << {:id => 23, :data => "This is the
data..."}
> >     index << {:id => 23, :data => "This is the new
data..."}
> >
> > You can even use this when indexing multiple tables like this;
> >
> >     index = Index::Index.new(:key => [:id, :table])
> >
> >     index << {:id => 23, :table => "content",
:data => "This is the
> data..."}
> >     index << {:id => 23, :table => "content",
:data => "This is the
> > new data..."}
> >
> > Note that :key is nil by default so adding a new document will always
> > add a new document.
> >
> > >
> > > What is the best practice to use Rails hooks for ferret?? I am
tried to
> use
> > > following code but it seems does not work correctly. Document
indexed
> twice
> > > after object update. Could you help me to write right Rails hook
> methods??
> > >   def after_save
> > >     index = FerretConfig::INDEX
> > >     index.remove(self.id.to_s)
> > >     index.update(self.id.to_s , self.to_document)
> > >
> > >     index.optimize
> > >   end
> > >
> > >   def before_destroy
> > >     index = FerretConfig::INDEX
> > >     index.remove(self.id.to_s)
> > >     index.optimize
> > >   end
> > >
> > >   def to_document
> > >     doc = Document.new
> > >     doc << Field.new(''id'',   
self.id.to_s,    Field::Store::YES,
> > > Field::Index::UNTOKENIZED)
> > >      doc << Field.new(''body_en'',
self.body_en, Field::Store::YES,
> > > Field::Index::TOKENIZED, Field::TermVector::NO, false, 1.0)
> > >     doc << Field.new(''title_en'', 
self.title_en,  Field::Store::YES,
> > > Field::Index::TOKENIZED, Field::TermVector::NO, false, 3.0)
> >
> > Unfortunately I haven''t had enough time to play with Ferret +
Rails
> > yet. There is a tutorial by Jan Prill here;
> >
> >
> http://wiki.rubyonrails.com/rails/pages/HowToIntegrateFerretWithRails
> >
> > I''m not sure where the remove method comes from. Perhaps
you''ve mapped
> > it to delete. Also, personally, I wouldn''t use optimize all
the time
> > like that unless updates are very rare. It''s not really
necessary. Of
> > course there is a payoff between update speed and query speed. You
> > should play around with or without the optimize to see what works best
> > for you. This is what I would do;
> >
> >    # create the index with the :key option plus whatever other options
> > you''re using;
> >    FerretConfig::INDEX = Index::Index.new(:key => :id)
> >
> >    def after_save
> >      FerretConfig::INDEX << self.to_document
> >    end
> >
> >    def before_destroy
> >      # NOTE: the "to_s" is necessary here so that Ferret
> >      #knows to use the id field
> >      FerretConfig::INDEX.delete(self.id.to_s)
> >    end
> >
> > If you find a better way to do this, please contribute to Jan Prills
> > rails wiki entry. I''ll work on better integration between
Ferret and
> > Rails when I finish working on the performance. I''m currently
very
> > busy integrating my C indexer which has been a little harder than I
> > thought. Currently 2000 lines of C this week and still growing.
> >
> > Please let me know if you run into any more problems.
> >
> > Cheers,
> > Dave
> >
>
>
>
> --
> anatol (http://pomozov.info)
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk
>
>
>

Maybe Matching Threads

Search for more maybe matching threads

Ferret talk - Nov 2005 - Several questions about Ferret.

[Ferret-talk] Several questions about Ferret.

[Ferret-talk] Several questions about Ferret.

[Ferret-talk] Several questions about Ferret.

[Ferret-talk] Several questions about Ferret.

Maybe Matching Threads