On 3/5/07, Chad Thatcher <chad at zulu.net> wrote:>
> Hi, I have question about the delete() method docs.
>
> I am re-indexing data on the fly so I would like to delete any existing
> indexed data for a particular resource before re-indexing it using
> index.delete(id).
>
> The delete() method api doc says:
>
> "Delete the document referenced by the document number id if id is an
> integer or all of the documents which have the term id if id is a term..
>
> id: The number of the document to delete"
>
> I am a little confused by what this means.
Is this any clearer?
# Deletes a document/documents from the index. The method for determining
# the document to delete depends on the type of the argument passed.
#
# If +arg+ is an Integer then delete the document based on the internal
# document number.
#
# If +arg+ is a String then search for the documents with +arg+ in the
# +id+ field. The +id+ field is either :id or whatever you set the :id_field
# parameter to when you create the Index object.
> At the time of deletion all
> I have is my own ID of the resource which was previously indexed in
> ferret with my own field :id. If I supply my own ID will the correct
> indexed data be deleted? Or does this ID refer to ferrets own internal
> ID for the resource?
In this case, since your id is probably an integer you will need to
convert it to a string or Ferret will delete the documents by internal
document number rather than your own ID for the resource.
> One other question while I am on the subject - will deleting a resource
> that does not exist raise an error. I ask this because I would like to
> index new data structures that haven''t been indexed before and
would
> like to avoid checking in the index first whether or not it exists
> before attempting to delete.
Yes, if you delete by internal document number. No, if you are
deleting by term, ie passing your own document id which is stored in
the *id* field. So in your case you should be fine. I should also
mention that you can set the :key parameter to :id;
index = Ferret::Index::Index.new(:key => :id)
This way, whenever you add a document with an id that already exists
in the index it will replace the existing document.
For example;
require ''rubygems''
require ''ferret''
index = Ferret::I.new(:key => :id)
[
{:id => ''1'', :text => ''one''},
{:id => ''2'', :text => ''Two''},
{:id => ''3'', :text => ''Three''},
{:id => ''1'', :text => ''One''}
].each {|doc| index << doc}
puts index.size # => 3
puts index[''1''].load.inspect # =>
{:text=>"One", :id=>"1"}
puts index.search(''id:1'').to_s(:text)
# => TopDocs: total_hits = 1, max_score = 1.287682 [
# 3 "One": 1.287682
# ]
Hope that helps,
Dave
--
Dave Balmain
http://www.davebalmain.com/