thr3ads.net - Ferret talk - [Ferret-talk] Parallel indexing doesn''t work? [Jan 2008]

If this information is useful, please help other people find it:
Share via:

Noah M. Daniels

2008-Jan-09 21:02 UTC

[Ferret-talk] Parallel indexing doesn''t work?

Hi,

I''m trying to get parallelized ferret indexing working for my AAF  
indices, based on the example in the O''Reilly Ferret shortcut.  
However, the resulting indices after merging seem to have no actual  
documents.

I went and made minimal changes to the example in the Ferret shortcut  
pdf, and indeed can''t get that to work either. I''d appreciate
any help
anyone can give! Thanks!

The example is below:

#!/usr/bin/env ruby

require ''rubygems''
require ''ferret''
include Ferret::Index

5.times do |i|
   name = "index#{i}"
   puts name
   i = Ferret::I.new(:path => "/tmp/#{i}", :create => true)
   i << {:name => name}
   i.close
end
readers = []
readers << IndexReader.new("/tmp/0")
readers << IndexReader.new("/tmp/1")
readers << IndexReader.new("/tmp/2")
readers << IndexReader.new("/tmp/3")
readers << IndexReader.new("/tmp/4")
index_writer = IndexWriter.new(:path => "/tmp/test")
index_writer.add_readers(readers)
index_writer.close()
readers.each {|reader| reader.close()}
i = Ferret::I.new(:path => ''/tmp/test'')
res = i.search(''name*'')
puts res.inspect # gives me: #<struct Ferret::Search::TopDocs  
total_hits=0, hits=[], max_score=0.0,  
searcher=#<Ferret::Search::Searcher:0x58a6ec>>

puts res.hits.size # gives me: 0

Jens Kraemer

2008-Jan-09 21:24 UTC

head link

[Ferret-talk] Parallel indexing doesn''t work?

Hi!

seems to me you''re indexing strings starting with
''index'' but you''re
searching for ''name''? Or maybe correcting this already was one
of your
minimal changes?

If not, try changing that line:> res = i.search(''name*'')
to> res = i.search(''index*'')
cheers,
Jens

On Wed, Jan 09, 2008 at 04:02:17PM -0500, Noah M. Daniels
wrote:> Hi,
> 
> I''m trying to get parallelized ferret indexing working for my AAF
> indices, based on the example in the O''Reilly Ferret shortcut.  
> However, the resulting indices after merging seem to have no actual  
> documents.
> 
> I went and made minimal changes to the example in the Ferret shortcut  
> pdf, and indeed can''t get that to work either. I''d
appreciate any help
> anyone can give! Thanks!
> 
> The example is below:
> 
> #!/usr/bin/env ruby
> 
> require ''rubygems''
> require ''ferret''
> include Ferret::Index
> 
> 5.times do |i|
>    name = "index#{i}"
>    puts name
>    i = Ferret::I.new(:path => "/tmp/#{i}", :create => true)
>    i << {:name => name}
>    i.close
> end
> readers = []
> readers << IndexReader.new("/tmp/0")
> readers << IndexReader.new("/tmp/1")
> readers << IndexReader.new("/tmp/2")
> readers << IndexReader.new("/tmp/3")
> readers << IndexReader.new("/tmp/4")
> index_writer = IndexWriter.new(:path => "/tmp/test")
> index_writer.add_readers(readers)
> index_writer.close()
> readers.each {|reader| reader.close()}
> i = Ferret::I.new(:path => ''/tmp/test'')
> res = i.search(''name*'')
> puts res.inspect # gives me: #<struct Ferret::Search::TopDocs  
> total_hits=0, hits=[], max_score=0.0,  
> searcher=#<Ferret::Search::Searcher:0x58a6ec>>
> 
> puts res.hits.size # gives me: 0
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk
> 
-- 
Jens Kr?mer
http://www.jkraemer.net/ - Blog
http://www.omdb.org/     - The new free film database

Noah M. Daniels

2008-Jan-09 21:37 UTC

head link

[Ferret-talk] Parallel indexing doesn''t work?

Thanks, Jens. Good catch; this little example works correctly after  
fixing that change.

However, my ActsAsFerret index merging does _not_ work, and I''m  
wondering if it''s something to do with AAF''s handling of
documents in
an index?

Let''s call my indexed class Company...


Company.find_by_contents(''*'')
=> #<ActsAsFerret::SearchResults:0x2b1699108878 @current_page=nil,  
@total_hits=3, @results=[], @total_pages=1, @per_page=3>


yet on each partial index prior to merging, that query would return a  
bunch of results as one would expect.

now, here''s how I''ve built that index... any idea why the
merged index
is broken?

module FerretHelpers
def merge_ferret_index_partitions(model)

     model_dir = File.basename(model.aaf_configuration[:ferret][:path])

     final_index_path =
"/tmp/merged_parallel_ferret_index/#{model_dir}"

     partial_index_path = "/tmp/partial_indices/#{model_dir}"

     paths = Dir.glob("#{partial_index_path}/*")

     paths.each do |path|
       i = Ferret::I.new(:path => path, :create => true)
       name = path.split(''/'').last
       i << {:name => name}
       i.close
     end

     readers = []
     paths.each {|path| readers << IndexReader.new(path) }
     index_writer = IndexWriter.new(:path => final_index_path)
     index_writer.add_readers(readers)
     index_writer.close()
     readers.each {|reader| reader.close()}
     index = Ferret::Index::Index.new(:path => final_index_path)
     index.optimize
     index.close


   end
end


On Jan 9, 2008, at 4:24 PM, Jens Kraemer wrote:
> Hi!
>
> seems to me you''re indexing strings starting with
''index'' but you''re
> searching for ''name''? Or maybe correcting this already
was one of your
> minimal changes?
>
> If not, try changing that line:
>> res = i.search(''name*'')
> to
>> res = i.search(''index*'')
>
> cheers,
> Jens
>
> On Wed, Jan 09, 2008 at 04:02:17PM -0500, Noah M. Daniels wrote:
>> Hi,
>>
>> I''m trying to get parallelized ferret indexing working for my
AAF
>> indices, based on the example in the O''Reilly Ferret shortcut.
>> However, the resulting indices after merging seem to have no actual
>> documents.
>>
>> I went and made minimal changes to the example in the Ferret shortcut
>> pdf, and indeed can''t get that to work either. I''d
appreciate any
>> help
>> anyone can give! Thanks!
>>
>> The example is below:
>>
>> #!/usr/bin/env ruby
>>
>> require ''rubygems''
>> require ''ferret''
>> include Ferret::Index
>>
>> 5.times do |i|
>>   name = "index#{i}"
>>   puts name
>>   i = Ferret::I.new(:path => "/tmp/#{i}", :create =>
true)
>>   i << {:name => name}
>>   i.close
>> end
>> readers = []
>> readers << IndexReader.new("/tmp/0")
>> readers << IndexReader.new("/tmp/1")
>> readers << IndexReader.new("/tmp/2")
>> readers << IndexReader.new("/tmp/3")
>> readers << IndexReader.new("/tmp/4")
>> index_writer = IndexWriter.new(:path => "/tmp/test")
>> index_writer.add_readers(readers)
>> index_writer.close()
>> readers.each {|reader| reader.close()}
>> i = Ferret::I.new(:path => ''/tmp/test'')
>> res = i.search(''name*'')
>> puts res.inspect # gives me: #<struct Ferret::Search::TopDocs
>> total_hits=0, hits=[], max_score=0.0,
>> searcher=#<Ferret::Search::Searcher:0x58a6ec>>
>>
>> puts res.hits.size # gives me: 0
>> _______________________________________________
>> Ferret-talk mailing list
>> Ferret-talk at rubyforge.org
>> http://rubyforge.org/mailman/listinfo/ferret-talk
>>
>
> -- 
> Jens Kr?mer
> http://www.jkraemer.net/ - Blog
> http://www.omdb.org/     - The new free film database
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk

Noah M. Daniels

2008-Jan-09 21:58 UTC

head link

[Ferret-talk] Parallel indexing doesn''t work?

Ok, further update -- there was an obvious and stupid bug in my code  
that was overwriting the partial indices. So now when that''s fixed, I  
get the proper number of results for a search:

Company.find_by_contents(''*'')
=> #<ActsAsFerret::SearchResults:0x2b0902ab4628 @current_page=nil,  
@total_hits=247, @results=[], @total_pages=1, @per_page=247>

however, why is @results empty?

Similarly, find_id_by_contents also returns empty documents, it seems:

 >> Company.find_id_by_contents(''*'')
=> [247, [{:model=>"Company", :data=>{}, :score=>1.0,
:id=>nil},
{:model=>"Company", :data=>{}, :score=>1.0, :id=>nil},  
{:model=>"Company", :data=>{}, :score=>1.0, :id=>nil},  
{:model=>"Company", :data=>{}, :score=>1.0, :id=>nil},  
{:model=>"Company", :data=>{}, :score=>1.0, :id=>nil},  
{:model=>"Company", :data=>{}, :score=>1.0, :id=>nil},  
{:model=>"Company", :data=>{}, :score=>1.0, :id=>nil},  
{:model=>"Company", :data=>{}, :score=>1.0, :id=>nil},  
{:model=>"Company", :data=>{}, :score=>1.0, :id=>nil},  
{:model=>"Company", :data=>{}, :score=>1.0, :id=>nil}]]

when I would have expected:

 >> Company.find_id_by_contents(''*'')
=> [247, [{:model=>"Company", :score=>1.0,
:id=>"189", :data=>{}},
{:model=>"Company", :score=>1.0, :id=>"2",
:data=>{}},
{:model=>"Company", :score=>1.0, :id=>"192",
:data=>{}},
{:model=>"Company", :score=>1.0, :id=>"4",
:data=>{}},
{:model=>"Company", :score=>1.0, :id=>"6",
:data=>{}},
{:model=>"Company", :score=>1.0, :id=>"7",
:data=>{}},
{:model=>"Company", :score=>1.0, :id=>"8",
:data=>{}},
{:model=>"Company", :score=>1.0, :id=>"37",
:data=>{}},
{:model=>"Company", :score=>1.0, :id=>"13",
:data=>{}},
{:model=>"Company", :score=>1.0, :id=>"21",
:data=>{}}]]

thanks for the help, and sorry for the silly previous bugs :)

On Jan 9, 2008, at 4:37 PM, Noah M. Daniels wrote:
> Thanks, Jens. Good catch; this little example works correctly after
> fixing that change.
>
> However, my ActsAsFerret index merging does _not_ work, and I''m
> wondering if it''s something to do with AAF''s handling of
documents in
> an index?
>
> Let''s call my indexed class Company...
>
>
> Company.find_by_contents(''*'')
> => #<ActsAsFerret::SearchResults:0x2b1699108878 @current_page=nil,
> @total_hits=3, @results=[], @total_pages=1, @per_page=3>
>
>
> yet on each partial index prior to merging, that query would return a
> bunch of results as one would expect.
>
> now, here''s how I''ve built that index... any idea why the
merged index
> is broken?
>
> module FerretHelpers
> def merge_ferret_index_partitions(model)
>
>     model_dir = File.basename(model.aaf_configuration[:ferret][:path])
>
>     final_index_path = "/tmp/merged_parallel_ferret_index/ 
> #{model_dir}"
>
>     partial_index_path = "/tmp/partial_indices/#{model_dir}"
>
>     paths = Dir.glob("#{partial_index_path}/*")
>
>     paths.each do |path|
>       i = Ferret::I.new(:path => path, :create => true)
>       name = path.split(''/'').last
>       i << {:name => name}
>       i.close
>     end
>
>     readers = []
>     paths.each {|path| readers << IndexReader.new(path) }
>     index_writer = IndexWriter.new(:path => final_index_path)
>     index_writer.add_readers(readers)
>     index_writer.close()
>     readers.each {|reader| reader.close()}
>     index = Ferret::Index::Index.new(:path => final_index_path)
>     index.optimize
>     index.close
>
>
>   end
> end
>
>
> On Jan 9, 2008, at 4:24 PM, Jens Kraemer wrote:
>
>> Hi!
>>
>> seems to me you''re indexing strings starting with
''index'' but you''re
>> searching for ''name''? Or maybe correcting this
already was one of
>> your
>> minimal changes?
>>
>> If not, try changing that line:
>>> res = i.search(''name*'')
>> to
>>> res = i.search(''index*'')
>>
>> cheers,
>> Jens
>>
>> On Wed, Jan 09, 2008 at 04:02:17PM -0500, Noah M. Daniels wrote:
>>> Hi,
>>>
>>> I''m trying to get parallelized ferret indexing working for
my AAF
>>> indices, based on the example in the O''Reilly Ferret
shortcut.
>>> However, the resulting indices after merging seem to have no actual
>>> documents.
>>>
>>> I went and made minimal changes to the example in the Ferret  
>>> shortcut
>>> pdf, and indeed can''t get that to work either.
I''d appreciate any
>>> help
>>> anyone can give! Thanks!
>>>
>>> The example is below:
>>>
>>> #!/usr/bin/env ruby
>>>
>>> require ''rubygems''
>>> require ''ferret''
>>> include Ferret::Index
>>>
>>> 5.times do |i|
>>>  name = "index#{i}"
>>>  puts name
>>>  i = Ferret::I.new(:path => "/tmp/#{i}", :create =>
true)
>>>  i << {:name => name}
>>>  i.close
>>> end
>>> readers = []
>>> readers << IndexReader.new("/tmp/0")
>>> readers << IndexReader.new("/tmp/1")
>>> readers << IndexReader.new("/tmp/2")
>>> readers << IndexReader.new("/tmp/3")
>>> readers << IndexReader.new("/tmp/4")
>>> index_writer = IndexWriter.new(:path => "/tmp/test")
>>> index_writer.add_readers(readers)
>>> index_writer.close()
>>> readers.each {|reader| reader.close()}
>>> i = Ferret::I.new(:path => ''/tmp/test'')
>>> res = i.search(''name*'')
>>> puts res.inspect # gives me: #<struct Ferret::Search::TopDocs
>>> total_hits=0, hits=[], max_score=0.0,
>>> searcher=#<Ferret::Search::Searcher:0x58a6ec>>
>>>
>>> puts res.hits.size # gives me: 0
>>> _______________________________________________
>>> Ferret-talk mailing list
>>> Ferret-talk at rubyforge.org
>>> http://rubyforge.org/mailman/listinfo/ferret-talk
>>>
>>
>> -- 
>> Jens Kr?mer
>> http://www.jkraemer.net/ - Blog
>> http://www.omdb.org/     - The new free film database
>> _______________________________________________
>> Ferret-talk mailing list
>> Ferret-talk at rubyforge.org
>> http://rubyforge.org/mailman/listinfo/ferret-talk
>
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk

Jens Kraemer

2008-Jan-11 18:53 UTC

head link

[Ferret-talk] Parallel indexing doesn''t work?

Hi!

On Wed, Jan 09, 2008 at 04:58:20PM -0500, Noah M. Daniels
wrote:> Ok, further update -- there was an obvious and stupid bug in my code  
> that was overwriting the partial indices. So now when that''s
fixed, I
> get the proper number of results for a search:
> 
> Company.find_by_contents(''*'')
> => #<ActsAsFerret::SearchResults:0x2b0902ab4628 @current_page=nil,  
> @total_hits=247, @results=[], @total_pages=1, @per_page=247>
strange. Did you try to access the merged index with plain Ferret to see
if this works? Additionally, are your partial indexes index ok and
deliver results with contents when you search only one of them?


Cheers,
Jens
> 
> however, why is @results empty?
> 
> Similarly, find_id_by_contents also returns empty documents, it seems:
> 
>  >> Company.find_id_by_contents(''*'')
> => [247, [{:model=>"Company", :data=>{}, :score=>1.0,
:id=>nil},
> {:model=>"Company", :data=>{}, :score=>1.0,
:id=>nil},
> {:model=>"Company", :data=>{}, :score=>1.0,
:id=>nil},
> {:model=>"Company", :data=>{}, :score=>1.0,
:id=>nil},
> {:model=>"Company", :data=>{}, :score=>1.0,
:id=>nil},
> {:model=>"Company", :data=>{}, :score=>1.0,
:id=>nil},
> {:model=>"Company", :data=>{}, :score=>1.0,
:id=>nil},
> {:model=>"Company", :data=>{}, :score=>1.0,
:id=>nil},
> {:model=>"Company", :data=>{}, :score=>1.0,
:id=>nil},
> {:model=>"Company", :data=>{}, :score=>1.0,
:id=>nil}]]
> 
> when I would have expected:
> 
>  >> Company.find_id_by_contents(''*'')
> => [247, [{:model=>"Company", :score=>1.0,
:id=>"189", :data=>{}},
> {:model=>"Company", :score=>1.0, :id=>"2",
:data=>{}},
> {:model=>"Company", :score=>1.0, :id=>"192",
:data=>{}},
> {:model=>"Company", :score=>1.0, :id=>"4",
:data=>{}},
> {:model=>"Company", :score=>1.0, :id=>"6",
:data=>{}},
> {:model=>"Company", :score=>1.0, :id=>"7",
:data=>{}},
> {:model=>"Company", :score=>1.0, :id=>"8",
:data=>{}},
> {:model=>"Company", :score=>1.0, :id=>"37",
:data=>{}},
> {:model=>"Company", :score=>1.0, :id=>"13",
:data=>{}},
> {:model=>"Company", :score=>1.0, :id=>"21",
:data=>{}}]]
> 
> thanks for the help, and sorry for the silly previous bugs :)
> 
> On Jan 9, 2008, at 4:37 PM, Noah M. Daniels wrote:
> 
> > Thanks, Jens. Good catch; this little example works correctly after
> > fixing that change.
> >
> > However, my ActsAsFerret index merging does _not_ work, and
I''m
> > wondering if it''s something to do with AAF''s
handling of documents in
> > an index?
> >
> > Let''s call my indexed class Company...
> >
> >
> > Company.find_by_contents(''*'')
> > => #<ActsAsFerret::SearchResults:0x2b1699108878
@current_page=nil,
> > @total_hits=3, @results=[], @total_pages=1, @per_page=3>
> >
> >
> > yet on each partial index prior to merging, that query would return a
> > bunch of results as one would expect.
> >
> > now, here''s how I''ve built that index... any idea
why the merged index
> > is broken?
> >
> > module FerretHelpers
> > def merge_ferret_index_partitions(model)
> >
> >     model_dir = File.basename(model.aaf_configuration[:ferret][:path])
> >
> >     final_index_path = "/tmp/merged_parallel_ferret_index/ 
> > #{model_dir}"
> >
> >     partial_index_path = "/tmp/partial_indices/#{model_dir}"
> >
> >     paths = Dir.glob("#{partial_index_path}/*")
> >
> >     paths.each do |path|
> >       i = Ferret::I.new(:path => path, :create => true)
> >       name = path.split(''/'').last
> >       i << {:name => name}
> >       i.close
> >     end
> >
> >     readers = []
> >     paths.each {|path| readers << IndexReader.new(path) }
> >     index_writer = IndexWriter.new(:path => final_index_path)
> >     index_writer.add_readers(readers)
> >     index_writer.close()
> >     readers.each {|reader| reader.close()}
> >     index = Ferret::Index::Index.new(:path => final_index_path)
> >     index.optimize
> >     index.close
> >
> >
> >   end
> > end
> >
> >
> > On Jan 9, 2008, at 4:24 PM, Jens Kraemer wrote:
> >
> >> Hi!
> >>
> >> seems to me you''re indexing strings starting with
''index'' but you''re
> >> searching for ''name''? Or maybe correcting this
already was one of
> >> your
> >> minimal changes?
> >>
> >> If not, try changing that line:
> >>> res = i.search(''name*'')
> >> to
> >>> res = i.search(''index*'')
> >>
> >> cheers,
> >> Jens
> >>
> >> On Wed, Jan 09, 2008 at 04:02:17PM -0500, Noah M. Daniels wrote:
> >>> Hi,
> >>>
> >>> I''m trying to get parallelized ferret indexing
working for my AAF
> >>> indices, based on the example in the O''Reilly Ferret
shortcut.
> >>> However, the resulting indices after merging seem to have no
actual
> >>> documents.
> >>>
> >>> I went and made minimal changes to the example in the Ferret  
> >>> shortcut
> >>> pdf, and indeed can''t get that to work either.
I''d appreciate any
> >>> help
> >>> anyone can give! Thanks!
> >>>
> >>> The example is below:
> >>>
> >>> #!/usr/bin/env ruby
> >>>
> >>> require ''rubygems''
> >>> require ''ferret''
> >>> include Ferret::Index
> >>>
> >>> 5.times do |i|
> >>>  name = "index#{i}"
> >>>  puts name
> >>>  i = Ferret::I.new(:path => "/tmp/#{i}", :create
=> true)
> >>>  i << {:name => name}
> >>>  i.close
> >>> end
> >>> readers = []
> >>> readers << IndexReader.new("/tmp/0")
> >>> readers << IndexReader.new("/tmp/1")
> >>> readers << IndexReader.new("/tmp/2")
> >>> readers << IndexReader.new("/tmp/3")
> >>> readers << IndexReader.new("/tmp/4")
> >>> index_writer = IndexWriter.new(:path =>
"/tmp/test")
> >>> index_writer.add_readers(readers)
> >>> index_writer.close()
> >>> readers.each {|reader| reader.close()}
> >>> i = Ferret::I.new(:path => ''/tmp/test'')
> >>> res = i.search(''name*'')
> >>> puts res.inspect # gives me: #<struct
Ferret::Search::TopDocs
> >>> total_hits=0, hits=[], max_score=0.0,
> >>> searcher=#<Ferret::Search::Searcher:0x58a6ec>>
> >>>
> >>> puts res.hits.size # gives me: 0
> >>> _______________________________________________
> >>> Ferret-talk mailing list
> >>> Ferret-talk at rubyforge.org
> >>> http://rubyforge.org/mailman/listinfo/ferret-talk
> >>>
> >>
> >> -- 
> >> Jens Kr?mer
> >> http://www.jkraemer.net/ - Blog
> >> http://www.omdb.org/     - The new free film database
> >> _______________________________________________
> >> Ferret-talk mailing list
> >> Ferret-talk at rubyforge.org
> >> http://rubyforge.org/mailman/listinfo/ferret-talk
> >
> > _______________________________________________
> > Ferret-talk mailing list
> > Ferret-talk at rubyforge.org
> > http://rubyforge.org/mailman/listinfo/ferret-talk
> 
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk
> 
-- 
Jens Kr?mer
http://www.jkraemer.net/ - Blog
http://www.omdb.org/     - The new free film database

Noah M. Daniels

2008-Jan-11 19:09 UTC

head link

[Ferret-talk] Parallel indexing doesn''t work?

Hi, Jens,

I''ll try what you suggested with plain ferret. What about  
ferret_browser; what should I be looking for? The partial indices are  
each fine prior to merging; they deliver results with contents when  
searching only one of them.

thanks again


On Jan 11, 2008, at 1:53 PM, Jens Kraemer wrote:
> Hi!
>
> On Wed, Jan 09, 2008 at 04:58:20PM -0500, Noah M. Daniels wrote:
>> Ok, further update -- there was an obvious and stupid bug in my code
>> that was overwriting the partial indices. So now when that''s
fixed, I
>> get the proper number of results for a search:
>>
>> Company.find_by_contents(''*'')
>> => #<ActsAsFerret::SearchResults:0x2b0902ab4628
@current_page=nil,
>> @total_hits=247, @results=[], @total_pages=1, @per_page=247>
>
> strange. Did you try to access the merged index with plain Ferret to  
> see
> if this works? Additionally, are your partial indexes index ok and
> deliver results with contents when you search only one of them?
>
>
> Cheers,
> Jens
>
>>
>> however, why is @results empty?
>>
>> Similarly, find_id_by_contents also returns empty documents, it  
>> seems:
>>
>>>> Company.find_id_by_contents(''*'')
>> => [247, [{:model=>"Company", :data=>{},
:score=>1.0, :id=>nil},
>> {:model=>"Company", :data=>{}, :score=>1.0,
:id=>nil},
>> {:model=>"Company", :data=>{}, :score=>1.0,
:id=>nil},
>> {:model=>"Company", :data=>{}, :score=>1.0,
:id=>nil},
>> {:model=>"Company", :data=>{}, :score=>1.0,
:id=>nil},
>> {:model=>"Company", :data=>{}, :score=>1.0,
:id=>nil},
>> {:model=>"Company", :data=>{}, :score=>1.0,
:id=>nil},
>> {:model=>"Company", :data=>{}, :score=>1.0,
:id=>nil},
>> {:model=>"Company", :data=>{}, :score=>1.0,
:id=>nil},
>> {:model=>"Company", :data=>{}, :score=>1.0,
:id=>nil}]]
>>
>> when I would have expected:
>>
>>>> Company.find_id_by_contents(''*'')
>> => [247, [{:model=>"Company", :score=>1.0,
:id=>"189", :data=>{}},
>> {:model=>"Company", :score=>1.0, :id=>"2",
:data=>{}},
>> {:model=>"Company", :score=>1.0,
:id=>"192", :data=>{}},
>> {:model=>"Company", :score=>1.0, :id=>"4",
:data=>{}},
>> {:model=>"Company", :score=>1.0, :id=>"6",
:data=>{}},
>> {:model=>"Company", :score=>1.0, :id=>"7",
:data=>{}},
>> {:model=>"Company", :score=>1.0, :id=>"8",
:data=>{}},
>> {:model=>"Company", :score=>1.0,
:id=>"37", :data=>{}},
>> {:model=>"Company", :score=>1.0,
:id=>"13", :data=>{}},
>> {:model=>"Company", :score=>1.0,
:id=>"21", :data=>{}}]]
>>
>> thanks for the help, and sorry for the silly previous bugs :)
>>
>> On Jan 9, 2008, at 4:37 PM, Noah M. Daniels wrote:
>>
>>> Thanks, Jens. Good catch; this little example works correctly after
>>> fixing that change.
>>>
>>> However, my ActsAsFerret index merging does _not_ work, and
I''m
>>> wondering if it''s something to do with AAF''s
handling of documents
>>> in
>>> an index?
>>>
>>> Let''s call my indexed class Company...
>>>
>>>
>>> Company.find_by_contents(''*'')
>>> => #<ActsAsFerret::SearchResults:0x2b1699108878
@current_page=nil,
>>> @total_hits=3, @results=[], @total_pages=1, @per_page=3>
>>>
>>>
>>> yet on each partial index prior to merging, that query would  
>>> return a
>>> bunch of results as one would expect.
>>>
>>> now, here''s how I''ve built that index... any idea
why the merged
>>> index
>>> is broken?
>>>
>>> module FerretHelpers
>>> def merge_ferret_index_partitions(model)
>>>
>>>    model_dir = File.basename(model.aaf_configuration[:ferret] 
>>> [:path])
>>>
>>>    final_index_path = "/tmp/merged_parallel_ferret_index/
>>> #{model_dir}"
>>>
>>>    partial_index_path =
"/tmp/partial_indices/#{model_dir}"
>>>
>>>    paths = Dir.glob("#{partial_index_path}/*")
>>>
>>>    paths.each do |path|
>>>      i = Ferret::I.new(:path => path, :create => true)
>>>      name = path.split(''/'').last
>>>      i << {:name => name}
>>>      i.close
>>>    end
>>>
>>>    readers = []
>>>    paths.each {|path| readers << IndexReader.new(path) }
>>>    index_writer = IndexWriter.new(:path => final_index_path)
>>>    index_writer.add_readers(readers)
>>>    index_writer.close()
>>>    readers.each {|reader| reader.close()}
>>>    index = Ferret::Index::Index.new(:path => final_index_path)
>>>    index.optimize
>>>    index.close
>>>
>>>
>>>  end
>>> end
>>>
>>>
>>> On Jan 9, 2008, at 4:24 PM, Jens Kraemer wrote:
>>>
>>>> Hi!
>>>>
>>>> seems to me you''re indexing strings starting with
''index'' but
>>>> you''re
>>>> searching for ''name''? Or maybe correcting
this already was one of
>>>> your
>>>> minimal changes?
>>>>
>>>> If not, try changing that line:
>>>>> res = i.search(''name*'')
>>>> to
>>>>> res = i.search(''index*'')
>>>>
>>>> cheers,
>>>> Jens
>>>>
>>>> On Wed, Jan 09, 2008 at 04:02:17PM -0500, Noah M. Daniels
wrote:
>>>>> Hi,
>>>>>
>>>>> I''m trying to get parallelized ferret indexing
working for my AAF
>>>>> indices, based on the example in the O''Reilly
Ferret shortcut.
>>>>> However, the resulting indices after merging seem to have
no
>>>>> actual
>>>>> documents.
>>>>>
>>>>> I went and made minimal changes to the example in the
Ferret
>>>>> shortcut
>>>>> pdf, and indeed can''t get that to work either.
I''d appreciate any
>>>>> help
>>>>> anyone can give! Thanks!
>>>>>
>>>>> The example is below:
>>>>>
>>>>> #!/usr/bin/env ruby
>>>>>
>>>>> require ''rubygems''
>>>>> require ''ferret''
>>>>> include Ferret::Index
>>>>>
>>>>> 5.times do |i|
>>>>> name = "index#{i}"
>>>>> puts name
>>>>> i = Ferret::I.new(:path => "/tmp/#{i}",
:create => true)
>>>>> i << {:name => name}
>>>>> i.close
>>>>> end
>>>>> readers = []
>>>>> readers << IndexReader.new("/tmp/0")
>>>>> readers << IndexReader.new("/tmp/1")
>>>>> readers << IndexReader.new("/tmp/2")
>>>>> readers << IndexReader.new("/tmp/3")
>>>>> readers << IndexReader.new("/tmp/4")
>>>>> index_writer = IndexWriter.new(:path =>
"/tmp/test")
>>>>> index_writer.add_readers(readers)
>>>>> index_writer.close()
>>>>> readers.each {|reader| reader.close()}
>>>>> i = Ferret::I.new(:path =>
''/tmp/test'')
>>>>> res = i.search(''name*'')
>>>>> puts res.inspect # gives me: #<struct
Ferret::Search::TopDocs
>>>>> total_hits=0, hits=[], max_score=0.0,
>>>>> searcher=#<Ferret::Search::Searcher:0x58a6ec>>
>>>>>
>>>>> puts res.hits.size # gives me: 0
>>>>> _______________________________________________
>>>>> Ferret-talk mailing list
>>>>> Ferret-talk at rubyforge.org
>>>>> http://rubyforge.org/mailman/listinfo/ferret-talk
>>>>>
>>>>
>>>> -- 
>>>> Jens Kr?mer
>>>> http://www.jkraemer.net/ - Blog
>>>> http://www.omdb.org/     - The new free film database
>>>> _______________________________________________
>>>> Ferret-talk mailing list
>>>> Ferret-talk at rubyforge.org
>>>> http://rubyforge.org/mailman/listinfo/ferret-talk
>>>
>>> _______________________________________________
>>> Ferret-talk mailing list
>>> Ferret-talk at rubyforge.org
>>> http://rubyforge.org/mailman/listinfo/ferret-talk
>>
>> _______________________________________________
>> Ferret-talk mailing list
>> Ferret-talk at rubyforge.org
>> http://rubyforge.org/mailman/listinfo/ferret-talk
>>
>
> -- 
> Jens Kr?mer
> http://www.jkraemer.net/ - Blog
> http://www.omdb.org/     - The new free film database
> _______________________________________________
> Ferret-talk mailing list
> Ferret-talk at rubyforge.org
> http://rubyforge.org/mailman/listinfo/ferret-talk

Apparently Analagous Threads

Search for more possibly parallel threads

Ferret talk - Jan 2008 - Parallel indexing doesn''t work?

[Ferret-talk] Parallel indexing doesn''t work?

[Ferret-talk] Parallel indexing doesn''t work?

[Ferret-talk] Parallel indexing doesn''t work?

[Ferret-talk] Parallel indexing doesn''t work?

[Ferret-talk] Parallel indexing doesn''t work?

[Ferret-talk] Parallel indexing doesn''t work?

Apparently Analagous Threads