Hi, In ferret, and especially when using acts_as_ferret, it is easy to specify many fields. What is the cost of using a lot of fields from a performance perspective? Is each field searched separately, or are they combined together in the inverted index. As an extreme example, if I made every word in my documents a separate field (so the first word in each document was field 1 and the second word was field 2, etc) would this be significantly less efficient than treating the entire document as a single field? I am not doing something quite as bad as this hypothetical example, but I am investigating different ways to organize some data. Thanks, Chris. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://rubyforge.org/pipermail/ferret-talk/attachments/20070305/481b9f87/attachment.html
On 3/6/07, Waters, Chris <cwaters at networkchemistry.com> wrote:> In ferret, and especially when using acts_as_ferret, it is easy to specify > many fields. What is the cost of using a lot of fields from a performance > perspective? Is each field searched separately, or are they combined > together in the inverted index.Hi Chris, Each field is searched separately so the more fields you search the longer the search will take. Also note that there shouldn''t be any difference in the time to search a single field whether you have 1 field or 1 million. It will only take longer if you search all 1 million fields.> As an extreme example, if I made every word in my documents a separate field > (so the first word in each document was field 1 and the second word was > field 2, etc) would this be significantly less efficient than treating the > entire document as a single field? > > > > I am not doing something quite as bad as this hypothetical example, but I am > investigating different ways to organize some data.I''m not sure exactly what you want to do but you may want to look at span queries. These queries allow you to search based on the positions of the terms in the document. But perhaps your hypothetical is misleading me. Cheers, Dave -- Dave Balmain http://www.davebalmain.com/
Thanks, that answers my question. My example was purely hypothetical, but I really am contemplating having hundreds of fields. Regards, Chris> -----Original Message----- > From: ferret-talk-bounces at rubyforge.org > [mailto:ferret-talk-bounces at rubyforge.org] On Behalf Of David Balmain > Sent: Monday, March 05, 2007 7:22 PM > To: ferret-talk at rubyforge.org > Subject: Re: [Ferret-talk] Cost of using many fields > > On 3/6/07, Waters, Chris <cwaters at networkchemistry.com> wrote: > > In ferret, and especially when using acts_as_ferret, it is easy to > > specify many fields. What is the cost of using a lot of > fields from a > > performance perspective? Is each field searched separately, or are > > they combined together in the inverted index. > > Hi Chris, > > Each field is searched separately so the more fields you > search the longer the search will take. Also note that there > shouldn''t be any difference in the time to search a single > field whether you have 1 field or 1 million. It will only > take longer if you search all 1 million fields. > > > As an extreme example, if I made every word in my documents > a separate > > field (so the first word in each document was field 1 and > the second > > word was field 2, etc) would this be significantly less > efficient than > > treating the entire document as a single field? > > > > > > > > I am not doing something quite as bad as this hypothetical example, > > but I am investigating different ways to organize some data. > > I''m not sure exactly what you want to do but you may want to > look at span queries. These queries allow you to search based > on the positions of the terms in the document. But perhaps > your hypothetical is misleading me. > > Cheers, > Dave > > -- > Dave Balmain > http://www.davebalmain.com/ > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk >