Is there any way to speed up phrase searches? What sort of performance should I expect? Currently when I search against a 5.3 GB flint database it takes 4.5 minutes for a simple 2 word phrase. Is that reasonable performance? I've seen some other threads about xapian phrase performance, but none seem to indicate what reletively "normal" performance should be and I'm not sure what I should expect. I'm using the perl interface to xapian 0.9.2. I'm building my own Query objects rather than using QueryParser since we use ':' as part of our field prefix. What popped out at me was that my query looks like: Xapian::Query((FIELD:term1 PHRASE 2 FIELD:term2)) while QueryParser's looks like: Xapian::Query((term1(pos=1) PHRASE 2 term2(pos=2))) where is the position information coming from and how do I add it to my query? Will it help or is it irrelevant? The query object (at least the perl interface) only allows me to build queries of the form: ("term") or (OP, "term", "term"...) or (OP, query, query...) I can't seem to specify position. Thanks, Alex
Hi Alex, what is your hardware ? How many documents do you have in your flint database ? wow .. I am complaining about search time > 0.5s and you have search time > 1 min ! I don't use phrase searching but that sounds really bad performances. Regards On 2/20/06, Alex Deucher <alexdeucher@gmail.com> wrote:> > Is there any way to speed up phrase searches? What sort of > performance should I expect? Currently when I search against a 5.3 GB > flint database it takes 4.5 minutes for a simple 2 word phrase. Is > that reasonable performance? I've seen some other threads about > xapian phrase performance, but none seem to indicate what reletively > "normal" performance should be and I'm not sure what I should expect. > I'm using the perl interface to xapian 0.9.2. I'm building my own > Query objects rather than using QueryParser since we use ':' as part > of our field prefix. What popped out at me was that my query looks > like: > > Xapian::Query((FIELD:term1 PHRASE 2 FIELD:term2)) > > while QueryParser's looks like: > > Xapian::Query((term1(pos=1) PHRASE 2 term2(pos=2))) > > where is the position information coming from and how do I add it to > my query? Will it help or is it irrelevant? The query object (at > least the perl interface) only allows me to build queries of the form: > ("term") > or > (OP, "term", "term"...) > or > (OP, query, query...) > > I can't seem to specify position. > > > Thanks, > > Alex > > _______________________________________________ > Xapian-discuss mailing list > Xapian-discuss@lists.xapian.org > http://lists.xapian.org/mailman/listinfo/xapian-discuss >-- David LEVY {selenium} Website ~ http://www.davidlevy.org Wishlist Zlio ~ http://david.zlio.com/wishlist Blog ~ http://selenium.blogspot.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.tartarus.org/pipermail/xapian-discuss/attachments/20060220/307bf416/attachment.htm
On Mon, Feb 20, 2006 at 02:27:53PM -0500, Alex Deucher wrote:> Is there any way to speed up phrase searches? What sort of > performance should I expect? Currently when I search against a 5.3 GB > flint database it takes 4.5 minutes for a simple 2 word phrase. Is > that reasonable performance?No. Phrase searches involving two common terms can be slow, especially where an AND query matches many documents but the two terms don't often occur as a phrase, but 4.5 minutes is clearly ludicrous. A more concrete example would be useful - what's the query, and what are the term frequencies for the two terms involved?> I'm using the perl interface to xapian 0.9.2. I'm building my own > Query objects rather than using QueryParser since we use ':' as part > of our field prefix.Can't you just set the prefix map to include the ":"? i.e. queryparser.add_prefix("field", "FIELD:");> Xapian::Query((FIELD:term1 PHRASE 2 FIELD:term2)) > > while QueryParser's looks like: > > Xapian::Query((term1(pos=1) PHRASE 2 term2(pos=2))) > > where is the position information coming from and how do I add it to > my query?There are optional parameters on the Query from term name constructor, one of which sets the query position.> Will it help or is it irrelevant?I think it's only used to sort the query terms which match a particular document into order (they may need to be reordered to build the query - e.g. 'hello +world' -> 'world AND_MAYBE hello').> The query object (at least the perl interface) only allows me to build > queries of the form:For Perl, see the "new_term" method of "Search::Xapian::Query" - added in 0.9.2.3. Cheers, Olly