Andreas Marienborg
2008-Oct-13 13:56 UTC
[Xapian-discuss] a strange type of alias/expanded term
Hello I was wondering if there is any way I can coach queryparser into something like this, so I don't have to pre-parse the query myself: (pseudo code) my $query_string = 'jazz oslo today'; $qp->add_alias('today' => 'D20081013'); my $q = $qp->parse($query_string); is($q->get_description, '(jazz AND oslo AND D20081013)'); basicly I want to somehow expand today to todays date, this week to a range, tomorrow to something etc, but not sure how I might best do it? the other option, to pre-process, is doable I guess, but it might be more error-prone? - andreas
On Mon, Oct 13, 2008 at 03:56:23PM +0200, Andreas Marienborg wrote:> I was wondering if there is any way I can coach queryparser into > something like this, so I don't have to pre-parse the query myself: > (pseudo code) > > my $query_string = 'jazz oslo today'; > > $qp->add_alias('today' => 'D20081013'); > > my $q = $qp->parse($query_string); > > is($q->get_description, '(jazz AND oslo AND D20081013)');This version is arguably slightly better since the date should act as a boolean filter term: ((jazz AND oslo) FILTER D20081013) Both will match the same documents, but the weightings will be slightly different. Not sure about the FILTER version, but the AND version can probably be achieved using synonyms: http://xapian.org/docs/synonyms.html Untested, but try something like: # Only need to do this once per day... $db->clear_synonyms("today"); $db->add_synonym("today", "D20081013"); $qp->set_database($db); my $q = $qp->parse_query($query_string, FLAG_PHRASE|FLAG_BOOLEAN|FLAG_LOVEHATE|FLAG_AUTO_SYNONYMS);> basicly I want to somehow expand today to todays date, this week to a > range, tomorrow to something etc, but not sure how I might best do it?If you define multiple synonyms for the same word (by calling add_synonym() multiple times with the same first argument), they're ORed, and multi-word synonyms are supported with FLAG_AUTO_MULTIWORD_SYNONYMS), so `this week' is doable by defining it as a synonym for 7 D-prefix terms. For `this year' you probably want to add Y-prefix terms with just the year to avoid an OR of 365 or 366 date terms...> the other option, to pre-process, is doable I guess, but it might be > more error-prone?Yes, preprocessing input to the QueryParser like that is best avoided. Cheers, Olly
Andreas Marienborg
2008-Oct-16 11:03 UTC
[Xapian-discuss] a strange type of alias/expanded term
On Oct 16, 2008, at 5:32 AM, Olly Betts wrote:> > Untested, but try something like: > > # Only need to do this once per day... > $db->clear_synonyms("today"); > $db->add_synonym("today", "D20081013"); > > $qp->set_database($db); > my $q = $qp->parse_query($query_string, > FLAG_PHRASE|FLAG_BOOLEAN|FLAG_LOVEHATE|FLAG_AUTO_SYNONYMS);One question regarding this: I have pretty persistent db-objects (I usually call reopen on exceptions), what happens if another process changes these synonyms? are they stored in the DB, or just in the object-instance? Is a reopen on each search the best approach to reread synonyms if the are persisted on disk? - andreas