Jorge Cardoso Leitão
2014-May-17 14:25 UTC
[Xapian-discuss] Search for exact match on document
Hi. I'm using Xapian, and I'm having some difficulty doing the following: I have db with a field "name": 1 with name="foo" 2 with name="foo bar" 3 with name="foo bar for you u2". I index this in the database using a prefix FNAME indicating the field where the strings belong to. delve -a returns something like FNAMEfoo FNAMEbar ... My question is: how can I can query by name the entry 2 without matching entry 1 nor 3? I tried using Query((FNAMEfoo PHRASE 2 FNAMEbar)) but this is returning me two entries (naturally). I know that, for obtaining the entry 1 alone, I can query Query('"XNAME8"') (i.e. using the name between "'s), but for the entry 2 I'm not being able to construct the query. I'm using python Xapian bindings. More generally, the question can be posed on the following terms: how can I query exact matches to fields? Thank you very much, Jorge
On Sat, May 17, 2014 at 04:25:06PM +0200, Jorge Cardoso Leit?o wrote:> More generally, the question can be posed on the following terms: how can I > query exact matches to fields?I'd suggest indexing special terms to mark the start and end of fields for which you want to do exact matching, such that the start term's position is one before the first real position in that field, and the end term's position is one after the last real position. If you make these terms FNAME^ and FNAME$, then you can build a suitable phrase query in Python like so: terms = ['FNAME^', 'FNAMEfoo', 'FNAMEbar', 'FNAME$'] xapian.Query(xapian.Query.OP_PHRASE, terms) You can also use these special terms to anchor a match to just the start or just the end of a field. Cheers, Olly
Jorge Cardoso Leitão
2014-May-19 13:34 UTC
[Xapian-discuss] Search for exact match on document
Olly, thank you for your reply and for the suggestion: It is indeed a nice solution. Is there a common place with such tips? Maybe it is outside the scope of Xapian and more related with "how to index a corpus", but since most Xapian users need to know both things, a set of common use cases on indexing could help (e.g. in read the docs). In any case, thank you for doing Xapian; I'm the new maintainer of the Xapian backend to Django-Haystack, and I'm eager to bring it to the Django community again (the development stalled for 2 years, so it became unusable and died out). Cheers, Jorge On Mon, May 19, 2014 at 12:02 PM, Olly Betts <olly at survex.com> wrote:> On Sat, May 17, 2014 at 04:25:06PM +0200, Jorge Cardoso Leit?o wrote: > > More generally, the question can be posed on the following terms: how > can I > > query exact matches to fields? > > I'd suggest indexing special terms to mark the start and end of fields > for which you want to do exact matching, such that the start term's > position is one before the first real position in that field, and the > end term's position is one after the last real position. > > If you make these terms FNAME^ and FNAME$, then you can build a suitable > phrase query in Python like so: > > terms = ['FNAME^', 'FNAMEfoo', 'FNAMEbar', 'FNAME$'] > xapian.Query(xapian.Query.OP_PHRASE, terms) > > You can also use these special terms to anchor a match to just the start > or just the end of a field. > > Cheers, > Olly >