jarrod roberson
2006-May-31 02:06 UTC
[Xapian-discuss] Phrase Query vs AND Query? Why don't these find the same things?
PHRASE QUERY Xapian::Query((MBOX:12345678-1234-1234-1234-1234567890ab AND (LP:backup:(pos=1) PHRASE 6 LP:c::(pos=2) PHRASE 6 LP:program files:(pos=3) PHRASE 6 LP:Mozilla Firefox:(pos=4) PHRASE 6 LP:res:(pos=5) PHRASE 6 LP: table-add-column-after-hover.gif:(pos=6)))) AND QUERY Xapian::Query((MBOX:12345678-1234-1234-1234-1234567890ab AND LP:backup:(pos=1) AND LP:c::(pos=2) AND LP:program files:(pos=3) AND LP:Mozilla Firefox:(pos=4) AND LP:res:(pos=5) AND LP: table-add-column-after-hover.gif:(pos=6))) given that I have done .add_posting() correctly with the LP: prefixed terms and the second AND Query returns the set of data I expect, why does the first query NOT return ANYTHING? here is what DELVE tells me>delve -r 1 -1 -d -v -k /index/wfs/Values for record #1: 1:/backup/c:/program files/Mozilla Firefox/res/table- add-column-after-hover.gif Data for record #1: /wfs/0/+/D/0+DM0wZ6y_mHubbNJjQXrgqW64s=/table-add-column-after-hover.gif /12345678-1234-1234-1234-1234567890ab Term List for record #1: CDATE:1135755971 1 1 ETAG:DA39A3EE5E6B4B0D3255BFEF95601890AFD80709 1 1 LANG:en 1 1 LEN:0 1 1 LP:Mozilla Firefox 1 1 LP:backup 1 1 LP:c: 1 1 LP:program files 1 1 LP:res 1 1 LP:table-add-column-after-hover.gif 1 1 MBOX:12345678-1234-1234-1234-1234567890ab 1 1 MDATE:1135755971 1 1 MIME:application/octet-stream 1 1 NAME:table-add-column-after-hover.gif 1 1 TYPE:0 1 1 here is what my test program outputs with the AND QUERY the numbers in the [] after the LP: prefixed terms are thier posting postiions found 1 DOCID:1 PPATH:/wfs/0/+/D/0+DM0wZ6y_mHubbNJjQXrgqW64s=/table- add-column-after-hover.gif/12345678-1234-1234-1234-1234567890ab LPATH:/backup/c:/program files/Mozilla Firefox/res/table- add-column-after-hover.gif CDATE:1135755971 (Wed, 28 Dec 2005 07:46:11 GMT) ETAG:DA39A3EE5E6B4B0D3255BFEF95601890AFD80709 LANG:en LEN:0 LP:Mozilla Firefox [4] LP:backup [1] LP:c: [2] LP:program files [3] LP:res [5] LP:table-add-column-after-hover.gif [6] MBOX:12345678-1234-1234-1234-1234567890ab MDATE:1135755971 (Wed, 28 Dec 2005 07:46:11 GMT) MIME:application/octet-stream NAME:table-add-column-after-hover.gif TYPE:0 (file) anyone have any ideas how to get a PHRASE QUERY to return the data? also what will ulimate be "better" given the context of these queries, the AND QUERY or the PHRASE QUERY?
Olly Betts
2006-Jun-01 01:24 UTC
[Xapian-discuss] Phrase Query vs AND Query? Why don't these find the same things?
On Tue, May 30, 2006 at 09:06:48PM -0400, jarrod roberson wrote:> given that I have done .add_posting() correctly with the LP: prefixed terms > and the second AND Query returns the set of data I expect, why does the > first query NOT return ANYTHING?I can't reproduce this. The attached python script finds one match (and also does using quartz or flint instead of the inmemory backend). So it seems the circumstances required to trigger this are somewhat subtle. Can you produce a self-contained example which shows this similar to phrase.py?> also what will ulimate be "better" given the context of these queries, the > AND QUERY or the PHRASE QUERY?I'm not sure I understand what you're trying to search for here. But generally a PHRASE query is better when the terms are only useful in a particular order, or when there are common false positives which includes all the terms but aren't relevant. A phrase query is slower than an AND, so that's a reason to favour AND if there's no other reason to pick one or the other. Cheers, Olly -------------- next part -------------- A non-text attachment was scrubbed... Name: phrase.py Type: text/x-python Size: 599 bytes Desc: not available Url : http://lists.tartarus.org/pipermail/xapian-discuss/attachments/20060601/37fd1833/phrase.py