Hi, I just went through the overview of Query Processing in Xapian ( http://www.xapian.org/docs/overview.html & http://www.xapian.org/docs/intro_ir.html ). According to the documentation, Xapian does not incorporate Proximity based searching and proximity based relevance. During the query processing, each term's weight is evaluated, every document is ranked according to term's weight and also query operator (AND, OR, NOT) etc is applied. Nowhere does it mention use of positions of terms. My question is if positions are used for relevance during searching or are they only used for proximity based searching such as PHRASE or NEAR? In any case how is this information used? Regards, Sharjeel
On Tue, Dec 05, 2006 at 11:58:57AM +0500, Sharjeel Ahmed Qureshi wrote:> I just went through the overview of Query Processing in Xapian ( > http://www.xapian.org/docs/overview.html & > http://www.xapian.org/docs/intro_ir.html ). According to the > documentation, Xapian does not incorporate Proximity based searching and > proximity based relevance.This is mentioned in overview.html, for example here: Xapian::Query::OP_NEAR Return documents where the terms are with the specified distance of each other. Xapian::Query::OP_PHRASE Return documents where the terms are with the specified distance of each other and in the given order.> My question is if positions are used for relevance during searching or > are they only used for proximity based searching such as PHRASE or > NEAR?There's no proximity based relevance at present. Cheers, Olly
Another, probably more general, question would be: How can proximity-weighted searches be done with Xapian? --Philip On 12/4/06, Sharjeel Ahmed Qureshi <sharjeel.ahmed@vahzay.com> wrote:> Hi, > > I just went through the overview of Query Processing in Xapian ( > http://www.xapian.org/docs/overview.html & > http://www.xapian.org/docs/intro_ir.html ). According to the > documentation, Xapian does not incorporate Proximity based searching and > proximity based relevance. During the query processing, each term's > weight is evaluated, every document is ranked according to term's weight > and also query operator (AND, OR, NOT) etc is applied. Nowhere does it > mention use of positions of terms. My question is if positions are used > for relevance during searching or are they only used for proximity based > searching such as PHRASE or NEAR? In any case how is this information > used? > > Regards, > Sharjeel > > _______________________________________________ > Xapian-discuss mailing list > Xapian-discuss@lists.xapian.org > http://lists.xapian.org/mailman/listinfo/xapian-discuss >
On Tue, Dec 05, 2006 at 09:46:45PM -0800, Philip Neustrom wrote:> Another, probably more general, question would be: How can > proximity-weighted searches be done with Xapian?You could take a proximity-weighted query for "foo" and "bar" and generate a query like: foo AND bar AND (foo NEAR/5 bar) That would match all documents with both "foo" and "bar" in, but give more weight to those where they occurred within 5 words of each other. Cheers, Olly