Hello all,
I am very sorry I did not include xapian-devel mailing list in my previous mail.
Thanks for responding my mail.
Mohd Azeem
NIT UK
________________________________
From: Olly Betts <olly at survex.com>
To: Mohd Azeem <azeem201001 at yahoo.in>
Cc: Parth Gupta <parthg.88 at gmail.com>
Sent: Saturday, 31 March 2012 11:40 AM
Subject: Re: GSoC, Xapian Project Weighting Schemes
Please DON'T mail individual mentors privately - use the xapian-devel
mailing list instead.
On Sat, Mar 31, 2012 at 01:35:16PM +0800, Mohd Azeem
wrote:> Presently Xapian
> provides the ability to rank search result by the mathematical
> formulas like tf*idf andBM25.
Actually, you can already rank results by incoming hyperlink counts, or
any query-independent factor(s) you want to keep track of, and you can
combine that with term-based weights.? This is done by creating a
PostingSource subclass and using it to the query:
http://xapian.org/docs/postingsource.html
> weight S= S1(Weight calculated by BM25) * S2(weight of document
> calculated based on
You can't multiply the factors like this with a PostingSource, only add
them - is there any theoretical or experimental basis for multiplying
the weight contributions in this situation?
So your suggested project would involve counting up in-bound hyperlinks,
and writing a simple PostingSource class to use them, plus perhaps
adding a new query operator which multiplies weights.? Unfortunately
that doesn't seem like it would be nearly enough work for a GSoC
project.
Thanks for the suggestion though.
Cheers,
? ? Olly
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.xapian.org/pipermail/xapian-devel/attachments/20120402/b8e6bc16/attachment.html>