On Tue, Mar 11, 2014 at 12:02:15PM +0100, Parth Gupta wrote:> During the indexing with omindex, only you need to make sure is indexing > with prefix 'S' for title as explained here in Letor documentation: > xapian-letor/docs/letor.rst > > Previously when I edited omindex.cc it was modified as can be seen > here<http://trac.xapian.org/browser/svn/branches/gsoc2011-parth/xapian-applications/omega/omindex.cc>on > line 838 and block 1532-1559. > > But now we have the same as xapian-letor/bin/xapian-letor-update.cc so > before starting with questletor.cc you need to run it once for each db and > in this case all you need to make sure is below line in omindex.cc while > indexing. > > indexer.index_text(title, 1,"S");On current trunk, we index the title with prefix "S" by default in omindex, though with a wdf inc of 5 rather than 1: indexer.index_text(title, 5, "S"); So I don't think you need that change to omindex now. Cheers, Olly
> > On current trunk, we index the title with prefix "S" by default in > omindex, though with a wdf inc of 5 rather than 1: > > indexer.index_text(title, 5, "S"); > > So I don't think you need that change to omindex now. >Yes, but please make sure to change 5 to 1 otherwise divide the final count statistics by 5 . :) Parth.> > Cheers, > Olly >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20140311/1f3f6fa0/attachment-0002.html>
On Tue, Mar 11, 2014 at 03:20:31PM +0100, Parth Gupta wrote:> > > > On current trunk, we index the title with prefix "S" by default in > > omindex, though with a wdf inc of 5 rather than 1: > > > > indexer.index_text(title, 5, "S"); > > > > So I don't think you need that change to omindex now. > > Yes, but please make sure to change 5 to 1 otherwise divide the final count > statistics by 5 . :)We really need to resolve any instances where letor requires code in other parts of Xapian to be patched. In this case, possibly the bias on the title should be done differently, but won't this just mean both the wdfs and the field length for the S prefix are 5 times larger, and it won't matter? Cheers, Olly