Hi, guys I am Wenjin from Graduate School of Chinese Academy of Science, pursing a master degree and my current research interests including using Data mining and Information retrieve technology to analysis software engineering (SE) data and support SE. I have great interested in "Weight Schemes" project. and in the last few days I have learnt some detail about DFR model family by reading some papers and web page. I find that Terrier Project (http://terrier.org/) has implement most of DFR scheme in Java language, and briefly read related source of Terrier's package( org.terrier.matching.models), I think "weight scheme" can imitate that package, of course in C++. It will be better to implement a generic DFR weighting model allowing any DFR to be generated and evaluated. Since DFR is a framework or model family, which contains many basic models and different normalizations. thus, I want to know what our "Weigh Schemes" includes. ??? Wenjin Wu -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20110328/8f35918c/attachment.html>
On Mon, Mar 28, 2011 at 10:24:30AM +0800, wuwenjin wrote:> I have great interested in "Weight Schemes" project. and in the last few > days I have learnt some detail about DFR model family by reading some papers > and web page. I find that Terrier Project (http://terrier.org/) has > implement most of DFR scheme in Java languageYes, Terrier implements a number of DfR schemes.> and briefly read related > source of Terrier's package( org.terrier.matching.models), I think "weight > scheme" can imitate that package, of course in C++.Xapian's weighting schemes are structured in a particular way to allow for various optimisations. You need to subclass Xapian::Weight and implement various methods to implement them, so I would suggest just starting from the formulae - trying to directly translate Terrier's code to C++ will give you Java-esque C++ code which doesn't actually fit where you need it. You can see the Xapian::Weight API here: http://trac.xapian.org/browser/trunk/xapian-core/include/xapian/weight.h> It will be better to > implement a generic DFR weighting model allowing any DFR to be generated > and evaluated. Since DFR is a framework or model family, which contains > many basic models and different normalizations.An interesting idea. Although it's a family of weighting schemes, I suspect you'd find you ended up switching between different implementations for each DfR scheme internally, and that's better done by subclassing really. But probably worth thinking about further. Cheers, Olly
hi, Olly I have submitted my proposal for "Weighting Schema" . if you get some time to read my proposal, I will appreciate your suggestions about it. http://socghop.appspot.com/gsoc/proposal/review/google/gsoc2011/kevinking/1001# <http://socghop.appspot.com/gsoc/proposal/review/google/gsoc2011/kevinking/1001#> Regards * * *Wenjin Wu* -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20110329/ec4f9d43/attachment.html>