Hey guys,hi.:) I've finished implementing the PL2 scheme . The bounds I have implemented for it are as good as I could, given the nature of the scheme and my mathematical skills.However,tight bounds for other named DFR schemes will be easier to implement because their forumlas are quite simpler compared to PL2 . Will send in a pull request in a couple of days once I'm done with the tests and the documentation. I'll now start working on the DPH scheme as described in Section 3 here:- http://trec.nist.gov/pubs/trec18/papers/uglasgow.BLOG.ENT.MQ.RF.WEB.pdf Now that GSOC is coming near,I want to start working on my proposal to make it as detailed as possible and my aim is to implement document weighting and query expansion using the DFR Framework(currently,we have a hard coded formula for query expansion). I hope to complete as many named DFR schemes as I can before the application period starts so that during GSOC , I can focus on implementing the DFR Framework which will allow the user to create any DFR scheme that he wants to and also implement Query Expansion using the DFR Framework .I hope to be able to do the following work by the end of GSOC:- 1.) About 8 named frequently used DFR schemes mentioned on the terrier homepage and those mentioned by Olly on IRC.Each of these will be an independent weighting scheme subclassed from Xapian::Weight . 2.) A DFR framework which allows the creation of any DFR scheme by choosing a probablistic model,a risk gain normalization and a term frequency normalization. 3/) Implement Query Expansion by using the DFR schemes and allow the user to choose any named scheme to expand the query (just like we do for MSet.) II hope to able to finish at least 50 % of 1.) before April ends so that I can focus on 2.) and 3.) during the summer. Please do comment on this and let me know what you think.Also,I have no prior experience with writing proposals and so,please can you tell me what a proposal for something like this should include ? I'd really appreciate your help. -Regards -Aarsh -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20130315/1af4e8df/attachment.htm>
On Fri, Mar 15, 2013 at 10:02:24PM +0530, aarsh shah wrote:> Please do comment on this and let me know what you think.We discussed the proposal plan on IRC since, but if there's anything you want to discuss more, please say.> Also,I have no > prior experience with writing proposals and so,please can you tell me what > a proposal for something like this should include ? I'd really appreciate > your help.For the benefit of others reading the list, there are 3 great links to resources on putting together a good GSoC proposal here: http://trac.xapian.org/wiki/GSoCApplicationTemplate The application template itself hasn't been updated from 2012 yet. The dates in there need updating, and I suspect we'll tweak the content a bit, but it should give a rough idea of what to expect. The other thing I'd recommend is getting your application in early on in the application period. We'll try to review applications as they come in, and give feedback so you can revise and improve them (which you can do right up until the deadline). However, the rate at which new applications come in inevitably increases towards the deadline, so it's likely to take us longer to respond with comments, and there's less time left for you to act on the feedback you get. Cheers, Olly