Hey all, I am Aditi Gupta, an aspiring Google Summer of Code 2015 participant. I had browsed through the plausible project ideas for GSoC 2015 on the wiki page and particularly found two ideas very interesting. The projects 'Weighting Schemes' and 'Learning to Rank' have seemed to capture my imagination. Having done the Information retrieval course at my university last semester these two projects appealed to me the most. As a part of my project for the course I had developed an automatic image annotation system prototype modeled on the Latent Dirichlet Allocation (LDA) probabilistic topic modelling approach. Probably as a part of extending the features taken into account when weighting terms by the Xapian library, the way LDA scores are assigned can be considered to add to the extension to improve precision and recall of search results. Also if some context specific parameters can be modeled in the weighting schemes it might help in improving the performance of the system. I am also currently doing a study oriented projects under one of my professors on Automatic text summarization and have done a literature review on various techniques being employed for allotting relevance scores to phrases/words in a sentence for picking salient sentences from a text. Hopefully this background can get me started on finding relevant extensions to weighting and ranking schemes. Any suggestions and guidance on this front will be really appreciated. I have got Xapian 1.2.19 built on my machine and am currently going through the Getting started guide to get into the nitty-gritty of things. Looking forward to a constructive discussion. Cheers Aditi -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20150215/6553360a/attachment-0002.html>
On 14 Feb 2015, at 19:06, aditi gupta <aditiguptabits at gmail.com> wrote:> I am Aditi Gupta, an aspiring Google Summer of Code 2015 participant. I had browsed through the plausible project ideas for GSoC 2015 on the wiki page and particularly found two ideas very interesting. The projects 'Weighting Schemes' and 'Learning to Rank' have seemed to capture my imagination.Aditi ? welcome! The getting started guide you?ve found is a good place to get a feel for how Xapian fits together. If you haven?t found it we also have a guide for potential GSoC students <http://trac.xapian.org/wiki/GSoC%20Guide>, which has some useful information and suggestions as well. Note that the Learning to Rank project we?re still working on for this year ? it will likely focus around stability, tests and getting it ready for distribution and general use ? so if you were drawn to it because you?re interested in the Information Retrieval side, you may be better off looking at Weighting Schemes instead. However, as the project page says, if you have an idea that?s not on the list, please talk to us ? we aren?t the source of all good ideas for things to do with Xapian! Best, James -- James Aylett, occasional trouble-maker xapian.org