Hi, I'd like to share my proposal for GSoC and get feedback on it. https://docs.google.com/document/d/1A4HF2lZBnLh1TUY3Y2DDUfz-nzbIL1NNAo8Adl3gN-8/edit?usp=sharing Thanks, Uppinder Chugh On Mon, Feb 26, 2018 at 2:14 AM, Uppinder Chugh <uppinderchugh at gmail.com> wrote:> In particular, I have the following doubts: > > a) Is wrapping Xapian::Mset matcher::get_set(..) suitable in this scenario > and with the api? Also, how can I allow the user to manually allow > diversification while he configures his result set via Matcher API? > > b) Should I include the LC clustering algorithm in xapian-core/cluster (as > there's the base class Cluster which can be inherited) or make it part of > diversification implementation. > > c) Apart from the proposed methods, I'd be writing automated tests, > examples and documenting the new feature. Some tips here are appreciated as > I've never written tests for code. Also, for documenting, I believe only > getting-started-with-xapian should be updated with examples for using the > new feature. > > Apart from the above, if I'm missing something or didn't go into enough > detail, please let me know. :) > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20180310/d23063f7/attachment.html>
Thanks for selecting my proposal for GSoC, looking forward to contributing further to Xapian. I've posted this in the IRC but didn't receive any reply, so I'm presuming this must've been missed and thus posting it here. As proposed, I plan to use ClueWeb09 Category B dataset for evaluating diversification. A hosted copy is available (http://lemurproject.org/clueweb09.php/index.php#Services) which may be accessed but requires a license. The license is free and granted to an organisation by applying online (http://lemurproject.org/clueweb09/organization_agreement.clueweb09.worder.Mar30-18.pdf) . If a maintainer could have a look at this, that would be great. It's mentioned on the website that it takes around 2 weeks to obtain the license, and as discussed in the interview, I might evaluate the GLS-MPT implementation before moving on to optimizations (C2-GLS). On Sat, Mar 10, 2018 at 12:08 AM, Uppinder Chugh <uppinderchugh at gmail.com> wrote:> > Hi, I'd like to share my proposal for GSoC and get feedback on it. > > https://docs.google.com/document/d/1A4HF2lZBnLh1TUY3Y2DDUfz-nzbIL1NNAo8Adl3gN-8/edit?usp=sharing > > Thanks, > Uppinder Chugh > > On Mon, Feb 26, 2018 at 2:14 AM, Uppinder Chugh <uppinderchugh at gmail.com> wrote: >> >> In particular, I have the following doubts: >> >> a) Is wrapping Xapian::Mset matcher::get_set(..) suitable in this scenario and with the api? Also, how can I allow the user to manually allow diversification while he configures his result set via Matcher API? >> >> b) Should I include the LC clustering algorithm in xapian-core/cluster (as there's the base class Cluster which can be inherited) or make it part of diversification implementation. >> >> c) Apart from the proposed methods, I'd be writing automated tests, examples and documenting the new feature. Some tips here are appreciated as I've never written tests for code. Also, for documenting, I believe only getting-started-with-xapian should be updated with examples for using the new feature. >> >> Apart from the above, if I'm missing something or didn't go into enough detail, please let me know. :) >> >
We are equally excited about working with you over summer. I think you missed reply by Olly on IRC, you can find it in logs here: https://botbot.me/freenode/xapian/2018-04-24/?msg=99336093&page=1 - olly icebyte[m]: i think that probably needs to go through SFC ( https://sfconservancy.org/) as the "legal entity" - 2:05 am <https://botbot.me/freenode/xapian/msg/99336095/> icebyte[m]: i can talk to them about it - Gaurav On Fri, Apr 27, 2018 at 12:23 AM, Uppinder Chugh <uppinderchugh at gmail.com> wrote:> Thanks for selecting my proposal for GSoC, looking forward to > contributing further to Xapian. I've posted this in the IRC but didn't > receive any reply, so I'm presuming this must've been missed and thus > posting it here. As proposed, I plan to use ClueWeb09 Category B > dataset for evaluating diversification. A hosted copy is available > (http://lemurproject.org/clueweb09.php/index.php#Services) which may > be accessed but requires a license. The license is free and granted to > an organisation by applying online > (http://lemurproject.org/clueweb09/organization_ > agreement.clueweb09.worder.Mar30-18.pdf) > . If a maintainer could have a look at this, that would be great. It's > mentioned on the website that it takes around 2 weeks to obtain the > license, and as discussed in the interview, I might evaluate the > GLS-MPT implementation before moving on to optimizations (C2-GLS). > > On Sat, Mar 10, 2018 at 12:08 AM, Uppinder Chugh > <uppinderchugh at gmail.com> wrote: > > > > Hi, I'd like to share my proposal for GSoC and get feedback on it. > > > > https://docs.google.com/document/d/1A4HF2lZBnLh1TUY3Y2DDUfz- > nzbIL1NNAo8Adl3gN-8/edit?usp=sharing > > > > Thanks, > > Uppinder Chugh > > > > On Mon, Feb 26, 2018 at 2:14 AM, Uppinder Chugh <uppinderchugh at gmail.com> > wrote: > >> > >> In particular, I have the following doubts: > >> > >> a) Is wrapping Xapian::Mset matcher::get_set(..) suitable in this > scenario and with the api? Also, how can I allow the user to manually allow > diversification while he configures his result set via Matcher API? > >> > >> b) Should I include the LC clustering algorithm in xapian-core/cluster > (as there's the base class Cluster which can be inherited) or make it part > of diversification implementation. > >> > >> c) Apart from the proposed methods, I'd be writing automated tests, > examples and documenting the new feature. Some tips here are appreciated as > I've never written tests for code. Also, for documenting, I believe only > getting-started-with-xapian should be updated with examples for using the > new feature. > >> > >> Apart from the above, if I'm missing something or didn't go into enough > detail, please let me know. :) > >> > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20180427/083fdeb6/attachment.html>