We are equally excited about working with you over summer. I think you missed reply by Olly on IRC, you can find it in logs here: https://botbot.me/freenode/xapian/2018-04-24/?msg=99336093&page=1 - olly icebyte[m]: i think that probably needs to go through SFC ( https://sfconservancy.org/) as the "legal entity" - 2:05 am <https://botbot.me/freenode/xapian/msg/99336095/> icebyte[m]: i can talk to them about it - Gaurav On Fri, Apr 27, 2018 at 12:23 AM, Uppinder Chugh <uppinderchugh at gmail.com> wrote:> Thanks for selecting my proposal for GSoC, looking forward to > contributing further to Xapian. I've posted this in the IRC but didn't > receive any reply, so I'm presuming this must've been missed and thus > posting it here. As proposed, I plan to use ClueWeb09 Category B > dataset for evaluating diversification. A hosted copy is available > (http://lemurproject.org/clueweb09.php/index.php#Services) which may > be accessed but requires a license. The license is free and granted to > an organisation by applying online > (http://lemurproject.org/clueweb09/organization_ > agreement.clueweb09.worder.Mar30-18.pdf) > . If a maintainer could have a look at this, that would be great. It's > mentioned on the website that it takes around 2 weeks to obtain the > license, and as discussed in the interview, I might evaluate the > GLS-MPT implementation before moving on to optimizations (C2-GLS). > > On Sat, Mar 10, 2018 at 12:08 AM, Uppinder Chugh > <uppinderchugh at gmail.com> wrote: > > > > Hi, I'd like to share my proposal for GSoC and get feedback on it. > > > > https://docs.google.com/document/d/1A4HF2lZBnLh1TUY3Y2DDUfz- > nzbIL1NNAo8Adl3gN-8/edit?usp=sharing > > > > Thanks, > > Uppinder Chugh > > > > On Mon, Feb 26, 2018 at 2:14 AM, Uppinder Chugh <uppinderchugh at gmail.com> > wrote: > >> > >> In particular, I have the following doubts: > >> > >> a) Is wrapping Xapian::Mset matcher::get_set(..) suitable in this > scenario and with the api? Also, how can I allow the user to manually allow > diversification while he configures his result set via Matcher API? > >> > >> b) Should I include the LC clustering algorithm in xapian-core/cluster > (as there's the base class Cluster which can be inherited) or make it part > of diversification implementation. > >> > >> c) Apart from the proposed methods, I'd be writing automated tests, > examples and documenting the new feature. Some tips here are appreciated as > I've never written tests for code. Also, for documenting, I believe only > getting-started-with-xapian should be updated with examples for using the > new feature. > >> > >> Apart from the above, if I'm missing something or didn't go into enough > detail, please let me know. :) > >> > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20180427/083fdeb6/attachment.html>
Hi Uppinder, Congratulations on being accepted into GSoC 2018 with Xapian! as discussed in the interview, I might evaluate the> GLS-MPT implementation before moving on to optimizations (C2-GLS). >We had a discussion with regard to this, and the decision was to perform evaluation after the optimizations as you had originally proposed. So let's stick to your original plan and complete the implementation of C2-GLS before going ahead with evaluation. Best Regards, Amanda On Fri, Apr 27, 2018 at 8:37 AM, Gaurav Arora <gauravarora.daiict at gmail.com> wrote:> We are equally excited about working with you over summer. > > I think you missed reply by Olly on IRC, you can find it in logs here: > https://botbot.me/freenode/xapian/2018-04-24/?msg=99336093&page=1 > > - olly > icebyte[m]: i think that probably needs to go through SFC ( > https://sfconservancy.org/) as the "legal entity" > - 2:05 am <https://botbot.me/freenode/xapian/msg/99336095/> > icebyte[m]: i can talk to them about it > > > > - Gaurav > > On Fri, Apr 27, 2018 at 12:23 AM, Uppinder Chugh <uppinderchugh at gmail.com> > wrote: > >> Thanks for selecting my proposal for GSoC, looking forward to >> contributing further to Xapian. I've posted this in the IRC but didn't >> receive any reply, so I'm presuming this must've been missed and thus >> posting it here. As proposed, I plan to use ClueWeb09 Category B >> dataset for evaluating diversification. A hosted copy is available >> (http://lemurproject.org/clueweb09.php/index.php#Services) which may >> be accessed but requires a license. The license is free and granted to >> an organisation by applying online >> (http://lemurproject.org/clueweb09/organization_agreement. >> clueweb09.worder.Mar30-18.pdf) >> . If a maintainer could have a look at this, that would be great. It's >> mentioned on the website that it takes around 2 weeks to obtain the >> license, and as discussed in the interview, I might evaluate the >> GLS-MPT implementation before moving on to optimizations (C2-GLS). >> >> On Sat, Mar 10, 2018 at 12:08 AM, Uppinder Chugh >> <uppinderchugh at gmail.com> wrote: >> > >> > Hi, I'd like to share my proposal for GSoC and get feedback on it. >> > >> > https://docs.google.com/document/d/1A4HF2lZBnLh1TUY3Y2DDUfz- >> nzbIL1NNAo8Adl3gN-8/edit?usp=sharing >> > >> > Thanks, >> > Uppinder Chugh >> > >> > On Mon, Feb 26, 2018 at 2:14 AM, Uppinder Chugh < >> uppinderchugh at gmail.com> wrote: >> >> >> >> In particular, I have the following doubts: >> >> >> >> a) Is wrapping Xapian::Mset matcher::get_set(..) suitable in this >> scenario and with the api? Also, how can I allow the user to manually allow >> diversification while he configures his result set via Matcher API? >> >> >> >> b) Should I include the LC clustering algorithm in xapian-core/cluster >> (as there's the base class Cluster which can be inherited) or make it part >> of diversification implementation. >> >> >> >> c) Apart from the proposed methods, I'd be writing automated tests, >> examples and documenting the new feature. Some tips here are appreciated as >> I've never written tests for code. Also, for documenting, I believe only >> getting-started-with-xapian should be updated with examples for using the >> new feature. >> >> >> >> Apart from the above, if I'm missing something or didn't go into >> enough detail, please let me know. :) >> >> >> > >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20180428/1f22bdef/attachment.html>
Hi Uppinder, I noticed that you have not updated the journal [1] since May 14th, so appreciate if you could provide an update on the current status of the project. Also, have you applied for the TREC ClueWeb09 dataset? [1] https://trac.xapian.org/wiki/GSoC2018/Diversification/Journal Best Regards, Amanda On Sat, Apr 28, 2018 at 8:53 AM, Amanda Jayanetti <amandajayanetti at gmail.com> wrote:> Hi Uppinder, > > Congratulations on being accepted into GSoC 2018 with Xapian! > > as discussed in the interview, I might evaluate the >> GLS-MPT implementation before moving on to optimizations (C2-GLS). >> > > We had a discussion with regard to this, and the decision was to perform > evaluation after the optimizations as you had originally proposed. So let's > stick to your original plan and complete the implementation of C2-GLS > before going ahead with evaluation. > > Best Regards, > Amanda > > On Fri, Apr 27, 2018 at 8:37 AM, Gaurav Arora < > gauravarora.daiict at gmail.com> wrote: > >> We are equally excited about working with you over summer. >> >> I think you missed reply by Olly on IRC, you can find it in logs here: >> https://botbot.me/freenode/xapian/2018-04-24/?msg=99336093&page=1 >> >> - olly >> icebyte[m]: i think that probably needs to go through SFC ( >> https://sfconservancy.org/) as the "legal entity" >> - 2:05 am <https://botbot.me/freenode/xapian/msg/99336095/> >> icebyte[m]: i can talk to them about it >> >> >> >> - Gaurav >> >> On Fri, Apr 27, 2018 at 12:23 AM, Uppinder Chugh <uppinderchugh at gmail.com >> > wrote: >> >>> Thanks for selecting my proposal for GSoC, looking forward to >>> contributing further to Xapian. I've posted this in the IRC but didn't >>> receive any reply, so I'm presuming this must've been missed and thus >>> posting it here. As proposed, I plan to use ClueWeb09 Category B >>> dataset for evaluating diversification. A hosted copy is available >>> (http://lemurproject.org/clueweb09.php/index.php#Services) which may >>> be accessed but requires a license. The license is free and granted to >>> an organisation by applying online >>> (http://lemurproject.org/clueweb09/organization_agreement.cl >>> ueweb09.worder.Mar30-18.pdf) >>> . If a maintainer could have a look at this, that would be great. It's >>> mentioned on the website that it takes around 2 weeks to obtain the >>> license, and as discussed in the interview, I might evaluate the >>> GLS-MPT implementation before moving on to optimizations (C2-GLS). >>> >>> On Sat, Mar 10, 2018 at 12:08 AM, Uppinder Chugh >>> <uppinderchugh at gmail.com> wrote: >>> > >>> > Hi, I'd like to share my proposal for GSoC and get feedback on it. >>> > >>> > https://docs.google.com/document/d/1A4HF2lZBnLh1TUY3Y2DDUfz- >>> nzbIL1NNAo8Adl3gN-8/edit?usp=sharing >>> > >>> > Thanks, >>> > Uppinder Chugh >>> > >>> > On Mon, Feb 26, 2018 at 2:14 AM, Uppinder Chugh < >>> uppinderchugh at gmail.com> wrote: >>> >> >>> >> In particular, I have the following doubts: >>> >> >>> >> a) Is wrapping Xapian::Mset matcher::get_set(..) suitable in this >>> scenario and with the api? Also, how can I allow the user to manually allow >>> diversification while he configures his result set via Matcher API? >>> >> >>> >> b) Should I include the LC clustering algorithm in >>> xapian-core/cluster (as there's the base class Cluster which can be >>> inherited) or make it part of diversification implementation. >>> >> >>> >> c) Apart from the proposed methods, I'd be writing automated tests, >>> examples and documenting the new feature. Some tips here are appreciated as >>> I've never written tests for code. Also, for documenting, I believe only >>> getting-started-with-xapian should be updated with examples for using the >>> new feature. >>> >> >>> >> Apart from the above, if I'm missing something or didn't go into >>> enough detail, please let me know. :) >>> >> >>> > >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20180605/6ab20df2/attachment.html>
Ricchiey Thomas, Vivek Pal and Amanda Jayanetti (Sorry, I don't know your IRC nicks, so I'm sending this via mailing list): Please review PR #198 ( https://github.com/xapian/xapian/pull/198). I'd like to get it to a mergeable state and quickly move on to optimisation and then evaluation of diversification. On Tue, Jun 5, 2018 at 9:18 PM, Amanda Jayanetti <amandajayanetti at gmail.com> wrote:> Great! Thanks Uppinder. > > Best Regards, > Amanda > > On Tue, Jun 5, 2018 at 5:18 PM, Uppinder Chugh <uppinderchugh at gmail.com> > wrote: > >> Hi Amanda, >> >> I have updated the journal. Regarding the TREC ClueWeb09 dataset, I >> have contacted Olly. I cannot directly apply for the dataset myself. >> >> Sincerely, >> Uppinder >> >> On Tue, Jun 5, 2018 at 12:20 AM, Amanda Jayanetti < >> amandajayanetti at gmail.com> wrote: >> >>> Hi Uppinder, >>> >>> I noticed that you have not updated the journal [1] since May 14th, so >>> appreciate if you could provide an update on the current status of the >>> project. Also, have you applied for the TREC ClueWeb09 dataset? >>> >>> [1] https://trac.xapian.org/wiki/GSoC2018/Diversification/Journal >>> >>> Best Regards, >>> Amanda >>> >>> On Sat, Apr 28, 2018 at 8:53 AM, Amanda Jayanetti < >>> amandajayanetti at gmail.com> wrote: >>> >>>> Hi Uppinder, >>>> >>>> Congratulations on being accepted into GSoC 2018 with Xapian! >>>> >>>> as discussed in the interview, I might evaluate the >>>>> GLS-MPT implementation before moving on to optimizations (C2-GLS). >>>>> >>>> >>>> We had a discussion with regard to this, and the decision was to >>>> perform evaluation after the optimizations as you had originally proposed. >>>> So let's stick to your original plan and complete the implementation of >>>> C2-GLS before going ahead with evaluation. >>>> >>>> Best Regards, >>>> Amanda >>>> >>>> On Fri, Apr 27, 2018 at 8:37 AM, Gaurav Arora < >>>> gauravarora.daiict at gmail.com> wrote: >>>> >>>>> We are equally excited about working with you over summer. >>>>> >>>>> I think you missed reply by Olly on IRC, you can find it in logs >>>>> here: https://botbot.me/freenode/xapian/2018-04-24/?msg=993 >>>>> 36093&page=1 >>>>> >>>>> - olly >>>>> icebyte[m]: i think that probably needs to go through SFC ( >>>>> https://sfconservancy.org/) as the "legal entity" >>>>> - 2:05 am <https://botbot.me/freenode/xapian/msg/99336095/> >>>>> icebyte[m]: i can talk to them about it >>>>> >>>>> >>>>> >>>>> - Gaurav >>>>> >>>>> On Fri, Apr 27, 2018 at 12:23 AM, Uppinder Chugh < >>>>> uppinderchugh at gmail.com> wrote: >>>>> >>>>>> Thanks for selecting my proposal for GSoC, looking forward to >>>>>> contributing further to Xapian. I've posted this in the IRC but didn't >>>>>> receive any reply, so I'm presuming this must've been missed and thus >>>>>> posting it here. As proposed, I plan to use ClueWeb09 Category B >>>>>> dataset for evaluating diversification. A hosted copy is available >>>>>> (http://lemurproject.org/clueweb09.php/index.php#Services) which may >>>>>> be accessed but requires a license. The license is free and granted to >>>>>> an organisation by applying online >>>>>> (http://lemurproject.org/clueweb09/organization_agreement.cl >>>>>> ueweb09.worder.Mar30-18.pdf) >>>>>> . If a maintainer could have a look at this, that would be great. It's >>>>>> mentioned on the website that it takes around 2 weeks to obtain the >>>>>> license, and as discussed in the interview, I might evaluate the >>>>>> GLS-MPT implementation before moving on to optimizations (C2-GLS). >>>>>> >>>>>> On Sat, Mar 10, 2018 at 12:08 AM, Uppinder Chugh >>>>>> <uppinderchugh at gmail.com> wrote: >>>>>> > >>>>>> > Hi, I'd like to share my proposal for GSoC and get feedback on it. >>>>>> > >>>>>> > https://docs.google.com/document/d/1A4HF2lZBnLh1TUY3Y2DDUfz- >>>>>> nzbIL1NNAo8Adl3gN-8/edit?usp=sharing >>>>>> > >>>>>> > Thanks, >>>>>> > Uppinder Chugh >>>>>> > >>>>>> > On Mon, Feb 26, 2018 at 2:14 AM, Uppinder Chugh < >>>>>> uppinderchugh at gmail.com> wrote: >>>>>> >> >>>>>> >> In particular, I have the following doubts: >>>>>> >> >>>>>> >> a) Is wrapping Xapian::Mset matcher::get_set(..) suitable in this >>>>>> scenario and with the api? Also, how can I allow the user to manually allow >>>>>> diversification while he configures his result set via Matcher API? >>>>>> >> >>>>>> >> b) Should I include the LC clustering algorithm in >>>>>> xapian-core/cluster (as there's the base class Cluster which can be >>>>>> inherited) or make it part of diversification implementation. >>>>>> >> >>>>>> >> c) Apart from the proposed methods, I'd be writing automated >>>>>> tests, examples and documenting the new feature. Some tips here are >>>>>> appreciated as I've never written tests for code. Also, for documenting, I >>>>>> believe only getting-started-with-xapian should be updated with examples >>>>>> for using the new feature. >>>>>> >> >>>>>> >> Apart from the above, if I'm missing something or didn't go into >>>>>> enough detail, please let me know. :) >>>>>> >> >>>>>> > >>>>>> >>>>> >>>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20180608/7892f51f/attachment.html>