Hi James, Thanks for clearing doubts I had earlier.>>if we can introduce the variants using optional parameters that default to >>(effectively) 'off' that might be better than distinct ones,Yes, this will definitely be the better approach for introducing the variants of existing weighting functions. Thanks for the suggestion. Next, I will try to come up with a draft of pseudo-code for each of those variants in next few days. Would be helpful if you could review them before coding period begins. It will help me get a clear picture of implementation in advance.>>you need to independently calculate, or independently >>verify, the correct outputs for some test sets (you should be able to >>use the existing test databases).So, careful manual testing of implemented code and automated testing through xapian-core/tests/api_weight.cc using the existing test databases is what I'd need to perform for complete testing of implemented weighting functions. Please correct me if I am wrong or missing something here.>>You should talk to Guarav about that, in particular looking at the evaluationwork he did previously>>(https://github.com/samuelharden/xapian-evaluation)I've started exploring and trying to get this evaluation module running on my system. Facing some issues initially so trying to sort out those issues with the help from Gaurav on IRC.>>We may want to take the opportunity to discuss whether parts or all of >>this evaluation framework can be moved into the main Xapian repo, and >>if there are changes that will make it easier to use for evaluation infuture. Yes, it'd be a huge plus for us as it would help to compare Xapian's performance based on the different weighting functions. I'll add this under "Additional tasks" in my project wiki and would like to work with Gaurav after completing my GSoC project.>>If Nishad doesn't find time to take this forward, >>it should be fine for you to pick up and complete this normalisation.Sure, I'll do it as a part of Additional tasks after GSoC period :)>>Yes, that's a good idea. You might want, at the end of the project, to >>transfer any remaining ideas and thoughts either into the bug tracker >>or to somewhere on the wikiI've got 3 ideas for this section so far after all discussions:- 1. Implement remaining SMART normalizations of tf-idf weighting function , 2. Work with Gaurav to get parts of evaluation module in main repo to start with.>>Good luck with them!Thanks :) Regards, Vivek -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20160508/a9641133/attachment.html>
On Sun, May 08, 2016 at 04:36:16PM +0530, Vivek Pal wrote:> >>you need to independently calculate, or independently > >>verify, the correct outputs for some test sets (you should be able to > >>use the existing test databases). > > So, careful manual testing of implemented code and automated testing > through xapian-core/tests/api_weight.cc > using the existing test databases is what I'd need to perform for complete > testing of implemented weighting functions.Almost -- the manual step should just be in calculating the correct outputs. All the actual testing, verifying that the weights come out correctly, should be automated.> I've started exploring and trying to get this evaluation module > running on my system. Facing some issues initially so trying to > sort out those issues with the help from Gaurav on IRC.Great -- I note that Olly has dropped something in IRC about this, so hopefully you're able to keep moving forward.> >>We may want to take the opportunity to discuss whether parts or all of > >>this evaluation framework can be moved into the main Xapian repo, and > >>if there are changes that will make it easier to use for evaluation in > >>future. > > Yes, it'd be a huge plus for us as it would help to compare > Xapian's performance based on the different weighting functions. > I'll add this under "Additional tasks" in my project wiki and would like to > work with Gaurav after completing my GSoC project.Perfect. J -- James Aylett, occasional trouble-maker xapian.org
Hi Vivek, I saw your comments on IRC, as noted by olly : *<olly> vivekp: (if you check the logs) - you want: ./trec_index config* *<olly> you want to run the compiled binary (no ".cc") not the source file...* But i guess you are not able to compile the setup. I can write steps and send across how to compile and sample files from config. Most of the files in config are from a test collection taken from FIRE, we would need to ask permission to gain access to those files by FIRE team( http://fire.irsi.res.in/fire/static/data). It needs to be signed to gain access by organization to permit us to use data. @olly and @James Earlier being part of fire team, i was able to use this data. Not sure i have the data now, Should we fill these form and ask permission to use this data from FIRE team? Thanks, Gaurav On Mon, May 9, 2016 at 4:03 PM, James Aylett <james-xapian at tartarus.org> wrote:> On Sun, May 08, 2016 at 04:36:16PM +0530, Vivek Pal wrote: > > > >>you need to independently calculate, or independently > > >>verify, the correct outputs for some test sets (you should be able to > > >>use the existing test databases). > > > > So, careful manual testing of implemented code and automated testing > > through xapian-core/tests/api_weight.cc > > using the existing test databases is what I'd need to perform for > complete > > testing of implemented weighting functions. > > Almost -- the manual step should just be in calculating the correct > outputs. All the actual testing, verifying that the weights come out > correctly, should be automated. > > > I've started exploring and trying to get this evaluation module > > running on my system. Facing some issues initially so trying to > > sort out those issues with the help from Gaurav on IRC. > > Great -- I note that Olly has dropped something in IRC about this, so > hopefully you're able to keep moving forward. > > > >>We may want to take the opportunity to discuss whether parts or all of > > >>this evaluation framework can be moved into the main Xapian repo, and > > >>if there are changes that will make it easier to use for evaluation in > > >>future. > > > > Yes, it'd be a huge plus for us as it would help to compare > > Xapian's performance based on the different weighting functions. > > I'll add this under "Additional tasks" in my project wiki and would like > to > > work with Gaurav after completing my GSoC project. > > Perfect. > > J > > -- > James Aylett, occasional trouble-maker > xapian.org > >-- Regards, Gaurav Arora -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20160509/f758f18b/attachment.html>