Hi Parth, I have a few questions- 1. Could you please provide me with the link for query-file, qrel-file for the dataset available at http://www.mpi-inf.mpg.de/departments/d5/software/inex/ . 2. I wanted to know how automated testing would be implemented. Will there be test cases like a test query must match this particular N results and this particular ranking. Or will it be in terms of evaluation of the IR algorithm something like MAP and NDCG score must >= a particular value? 3. When are you available at IRC? It's easier to communicate through IRC (atleast for me). This question is rather just because I was curious. Is there nothing like K-fold validation in Learning to Rank? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20140306/96a11732/attachment-0002.html>
Hi Mayank, 1. Could you please provide me with the link for query-file, qrel-file for> the dataset available at > http://www.mpi-inf.mpg.de/departments/d5/software/inex/ . >Very recently I have added a small section "exercise to warm-up" for Learning-to-Rank Project on Project idea page ( http://trac.xapian.org/wiki/GSoCProjectIdeas#Project:LearningtoRank ), you should do this exercise. You the information you ask there.> 2. I wanted to know how automated testing would be implemented. Will there > be test cases like a test query must match this particular N results and > this particular ranking. Or will it be in terms of evaluation of the IR > algorithm something like MAP and NDCG score must >= a particular value? >Yes, for the ranking algorithms, the automated test contains set of queries, their relevance judgments and document collection. For this part, you dont need a large dataset like INEX. For example, you can take 20 simple documents, prepare 5 queries on them and judge each document of that query as relevant or non-relevant. During the test, run the algorithms on this dataset and see what is/are the value of MAP and/or NDCG and/or other evaluation metrics. Ideally, each time the value should be same if the ranking algorithm/process or evaluation metric itself is not changed.> 3. When are you available at IRC? It's easier to communicate through IRC > (atleast for me). >These days I am not able to catch up the discussions on IRC, so yes, if I am there and see you, will ping otherwise just ask your questions there and then someone from our side will respond (if its not specifically meant for only me).> This question is rather just because I was curious. Is there nothing like > K-fold validation in Learning to Rank? >Well, k-fold validation is more on the evaluation side and is very much part of the learning-to-rank culture in fact whenever machine learning is involved, k-fold validation becomes inevitable. Usually you see this feature a lot when the library is especially meant for researchers. K-fold validation is yet to be included in the xapian-letor and it would be addressed with the evaluation metric issues. Cheers, Parth.> _______________________________________________ > Xapian-devel mailing list > Xapian-devel at lists.xapian.org > http://lists.xapian.org/mailman/listinfo/xapian-devel > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20140307/d7572776/attachment-0002.html>
Thanks. On Fri, Mar 7, 2014 at 1:31 PM, Parth Gupta <pargup8 at gmail.com> wrote:> Hi Mayank, > > 1. Could you please provide me with the link for query-file, qrel-file >> for the dataset available at >> http://www.mpi-inf.mpg.de/departments/d5/software/inex/ . >> > > Very recently I have added a small section "exercise to warm-up" for > Learning-to-Rank Project on Project idea page ( > http://trac.xapian.org/wiki/GSoCProjectIdeas#Project:LearningtoRank ), > you should do this exercise. You the information you ask there. > > >> 2. I wanted to know how automated testing would be implemented. Will >> there be test cases like a test query must match this particular N results >> and this particular ranking. Or will it be in terms of evaluation of the IR >> algorithm something like MAP and NDCG score must >= a particular value? >> > > Yes, for the ranking algorithms, the automated test contains set of > queries, their relevance judgments and document collection. For this part, > you dont need a large dataset like INEX. For example, you can take 20 > simple documents, prepare 5 queries on them and judge each document of that > query as relevant or non-relevant. During the test, run the algorithms on > this dataset and see what is/are the value of MAP and/or NDCG and/or other > evaluation metrics. Ideally, each time the value should be same if the > ranking algorithm/process or evaluation metric itself is not changed. > > >> 3. When are you available at IRC? It's easier to communicate through IRC >> (atleast for me). >> > > These days I am not able to catch up the discussions on IRC, so yes, if I > am there and see you, will ping otherwise just ask your questions there and > then someone from our side will respond (if its not specifically meant for > only me). > > >> This question is rather just because I was curious. Is there nothing like >> K-fold validation in Learning to Rank? >> > > Well, k-fold validation is more on the evaluation side and is very much > part of the learning-to-rank culture in fact whenever machine learning is > involved, k-fold validation becomes inevitable. Usually you see this > feature a lot when the library is especially meant for researchers. K-fold > validation is yet to be included in the xapian-letor and it would be > addressed with the evaluation metric issues. > > Cheers, > Parth. > > >> _______________________________________________ >> Xapian-devel mailing list >> Xapian-devel at lists.xapian.org >> http://lists.xapian.org/mailman/listinfo/xapian-devel >> >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20140307/84c88842/attachment-0002.html>