similar to: Test Dataset for performance and accuracy analysis

Displaying 20 results from an estimated 900 matches similar to: "Test Dataset for performance and accuracy analysis"

2014 Mar 09
2
[GSOC 2014] Some questions about Letor module
Thanks for your reply! For the third question: In https://inex.mmci.uni-saarland.de/data/documentcollection.jsp, I can find inex2010-article.qrels in 2010 assessment, but can?t find query files. Could you send me the link? I have registered on INEX website. And I also need to download ``INEX 2009 collection without annotation tags: (unofficial)`` on
2016 May 14
2
GSoC 2016 Letor dataset discussion
Hello, I wanted to decide the dataset that should be used for Letor stabilisation project. I think 2009 INEX Wikipedia Collection <http://www.mpi-inf.mpg.de/departments/databases-and-information-systems/software/inex/> should work fine. It's a collection of 2,666,190 XML articles, 115 topics <http://inex.mmci.uni-saarland.de/protected/adhoc/2009-topics.zip>, 50,275 qrel
2014 Mar 09
2
[GSOC 2014] Some questions about Letor module
Hi, I've read the code of letor module. And I have some questions about it: 1. In https://github.com/rishabhmehrotra/xapian/blob/master/xapian-letor/letor_internal.cc#L299, there is a write_to_file method, which save RankList into ?train.txt?. But the format for ?train.txt? is different from the one mentioned in http://trac.xapian.org/wiki/GSoC2011/LTR/Notes#QueryLevelNorm. And in
2016 Jul 25
3
Weighting Schemes: Evaluation results
Hi James, > We probably don't want them committed in git where they're evaluation > runs (because we can recreate them); a gist might be more appropriate. Sorry, I have moved results files over to gist for each individual weighting scheme. Link: https://gist.github.com/ivmarkp/secret > I can't tell, but are some of those files from FIRE? If so, they > shouldn't be
2014 Mar 11
3
Proposal Outline
Hi, Before starting my proposal, I wanted to know what is the expected output of Letor module. Is it for transfer learning (i.e you learn from one dataset and leverage it to predict the rankings of other dataset) or is it for supervised learning? For instance - Xapian currently powers the Gmane search which is by default based on BM25 weighting scheme and now suppose we want to use LETOR to rank
2014 Mar 04
4
Questions on letor module
Hi, I have several questions regarding the letor module,I looked at the framework of learning to rank in xapian http://rishabhmehrotra.com/gsoc/17.png, I am a little confused. Why using deep learning to find unsupervised features in test data? Since in my understanding, learning to rank model usually learn features from the training data then apply the model to the test data? Why test set and
2014 Mar 22
2
[GSOC 2014] Indexing INEX dataset
For unsupervised approaches like BM25 this approach works well but letor does not need special weighting for title in this form as it itself assigns weights to title features separately. But I see your concern it would be a problem when BM25 is used on the index with this setup. Hence its preferable to take a note of this uplift in title weight for xapian-letor and normalize it everywhere
2014 Mar 01
2
Complete GSOC idea
Hi everyone, I am thinking of working on the following ideas for my GSOC proposal based on my discussions with Olly and my own understanding. Rather than focusing on an entire perftest module, I have decided to focus on implementing performance tests for weighting schemes based on a wikipedia dump and in addition to that, build a framework to measure the
2014 Mar 05
2
Question regarding LETOR
Hi Parth, I have a few questions- 1. Could you please provide me with the link for query-file, qrel-file for the dataset available at http://www.mpi-inf.mpg.de/departments/d5/software/inex/ . 2. I wanted to know how automated testing would be implemented. Will there be test cases like a test query must match this particular N results and this particular ranking. Or will it be in terms of
2014 May 14
2
Starting work on Perf Test Module
Hello, I am beginning work on the perf test module. The initial steps that I aim to accomplish are :- -> Download the wikipedia dumps for multiple languages . -> Write python scripts to tokenize the dump (will probably use something like nltk which has powerful inbuilt tokenizers) -> Discuss and finalize the design of the search and query expansion perf tests as I want to complete them
2014 Mar 17
2
[GSOC 2014] Indexing INEX dataset
Hi Olly, Wouldn't setting the weight of terms in title back to normal (e.g. 5 to 1) by below line, automatically adjust the wdfs and field lengths? indexer.index_text(title, 5, "S"); -> indexer.index_text(title, 1, "S"); if it does not then we should include that part in the patch too. I like to create a patch for xapian-letor for resolving common code of xapian.
2016 Jul 28
2
Weighting Schemes: Evaluation results
Ah. If FIRE doesn't have something that can show this suitably, then > maybe Parth can advise on access to TREC, as I know he's used some of > them in the past. > ?I can say FIRE is also a reliable source but INEX/TREC are better. INEX can give you free access and TREC is not freely available. I had used INEX for xapian in the past and some details are here:
2016 Mar 20
2
GSoC 2016 Letor Stabilisation
Hello, I'm Ayush from New Delhi, India. I am interested in Letor Stabilisation project for GSoC. I have a good background in machine learning. Sorry for getting in so late, university exams were holding me back. I'll try to cover as much as I can in the coming week. I am following the plan of attack suggested on the project page. Following are the things that I have completed: 1.
2014 Mar 11
2
[GSOC 2014] Indexing INEX dataset
Hi Parth, I?ve implemented SVMRanker class and also sorted out most of current Letor APIs. Now I?m trying to use INEX dataset to verify my implement. But I stuck in the indexing part. You said in the documentation that we have to add prefix when indexing. Also I notice that you set some metadata in omindex.cc of your version. But the omindex.cc has changed since 2011. I think that?s why my result
2013 Feb 19
2
Implementing tf-idf weighting scheme in Xapian
Hello guys.I just read up about tf-idf schemes and want to implement it in Xapian (with some frequently used normalizations) as it will also give me a good hang of implementing a weighting scheme before I start working on implementing DFR schemes. I read the following as references and I think Ive understood it well and can write the hack :- 1.)
2013 Sep 25
2
Is the project learning to rank need to be improved?
As Olly has already pointed out the 2012 branch is not merged. I think there are some compilation errors in the branch. The code in branch is better refactored. The Ranker and FeatureManager classes are well defined and implemented. Parth. On Wed, Sep 25, 2013 at 9:02 AM, Olly Betts <olly at survex.com> wrote: > On Tue, Sep 24, 2013 at 08:34:10PM +0800, jiangwen jiang wrote: >
2014 Oct 12
5
Help with xapian
Hi, I am unable to build the letor module. I am generating the configure file using autoconf. The configure file generated is throwing the error ./configure: line 2057: syntax error near unexpected token `1.10.1' ./configure: line 2057: `AM_INIT_AUTOMAKE(1.10.1 -Wportability tar-ustar)` I am not too sure what to do with this. Need help with this. Thank You Regards Karthik On Mon, Sep 29,
2014 Mar 11
2
[GSOC 2013] Question about indexing INEX dataset
Hi, I?m trying to use Omega to index INEX dataset for Letor. But omindex told me these xml files are unknown. Olly told me I could tell omindex to handle them as HTML. (Thanks Olly :) ) Is it appropriate? Parth, could you give me some suggestions? Thank you! Jiarong Wei
2014 Sep 29
2
Help with xapian
Hi, I have started getting a hang of the xapian codebase. I think I would like to try my hands on the letor module of xapian. Could you please suggest some free data set for the training and testing of letor features. I am not able to get the INEX data set from anywhere (the one mentioned by parth gupta in his GSOC 2011 projecct. Regards Karthik On Mon, Sep 22, 2014 at 4:23 PM, Olly Betts
2016 Jun 27
2
xapian-letor: FeatureVector discussion
Hello James, Parth, Following our discussion on IRC and on code review, the way FeatureVector class works needs some discussion. Presently, the FeatureVector class is defined as follows, with a fixed number of feature count (19): class FeatureVector::Internal : public Xapian::Internal::intrusive_base{ friend class FeatureVector; double label; double score;