Displaying 20 results from an estimated 1000 matches similar to: "[GSOC 2014] Indexing INEX dataset"
2014 Mar 11
2
[GSOC 2014] Indexing INEX dataset
On Tue, Mar 11, 2014 at 03:20:31PM +0100, Parth Gupta wrote:
> >
> > On current trunk, we index the title with prefix "S" by default in
> > omindex, though with a wdf inc of 5 rather than 1:
> >
> > indexer.index_text(title, 5, "S");
> >
> > So I don't think you need that change to omindex now.
>
> Yes, but please
2014 Mar 11
2
[GSOC 2013] Question about indexing INEX dataset
Hi,
I?m trying to use Omega to index INEX dataset for Letor. But omindex told me these xml files are unknown. Olly told me I could tell omindex to handle them as HTML. (Thanks Olly :) ) Is it appropriate? Parth, could you give me some suggestions?
Thank you!
Jiarong Wei
2014 Mar 17
2
[GSOC 2014] Indexing INEX dataset
Hi Olly,
Wouldn't setting the weight of terms in title back to normal (e.g. 5 to 1)
by below line, automatically adjust the wdfs and field lengths?
indexer.index_text(title, 5, "S"); -> indexer.index_text(title, 1, "S");
if it does not then we should include that part in the patch too. I like to
create a patch for xapian-letor for resolving common code of xapian.
2014 Mar 11
2
[GSOC 2014] Indexing INEX dataset
On Tue, Mar 11, 2014 at 12:02:15PM +0100, Parth Gupta wrote:
> During the indexing with omindex, only you need to make sure is indexing
> with prefix 'S' for title as explained here in Letor documentation:
> xapian-letor/docs/letor.rst
>
> Previously when I edited omindex.cc it was modified as can be seen
>
2014 Mar 09
2
[GSOC 2014] Some questions about Letor module
Thanks for your reply! For the third question: In https://inex.mmci.uni-saarland.de/data/documentcollection.jsp, I can find inex2010-article.qrels in 2010 assessment, but can?t find query files. Could you send me the link? I have registered on INEX website. And I also need to download ``INEX 2009 collection without annotation tags: (unofficial)`` on
2014 Mar 09
2
[GSOC 2014] Some questions about Letor module
Hi,
I've read the code of letor module. And I have some questions about it:
1. In https://github.com/rishabhmehrotra/xapian/blob/master/xapian-letor/letor_internal.cc#L299, there is a write_to_file method, which save RankList into ?train.txt?. But the format for ?train.txt? is different from the one mentioned in http://trac.xapian.org/wiki/GSoC2011/LTR/Notes#QueryLevelNorm. And in
2016 Mar 20
2
GSoC 2016 Letor Stabilisation
Hello,
I'm Ayush from New Delhi, India. I am interested in Letor Stabilisation
project for GSoC. I have a good background in machine learning. Sorry for
getting in so late, university exams were holding me back. I'll try to
cover as much as I can in the coming week.
I am following the plan of attack suggested on the project page. Following
are the things that I have completed:
1.
2014 May 21
2
Some questions about Letor project
Hi all,
Thank you for giving me the opportunity to work with Xapian :) I am Jiarong
Wei, a third year undergraduate student in Zhejiang University, China. In
GSoC 2014, I will work on Letor module with Hanxiao Sun.
Here are some questions I encountered these days,
1. In letor.cc, we have two parts of functions: the training part and
the ranking part. I?ll use SVMRanker as an example. The
2014 Mar 22
2
[GSOC 2014] Indexing INEX dataset
For unsupervised approaches like BM25 this approach works well but letor
does not need special weighting for title in this form as it itself assigns
weights to title features separately.
But I see your concern it would be a problem when BM25 is used on the index
with this setup. Hence its preferable to take a note of this uplift in
title weight for xapian-letor and normalize it everywhere
2016 May 14
2
GSoC 2016 Letor dataset discussion
Hello,
I wanted to decide the dataset that should be used for Letor stabilisation
project.
I think 2009 INEX Wikipedia Collection
<http://www.mpi-inf.mpg.de/departments/databases-and-information-systems/software/inex/>
should work fine. It's a collection of 2,666,190 XML articles, 115 topics
<http://inex.mmci.uni-saarland.de/protected/adhoc/2009-topics.zip>, 50,275
qrel
2014 Mar 11
3
Proposal Outline
Hi,
Before starting my proposal, I wanted to know what is the expected output
of Letor module. Is it for transfer learning (i.e you learn from one
dataset and leverage it to predict the rankings of other dataset) or is it
for supervised learning?
For instance - Xapian currently powers the Gmane search which is by default
based on BM25 weighting scheme and now suppose we want to use LETOR to rank
2014 Oct 12
5
Help with xapian
Hi,
I am unable to build the letor module. I am generating the configure file
using autoconf. The configure file generated is throwing the error
./configure: line 2057: syntax error near unexpected token `1.10.1'
./configure: line 2057: `AM_INIT_AUTOMAKE(1.10.1 -Wportability tar-ustar)`
I am not too sure what to do with this. Need help with this.
Thank You
Regards
Karthik
On Mon, Sep 29,
2014 Mar 04
4
Questions on letor module
Hi,
I have several questions regarding the letor module,I looked at the
framework of learning to rank in xapian
http://rishabhmehrotra.com/gsoc/17.png, I am a little confused. Why using
deep learning to find unsupervised features in test data? Since in my
understanding, learning to rank model usually learn features from the
training data then apply the model to the test data? Why test set and
2014 Feb 26
2
GSOC 2014
Just to add on top of what Olly has already mentioned.
> > Now, I'm reading the resources provided on ideas' page. Do you have
> > any other suggestions of knowing more about the letor?
> > And I'd like to test the function of letor. But I can't find code
> > example. Can u give me some suggestions?
>
> Hopefully Parth can help here.
>
In order
2014 Feb 25
2
GSOC 2014
Hi,
I am Jiarong Wei (irc: VcamX). I?m a 3rd year computer science student at Zhejiang University, China. I?m very willing to contribute to Xapian as part of GSoC 2014. Now I?m at Simon Fraser University, Canada, as an exchange student. I?ll go back to China on the end of April. I think it doesn?t matter I?ll change the time zone :)
From the list of project?s ideas, Learning to Rank interests me
2014 Sep 29
2
Help with xapian
Hi,
I have started getting a hang of the xapian codebase. I think I would like
to try my hands on the letor module of xapian. Could you please suggest
some free data set for the training and testing of letor features. I am not
able to get the INEX data set from anywhere (the one mentioned by parth
gupta in his GSOC 2011 projecct.
Regards
Karthik
On Mon, Sep 22, 2014 at 4:23 PM, Olly Betts
2014 Mar 05
2
Question regarding LETOR
Hi Parth,
I have a few questions-
1. Could you please provide me with the link for query-file, qrel-file for
the dataset available at
http://www.mpi-inf.mpg.de/departments/d5/software/inex/ .
2. I wanted to know how automated testing would be implemented. Will there
be test cases like a test query must match this particular N results and
this particular ranking. Or will it be in terms of
2014 Mar 04
2
Test Dataset for performance and accuracy analysis
Hi Parth,
I implemented DFR algorithms in Xapian as
a part of GSOC last year under the mentorship of Olly. This year, I want to
work on analyzing and optimizing the performance of the DFR algorithms and
comparing them with BM25.I also want to work on profiling the query
expansion schemes and test the relevance(precision and recall) / speed(time
taken) of the
2013 Sep 25
2
Is the project learning to rank need to be improved?
As Olly has already pointed out the 2012 branch is not merged.
I think there are some compilation errors in the branch.
The code in branch is better refactored. The Ranker and FeatureManager
classes are well defined and implemented.
Parth.
On Wed, Sep 25, 2013 at 9:02 AM, Olly Betts <olly at survex.com> wrote:
> On Tue, Sep 24, 2013 at 08:34:10PM +0800, jiangwen jiang wrote:
>
2012 Jul 27
1
A Little Help
Hi Rishabh,
I think its better not to expose RankiList to Letor.h and make it better
user friendly. So my suggestion is to convert RankList to the following
statement in this method.
std::map<Xapian::docid, double> letor_score(const Xapian::MSet & mset);
So just convert the RankList in std::map<Xapian::docid, double> format in
the methods where you need to return.
Parth.
On