thr3ads.net - Xapian devel - [Xapian-devel] Learning to Rank : GSoC 2012 [Apr 2012]

If this information is useful, please help other people find it:
Share via:

Ashish Sadh

2012-Apr-01 05:00 UTC

[Xapian-devel] Learning to Rank : GSoC 2012

Hello all,

This is in reference to "Learning to Rank" Project Idea. [I know, i
made
the entry a bit late, but hope you are still in interest to help out]
I am looking for suggestions to help me narrowing down the choices of
algorithms. I had been readily surveying on the referred algorithms for the
purpose of choosing the right one. I am mentioning here some of my doubts
to discuss and make my concepts clear about the algorithms, so i should end
up choosing the most suitable one. I am sure your input would be fruitful
for me in effectively drafting my proposal.
The listnet looks like computationally more complex compared to ranknet. So
, is there any big advantage (in terms of improvement in ndcg/map) to move
to listnet AND the optimization suggested in the paper to look for top one
seems too simple. What will be the impact on accuracy and is there any way
to speed up /optimize listnet?
For adarank i didnt understand how is it superior compared to linear
regression??
I was also trying to search for open-source package for training listnet to
save time and focus on more important aspects of library enhancements, but
didn't get any suitable one. However, FANN is still in my to-check list,
and meanwhile i was just experimenting to train list-net in octave by
reusing some of my ml-class code (an online course by Professor Andrew Ng
that i participated in). What is the quickest way to understand the
modularity to be-involved while implementing any algorithm to serve the
current need.

Thanks,

Regards,
Ashish Sadh
B.Tech, final year student.
Indian Institute of Information Technology, Allahabad, India.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.xapian.org/pipermail/xapian-devel/attachments/20120401/fc4590c9/attachment-0001.html>

Rishabh Mehrotra

2012-Apr-01 12:47 UTC

head link

[Xapian-devel] Learning to Rank : GSoC 2012

Hi Ashish,

As your doubt related to the algorithms is a general one, I would like to
try addressing it. Ranknet is a pairwise approach while ListNet is a
listwise approach to ranking, so Listnet's advantages over Ranknet would be
same as what other Listwise algorithms have over Pairwise ones.

The listwise approach addresses the ranking problem in the following way.
In learning, it takes ranked lists of objects as instances and trains a
ranking function through the minimization of a listwise loss function
defined on the predicted list and the ground truth list. The
listwise approach captures the ranking problems in a conceptually more
natural way than pairwise, apart from the computational advantages(I am of
sure of the specific here).

For your other doubt on the Adarank: the inherent advantage of
Adarank(build on the Adaboost concept) is that it minimizes a loss
function directly defined on the performance measures with respect to "the
training data". It re-weighs the training instances while constructing weak
learners and in the end forms an ensemble of these weak-learners aiming for
the total performance to be "boosted". In the case of linear
regression, we
don't give different weights to different training tuples and build an
ensemble in the end: we work with just one model.
You could refer to the original paper here:
[link<http://research.microsoft.com/en-us/people/hangli/xu-sigir07.pdf>
].

Hope it helps! Do let me know if I have written anything incorrect above. :)

Refards,
Rishabh.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.xapian.org/pipermail/xapian-devel/attachments/20120401/13f7540f/attachment.html>

Parth Gupta

2012-Apr-02 10:36 UTC

head link

[Xapian-devel] Learning to Rank : GSoC 2012

Hello Ashish,


> This is in reference to "Learning to Rank" Project Idea. [I know,
i made
> the entry a bit late, but hope you are still in interest to help out]
> I am looking for suggestions to help me narrowing down the choices of
> algorithms. I had been readily surveying on the referred algorithms for the
> purpose of choosing the right one. I am mentioning here some of my doubts
> to discuss and make my concepts clear about the algorithms, so i should end
> up choosing the most suitable one. I am sure your input would be fruitful
> for me in effectively drafting my proposal.
> The listnet looks like computationally more complex compared to ranknet.
> So , is there any big advantage (in terms of improvement in ndcg/map) to
> move to listnet AND the optimization suggested in the paper to look for top
> one seems too simple. What will be the impact on accuracy and is there any
> way to speed up /optimize listnet?
>
Dont worry about the late entry. Okay if your question is just between
RankNet and ListNet, then I would say considering top k ranks for the
optimization, give your choices to identify the relevant documents than top
2 in RankNet. Yes the concept of ListNet is too simple but still effective.
Moreover, implementing ListNet automatically implements RankNet (if I am
not wrong choosing k=2, makes it RankNet).

> For adarank i didnt understand how is it superior compared to linear
> regression??
> I was also trying to search for open-source package for training listnet
> to save time and focus on more important aspects of library enhancements,
> but didn't get any suitable one. However, FANN is still in my to-check
> list, and meanwhile i was just experimenting to train list-net in octave by
> reusing some of my ml-class code (an online course by Professor Andrew Ng
> that i participated in). What is the quickest way to understand the
> modularity to be-involved while implementing any algorithm to serve the
> current need.
>
Linear Regression, can also be a choice, but it tries to fit the whole
training data as a singular value decomposition (SVD) problem and gives you
a weight vector. In the past it has been compared to other models and it
performs mostly bad with RankBoost and sometimes Adarank. But boosting
based techniques, have performed better on Yahoo dataset. Anyway, the
linear regression can be incorporated but it would alone be insufficient
for a  GSoC project.

Hope this helps. Thanks, Rishabh for the comments on the same.

Regards,
Parth.
>
> Thanks,
>
> Regards,
> Ashish Sadh
> B.Tech, final year student.
> Indian Institute of Information Technology, Allahabad, India.
> _______________________________________________
> Xapian-devel mailing list
> Xapian-devel at lists.xapian.org
> http://lists.xapian.org/mailman/listinfo/xapian-devel
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.xapian.org/pipermail/xapian-devel/attachments/20120402/2c74c737/attachment.html>

Seemingly Similar Threads

Search for more possibly parallel threads

Xapian devel - Apr 2012 - Learning to Rank : GSoC 2012

[Xapian-devel] Learning to Rank : GSoC 2012

[Xapian-devel] Learning to Rank : GSoC 2012

[Xapian-devel] Learning to Rank : GSoC 2012

Seemingly Similar Threads