Cong Ding
2019-Mar-26 17:37 UTC
Regarding GSoC 2019 "Clickstream Mining for Learning to Rank"
Hi, everyone! I'm Cong Ding, a second year undergraduate student from Peking University, Beijing(UTC +0800). I am interested in the project "Clickstream Mining for LeToR" for this year's GSoC. I have already taken the courses of basic math(calculus, linear algebra, probability and statistics & discrete mathematics), programming and computer system. Since January I have been working on an open source graph database system called gStore(written in C++) in Data Management Lab, and my work is to optimize the index structure, so I am familiar with C++ project. Also I'm taking a course in Python and data mining this semester. Honestly, I haven't used Xapian before. But I have been working on the project these days. I followed the GSoC guide of Xapian and read the papers listed on the website. You can see my learning notes on my blog(https://congding.info/blog/) and now I have a simple view of click models. I found myself more interested in the project as I went in deeper research. In fact, the work on this project have enlightened me to apply click model to my course project. BTW, I'm a new blogger and wish to develop the habit of blogging. I will try to keep updating this blog. :) Besides, I browsed the previous GSoC 2017 project and looked into the relevant code. I'm now trying to implement MLE algorithm for the SDBN click model. I hope to complete my first version before Friday. For future work, if we want to implement other click models, I believe a test algorithm to compare the performance of different models is important. As for end-to-end use of LeToR with Omega, I haven't got a clue yet. Hope to discuss with you! After implementing the MLE algorithm, I will focus on my GSoC proposal. Looking forward to coding with you this summer! Cheers, Cong Ding ----- Peking University Computer Science and Technology E-mail: congding at pku.edu.cn Mobile: +8615810029293 -----
Olly Betts
2019-Mar-30 03:36 UTC
Regarding GSoC 2019 "Clickstream Mining for Learning to Rank"
On Wed, Mar 27, 2019 at 01:37:05AM +0800, Cong Ding wrote:> Besides, I browsed the previous GSoC 2017 project and looked into the > relevant code. I'm now trying to implement MLE algorithm for the SDBN > click model. I hope to complete my first version before Friday. For > future work, if we want to implement other click models, I believe a > test algorithm to compare the performance of different models is > important. As for end-to-end use of LeToR with Omega, I haven't got a > clue yet. Hope to discuss with you!If we're going to have multiple models, it sounds useful to provide a way for someone trying to deploy this to check which works best for their situation. As for end-to-end use, it'll need some integration of xapian-letor in xapian-omega (so the results get re-ranked), and then it's mostly a matter of making sure there's a clear and fairly smooth process all the way through from logging clicks to turning the logs into relevance judgements to training the model to getting queries reranked. And writing some documentation so people can easily follow that process. Cheers, Olly