Displaying 20 results from an estimated 10000 matches similar to: "Doubt about GSOC proposal"
2013 Feb 19
2
Implementing tf-idf weighting scheme in Xapian
Hello guys.I just read up about tf-idf schemes and want to implement it in
Xapian (with some frequently used normalizations) as it will also give me a
good hang of implementing a weighting scheme before I start working on
implementing DFR schemes.
I read the following as references and I think Ive understood it well and
can write the hack :-
1.)
2014 Mar 22
2
[GSOC 2014] Indexing INEX dataset
For unsupervised approaches like BM25 this approach works well but letor
does not need special weighting for title in this form as it itself assigns
weights to title features separately.
But I see your concern it would be a problem when BM25 is used on the index
with this setup. Hence its preferable to take a note of this uplift in
title weight for xapian-letor and normalize it everywhere
2014 Mar 01
2
Complete GSOC idea
Hi everyone,
I am thinking of working on the
following ideas for my GSOC proposal based on my discussions with Olly and
my own understanding. Rather than focusing on an entire perftest module, I
have decided to focus on implementing performance tests for weighting
schemes based on a wikipedia dump and in addition to that, build a
framework to measure the
2014 Mar 04
2
Test Dataset for performance and accuracy analysis
Hi Parth,
I implemented DFR algorithms in Xapian as
a part of GSOC last year under the mentorship of Olly. This year, I want to
work on analyzing and optimizing the performance of the DFR algorithms and
comparing them with BM25.I also want to work on profiling the query
expansion schemes and test the relevance(precision and recall) / speed(time
taken) of the
2011 Mar 29
2
Participation in GSOC
Hi,
I'm Michael, I would like to participate in this year's Google Summer of
Code, and I picked Xapian as the project to code for.
Before writing a full proposal, I want to get in contact with the
community, as well as introducing myself and discuss my ideas for the
contribution to Xapian.
First of all I'd like to talk about my motivation.
I'm currently working on a webapp
2011 Mar 29
2
Participation in GSOC
Hi,
I'm Michael, I would like to participate in this year's Google Summer of
Code, and I picked Xapian as the project to code for.
Before writing a full proposal, I want to get in contact with the
community, as well as introducing myself and discuss my ideas for the
contribution to Xapian.
First of all I'd like to talk about my motivation.
I'm currently working on a webapp
2007 Mar 21
1
scoring question
Hi All
I have just realized that if I set a query like
'green jelly bean'
xapian will turn that query into
'green OR jelly OR bean'
This causes documents containing just one of the words to be considered
a 100% hit.
The behavior I would like to see is that each word gives a 33.3% hit, so
that a document containing all 3 words gets placed above a document with
only 1 or 2
2014 Nov 23
2
GSoc Project Idea Weighting Schemes (Ranking)
Hi,
I am Abhishek
Currently Xapian::Weight follows BM25 scheme, many models such as the
Divergence from Randomness (DfR) family of models, Unigram Language Model
and the Bi-gram Language Model implemented two years ago in GSoc 2012 yet
not merged to the master.
The new weighing schemes or improvement in implementing the previous models
to change the default scheme of BM25 from SMART with
2014 Mar 17
2
[GSOC 2014] Indexing INEX dataset
Hi Olly,
Wouldn't setting the weight of terms in title back to normal (e.g. 5 to 1)
by below line, automatically adjust the wdfs and field lengths?
indexer.index_text(title, 5, "S"); -> indexer.index_text(title, 1, "S");
if it does not then we should include that part in the patch too. I like to
create a patch for xapian-letor for resolving common code of xapian.
2013 Mar 11
1
Implementation of the PL2 weighting scheme of the DFR Framework
Hello guys.I am working on implementing the PL2 weighting scheme of the DFR
framework by Gianni Amati.
It uses the Poisson approximation of the Binomial as the probabilistic
model (P), the Laplace law of succession to calculate the after effect of
sampling or the risk gain (L) and within document frequency normalization
H2(2) (as proposed by Amati in his PHD thesis).
The formula for w(t,d) in
2019 Mar 19
3
Project Proposal in GSoC 2019
Hi All,
I am interested in applying for the two projects listed in the Xapian Gsoc
2019 project idealist: "Learning to Rank Stabilisation" and "Weighting
Schemes". I have downloaded the codebase and going through some of the
commits related to Letor API, BM25, and DFR weighting schemes. Can anyone
tell me how to write about the formal proposal for the above-mentioned
projects?
2012 Mar 31
1
Project: Posting list encoding improvements
Hi Xapianers:
My name is Weixian Zhou, Computer Science student of University at Buffalo,
State University of New York. I am interested in the project of posting
list encoding improvements and weighting schemes. I have some questions
toward them.
1) After read the comments in brass_postlist.cc, I am still not very clear
about the detailed structure of postings list. If you can provide some
simple
2017 Mar 16
2
GSoC-2017 Introduction and Project Discussion
Hello,
I'm Shivang Bansal, a 3rd year Computer Science Engineering undergraduate
at Institute of Engineering & Technology in Lucknow, India. This mail is an
expression of my interest for Google Summer of Code program of this year. I
want to apologize for getting in so late. Actually I would have contacted
earlier, but sudden demise of my Grandfather disabled me in doing so.
I am
2017 Mar 05
3
GSoc 2017 Introduction(Weighting Schemes)
Hello Everyone,
I am a second year graduate student at IIIT-Bangalore and my interest is in
the field of Information Retrieval. I have successfully compiled Xapian
from source and have implemented some examples. While going through the
project list Weighting Schemes project is the one I was looking to
contribute to. So i went through the xapian-core/weight where most of the
schemes are already
2012 Jul 17
1
Can not use custom weight scheme with python binding
Hi, I'm trying to use custom weight with python binding.
My test code is like this.
class TinkerWeight(xapian.Weight):
def __init__(self):
pass
def name(self):
return "Tinker"
def serialize(self):
return ""
def get_sumpart(*args):
return 1
def get_maxpart(*args):
return 1
def get_sumextra(*args):
2016 Mar 10
2
Introduction and Doubts
Tf-idf is most used used weighting scheme is easy to understand and has
been used in other frameworks like lucene and many other places.
okapi bm25(implemented in xapian) is theoretically better/improved measure
than tf-idf and
i am looking into various other weighting scheme which are there in xapian
or can be implemented like TF-ICF(term frequecy inverse corpus
frequency),TF-RF(term
2013 Aug 25
2
Backend for Lucene format indexes-How to get doclength
On Tue, Aug 20, 2013 at 07:28:42PM +0800, jiangwen jiang wrote:
> I think norm(t, d) in Lucene can used to caculate the number which is
> similar to doc length(see norm(t,d) in
> http://lucene.apache.org/core/3_5_0/api/all/org/apache/lucene/search/Similarity.html#formula_norm).
It sounds similar (especially if document and field boosts aren't in use),
though some places may rely on
2013 Mar 08
2
Gsoc-2013
Hi,
I am Chinmay Naik, an undergraduate in Computer Science at Bangalore
Institute of Technology, Bangalore.
I am an experienced programmer and good with C,C++,Python,Java,OpenGL and
would love to participate in Gsoc-13.
>From the ideas listed, i am interested to work on the project "posting list
encoding improvements".
I am a newbie to Xapian but would like to get involved and get a
2013 Mar 15
1
DFR framework as a GSOC project
Hey guys,hi.:) I've finished implementing the PL2 scheme . The bounds I
have implemented for it are as good as I could, given the nature of the
scheme and my mathematical skills.However,tight bounds for other named DFR
schemes will be easier to implement because their forumlas are quite
simpler compared to PL2 . Will send in a pull request in a couple of days
once I'm done with the tests
2013 Apr 27
1
[LLVMdev] GSoC Proposal: Inter-Procedure Program Slicing in LLVM
Hi all,
This is a GSoC 2013 proposal for LLVM project. Please see the formatted version at here: http://pacman.cs.tsinghua.edu.cn/~liuml07/files/gsoc2013-proposal-program-slicing.pdf
Program slicing has been used in many applications, the criteria of which is a pair of statement and variables. I would like to write an inter-procedural program slicing pass in LLVM, which is able to calculate C