thr3ads.net - similar to: "scoring question"

Displaying 20 results from an estimated 6000 matches similar to: "scoring question"

Can not use custom weight scheme with python binding

2012 Jul 17

Can not use custom weight scheme with python binding

Hi, I'm trying to use custom weight with python binding. My test code is like this. class TinkerWeight(xapian.Weight): def __init__(self): pass def name(self): return "Tinker" def serialize(self): return "" def get_sumpart(*args): return 1 def get_maxpart(*args): return 1 def get_sumextra(*args):

Implementing tf-idf weighting scheme in Xapian

2013 Feb 19

Implementing tf-idf weighting scheme in Xapian

Hello guys.I just read up about tf-idf schemes and want to implement it in Xapian (with some frequently used normalizations) as it will also give me a good hang of implementing a weighting scheme before I start working on implementing DFR schemes. I read the following as references and I think Ive understood it well and can write the hack :- 1.)

Doubt about GSOC proposal

2013 Apr 01

Doubt about GSOC proposal

Hello guys.I have begun work on writing my proposal as discussed on IRC and will submit a draft in a couple of days so that I can make it detailed and refine it after getting feedback. I wanted to know about the number of weeks a proposal should cover and also,is it okay if I set aside a buffer week somewhere in the middle of the summer for something like cleaning the code,working on the

[GSOC 2014] Indexing INEX dataset

2014 Mar 22

[GSOC 2014] Indexing INEX dataset

For unsupervised approaches like BM25 this approach works well but letor does not need special weighting for title in this form as it itself assigns weights to title features separately. But I see your concern it would be a problem when BM25 is used on the index with this setup. Hence its preferable to take a note of this uplift in title weight for xapian-letor and normalize it everywhere

Project: Posting list encoding improvements

2012 Mar 31

Project: Posting list encoding improvements

Hi Xapianers: My name is Weixian Zhou, Computer Science student of University at Buffalo, State University of New York. I am interested in the project of posting list encoding improvements and weighting schemes. I have some questions toward them. 1) After read the comments in brass_postlist.cc, I am still not very clear about the detailed structure of postings list. If you can provide some simple

Backend for Lucene format indexes-How to get doclength

2013 Aug 25

Backend for Lucene format indexes-How to get doclength

On Tue, Aug 20, 2013 at 07:28:42PM +0800, jiangwen jiang wrote: > I think norm(t, d) in Lucene can used to caculate the number which is > similar to doc length(see norm(t,d) in > http://lucene.apache.org/core/3_5_0/api/all/org/apache/lucene/search/Similarity.html#formula_norm). It sounds similar (especially if document and field boosts aren't in use), though some places may rely on

Participation in GSOC

2011 Mar 29

Participation in GSOC

Hi, I'm Michael, I would like to participate in this year's Google Summer of Code, and I picked Xapian as the project to code for. Before writing a full proposal, I want to get in contact with the community, as well as introducing myself and discuss my ideas for the contribution to Xapian. First of all I'd like to talk about my motivation. I'm currently working on a webapp

Participation in GSOC

2011 Mar 29

Participation in GSOC

mount: only root can do that

2000 Sep 08

mount: only root can do that

I am having a problem letting users mount there samba shares. The root user has no problem mounting . But regular users get the error "mount: only root can do that " when trying to execute the following. mount -rw -t smbfs //jelly-bean-iii/E /home/ca43887/E Any help is appreciated. Thanks, Charles

adaptive query scoring

2006 May 15

adaptive query scoring

Hi all Is there a way to do adaptive query scoring (as in popular results returned by a query should get more weight because they are getting clicked more often) in xapian? Is this what the rset class should be used for? I could write a php app to do adaptive results scoring for separate words (just recording the clicks and then have a cron:ned script add weight to the document_id:s for the

[LLVMdev] CGO Tutorial on MCLinker and LLVM 2013 - Call for Participation

2012 Dec 14

[LLVMdev] CGO Tutorial on MCLinker and LLVM 2013 - Call for Participation

Dear LLVM user and developer, We get a chance to give a tutorial on LLVM and MCLinker. The tutorial will be co-located with CGO 2013 on Feb. 24 (Sunday morning) in Shenzhen, China. If you are also interesting in these topics, welcome to join the tutorial! Here is a website of the tutorial: http://code.google.com/p/mclinker/ We're also looking for additional presenters to share a

How to get the serialise score returned in Xapian::KeyMaker->operator().

2017 Dec 15

How to get the serialise score returned in Xapian::KeyMaker->operator().

HI, all, I am a user of Xapian, and now I have a problem in using it. After using boolean terms to get some candidates of documents (still too much), we want sorted them by self-defined function which is used in Xapian::KeyMaker->operator(). But how can I get the serialise score in Xapian::MSetIterator object. c++ code likes this: class SortKeyMaker : public Xapian::KeyMaker { std::string

[GSOC 2014] Indexing INEX dataset

2014 Mar 17

[GSOC 2014] Indexing INEX dataset

Hi Olly, Wouldn't setting the weight of terms in title back to normal (e.g. 5 to 1) by below line, automatically adjust the wdfs and field lengths? indexer.index_text(title, 5, "S"); -> indexer.index_text(title, 1, "S"); if it does not then we should include that part in the patch too. I like to create a patch for xapian-letor for resolving common code of xapian.

Omega: Missing support for newer weighting schemes

2017 Apr 09

Omega: Missing support for newer weighting schemes

On Sun, Apr 09, 2017 at 11:34:07PM +0530, Vivek Pal wrote: > > Each scheme already has a human-readable name, and Xapian::Registry > > can map that to an "examplar" object of the right type, so we > > could take a string like "bm25 1 0.8", see the first word is "bm25" > > and get a BM25Weight object, then call parse_params("1 0.8") on

Backend for Lucene format indexes-How to get doclength

2013 Aug 26

Backend for Lucene format indexes-How to get doclength

On Mon, Aug 26, 2013 at 09:41:07AM +0800, jiangwen jiang wrote: > > For now, using weighting schemes which don't use document length is > > probably the simplest answer. > > There's tf-idf weighting scheme on svn master, is it suitable for lucene > backend? Yes - TfIdfWeight doesn't ever use the document length (at least with the normalisations currently

GSoc Project Idea Weighting Schemes (Ranking)

2014 Nov 23

GSoc Project Idea Weighting Schemes (Ranking)

Hi, I am Abhishek Currently Xapian::Weight follows BM25 scheme, many models such as the Divergence from Randomness (DfR) family of models, Unigram Language Model and the Bi-gram Language Model implemented two years ago in GSoc 2012 yet not merged to the master. The new weighing schemes or improvement in implementing the previous models to change the default scheme of BM25 from SMART with

Omega: Missing support for newer weighting schemes

2017 Apr 08

Omega: Missing support for newer weighting schemes

On Sat, Apr 08, 2017 at 09:11:22PM +0100, James Aylett wrote: > On 8 Apr 2017, at 19:15, Vivek Pal <vivekpal.dtu at gmail.com> wrote: > > >> and the details of which weighting schemes were available in which version > >> isn't a key part of the $set command itself. > > > > Do you suggest dropping that piece of information out? Since the reason behind

Project Proposal in GSoC 2019

2019 Mar 19

Project Proposal in GSoC 2019

Hi All, I am interested in applying for the two projects listed in the Xapian Gsoc 2019 project idealist: "Learning to Rank Stabilisation" and "Weighting Schemes". I have downloaded the codebase and going through some of the commits related to Letor API, BM25, and DFR weighting schemes. Can anyone tell me how to write about the formal proposal for the above-mentioned projects?

Backend for Lucene format indexes-How to get doclength

2013 Jun 16

Backend for Lucene format indexes-How to get doclength

Hi, all: I have wrote a demo patch for Backend for Lucene format indexes, Lucene version is 3.6.2. http://lucene.apache.org/core/3_6_2/fileformats.html Now, this demo patch just support the basic features in Lucene. Compound File(.cfs/.cfe)?term vector(.tvx/.tvd/.tvf) delete document(.del) are not supported, skip list in .fdx is not supported too example/quest.cc is used to test this demo.

Weighting the author of a doc when that term can also appear as a frequent term in other docs

2017 Sep 28

Weighting the author of a doc when that term can also appear as a frequent term in other docs

We have a corpus of academic papers. Sometimes it happens that there is an academic controversy and one paper is a response or rebuttal to another paper. The name of the author of the first paper may appear many times in the second paper. So in light of this, how should we set our weight on the author field? Here is an example: http://www.nber.org/papers/w11215 in which the term

similar to: scoring question