thr3ads.net - similar to: "Weighting Schemes: Evaluation results"

Displaying 20 results from an estimated 500 matches similar to: "Weighting Schemes: Evaluation results"

Omega: Missing support for newer weighting schemes

2017 Apr 08

Omega: Missing support for newer weighting schemes

Hi, In my explorations of Omega codebase, I have found that Omega is currently missing support for newer weighting schemes added in 1.4.1 (BM25+, PL2+, Dir+). I'd submit a PR addressing that but as I think I might be missing something so just wanted to check if there's a particular reason for that? P.S. Finally back after a long week. Been eagerly waiting for a weekend since the

Weighting Schemes -- Project Progress

2016 Jun 10

Weighting Schemes -- Project Progress

Hello everyone, I have been working on adding support for BM25+ weighting function from the last couple of weeks. Initially, I considered modifying bm25weight.cc to add support for BM25+ function without disturbing functionalities of BM25. But that didn't work out very well. A day or two was spent trying to refactor and debug the same code. Later, I took another approach following the

Weighting Schemes: Evaluation results

2016 Aug 07

Weighting Schemes: Evaluation results

Hi, Evaluation of pivoted normalization ("PPP") of tf-idf weighting scheme is also complete now. I have also evaluated the default tf-idf normalization ("ntn") and other normalizations combinations involving pivoted normalization in wdfn, idfn and wtn component as "Pxx", "xPx" and "xxP" normalization strings respectively to have a clear idea about

Omega: Missing support for newer weighting schemes

2017 Apr 08

Omega: Missing support for newer weighting schemes

> Hi, Vivek — there isn't any particular reason that I'm aware of. It's > probably worth pointing (in the omegascript documentation) to the part of > the getting started guide which talks about the different weighting schemes If there isn't any reason then I'd like to send in a patch adding support for those weighting schemes in weight.cc and I agree omegascript

Weighting Schemes: Evaluation results

2016 Jul 25

Weighting Schemes: Evaluation results

Hi James, > We probably don't want them committed in git where they're evaluation > runs (because we can recreate them); a gist might be more appropriate. Sorry, I have moved results files over to gist for each individual weighting scheme. Link: https://gist.github.com/ivmarkp/secret > I can't tell, but are some of those files from FIRE? If so, they > shouldn't be

Weighting Schemes: Evaluation results

2016 Jul 28

Weighting Schemes: Evaluation results

Ah. If FIRE doesn't have something that can show this suitably, then > maybe Parth can advise on access to TREC, as I know he's used some of > them in the past. > ?I can say FIRE is also a reliable source but INEX/TREC are better. INEX can give you free access and TREC is not freely available. I had used INEX for xapian in the past and some details are here:

Omega: Missing support for newer weighting schemes

2017 Apr 09

Omega: Missing support for newer weighting schemes

On Sun, Apr 09, 2017 at 11:34:07PM +0530, Vivek Pal wrote: > > Each scheme already has a human-readable name, and Xapian::Registry > > can map that to an "examplar" object of the right type, so we > > could take a string like "bm25 1 0.8", see the first word is "bm25" > > and get a BM25Weight object, then call parse_params("1 0.8") on

Omega: Missing support for newer weighting schemes

2017 Apr 08

Omega: Missing support for newer weighting schemes

On Sat, Apr 08, 2017 at 09:11:22PM +0100, James Aylett wrote: > On 8 Apr 2017, at 19:15, Vivek Pal <vivekpal.dtu at gmail.com> wrote: > > >> and the details of which weighting schemes were available in which version > >> isn't a key part of the $set command itself. > > > > Do you suggest dropping that piece of information out? Since the reason behind

GSoC-2017 Introduction and Project Discussion

2017 Mar 16

GSoC-2017 Introduction and Project Discussion

Hello, I'm Shivang Bansal, a 3rd year Computer Science Engineering undergraduate at Institute of Engineering & Technology in Lucknow, India. This mail is an expression of my interest for Google Summer of Code program of this year. I want to apologize for getting in so late. Actually I would have contacted earlier, but sudden demise of my Grandfather disabled me in doing so. I am

GSoc Project Idea Weighting Schemes (Ranking)

2014 Nov 23

GSoc Project Idea Weighting Schemes (Ranking)

Hi, I am Abhishek Currently Xapian::Weight follows BM25 scheme, many models such as the Divergence from Randomness (DfR) family of models, Unigram Language Model and the Bi-gram Language Model implemented two years ago in GSoc 2012 yet not merged to the master. The new weighing schemes or improvement in implementing the previous models to change the default scheme of BM25 from SMART with

Omega: Missing support for newer weighting schemes

2017 Apr 08

Omega: Missing support for newer weighting schemes

> It may be worth splitting that part of the $set documentation out into its > own section somehow, because it's getting a bit long - Undoubtedly; $set command has the longest section on the documentation page :) But it would be hard splitting that up because the documentation is organised in a way that each command is really contained in its own specific section. > and the details

Weighting the author of a doc when that term can also appear as a frequent term in other docs

2017 Sep 28

Weighting the author of a doc when that term can also appear as a frequent term in other docs

We have a corpus of academic papers. Sometimes it happens that there is an academic controversy and one paper is a response or rebuttal to another paper. The name of the author of the first paper may appear many times in the second paper. So in light of this, how should we set our weight on the author field? Here is an example: http://www.nber.org/papers/w11215 in which the term

Omega: Missing support for newer weighting schemes

2017 Apr 12

Omega: Missing support for newer weighting schemes

> Each scheme already has a human-readable name, and Xapian::Registry > can map that to an "examplar" object of the right type, so we > could take a string like "bm25 1 0.8", see the first word is "bm25" > and get a BM25Weight object, then call parse_params("1 0.8") on it to > create the correct Weight object (broadly similar to how

Introduction and Doubts

2016 Mar 10

Introduction and Doubts

Tf-idf is most used used weighting scheme is easy to understand and has been used in other frameworks like lucene and many other places. okapi bm25(implemented in xapian) is theoretically better/improved measure than tf-idf and i am looking into various other weighting scheme which are there in xapian or can be implemented like TF-ICF(term frequecy inverse corpus frequency),TF-RF(term

Is it possible to reset the parameters in BM25 each time a new query enters?

2011 Feb 18

Is it possible to reset the parameters in BM25 each time a new query enters?

Hi guys, I'm trying to improve the search results of our collection by tuning the parameters in the BM25 weighting schema. Since our collection includes several databases, such as for pictures, websites, etc., I would like to use different values of the same schema to calculate the weights. Yet, rebuilding each time after the change was done to the head file seems not an optimal approach and

Relevance, weighting and searching by specifically weighted text

2011 Jun 01

Relevance, weighting and searching by specifically weighted text

Hi guys In our implementation of Xapian for one of our sites, we index the title, subtitle, summary and table of contents of around 200,000 products on ReportBuyer.com. When we create each Xapian doc to index this information, we apply a weighting to each of these 'fields' and add these to the doc using index_text with the second parameter passing in a weighting. I've been asked if

Can not use custom weight scheme with python binding

2012 Jul 17

Can not use custom weight scheme with python binding

Hi, I'm trying to use custom weight with python binding. My test code is like this. class TinkerWeight(xapian.Weight): def __init__(self): pass def name(self): return "Tinker" def serialize(self): return "" def get_sumpart(*args): return 1 def get_maxpart(*args): return 1 def get_sumextra(*args):

[GSOC 2014] Indexing INEX dataset

2014 Mar 22

[GSOC 2014] Indexing INEX dataset

For unsupervised approaches like BM25 this approach works well but letor does not need special weighting for title in this form as it itself assigns weights to title features separately. But I see your concern it would be a problem when BM25 is used on the index with this setup. Hence its preferable to take a note of this uplift in title weight for xapian-letor and normalize it everywhere

Weighting Schemes: Implementing Piv+ Normalization

2016 Jul 29

Weighting Schemes: Implementing Piv+ Normalization

> `ptr` is, if I inferred correctly, a `const char *`. (I'm not sure, > because I don't know why you're incrementing it. Please push your code > to github if you need further help so people can see the entire > context of your changes.) I've pushed all the changes I made so far https://github.com/xapian/xapian/compare/master...ivmarkp:piv+?diff=split&name=piv%2B

Backend for Lucene format indexes-How to get doclength

2013 Aug 26

Backend for Lucene format indexes-How to get doclength

On Mon, Aug 26, 2013 at 09:41:07AM +0800, jiangwen jiang wrote: > > For now, using weighting schemes which don't use document length is > > probably the simplest answer. > > There's tf-idf weighting scheme on svn master, is it suitable for lucene > backend? Yes - TfIdfWeight doesn't ever use the document length (at least with the normalisations currently

similar to: Weighting Schemes: Evaluation results