thr3ads.net - similar to: "Better parsing of BM25 parameters in Omega"

Displaying 20 results from an estimated 6000 matches similar to: "Better parsing of BM25 parameters in Omega"

2013 Apr 11

Added support for TfIdf to Omega

Hello guys,I have added code for tfidf to the weight.cc file in omega/ . Here is the patch : - https://github.com/aarshkshah1992/xapian/commit/5ff41a15f574e6780cc61e67e7f3da3d97ff4ec8 It compiles well and I think it'll work well. Here's the link to the documentation file omegascript.rst where I've added tfidf.

Need help as Pl2 tests not performing as expected

2013 Mar 27

Need help as Pl2 tests not performing as expected

Hello guys. I just ran the updated tests for PL2 and they are not giving the mset order I expect.Now,the thing is, dfr's behavior is a bit hard to predict and so even if I expect a particular order ,it may give another order and still be correct.So,the only way to write correct tests for PL2 is to manually calculate the weight of the documents to decide the expected order.For that,I need to

Omega: Missing support for newer weighting schemes

2017 Apr 08

Omega: Missing support for newer weighting schemes

On Sat, Apr 08, 2017 at 09:11:22PM +0100, James Aylett wrote: > On 8 Apr 2017, at 19:15, Vivek Pal <vivekpal.dtu at gmail.com> wrote: > > >> and the details of which weighting schemes were available in which version > >> isn't a key part of the $set command itself. > > > > Do you suggest dropping that piece of information out? Since the reason behind

Omega: Missing support for newer weighting schemes

2017 Apr 09

Omega: Missing support for newer weighting schemes

On Sun, Apr 09, 2017 at 11:34:07PM +0530, Vivek Pal wrote: > > Each scheme already has a human-readable name, and Xapian::Registry > > can map that to an "examplar" object of the right type, so we > > could take a string like "bm25 1 0.8", see the first word is "bm25" > > and get a BM25Weight object, then call parse_params("1 0.8") on

Test Dataset for performance and accuracy analysis

2014 Mar 04

Test Dataset for performance and accuracy analysis

Hi Parth, I implemented DFR algorithms in Xapian as a part of GSOC last year under the mentorship of Olly. This year, I want to work on analyzing and optimizing the performance of the DFR algorithms and comparing them with BM25.I also want to work on profiling the query expansion schemes and test the relevance(precision and recall) / speed(time taken) of the

Omega: Missing support for newer weighting schemes

2017 Apr 12

Omega: Missing support for newer weighting schemes

> Each scheme already has a human-readable name, and Xapian::Registry > can map that to an "examplar" object of the right type, so we > could take a string like "bm25 1 0.8", see the first word is "bm25" > and get a BM25Weight object, then call parse_params("1 0.8") on it to > create the correct Weight object (broadly similar to how

Merging of the TfIdf patch

2013 Mar 26

Merging of the TfIdf patch

Hello Guys. I have updated the code,tests,documentation,makefile entries and the registry entry of the* *TfIdf patch as per the feedback.Please do let me know if any additional changes are required before the patch can be merged, -Regards -Aarsh On Sun, Mar 3, 2013 at 2:50 PM, aarsh shah <aarshkshah1992 at gmail.com> wrote: > Hello guys.I have sent a pull request for the code and

make error in xapian-application/omega (jiangwen jiang)

2013 Feb 05

make error in xapian-application/omega (jiangwen jiang)

Hey Hi jiangwen,hope you are doing fine :) You need some libraries and tools installed on your system before you build Xapian and omega.The complete list can be found in the "Building from svn or git" section of this document.:- http://svn.xapian.org/trunk/xapian-core/HACKING?view=co Make sure you have all the required tools installed and it will work fine. Please let me know if you

Is it possible to reset the parameters in BM25 each time a new query enters?

2011 Feb 18

Is it possible to reset the parameters in BM25 each time a new query enters?

Hi guys, I'm trying to improve the search results of our collection by tuning the parameters in the BM25 weighting schema. Since our collection includes several databases, such as for pictures, websites, etc., I would like to use different values of the same schema to calculate the weights. Yet, rebuilding each time after the change was done to the head file seems not an optimal approach and

Omega: Missing support for newer weighting schemes

2017 Apr 13

Omega: Missing support for newer weighting schemes

On Mon, Apr 10, 2017 at 11:47:36PM +0530, Vivek Pal wrote: > > No, use Xapian::Registry to find the weighting scheme from the name > > like how Weight::unserialise() does (otherwise every caller would need > > code similar to that above). > > Okay, I looked into Xapian::Registry and it seems you are referring to using > the get_weighting_scheme method? (which expects a

Added a python example to the community page

2013 Jan 27

Added a python example to the community page

Hey guys,I have added a python indexer example to the SampleCode page of our wiki.Please do have a look.The code can also be found here :- https://github.com/aarshkshah1992/xapian/blob/efcf443527b74326119bbc0935fc41a002ce60db/xapian-bindings/python/docs/examples/simpleindexgrep.py/ Thanks :) -Regards -Aarsh -------------- next part -------------- An HTML attachment was scrubbed... URL:

Added code and tests for the tf-idf weighting scheme.

2013 Mar 03

Added code and tests for the tf-idf weighting scheme.

Hello guys.I have sent a pull request for the code and tests of the Tf-Idf weighting scheme. Please do let me know if any changes are required.Meanwhile,Ill begin working on implementing normalizations which require additional statistics and on the DFR schemes. https://github.com/xapian/xapian/pull/6 On Tue, Feb 26, 2013 at 5:30 PM, <xapian-devel-request at lists.xapian.org>wrote: >

Ideas for allowing specification of weighing scheme for Eset

2013 Feb 07

Ideas for allowing specification of weighing scheme for Eset

Hey guys ,Hi :) I am working on a hack which will allow the user to specify a weighing scheme (along with the parameters , if he does not not want to use the default values) to build the Eset (rather than using the hard coded TradWeight scheme with default k=1 ) as Olly had suggested that we can probably get better terms (a more relevant Eset) for query expansion if we use say something

Sent a pull request for testing TradWeight using an Rset.

2013 Mar 03

Sent a pull request for testing TradWeight using an Rset.

Hello guys.As discussed on IRC,I have sent a pull request for a test for testing TradWeight with an Rset. On Fri, Mar 1, 2013 at 5:30 PM, <xapian-devel-request at lists.xapian.org>wrote: > Send Xapian-devel mailing list submissions to > xapian-devel at lists.xapian.org > > To subscribe or unsubscribe via the World Wide Web, visit >

floating-point issues with set_sort_by_relevance_then_value? (1.2.3, BM25 k1=0)

2010 Nov 01

floating-point issues with set_sort_by_relevance_then_value? (1.2.3, BM25 k1=0)

I am using BM25 with k1=0 and min_normlen=1 to get weights unaffected by document length and term frequency in the document (min_normlen=1 isn't necessary I guess) and am expecting single-term weights to be identical for all matches. I have added a document value to steer such general search queries and it works fine, except that for some search terms, I get results like:

Implementing tf-idf weighting scheme in Xapian

2013 Feb 19

Implementing tf-idf weighting scheme in Xapian

Hello guys.I just read up about tf-idf schemes and want to implement it in Xapian (with some frequently used normalizations) as it will also give me a good hang of implementing a weighting scheme before I start working on implementing DFR schemes. I read the following as references and I think Ive understood it well and can write the hack :- 1.)

Starting work on Perf Test Module

2014 May 14

Starting work on Perf Test Module

Hello, I am beginning work on the perf test module. The initial steps that I aim to accomplish are :- -> Download the wikipedia dumps for multiple languages . -> Write python scripts to tokenize the dump (will probably use something like nltk which has powerful inbuilt tokenizers) -> Discuss and finalize the design of the search and query expansion perf tests as I want to complete them

Major Mistake in pL2 tests in the pull request

2013 Mar 27

Major Mistake in pL2 tests in the pull request

Hello guys.I just realized that Ive not set the weighting scheme to PL2 in the tests for PL2 and so a default weighting scheme of BM25 is used. I am extremely sorry for this and am updating the tests by setting the weighting scheme to PL2. -Regards -Aarsh -------------- next part -------------- An HTML attachment was scrubbed... URL:

Xapian now has Divergence from Randomness schemes

2013 Jul 15

Xapian now has Divergence from Randomness schemes

Hello guys, you'll will be happy to know that the current codebase now includes the divergence from randomness weighting schemes which are known to outperform a lot of known weighting schemes such as BM25. Thanks to the amazing mentorship of Olly Betts and Dan Colish, our search results will now be better than before and Xapian will be more preferred in the research community than it was

Xapian now has Divergence from Randomness schemes

2013 Jul 15

Xapian now has Divergence from Randomness schemes

similar to: Better parsing of BM25 parameters in Omega