similar to: Merging of the TfIdf patch

Displaying 20 results from an estimated 600 matches similar to: "Merging of the TfIdf patch"

2013 Mar 03
0
Added code and tests for the tf-idf weighting scheme.
Hello guys.I have sent a pull request for the code and tests of the Tf-Idf weighting scheme. Please do let me know if any changes are required.Meanwhile,Ill begin working on implementing normalizations which require additional statistics and on the DFR schemes. https://github.com/xapian/xapian/pull/6 On Tue, Feb 26, 2013 at 5:30 PM, <xapian-devel-request at lists.xapian.org>wrote: >
2013 Apr 11
1
Added support for TfIdf to Omega
Hello guys,I have added code for tfidf to the weight.cc file in omega/ . Here is the patch : - https://github.com/aarshkshah1992/xapian/commit/5ff41a15f574e6780cc61e67e7f3da3d97ff4ec8 It compiles well and I think it'll work well. Here's the link to the documentation file omegascript.rst where I've added tfidf.
2013 Mar 05
0
Please take a look at the TfIdf patch
Hello guys, :) Please do take a look at the pull request for the TfIdf patch Ive sent because I want to start working on writing DFR schemes for us and want to incorporate the feedback into making a good hack for the DFR schemes.The patch incorporates all normalizations possible with our current statistics and passed all the tests I wrote for it.Have also attached the tests with the pull request.
2013 Feb 25
0
Sent a pull request for the Tf-Idf Weighting scheme
Hello guys :) I have sent a pull request for the Tf-Idf Weighting scheme incorporating as many normalizations as I could with the help of statistics currently available from Xapian::Weight . Please let me know what you'll think about it. I used the weighting scheme in a simple searcher and it did a fine job with it. I have no experience with writing tests for features like this.Please give me
2013 Feb 19
2
Implementing tf-idf weighting scheme in Xapian
Hello guys.I just read up about tf-idf schemes and want to implement it in Xapian (with some frequently used normalizations) as it will also give me a good hang of implementing a weighting scheme before I start working on implementing DFR schemes. I read the following as references and I think Ive understood it well and can write the hack :- 1.)
2016 Jul 27
2
Weighting Schemes: Implementing Piv+ Normalization
Hi, I have added support for Piv normalization in Tf-Idf weighting scheme as a intermediate step to implementing the support for Piv+ normalization. All tests pass. But I'm running into some issues with Piv+ normalization. In the Piv+ formula , there are two parameters (s and delta) that control the weight assigned. I think the way I'm serialising and unserialising these parameters has
2013 Aug 26
2
Backend for Lucene format indexes-How to get doclength
On Mon, Aug 26, 2013 at 09:41:07AM +0800, jiangwen jiang wrote: > > For now, using weighting schemes which don't use document length is > > probably the simplest answer. > > There's tf-idf weighting scheme on svn master, is it suitable for lucene > backend? Yes - TfIdfWeight doesn't ever use the document length (at least with the normalisations currently
2012 Mar 27
1
About the projects of "Ranking" for GSoC 2012
Hello, I am Mohiuddin Abdul Qader, final year student from dept of CSE in Bangladesh University of Engineering & Technology(BUET). My major was artificial intelligence & i finished my course on Machine Learning and Pattern Recognition this year. I am very keen to contribute in open source community. I have just completed my thesis on 'Location Based Structured Web Search'. For the
2013 Mar 20
0
Registering a weighting scheme with Xapian
Hello guys,I've modified the TfIdf patch as per the feedback I got on it and have added the code to the pull request. Please do have a look and let me now what you'll think. https://github.com/xapian/xapian/pull/6 Also,I read somewhere that I need to register this weighting scheme with Xapian. Please can you'll throw some light on that ? -Regards -Aarsh -------------- next part
2016 Mar 10
2
Introduction and Doubts
I was not sharing it on maling list because i thought that someone can use all ideas i proposed in their GSOC proposal. Surely i will contribute to xapian project. sorry if that was against the rules The algorithm is not developed by me but after having much research on various clustering techniques. I found that there is a new algorithm called CLUBS(Clustering Using Binary Splitting) which gives
2016 Mar 10
2
Introduction and Doubts
Tf-idf is most used used weighting scheme is easy to understand and has been used in other frameworks like lucene and many other places. okapi bm25(implemented in xapian) is theoretically better/improved measure than tf-idf and i am looking into various other weighting scheme which are there in xapian or can be implemented like TF-ICF(term frequecy inverse corpus frequency),TF-RF(term
2013 Mar 27
1
Need help as Pl2 tests not performing as expected
Hello guys. I just ran the updated tests for PL2 and they are not giving the mset order I expect.Now,the thing is, dfr's behavior is a bit hard to predict and so even if I expect a particular order ,it may give another order and still be correct.So,the only way to write correct tests for PL2 is to manually calculate the weight of the documents to decide the expected order.For that,I need to
2016 Aug 07
2
Weighting Schemes: Evaluation results
Hi, Evaluation of pivoted normalization ("PPP") of tf-idf weighting scheme is also complete now. I have also evaluated the default tf-idf normalization ("ntn") and other normalizations combinations involving pivoted normalization in wdfn, idfn and wtn component as "Pxx", "xPx" and "xxP" normalization strings respectively to have a clear idea about
2013 Mar 03
0
Sent a pull request for testing TradWeight using an Rset.
Hello guys.As discussed on IRC,I have sent a pull request for a test for testing TradWeight with an Rset. On Fri, Mar 1, 2013 at 5:30 PM, <xapian-devel-request at lists.xapian.org>wrote: > Send Xapian-devel mailing list submissions to > xapian-devel at lists.xapian.org > > To subscribe or unsubscribe via the World Wide Web, visit >
2013 Jan 27
1
Added a python example to the community page
Hey guys,I have added a python indexer example to the SampleCode page of our wiki.Please do have a look.The code can also be found here :- https://github.com/aarshkshah1992/xapian/blob/efcf443527b74326119bbc0935fc41a002ce60db/xapian-bindings/python/docs/examples/simpleindexgrep.py/ Thanks :) -Regards -Aarsh -------------- next part -------------- An HTML attachment was scrubbed... URL:
2013 Sep 02
2
Backend for Lucene format indexes-How to get doclength
On Mon, Sep 02, 2013 at 09:21:48AM +0800, jiangwen jiang wrote: > TfIdfWeight and BM25(b=0) also need wdf_upper_bound, it is not exists in > Lucene backends. If you don't provide an implementation of wdf_upper_bound(), the default is to use the collection frequency of the term, so provided that information is available in the lucene files, the lack of wdf_upper_bound information
2013 May 15
0
Better parsing of BM25 parameters in Omega
Hello guys, as discussed on IRC, I have written some code for better parsing of BM25 parameters in Omega. If no parameters are specified ,it defaults all of them. However, if there some are specified and some are not or if the invalid values are given for any of them,it throws an error. https://github.com/aarshkshah1992/xapian/commit/ac0a11f5d8ff975fad1e96e63764eab9b04dfcfb -Regards -Aarsh
2009 Jan 27
0
samba, ADS and privileges management
Hello list. I once had a samba server acting as a PDC, a mapping between my NT 'Domain admins' and Unix 'admins' groups, and everything worked perfectly. Now I got a new shiny samba server acting as a print server only, member of an AD domain, and I can't have the members of 'Domain admins' group manage printing drivers on the server, whereas the Administrator
2013 Mar 02
2
Getting Started
Hello all, I am Mohd Azeem. I want to contribute in Xapian but I am a newbie here. I wonder if anyone could help me in getting started with Xapian. I have some basic knowledge of IR and implemented TF*IDF and PageRank schemes, and also implemented Inverted Index and Web-Crawler. regards, Azeem -------------- next part -------------- An HTML attachment was scrubbed... URL:
2013 Nov 12
0
Data Analyst and Coordinator
Dear R-Sig-Jobs members, For its Executive Office in Brussels, The International Diabetes Federation (IDF) is looking to hire a Data Analyst & Coordinator with significant R experience. This person will join the Epidemiology and Public Health unit that sits within the Policy & Programmes department. They will be responsible for the management of IDF?s high-profile Diabetes Atlas. They