Displaying 20 results from an estimated 600 matches similar to: "Merging of the TfIdf patch"
2013 Mar 03
0
Added code and tests for the tf-idf weighting scheme.
Hello guys.I have sent a pull request for the code and tests of the Tf-Idf
weighting scheme.
Please do let me know if any changes are required.Meanwhile,Ill begin
working on implementing normalizations which require additional statistics
and on the DFR schemes.
https://github.com/xapian/xapian/pull/6
On Tue, Feb 26, 2013 at 5:30 PM, <xapian-devel-request at lists.xapian.org>wrote:
>
2013 Apr 11
1
Added support for TfIdf to Omega
Hello guys,I have added code for tfidf to the weight.cc file in omega/ .
Here is the patch : -
https://github.com/aarshkshah1992/xapian/commit/5ff41a15f574e6780cc61e67e7f3da3d97ff4ec8
It compiles well and I think it'll work well.
Here's the link to the documentation file omegascript.rst where I've added
tfidf.
2013 Mar 05
0
Please take a look at the TfIdf patch
Hello guys, :) Please do take a look at the pull request for the TfIdf
patch Ive sent because I want to start working on writing DFR schemes for
us and want to incorporate the feedback into making a good hack for the DFR
schemes.The patch incorporates all normalizations possible with our current
statistics and passed all the tests I wrote for it.Have also attached the
tests with the pull request.
2013 Feb 25
0
Sent a pull request for the Tf-Idf Weighting scheme
Hello guys :) I have sent a pull request for the Tf-Idf Weighting scheme
incorporating as many normalizations as I could with the help of statistics
currently available from Xapian::Weight . Please let me know what you'll
think about it.
I used the weighting scheme in a simple searcher and it did a fine job with
it. I have no experience with writing tests for features like this.Please
give me
2013 Feb 19
2
Implementing tf-idf weighting scheme in Xapian
Hello guys.I just read up about tf-idf schemes and want to implement it in
Xapian (with some frequently used normalizations) as it will also give me a
good hang of implementing a weighting scheme before I start working on
implementing DFR schemes.
I read the following as references and I think Ive understood it well and
can write the hack :-
1.)
2016 Jul 27
2
Weighting Schemes: Implementing Piv+ Normalization
Hi,
I have added support for Piv normalization in Tf-Idf weighting scheme as a
intermediate step to implementing the support for Piv+ normalization. All
tests pass.
But I'm running into some issues with Piv+ normalization. In the Piv+
formula , there are two parameters (s and delta) that control the weight
assigned. I think the way I'm serialising and unserialising these
parameters has
2013 Aug 26
2
Backend for Lucene format indexes-How to get doclength
On Mon, Aug 26, 2013 at 09:41:07AM +0800, jiangwen jiang wrote:
> > For now, using weighting schemes which don't use document length is
> > probably the simplest answer.
>
> There's tf-idf weighting scheme on svn master, is it suitable for lucene
> backend?
Yes - TfIdfWeight doesn't ever use the document length (at least with
the normalisations currently
2012 Mar 27
1
About the projects of "Ranking" for GSoC 2012
Hello,
I am Mohiuddin Abdul Qader, final year student from dept of CSE in
Bangladesh University of Engineering & Technology(BUET).
My major was artificial intelligence & i finished my course on Machine
Learning and Pattern Recognition this year. I am very keen to contribute in
open source community. I have just completed my thesis on 'Location Based
Structured Web Search'. For the
2013 Mar 20
0
Registering a weighting scheme with Xapian
Hello guys,I've modified the TfIdf patch as per the feedback I got on it
and have added the code to the pull request. Please do have a look and let
me now what you'll think.
https://github.com/xapian/xapian/pull/6
Also,I read somewhere that I need to register this weighting scheme with
Xapian. Please can you'll throw some light on that ?
-Regards
-Aarsh
-------------- next part
2016 Mar 10
2
Introduction and Doubts
I was not sharing it on maling list because i thought that someone can use
all ideas i proposed in their GSOC proposal.
Surely i will contribute to xapian project.
sorry if that was against the rules
The algorithm is not developed by me but after having much research on
various clustering techniques.
I found that there is a new algorithm called CLUBS(Clustering Using Binary
Splitting) which gives
2016 Mar 10
2
Introduction and Doubts
Tf-idf is most used used weighting scheme is easy to understand and has
been used in other frameworks like lucene and many other places.
okapi bm25(implemented in xapian) is theoretically better/improved measure
than tf-idf and
i am looking into various other weighting scheme which are there in xapian
or can be implemented like TF-ICF(term frequecy inverse corpus
frequency),TF-RF(term
2013 Mar 27
1
Need help as Pl2 tests not performing as expected
Hello guys. I just ran the updated tests for PL2 and they are not giving
the mset order I expect.Now,the thing is, dfr's behavior is a bit hard to
predict and so even if I expect a particular order ,it may give another
order and still be correct.So,the only way to write correct tests for PL2
is to manually calculate the weight of the documents to decide the expected
order.For that,I need to
2016 Aug 07
2
Weighting Schemes: Evaluation results
Hi,
Evaluation of pivoted normalization ("PPP") of tf-idf weighting scheme is
also complete now. I have also evaluated the default tf-idf normalization
("ntn") and other normalizations combinations involving pivoted
normalization in wdfn, idfn and wtn component as "Pxx", "xPx" and "xxP"
normalization strings respectively to have a clear idea about
2013 Mar 03
0
Sent a pull request for testing TradWeight using an Rset.
Hello guys.As discussed on IRC,I have sent a pull request for a test for
testing TradWeight with an Rset.
On Fri, Mar 1, 2013 at 5:30 PM, <xapian-devel-request at lists.xapian.org>wrote:
> Send Xapian-devel mailing list submissions to
> xapian-devel at lists.xapian.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>
2013 Jan 27
1
Added a python example to the community page
Hey guys,I have added a python indexer example to the SampleCode page of
our wiki.Please do have a look.The code can also be found here :-
https://github.com/aarshkshah1992/xapian/blob/efcf443527b74326119bbc0935fc41a002ce60db/xapian-bindings/python/docs/examples/simpleindexgrep.py/
Thanks :)
-Regards
-Aarsh
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
2013 Sep 02
2
Backend for Lucene format indexes-How to get doclength
On Mon, Sep 02, 2013 at 09:21:48AM +0800, jiangwen jiang wrote:
> TfIdfWeight and BM25(b=0) also need wdf_upper_bound, it is not exists in
> Lucene backends.
If you don't provide an implementation of wdf_upper_bound(), the default
is to use the collection frequency of the term, so provided that
information is available in the lucene files, the lack of
wdf_upper_bound information
2013 May 15
0
Better parsing of BM25 parameters in Omega
Hello guys, as discussed on IRC, I have written some code for better
parsing of BM25 parameters in Omega. If no parameters are specified ,it
defaults all of them. However, if there some are specified and some are not
or if the invalid values are given for any of them,it throws an error.
https://github.com/aarshkshah1992/xapian/commit/ac0a11f5d8ff975fad1e96e63764eab9b04dfcfb
-Regards
-Aarsh
2009 Jan 27
0
samba, ADS and privileges management
Hello list.
I once had a samba server acting as a PDC, a mapping between my NT
'Domain admins' and Unix 'admins' groups, and everything worked perfectly.
Now I got a new shiny samba server acting as a print server only, member
of an AD domain, and I can't have the members of 'Domain admins' group
manage printing drivers on the server, whereas the Administrator
2013 Mar 02
2
Getting Started
Hello all,
I am Mohd Azeem. I want to contribute in Xapian but I am a newbie here. I wonder if anyone could help me in getting started with Xapian. I have some basic knowledge of IR and implemented TF*IDF and PageRank schemes, and also implemented Inverted Index and Web-Crawler.
regards,
Azeem
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
2013 Nov 12
0
Data Analyst and Coordinator
Dear R-Sig-Jobs members,
For its Executive Office in Brussels, The International Diabetes
Federation (IDF) is looking to hire a Data Analyst & Coordinator with
significant R experience. This person will join the Epidemiology and
Public Health unit that sits within the Policy & Programmes
department. They will be responsible for the management of IDF?s
high-profile Diabetes Atlas. They