Displaying 20 results from an estimated 1400 matches similar to: "Added support for TfIdf to Omega"
2013 Mar 26
1
Merging of the TfIdf patch
Hello Guys. I have updated the code,tests,documentation,makefile entries
and the registry entry of the* *TfIdf patch as per the feedback.Please do
let me know if any additional changes are required before the patch can be
merged,
-Regards
-Aarsh
On Sun, Mar 3, 2013 at 2:50 PM, aarsh shah <aarshkshah1992 at gmail.com> wrote:
> Hello guys.I have sent a pull request for the code and
2013 Mar 03
0
Added code and tests for the tf-idf weighting scheme.
Hello guys.I have sent a pull request for the code and tests of the Tf-Idf
weighting scheme.
Please do let me know if any changes are required.Meanwhile,Ill begin
working on implementing normalizations which require additional statistics
and on the DFR schemes.
https://github.com/xapian/xapian/pull/6
On Tue, Feb 26, 2013 at 5:30 PM, <xapian-devel-request at lists.xapian.org>wrote:
>
2013 Mar 05
0
Please take a look at the TfIdf patch
Hello guys, :) Please do take a look at the pull request for the TfIdf
patch Ive sent because I want to start working on writing DFR schemes for
us and want to incorporate the feedback into making a good hack for the DFR
schemes.The patch incorporates all normalizations possible with our current
statistics and passed all the tests I wrote for it.Have also attached the
tests with the pull request.
2013 Jan 27
1
Added a python example to the community page
Hey guys,I have added a python indexer example to the SampleCode page of
our wiki.Please do have a look.The code can also be found here :-
https://github.com/aarshkshah1992/xapian/blob/efcf443527b74326119bbc0935fc41a002ce60db/xapian-bindings/python/docs/examples/simpleindexgrep.py/
Thanks :)
-Regards
-Aarsh
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
2013 Feb 19
2
Implementing tf-idf weighting scheme in Xapian
Hello guys.I just read up about tf-idf schemes and want to implement it in
Xapian (with some frequently used normalizations) as it will also give me a
good hang of implementing a weighting scheme before I start working on
implementing DFR schemes.
I read the following as references and I think Ive understood it well and
can write the hack :-
1.)
2013 Mar 04
2
Need Beginner Guide for Matcher Optimisations Project
Hi,
While searching for a project which matches my interest andskill level, I
found this project named Matcher Optimization. This project is really
challenging and excting from my view point and I would like to be a part of
this project.
Optimization techniques metioned in the reference links provided will take
some time for me to have a good understanding about them. But I am trying
to get my
2013 May 15
0
Better parsing of BM25 parameters in Omega
Hello guys, as discussed on IRC, I have written some code for better
parsing of BM25 parameters in Omega. If no parameters are specified ,it
defaults all of them. However, if there some are specified and some are not
or if the invalid values are given for any of them,it throws an error.
https://github.com/aarshkshah1992/xapian/commit/ac0a11f5d8ff975fad1e96e63764eab9b04dfcfb
-Regards
-Aarsh
2013 Mar 27
1
Need help as Pl2 tests not performing as expected
Hello guys. I just ran the updated tests for PL2 and they are not giving
the mset order I expect.Now,the thing is, dfr's behavior is a bit hard to
predict and so even if I expect a particular order ,it may give another
order and still be correct.So,the only way to write correct tests for PL2
is to manually calculate the weight of the documents to decide the expected
order.For that,I need to
2014 May 14
2
Starting work on Perf Test Module
Hello,
I am beginning work on the perf test module. The initial steps that I aim
to accomplish are :-
-> Download the wikipedia dumps for multiple languages .
-> Write python scripts to tokenize the dump (will probably use something
like nltk which has powerful inbuilt tokenizers)
-> Discuss and finalize the design of the search and query expansion perf
tests as I want to complete them
2014 Mar 04
2
Test Dataset for performance and accuracy analysis
Hi Parth,
I implemented DFR algorithms in Xapian as
a part of GSOC last year under the mentorship of Olly. This year, I want to
work on analyzing and optimizing the performance of the DFR algorithms and
comparing them with BM25.I also want to work on profiling the query
expansion schemes and test the relevance(precision and recall) / speed(time
taken) of the
2012 Dec 08
2
Want to contribute code to the Xapian project
Hey guys,I am a 3rd year Computer Science undergrad student.I a extremely
interested in contributing code to the XAPIAN project. The work you people
do sounds extremely fascinating and interesting.Can someone just give me a
brief overview of how to proceed ?. I Can code in C,C++ and Python and
have experience in Natural Lanuage Processing.Am also quite comfortable
with NLTK and using Wordnet.Am
2017 Apr 08
2
Omega: Missing support for newer weighting schemes
> Hi, Vivek — there isn't any particular reason that I'm aware of. It's
> probably worth pointing (in the omegascript documentation) to the part of
> the getting started guide which talks about the different weighting schemes
If there isn't any reason then I'd like to send in a patch adding support for
those weighting schemes in weight.cc and I agree omegascript
2013 Jul 17
1
Base class for query expansion
Hello Dan and Olly, this is the code for the base class for query expansion
that I have written. The code will not compile as I have written only the
base class until now. Have yet to use it. Please do tell me what you think
of the base class and what changes you suggest I should make before I move
forward with the project.
https://github.com/xapian/xapian/pull/23
-Regards
-Aarsh
--------------
2013 Mar 03
0
Sent a pull request for testing TradWeight using an Rset.
Hello guys.As discussed on IRC,I have sent a pull request for a test for
testing TradWeight with an Rset.
On Fri, Mar 1, 2013 at 5:30 PM, <xapian-devel-request at lists.xapian.org>wrote:
> Send Xapian-devel mailing list submissions to
> xapian-devel at lists.xapian.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>
2007 Jan 28
1
omega: $field{sample} clarification
I'm using omega for a sitesearch and currently having a problem trying
to filter $field(sample). At the moment it returns text from the page
header and navigation within the sample, ideally I'd like it to return
only the page content. I've been trying various omegascript commands to
trim the output or seperate the fields and also looking at scriptindex
to control how the xhtml is
2013 Feb 25
0
Sent a pull request for the Tf-Idf Weighting scheme
Hello guys :) I have sent a pull request for the Tf-Idf Weighting scheme
incorporating as many normalizations as I could with the help of statistics
currently available from Xapian::Weight . Please let me know what you'll
think about it.
I used the weighting scheme in a simple searcher and it did a fine job with
it. I have no experience with writing tests for features like this.Please
give me
2013 Mar 20
0
Registering a weighting scheme with Xapian
Hello guys,I've modified the TfIdf patch as per the feedback I got on it
and have added the code to the pull request. Please do have a look and let
me now what you'll think.
https://github.com/xapian/xapian/pull/6
Also,I read somewhere that I need to register this weighting scheme with
Xapian. Please can you'll throw some light on that ?
-Regards
-Aarsh
-------------- next part
2013 Jun 22
2
Dealing with negative weights
I was adding the calculations for a lower bound to get_sumpart() (DLH has
no term independent component) when I realized that the same lower bound
will be calculated for each term-docment pair that get_sumpart is called
pair which basically reduces efficiency. How do I calculate the lower bound
for a term only once and then use it ?
-Regards
-Aarsh
On Fri, Jun 21, 2013 at 4:41 PM, Olly Betts
2010 Mar 24
1
Omega: behavior msize when collapsing results
Hello list,
I have a problem with the value of the result size ($msize in
omegascript) when collapsing results. The index contains 151452
documents. I'm using Omega 1.0.18 on FreeBSD (I tried both the version
in ports and the latest one from xapian.org). This is my indexscript:
uniqueid: boolean=Q unique=Q field=uniqueid
objectid: field=objectid boolean=XID value=0
objecttype: field=type
2017 Apr 08
2
Omega: Missing support for newer weighting schemes
On Sat, Apr 08, 2017 at 09:11:22PM +0100, James Aylett wrote:
> On 8 Apr 2017, at 19:15, Vivek Pal <vivekpal.dtu at gmail.com> wrote:
>
> >> and the details of which weighting schemes were available in which version
> >> isn't a key part of the $set command itself.
> >
> > Do you suggest dropping that piece of information out? Since the reason behind