similar to: Interested in IR, Getting started with Xapian

Displaying 20 results from an estimated 600 matches similar to: "Interested in IR, Getting started with Xapian"

2013 Mar 02
3
How to add an custom weight to the relevancy value and sort it.
Hello guys, I have an weight value which is calculated by some factor and i need to add the weight with the relevancy value of a result and sort it with that value is that possible in xapian. Thanks, VishnuKumar -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20130302/9831e287/attachment-0001.html>
2018 Jan 22
2
How to get the serialise score returned in Xapian::KeyMaker->operator().
>A possible workaround (and perhaps a better approach) would be to >set BoolWeight as the weighting scheme, then feed in your score as >a weight using a PostingSource. Then it's available via get_weight() >on the MSetIterator object: > >https://getting-started-with-xapian.readthedocs.io/en/latest/advanced/postingsource.html > >You may find that's faster because
2020 Feb 07
2
prioritizing aggregated DBs
Hey all, I've been using ->add_database for a few years to tie sharded DBs together and it works great. Now, I want to be able to search across several DBs which aren't sharded, say: linux-DB, glibc-DB, freebsd-DB. I want to search for something across all of them, but prioritize results to favor one or some of those DBs over others. Is there a way to do that without reindexing? Or
2017 Dec 15
5
How to get the serialise score returned in Xapian::KeyMaker->operator().
HI, all, I am a user of Xapian, and now I have a problem in using it. After using boolean terms to get some candidates of documents (still too much), we want sorted them by self-defined function which is used in Xapian::KeyMaker->operator(). But how can I get the serialise score in Xapian::MSetIterator object. c++ code likes this: class SortKeyMaker : public Xapian::KeyMaker { std::string
2016 Apr 22
2
Weighting recent results
I did some digging and found a thread from 2011 talking about how to subclass Xapian::PostingSource in order to incorporate the date or recency of a document in its weighting: http://thread.gmane.org/gmane.comp.search.xapian.general/8849/focus=8856 As in that thread, I want to be clear that I don't want to sort by date, but rather incorporate date information into the score by which I
2016 May 03
2
Weighting recent results
On 5/2/2016 9:03 PM, Olly Betts wrote: > On Fri, Apr 22, 2016 at 12:23:15PM -0400, Alex Aminoff wrote: >> I did some digging and found a thread from 2011 talking about how to >> subclass Xapian::PostingSource in order to incorporate the date or >> recency of a document in its weighting: >> >> http://thread.gmane.org/gmane.comp.search.xapian.general/8849/focus=8856
2011 May 23
1
More relevance for recent documents
Good afternoon I would like to ask if is possible somehow give more relevance to the recent documents in search results. I dont want to sort results according to the date, I still prefer relevance, but I would like to see recent documents with better scoring. I was trying to add search query using AND_MAYBE, which should use relevance from both subqueries, but it didnt add any benefit to the
2011 Aug 09
3
what is the fastest way to fetch results which are sorted by timestamp ?
what is the fastest way to fetch results which are sorted by timestamp ? i want to use xapian as my search engine , use add_boolean_term(something) and add_value(0,sortable_serialise(get_timestamp())) to a doc. search through enquire.set_weighting_scheme(xapian.BoolWeight()) and enquire.set_sort_by_value(0,True) to ensure that the results are sorted by the timestamp. This method is ok , but
2010 Aug 27
1
Using relevance when sorting by generated key
Hi all, I am trying to implement a scheme where documents in an MSet will be sorted based on relevance as well as geographical distance from a given (non-fixed) point. I understand that this can be accomplished by using PostingSource in order to implement a custom weighting scheme that would combine BM25 weighting and distance. The problem is that I am using perl and PostingSource is not
2008 Sep 10
1
mu-0.2, maildir indexer/searcher with xapian support
Hi all, [ Hopefully announcements like this are appropriate here... ] I've just released version 0.2 of my maildir scanner/search called 'mu': http://www.djcbsoftware.nl/code/mu/ It it's written in C and a bit of C++, and released under the GPL. Thanks for the help I got here, it was quite easy to integrate Xapian, and it works really nice -- a high quality product. Great job!
2010 Apr 16
2
best practices - combining sql database and xapian, size of database?
Newbie-alert: I'm just getting started on a new project involving a full text search requirement, and my initial investigation points to xapian being the way to go. Two questions: - eventually I'll most likely be indexing towards 50 million documents - is this reasonable to expect or attempt with xapian? - each of my documents come with a set of attributes. These are easily stored
2010 Jun 24
1
Quickest way to retrieve data for a large match set?
We're using the Perl binding to access Xapian in a simple search of image metadata (title and keywords). Due to the specification for the search engine, by default we have to sort the results using a function of the search rank, age (well, newness) and popularity (rated by sales of the image). As a result, we have to fetch the complete result set and then calculate a new ranking based on
2017 Dec 18
2
How to get the serialise score returned in Xapian::KeyMaker->operator().
On Sat, Dec 16, 2017 at 10:11:40PM +0000, Olly Betts wrote: > Unfortunately the sort key isn't currently exposed via the public API. > It's available internally and it seems like it ought to be accessible > but there's no accessor method for it - I can add one but that won't > help for existing releases. I've added MSetIterator::get_sort_key() to master in
2012 Apr 02
0
GSoC, Xapian Project Weighting Schemes
Hello all, I am very sorry I did not include xapian-devel mailing list in my previous mail. Thanks for responding my mail. Mohd Azeem NIT UK ________________________________ From: Olly Betts <olly at survex.com> To: Mohd Azeem <azeem201001 at yahoo.in> Cc: Parth Gupta <parthg.88 at gmail.com> Sent: Saturday, 31 March 2012 11:40 AM Subject: Re: GSoC, Xapian Project Weighting
2008 Oct 09
3
Sorting results by a "sort expression"
Olly, We currently use Sphinx for our website search function, but we're planning on using Xapian instead for a few of the extra features it has. Our website is written in Ruby on Rails, so of course we're using Xapian with Ruby bindings. I don't know if you're familiar with Sphinx but Sphinx allows you to pass a sort expression when you execute the search that will be evaluated
2008 Dec 17
1
using ValueWeightPostingSource
Hi, I'm currently using PostingSource to add some weight over the result using a value. I didn't find any documentation on how to use it with the query so i link a query constructed using the posting source and a query made using the query parser with an AND operator : Xapian.Query queryText = parser.ParseQuery("test:" + textBox1.Text + " DS:1 DS:2"); Xapian.Query
2010 Aug 09
2
File descriptor leak (?) in Python
Hi all, Recently I have upgraded a Python application from Xapian 1.0.7 to 1.2.2 in order to use the PostingSource class. It is a long-running process, and I am seeing the number of open file descriptors to the Xapian database steadily increase. I suspect what I am seeing is some kind of resource leak. I have no idea if it is a problem in our code or in the Xapian Python bindings. How do I debug
2013 Mar 18
2
Incremental indexing
Hi all, I am trying to implement an Incremental indexing scheme. The problem is that usually the modified documents are large but the modifications are limited. Ideally, I would like to reindex only the modified parts of these documents. If I am not mistaken, xapian cannot do that. Are there any other approaches? It would be nice if xapian supported something like the SQL "group by".
2024 Apr 26
1
queries for a set of values
I probably should've used boolean terms in addition to numeric values when indexing, but currently I have a set of numeric values[1] and trying to avoid having to reindex ~250GB DBs (and asking numerous users to do the same). Say I have a bunch of values which I want to filter a query against. If I had boolean terms, it could just OP_OR against the whole set. IOW, this is what notmuch does
2018 Jan 24
0
How to get the serialise score returned in Xapian::KeyMaker->operator().
On Tue, Jan 23, 2018 at 12:55:31AM +0800, 张少华 wrote: > We realise our score function using PostingSource instead of using > KeyMaker, we reference your python example and source code of xapian, > the simple demo is here. > https://github.com/xiangqianzsh/xapian_leaning/blob/master/postingsource/ExternalWeightPostingSource.h I'd just put the get_weight() and get_maxweight()