similar to: sorting large msets

Displaying 20 results from an estimated 800 matches similar to: "sorting large msets"

2018 Mar 31
2
sorting large msets
Olly Betts <olly at survex.com> wrote: > On Fri, Mar 30, 2018 at 05:21:43PM +0000, Eric Wong wrote: > > Hello, is there a way to optimize sorting by certain values > > for queries which return a huge amount of results? > [...] > > $enquire->set_sort_by_value_then_relevance(0, 1); > > If you're just wanting the 200 newest, it'll be faster not to
2011 Aug 11
3
Fwd: Re: what is the fastest way to fetch results which are sorted by timestamp ?
(Forwarded off-list message) -------- Original Message -------- Subject: Re: [Xapian-discuss] what is the fastest way to fetch results which are sorted by timestamp ? Date: Thu, 11 Aug 2011 01:06:36 +0800 From: ??? <panjunyong at gmail.com> To: Tim Brody <tdb2 at ecs.soton.ac.uk> On Wed, Aug 10, 2011 at 6:39 PM, Tim Brody <tdb2 at ecs.soton.ac.uk> wrote: > Hi, > > In
2005 Jun 29
2
Sort by docid
Hello, I wonder if there is a way to cause Xapian to order a result set purely by docid. In other words, once the result set has been determined, I'd like the results to be returned to me ordered by their docid, as opposed to by their match relevance. The problem at hand is that I'm building a search engine for a mailing list and I would like to return matches sorted by date; ordering by
2018 Mar 30
0
sorting large msets
On Fri, Mar 30, 2018 at 05:21:43PM +0000, Eric Wong wrote: > Hello, is there a way to optimize sorting by certain values > for queries which return a huge amount of results? [...] > $enquire->set_sort_by_value_then_relevance(0, 1); If you're just wanting the 200 newest, it'll be faster not to calculate weights, so: $enquire->set_sort_by_value(0, 1);
2011 Aug 09
3
what is the fastest way to fetch results which are sorted by timestamp ?
what is the fastest way to fetch results which are sorted by timestamp ? i want to use xapian as my search engine , use add_boolean_term(something) and add_value(0,sortable_serialise(get_timestamp())) to a doc. search through enquire.set_weighting_scheme(xapian.BoolWeight()) and enquire.set_sort_by_value(0,True) to ensure that the results are sorted by the timestamp. This method is ok , but
2018 Apr 03
0
sorting large msets
On Sat, Mar 31, 2018 at 12:58:19AM +0000, Eric Wong wrote: > Olly Betts <olly at survex.com> wrote: > > If you're just wanting the 200 newest, it'll be faster not to calculate > > weights, so: > > > > $enquire->set_sort_by_value(0, 1); > > $enquire->set_weighting_scheme(new Xapian::BoolWeight()); > > > > For me, this drops the time
2018 Apr 06
1
sorting large msets
> > Olly Betts <olly at survex.com> wrote: > > > > > > The reverse order (ENQ_ASCENDING) is really fast - about 0.0001 seconds. > > > This is because in that case we can just stop once we've found 200 > > > matches. With a few million documents, that ENQ_ASCENDING sounds promising :) So, it looks like if I had ideal ordering, I could do
2010 Aug 23
1
Sort ordering
Using MultiValueSorter, I can sort by key1, key2, relevance; or relevance, key1, key2. But AFAIK, I can't sort by key1, relevance, key2. Unless I spool out the entire result set or write some C++. I wonder if we need a new 'sort by' function that accepts any combination of keys and relevance in any order? The function would make it's own optimisations (ie is relevance first or
2013 Oct 23
2
performance on document.get_data()
I got some performance issue for document.get_data() and enquire.get_mset(). It costs 35 seconds for matches = enquire.get_mset(0,200), and 3 seconds for iterating all doc in matches to get_data. Is't normal? My index contains 30millions documents. I use python binding to operate xapian. Bellow it's my index structure # value: 0:date, 1:site # data: json message which contains: author,
2007 Apr 11
1
Deprecation Policy
When going through the xapian bindings yesterday, I noticed that several of the methods were not wrapped for Ruby because they were deprecated at the time the ruby bindings were created. I filed a bug (#126) saying that they should be removed entirely, which led to the suggestion from Olly that it would be good to make a semi-formal policy about deprecating features. I've written such a
2011 Aug 10
0
xapian enquire.set_docid_order(Xapian::Enquire::DESCENDING so slow!
i have 300 millions records and my search file like this , i want the newest 10 results that match my query , so i use boolean search and "enquire.set_docid_order(enquire.DESCENDING)" , but this method seems a little slow . when i remove "enquire.set_docid_order(enquire.DESCENDING)" it run much faster . how can i fetch the newest 10 results as fast as possible? search.py
2007 Oct 16
1
Matches estimate varies with sorting method
Hi all, I found that the figure returned by MSet::get_matches_estimated() varies depending on how results are to be sorted. For instance, in my index, value 4 contains date and time in the format "yyyymmddhhmmss". For the same query, the number of results will be estimated to 20000+ when results are first sorted by date and time with set_sort_by_value_then_relevance(4) and to only 100
2017 Apr 09
3
Omega: Missing support for newer weighting schemes
On Sun, Apr 09, 2017 at 11:34:07PM +0530, Vivek Pal wrote: > > Each scheme already has a human-readable name, and Xapian::Registry > > can map that to an "examplar" object of the right type, so we > > could take a string like "bm25 1 0.8", see the first word is "bm25" > > and get a BM25Weight object, then call parse_params("1 0.8") on
2017 Apr 08
2
Omega: Missing support for newer weighting schemes
On Sat, Apr 08, 2017 at 09:11:22PM +0100, James Aylett wrote: > On 8 Apr 2017, at 19:15, Vivek Pal <vivekpal.dtu at gmail.com> wrote: > > >> and the details of which weighting schemes were available in which version > >> isn't a key part of the $set command itself. > > > > Do you suggest dropping that piece of information out? Since the reason behind
2017 Apr 12
4
Omega: Missing support for newer weighting schemes
> Each scheme already has a human-readable name, and Xapian::Registry > can map that to an "examplar" object of the right type, so we > could take a string like "bm25 1 0.8", see the first word is "bm25" > and get a BM25Weight object, then call parse_params("1 0.8") on it to > create the correct Weight object (broadly similar to how
2012 Apr 27
4
GSoC xapian node binding
Posting recent offline discussion... On Fri, Apr 27, 2012 at 10:55 AM, Marius Tibeica <mtibeica at gmail.com> wrote: > Hi Liam, > > I've added the Enquire class and designed a query spec structured as a JS > object. Hope you like it :) > I'll probably be off a few days (there is a national holiday Tuesday which > means i have a long weekend :D) but maybe I'll
2017 Apr 13
2
Omega: Missing support for newer weighting schemes
On Mon, Apr 10, 2017 at 11:47:36PM +0530, Vivek Pal wrote: > > No, use Xapian::Registry to find the weighting scheme from the name > > like how Weight::unserialise() does (otherwise every caller would need > > code similar to that above). > > Okay, I looked into Xapian::Registry and it seems you are referring to using > the get_weighting_scheme method? (which expects a
2023 May 03
1
manual flushing thresholds for deletes?
Olly Betts <olly at survex.com> wrote: > On Mon, Mar 27, 2023 at 11:22:09AM +0000, Eric Wong wrote: > > Olly Betts <olly at survex.com> wrote: > > > 10 seems too long. You want the mean word length weighted by frequency > > > of occurrence. For English that's typically around 5 characters, which > > > is 5 bytes. If we go for +1 that's:
2023 Aug 18
1
does Xapian::Enquire hold an MVCC revision?
On Thu, Aug 17, 2023 at 09:28:26PM +0000, Eric Wong wrote: > In other words, is it possible to avoid duplicates if new > documents are inserted into the DB by another process in-between > ->get_mset calls when reusing Xapian::Enquire objects? The Database object itself effectively does (it works in a snapshot of the state of the database when you open it, or last called reopen() which
2012 Dec 15
1
virt-resize Fatal error: exception Guestfs.Error("e2fsck_f
We?ve been seeing this a lot lately on generic CentOS 6 rpm installs: rpm -qa | grep libguestfs libguestfs-java-1.16.19-1.el6.x86_64 libguestfs-java-devel-1.16.19-1.el6.x86_64 libguestfs-1.16.19-1.el6.x86_64 libguestfs-tools-1.16.19-1.el6.x86_64 libguestfs-javadoc-1.16.19-1.el6.x86_64 libguestfs-devel-1.16.19-1.el6.x86_64 libguestfs-tools-c-1.16.19-1.el6.x86_64