thr3ads.net - similar to: "xapian enquire.set_docid_order(Xapian::Enquire::DESCENDING so slow!"

Displaying 20 results from an estimated 600 matches similar to: "xapian enquire.set_docid_order(Xapian::Enquire::DESCENDING so slow!"

Fwd: Re: what is the fastest way to fetch results which are sorted by timestamp ?

2011 Aug 11

Fwd: Re: what is the fastest way to fetch results which are sorted by timestamp ?

(Forwarded off-list message) -------- Original Message -------- Subject: Re: [Xapian-discuss] what is the fastest way to fetch results which are sorted by timestamp ? Date: Thu, 11 Aug 2011 01:06:36 +0800 From: ??? <panjunyong at gmail.com> To: Tim Brody <tdb2 at ecs.soton.ac.uk> On Wed, Aug 10, 2011 at 6:39 PM, Tim Brody <tdb2 at ecs.soton.ac.uk> wrote: > Hi, > > In

sorting large msets

2018 Mar 31

sorting large msets

Olly Betts <olly at survex.com> wrote: > On Fri, Mar 30, 2018 at 05:21:43PM +0000, Eric Wong wrote: > > Hello, is there a way to optimize sorting by certain values > > for queries which return a huge amount of results? > [...] > > $enquire->set_sort_by_value_then_relevance(0, 1); > > If you're just wanting the 200 newest, it'll be faster not to

sorting large msets

2018 Mar 30

sorting large msets

On Fri, Mar 30, 2018 at 05:21:43PM +0000, Eric Wong wrote: > Hello, is there a way to optimize sorting by certain values > for queries which return a huge amount of results? [...] > $enquire->set_sort_by_value_then_relevance(0, 1); If you're just wanting the 200 newest, it'll be faster not to calculate weights, so: $enquire->set_sort_by_value(0, 1);

sorting large msets

2018 Apr 03

sorting large msets

On Sat, Mar 31, 2018 at 12:58:19AM +0000, Eric Wong wrote: > Olly Betts <olly at survex.com> wrote: > > If you're just wanting the 200 newest, it'll be faster not to calculate > > weights, so: > > > > $enquire->set_sort_by_value(0, 1); > > $enquire->set_weighting_scheme(new Xapian::BoolWeight()); > > > > For me, this drops the time

what is the fastest way to fetch results which are sorted by timestamp ?

2011 Aug 09

what is the fastest way to fetch results which are sorted by timestamp ?

what is the fastest way to fetch results which are sorted by timestamp ? i want to use xapian as my search engine , use add_boolean_term(something) and add_value(0,sortable_serialise(get_timestamp())) to a doc. search through enquire.set_weighting_scheme(xapian.BoolWeight()) and enquire.set_sort_by_value(0,True) to ensure that the results are sorted by the timestamp. This method is ok , but

Sort by docid

2005 Jun 29

Sort by docid

Hello, I wonder if there is a way to cause Xapian to order a result set purely by docid. In other words, once the result set has been determined, I'd like the results to be returned to me ordered by their docid, as opposed to by their match relevance. The problem at hand is that I'm building a search engine for a mailing list and I would like to return matches sorted by date; ordering by

sorting large msets

2018 Mar 30

sorting large msets

Hello, is there a way to optimize sorting by certain values for queries which return a huge amount of results? For example, I just want a simple query that gives me the 200 most recent emails out of millions. The elapsed time for get_mset increases as the number of documents ($n * 2000) increases. I suppose I could store a pre-sorted set using SQLite or similar. Thanks in advance for any

Implementing the tf-idf weighting scheme

2012 Apr 20

Implementing the tf-idf weighting scheme

Hi, all: This is the basic implementation of tf-idf scheme (basic scheme used in SMART) that can be used in the Xapian. It might still need some futher revision, but I believe it works anyway.:) I modified the weight.h to define a subclass Tf_idfWeight and add a new file tf_idf.cc in ../weight in the repo, to implement Tf_idfWeight. Here is the git diff patch: https://gist.github.com/2422049

having trouble with prefixes

2013 Sep 02

having trouble with prefixes

I've got a small test database setup with one record. $ delve -r 1 -V /tmp/1/ Values for record #1: 0:DD4F2162FFFF0E43741A4A1C2B8EC0E7 1:./Text_page_scan_2.jpg 2:jpg 3:.jpg Term List for record #1: E:.jpg P:./Text_page_scan_2.jpg Q:DD4F2162FFFF0E43741A4A1C2B8EC0E7 T:jpg The terms were added with lines like this: doc.add_term(string("P:") + path); Problem is, I can't seem to

performance on document.get_data()

2013 Oct 23

performance on document.get_data()

I got some performance issue for document.get_data() and enquire.get_mset(). It costs 35 seconds for matches = enquire.get_mset(0,200), and 3 seconds for iterating all doc in matches to get_data. Is't normal? My index contains 30millions documents. I use python binding to operate xapian. Bellow it's my index structure # value: 0:date, 1:site # data: json message which contains: author,

How to filter search result with query with has white space.

2013 Sep 22

How to filter search result with query with has white space.

Hello, include <iostream>#include <string>#include <xapian.h>struct document{ std::string title; std::string content; std::string url;}; void indexData(document d) { try { Xapian::WritableDatabase db("/Users/ramesh/Desktop/xapian", Xapian::DB_CREATE_OR_OPEN); Xapian::TermGenerator indexer; Xapian::Stem

How to filter search result with query with has white space.

2013 Sep 22

How to filter search result with query with has white space.

Get term from document by position

2015 Jul 26

Get term from document by position

mple (see attachment). > > Attachments get stripped out by the mailing list, so I?ve made a private gist of the two files here: <https://gist.github.com/jaylett/ce8455b37e2b84422346>. > > Actually, when I run it I get 0 matches, which would explain why you?re just getting the start of the document. However if I adjust things (match the stemming strategy for TermGenerator to

MSet order

2011 Mar 08

MSet order

Hello I defined a weighting scheme to simulate a king of "euclidean" distance. To test it, i used a database with 1000 documents. If I run : enquire.set_weighting_scheme(MyWeight()); Xapian::MSet matches = enquire.get_mset(0, 1000); I have a correct list of results. But if I run Xapian::MSet matches = enquire.get_mset(0, 10); I don't have the top-10 results. If I run Xapian::MSet

[PATCH] Replace hard-coded PKG_STATEDIR with state_dir setting

2012 Aug 13

[PATCH] Replace hard-coded PKG_STATEDIR with state_dir setting

Sharing an installed copy of dovecot between several users each running a daemon within their own account (or using the same binaries for a system daemon and a user daemon) is difficult because the compile-time directory PKG_STATEDIR (typically /var/lib/dovecot) is hard-coded as the location of things like the ssl-parameters.dat file and the replicator database. Replace all these uses of

Omega: Missing support for newer weighting schemes

2017 Apr 09

Omega: Missing support for newer weighting schemes

On Sun, Apr 09, 2017 at 11:34:07PM +0530, Vivek Pal wrote: > > Each scheme already has a human-readable name, and Xapian::Registry > > can map that to an "examplar" object of the right type, so we > > could take a string like "bm25 1 0.8", see the first word is "bm25" > > and get a BM25Weight object, then call parse_params("1 0.8") on

Omega: Missing support for newer weighting schemes

2017 Apr 12

Omega: Missing support for newer weighting schemes

> Each scheme already has a human-readable name, and Xapian::Registry > can map that to an "examplar" object of the right type, so we > could take a string like "bm25 1 0.8", see the first word is "bm25" > and get a BM25Weight object, then call parse_params("1 0.8") on it to > create the correct Weight object (broadly similar to how

Bug and patch for +terms with wildcards

2006 Dec 06

Bug and patch for +terms with wildcards

In current Xapian SVN HEAD, there is a bug in the query parser concerned with the handling of wildcard terms with a "+" prefix. Specifically, a query such as "+foo* bar" will be parsed by the query parser into Xapian::Query("bar") if there are no terms in the database which start "foo". Instead, since the "+" term cannot be matched, I believe

Omega: Missing support for newer weighting schemes

2017 Apr 13

Omega: Missing support for newer weighting schemes

On Mon, Apr 10, 2017 at 11:47:36PM +0530, Vivek Pal wrote: > > No, use Xapian::Registry to find the weighting scheme from the name > > like how Weight::unserialise() does (otherwise every caller would need > > code similar to that above). > > Okay, I looked into Xapian::Registry and it seems you are referring to using > the get_weighting_scheme method? (which expects a

seg fault on search

2014 Jan 21

seg fault on search

I have written a very simple function to return the match count based on the simplesearch.cc code. It fails with a seg fault. The relevant code is: -------------------- int ftQuery(char* qs, const char* dbname,char* results, int msize) { long docid; char* op; char fullDB[1024]; string queryString;

similar to: xapian enquire.set_docid_order(Xapian::Enquire::DESCENDING so slow!