similar to: Xapian wiki: typo in docid to sub-db translation?

Displaying 20 results from an estimated 1000 matches similar to: "Xapian wiki: typo in docid to sub-db translation?"

2004 Dec 21
1
Search::Xapian add_database'd search results are odd?
Sorry if this is the wrong forum to discuss Search::Xapian issues -- this just seems like the best place.. Anyways, I've been testing out using $db->add_database() when searching, and it seems like the docids I'm getting out of it are incorrect, almost as though they're "double" what they should be (numerically)... the docids that exist should be around 950,000 and
2010 Oct 22
1
overlapping docids when searching on multiple databases?
Just a quick question - it seems to me that it's entirely possible to get overlapping docids when searching on multiple databases? For instance: open database1 add database2 to database1 search db1+db2 if docid 10 exists in both databases, is there any way of telling which which database to retrieve the document from? /Per Jessen, Z?rich
2015 Mar 11
2
stub-file and get_doccount
Hello, i switched from one big index to a stub file with many indexes and running into a problem. i have a tool to fetch a random document via: get_doccount random id up to get_doccount get_document with that id after changing to stub file this failes. Is there a nice way to get a random document from a stub file? ?MfG? Felix Ostmann
2005 Jul 20
1
docid type redifine
Hello all. I need to redefine a docid type (and all dependent types) like this: typedef unsigned long long docid; I think it would be enough to edit "include/xapian/types.h", but it isn't so. 1) I've added : string om_tostring(unsigned long long val) { CONVERT_TO_STRING("%llu") } in common/utils.{h,cc} 2) In include/enquire.h (line 438) I've found the
2005 Jun 29
2
Sort by docid
Hello, I wonder if there is a way to cause Xapian to order a result set purely by docid. In other words, once the result set has been determined, I'd like the results to be returned to me ordered by their docid, as opposed to by their match relevance. The problem at hand is that I'm building a search engine for a mailing list and I would like to return matches sorted by date; ordering by
2011 May 30
1
How to check docid
I have a bit of code (Python) to delete a number of documents: for f in Flist: xapian_store.delete_document(f.pri_key) in which I am using a unique primary key from an SQL database as the docid for the Xapian database. The problem I have is that some of the documents may not have been created - so I get an error. Now I could just ignore the error (try-recover), but what would be the
2020 Feb 19
2
prioritizing aggregated DBs
Olly Betts <olly at survex.com> wrote: > On Sat, Feb 08, 2020 at 06:04:42PM +0000, Eric Wong wrote: > > Olly Betts <olly at survex.com> wrote: > > > On Fri, Feb 07, 2020 at 09:33:08PM +0000, Eric Wong wrote: > > > > Or would I fiddle with wdf_inc for all ->index_text and ->add_term > > > > calls on a per-DB basis? > > > >
2020 Feb 19
0
prioritizing aggregated DBs
On Wed, Feb 19, 2020 at 10:23:09AM +0000, Eric Wong wrote: > Btw, is there a way to quickly figure out which sub-DB a retrieved > document or mset item belongs to? Yes: https://trac.xapian.org/wiki/FAQ/MultiDatabaseDocumentID 1.4.12 added a Database::size() method which reports the number of shards - for older versions you have to keep track of that yourself (which needs a little care as
2020 Feb 21
1
prioritizing aggregated DBs
Olly Betts <olly at survex.com> wrote: > On Wed, Feb 19, 2020 at 10:23:09AM +0000, Eric Wong wrote: > > Btw, is there a way to quickly figure out which sub-DB a retrieved > > document or mset item belongs to? > > Yes: https://trac.xapian.org/wiki/FAQ/MultiDatabaseDocumentID > > 1.4.12 added a Database::size() method which reports the number of > shards - for
2018 Jan 03
2
Storing the documents text: data record or value ?
Hi, Following the Recoll snippets generation performance problem caused by the new positions list storage scheme in Xapian 1.4, I am experimenting with generating snippets from the complete document text stored in the index. This increases the index size much less than I would have expected (around 10-15% apparently with my home directory data), which is good news obviously. I have tried
2016 Jan 08
2
Strange index consistency issue
Hi, A Recoll user is reporting an index corruption problem. In general, index corruption happens from time to time with Recoll, because of crashes, reboots, misc Recoll bugs, etc. The strange thing here is that xapian-check does not seem to detect anything. In a nutshell, some document numbers seem to point to a data blackhole: the docids are returned when searching for the file/doc unique
2009 Nov 13
1
Using xapian for general indexed storage
Hello, Two questions about using Xapian as a gdbm stand-in for an auxiliary database: - I am currently using single-term documents having the key as a single term, and the (small) associated data chunk stored in the document data record. Is this still the right way to do it? - There was an answer on the mailing list two years ago, saying that storing a few megabytes in the document
2016 Apr 12
2
Xapian 1.3.5 snapshot performance and index size
Olly Betts writes: > On Mon, Apr 11, 2016 at 09:54:36AM +0200, Jean-Francois Dockes wrote: > > The question which remains for me is if I should run xapian-compact > > after an initial indexing operation. I guess that this depends on the > > amount of expected updates and that there is no easy answer ? > > I think it's not obvious whether it's a good plan
2016 Apr 11
2
Xapian 1.3.5 snapshot performance and index size
Olly Betts writes: > On Sun, Apr 10, 2016 at 04:47:01PM +0200, Jean-Francois Dockes wrote: > > Some might notice the 50% index size increase. Excessive index size is > > already one relatively rare, but recurring complaint. Except if I did > > something wrong: I'm actually quite surprised by it. > > Did you try compacting the resulting databases? > >
2007 Jun 19
2
Deleted documents not deleted
I seem to be seeing cases where I call db.delete_document(somedocid) with no error, then flush() and delete the database object, but the document is still there after process exit. The write lock is normally deleted, so it appears that the database close finished normally. If I then then call delete_document(somedocid) from another command/process, this time it goes away. I've been seeing
2018 Jan 04
0
Storing the documents text: data record or value ?
On Wed, Jan 03, 2018 at 04:18:18PM +0100, Jean-Francois Dockes wrote: > Seen from the outside, it would appear to make sense to use values, so that > code which needs to access the data record but not the full document text > does not pay a performance penalty. > > I am wondering if there are other arguments for using either method ? I wouldn't recommend using a value to store
2013 Apr 26
1
Compiling Xapian within a Cocoa project
Hi, folks. I don't know much about C++, so please excuse my newbie question: I just tried to include Xapian in a dummy Cocoa project, ie created a new project and added the libxapian.a file installed via MacPorts. When I include xapian.h in an Objective-C++ file (either mm or h), the compilation fails with the following message: ================= In file included from
2010 Apr 26
8
[LLVMdev] Proposal for a new LLVM concurrency memory model
Hi all, Chandler, Owen, and I have written up a proposal for a new memory model and atomic intrinsics in LLVM, which will make it possible to support Java and the upcoming C++0x standard. The proposed changes to the LangRef are at <http://docs.google.com/View?docID=ddb4mhxz_22dz5g98dd&revision=_latest>, and a rationale for some of the more surprising changes is at
2007 Feb 09
1
Fetching document content by Q term in Python
Hello, I'd like to be able to retrieve the indexes stored copy of the document text and tried the following: terms = self.db.allterms() terms.skip_to('Q' + uri.encode('utf-8')) term = terms.next() doc = self.db.get_document(term[1]) print doc.get_data() I just wildly guessed that [1] was the docid, but of course it isn't. So the question is, how do I
2011 Sep 04
5
Ranking and term proximity
Hi, I was reading an article recently about how google ranks results (among many other things of course) based on the proximity of the search terms in the source documents. In addition, the position of the search terms in the search query string itself is also taken into consideration when determining how important each term is. Does Xapian do something similar - at least for the first part?