search for: docids

Displaying 20 results from an estimated 151 matches for "docids".

Did you mean: docid
2004 Dec 21
1
Search::Xapian add_database'd search results are odd?
Sorry if this is the wrong forum to discuss Search::Xapian issues -- this just seems like the best place.. Anyways, I've been testing out using $db->add_database() when searching, and it seems like the docids I'm getting out of it are incorrect, almost as though they're "double" what they should be (numerically)... the docids that exist should be around 950,000 and 1000000 not around 1900000, etc... $xapiandirbase . '-11' and $xapiandirbase . '-10' both exist. qui...
2005 Jul 20
1
docid type redifine
Hello all. I need to redefine a docid type (and all dependent types) like this: typedef unsigned long long docid; I think it would be enough to edit "include/xapian/types.h", but it isn't so. 1) I've added : string om_tostring(unsigned long long val) { CONVERT_TO_STRING("%llu") } in common/utils.{h,cc} 2) In include/enquire.h (line 438) I've found the
2010 Oct 22
1
overlapping docids when searching on multiple databases?
Just a quick question - it seems to me that it's entirely possible to get overlapping docids when searching on multiple databases? For instance: open database1 add database2 to database1 search db1+db2 if docid 10 exists in both databases, is there any way of telling which which database to retrieve the document from? /Per Jessen, Z?rich
2005 Jun 29
2
Sort by docid
Hello, I wonder if there is a way to cause Xapian to order a result set purely by docid. In other words, once the result set has been determined, I'd like the results to be returned to me ordered by their docid, as opposed to by their match relevance. The problem at hand is that I'm building a search engine for a mailing list and I would like to return matches sorted by date; ordering by
2013 Apr 26
1
Compiling Xapian within a Cocoa project
Hi, folks. I don't know much about C++, so please excuse my newbie question: I just tried to include Xapian in a dummy Cocoa project, ie created a new project and added the libxapian.a file installed via MacPorts. When I include xapian.h in an Objective-C++ file (either mm or h), the compilation fails with the following message: ================= In file included from
2012 Mar 31
1
Project: Posting list encoding improvements
...ions toward them. 1) After read the comments in brass_postlist.cc, I am still not very clear about the detailed structure of postings list. If you can provide some simple examples/graphs will be very straightforward. 2) My instant idea to make list smaller: use gamma codes to encode the gap between docids instead of docids. Last question towards the project of weighting schemes: Do we need only to implement existing weighting scheme instead of coming up with new ideas? And our mission is to find a weighting scheme that could replace the default BM25 in Xapian? -- Weixian Zhou Department of Compu...
2023 May 03
1
manual flushing thresholds for deletes?
...ency for non-boolean terms and the term frequency for boolean terms, so that's: xapian-delve -avv1 .|tr -d A-Z|awk '{f = $3 ? $3 : $2; t += length($1)*f; n += f} END {print t/n}' > My Perl deletion code is something like: > > my $EST_LEN = 6; > ... > for my $docid (@docids) { > $TXN_BYTES -= $xdb->get_doclength($docid) * $EST_LEN; However you're using that estimate here, and the document length doesn't include boolean terms (it's sum(wdf) over the terms in the document), so including them in $EST_LEN seems wrong. For you doing so increases $EST_...
2010 Apr 26
8
[LLVMdev] Proposal for a new LLVM concurrency memory model
Hi all, Chandler, Owen, and I have written up a proposal for a new memory model and atomic intrinsics in LLVM, which will make it possible to support Java and the upcoming C++0x standard. The proposed changes to the LangRef are at <http://docs.google.com/View?docID=ddb4mhxz_22dz5g98dd&revision=_latest>, and a rationale for some of the more surprising changes is at
2007 Feb 09
1
Fetching document content by Q term in Python
Hello, I'd like to be able to retrieve the indexes stored copy of the document text and tried the following: terms = self.db.allterms() terms.skip_to('Q' + uri.encode('utf-8')) term = terms.next() doc = self.db.get_document(term[1]) print doc.get_data() I just wildly guessed that [1] was the docid, but of course it isn't. So the question is, how do I
2011 May 30
1
How to check docid
I have a bit of code (Python) to delete a number of documents: for f in Flist: xapian_store.delete_document(f.pri_key) in which I am using a unique primary key from an SQL database as the docid for the Xapian database. The problem I have is that some of the documents may not have been created - so I get an error. Now I could just ignore the error (try-recover), but what would be the
2014 May 10
2
some trouble when devising skiplist
Hi, I was confronted with some trouble, I describe the trouble in my journal http://trac.xapian.org/wiki/GSoC2014/Posting%20list%20encoding%20improvements/Journal#May10 And corresponding code is in my git. Would you like to give me some help? ------------------ Shangtong Zhang,Second Year Undergraduate, School of Computer Science, Fudan University, China. -------------- next part
2013 Jan 18
2
Smartd Warning in logwatch report
I updated the machine that I use for a web/email server last night to 5.9, and I have this in the logwatch report this morning: QUOTE: --------------------- Smartd Begin ------------------------ Warnings: Device: /dev/sda [SAT], WARNING: There are known problems with these drives, - 1 Time(s) **Unmatched Entries** smartd 5.42 2011-10-20 r3458 [x86_64-linux-2.6.18-348.el5] (local
2023 May 03
1
manual flushing thresholds for deletes?
...: awk 'NR > 1 {t += length($1)*($3+1); n += ($3+1)} END {print t/n}' # (also added "NR > 1" to ignore the delve header line) Which gives me 6.00067, so rounding to 6 seems fine either way. My Perl deletion code is something like: my $EST_LEN = 6; ... for my $docid (@docids) { $TXN_BYTES -= $xdb->get_doclength($docid) * $EST_LEN; $xdb->delete_document($docid); if ($TXN_BYTES < 0) { # flush within txn $xdb->commit_transaction; $TXN_BYTES = 8000000; $xdb->begin_transaction; } } > > (that awk bit should be overflow-free) <snip&...
2013 Jun 19
2
Compact databases and removing stale records at the same time
On Wed, Jun 19, 2013, at 03:49 PM, Olly Betts wrote: > On Wed, Jun 19, 2013 at 01:29:16PM +1000, Bron Gondwana wrote: > > The advantage of compact - it runs approximately 8 times as fast (we > > are CPU limited in each case - writing to tmpfs first, then rsyncing > > to the destination) and it takes approximately 75% of the space of a > > fresh database with maximum
2007 May 15
1
Document ID 0 is invalid... but not always...
Note: this is rather long and not very important and I don't want to prevent the team from releasing version 1.0, so go on reading only if you have too much free time !!! ;-) 0 is not a valid document ID, never, ever, but I just found a special case in which xapian will create a record and return 0 for the newly created record. In fact, I was "hacking", trying to store metadata
2013 Mar 26
1
Xapian wiki: typo in docid to sub-db translation?
On the Xapian wiki page: http://trac.xapian.org/wiki/FAQ/MultiDatabaseDocumentID It seems to me that: subdatabase_number = docid_combined % number_of_databases; Should read: subdatabase_number = (docid_combined - 1) % number_of_databases; Otherwise I'm seriously confused ... Cheers, jf
2003 Jan 08
7
ping from local to net
I try to do ping between my local network and Internet and i can''t do it, in my policy I have: loc net ACCEPT info loc fw ACCEPT loc dmz ACCEPT info fw loc ACCEPT fw net ACCEPT info fw dmz
2011 Apr 08
2
[Weft QDA users] Shifty Markings - round 2
Good Afternoon, I''m new to the mailing list and was wondering if anyone could help me with my current headache. I saw in the previous posts that someone else has had the problem of shifty markings. I too am finding the text immediately above what I wrote appearing in the coding reports, even though the coding in the documents themselves remains as it should. The project involves
2017 Dec 18
2
How to get the serialise score returned in Xapian::KeyMaker->operator().
On Sat, Dec 16, 2017 at 10:11:40PM +0000, Olly Betts wrote: > Unfortunately the sort key isn't currently exposed via the public API. > It's available internally and it seems like it ought to be accessible > but there's no accessor method for it - I can add one but that won't > help for existing releases. I've added MSetIterator::get_sort_key() to master in
2013 Mar 08
2
Gsoc-2013
Hi, I am Chinmay Naik, an undergraduate in Computer Science at Bangalore Institute of Technology, Bangalore. I am an experienced programmer and good with C,C++,Python,Java,OpenGL and would love to participate in Gsoc-13. >From the ideas listed, i am interested to work on the project "posting list encoding improvements". I am a newbie to Xapian but would like to get involved and get a