similar to: Test for the end of PostingIterator in perl?

Displaying 20 results from an estimated 400 matches similar to: "Test for the end of PostingIterator in perl?"

2023 Mar 28
1
uncaught C++ exception from Perl Search::Xapian XS?
On Mon, Mar 27, 2023 at 11:46:04AM +0000, Eric Wong wrote: > I'm using Search::Xapian XS from Debian stable and I'm getting what > appears to be an unwrapped C++ exception: > > terminate called after throwing an instance of 'Xapian::DatabaseModifiedError' > > Using `eval' from Perl5 doesn't seem effective in catching it. > > I'm using
2010 Jan 16
1
PHP XapianTermIterator/XapianPositionIterator usage
Hello again, /thanks to Peter for previous response. I've been digging around trying to find sample usage of XapianTermIterator/XapianPositionIterator in PHP. The idea is to code up a test case in PHP to perform snippet extraction (with a possible view to coding a pecl extension in C). I found a C++ sample, but that wasn't much help. I must be dense this morning though, since I
2023 Mar 27
1
uncaught C++ exception from Perl Search::Xapian XS?
I'm using Search::Xapian XS from Debian stable and I'm getting what appears to be an unwrapped C++ exception: terminate called after throwing an instance of 'Xapian::DatabaseModifiedError' Using `eval' from Perl5 doesn't seem effective in catching it. I'm using postlist_begin, postlist_end and ++ to iterate a PositionIterator, and reading XS/*Iterator.xs, I see the
2013 Jun 19
2
Compact databases and removing stale records at the same time
On Wed, Jun 19, 2013, at 03:49 PM, Olly Betts wrote: > On Wed, Jun 19, 2013 at 01:29:16PM +1000, Bron Gondwana wrote: > > The advantage of compact - it runs approximately 8 times as fast (we > > are CPU limited in each case - writing to tmpfs first, then rsyncing > > to the destination) and it takes approximately 75% of the space of a > > fresh database with maximum
2016 May 09
1
Given a document, how do you get its ID? (perl bindings)
I am writing an indexer that will crawl our web site. Following the recommendation here: https://trac.xapian.org/wiki/FAQ/UniqueIds I'm using the URL as the unique ID for each document. I see how to get a document from the xapian database if I know its URL, but what I need is also to be able to find out the URL from the document. Does this mean I need to store the URL in a value in
2009 Jan 27
1
Segmentation fault in MSetIterator get_weight
Hi, I'm using xapian with c# and mono and i'm having a segfault in get_weight. When i print the index variable, the value is clearly too high. I think something write over it. Do you have any idea on how i could trace the beginning of the segmentation fault ? Thanks, -- Yann
2016 May 16
2
Weighting recent results
I was thinking about this some more: Is there a reason I can't just weight by some function of recency at indexing time? $weight = get_weight_based_on_recency(...); $tg->index_text($txt,$weight); If I wanted to allow the user the option of searching either in recency-weighted mode or not, I could index each document into 2 different databases, one with and one without. This avoids
2016 May 03
2
Weighting recent results
On 5/2/2016 9:03 PM, Olly Betts wrote: > On Fri, Apr 22, 2016 at 12:23:15PM -0400, Alex Aminoff wrote: >> I did some digging and found a thread from 2011 talking about how to >> subclass Xapian::PostingSource in order to incorporate the date or >> recency of a document in its weighting: >> >> http://thread.gmane.org/gmane.comp.search.xapian.general/8849/focus=8856
2012 Jan 20
3
get_docid???
my $mset = $enq->get_mset($nstart,$nrecords); for(my $mit=$mset->begin(); $mit != $mset->end();$mit++) { my $doc = $mit->get_document(); my $dat = $doc->get_data(); my $id = $doc->get_docid(); } [Fri Jan 20 10:35:06 2012] newmail.cgi: Can't locate auto/Search/Xapian/Document/get_docid.al in @INC (@INC contains: /etc/perl
2006 Sep 26
1
extended ACLs and Samba
Hi, i've a Project share with many subfolders. In this subfolder many user have dedicated access rights to single files. Sometimes a project member change and the new one should get the same rights as the old one. But here is my problem. With the following line i can easy change the owner of files: find /samba/project -user oldid -exec chown newid {} ";" But i don't know
2013 Jun 19
2
Compact databases and removing stale records at the same time
I'm trying to compact (or at least merge) multiple databases, while stripping search records which are no longer required. Backstory: I've inherited the Cyrus IMAPd xapian-based search code from Greg Banks when he left Opera. One of the unfinished parts was removing expunged emails from the search database. We moved from having a single search database to supporting multiple
2017 Sep 28
1
Weighting the author of a doc when that term can also appear as a frequent term in other docs
We have a corpus of academic papers. Sometimes it happens that there is an academic controversy and one paper is a response or rebuttal to another paper. The name of the author of the first paper may appear many times in the second paper. So in light of this, how should we set our weight on the author field? Here is an example: http://www.nber.org/papers/w11215  in which the term
2016 Apr 22
2
Weighting recent results
I did some digging and found a thread from 2011 talking about how to subclass Xapian::PostingSource in order to incorporate the date or recency of a document in its weighting: http://thread.gmane.org/gmane.comp.search.xapian.general/8849/focus=8856 As in that thread, I want to be clear that I don't want to sort by date, but rather incorporate date information into the score by which I
2013 Jan 18
2
A smart way to use "$" in data frame
Hello all, I have a data frame dataa: newdate newstate newid newbalance newaccounts 1 31DEC2001 AR 1 1170 61 2 31DEC2001 VA 2 4565 54 3 31DEC2001 WA 3 2726 35 4 31DEC2001 AR 3 2700 35 The following gives me the balance of state AR:
2007 Sep 30
1
Perl example of using termitrator?
I'm having trouble translating from C++ to perl objects. The TermIterator class looks like to get a set of terms in a document you might have C++ code like: Enquire::TermIterator termIt =enquire->get_matching_terms_begin(id); for(;termIt != enquire->get_matching_terms_end(id);termIt++) { string term = *termIt; } Or something similar. However when I attempt to translate that
2013 Jan 17
1
FASTER Search
I am suffering for slow searching performance on Xapian. I am using Xapian for indexing about 150,000,000 documents. It was implemented in C++; The performance of searching was not that fast. e.g. Searching a query, which includes about 20 terms, needs 2 secs avg. For searching, I followed such steps: 1. construct a QueryParser for certain string 2. parse the query to get a Xapian::Query
2017 Sep 12
2
perl bindings to Xapian::Query
QueryParser is great, but I would like to make a query myself, so I can filter results by a specified value (in this case restricting by epoch time after a certain value) My code looks like this, and compiles, and appears like it should work according to the perl source:     my $query = $qp->parse_query($querystr);     if ($datefilter) {         my $filterepoch = time() - ($datefilter
2010 Sep 26
5
Network booting FreeBSD with gpxelinx almost works (fwd)
We have been network booting FreeBSD for some time with pxeboot. But now we would like to have menu of OSs to boot and got the idea somewhere that gpxelinux could do that for us. We copied gpxelinux.0 from the syslinux-4.02 distribution and replaced pxeboot with "gpxelinux" in the dhcpd.conf file. Indeed with a configuration file in pxelinux.cfg like this: default freebsd
2004 Jul 07
3
fast NA elimination ?
dear R wizards: an operation I execute often is the deletion of all observations (in a matrix or data set) that have at least one NA. (I now need this operation for kde2d, because its internal quantile call complains; could this be considered a buglet?) usually, my data sets are small enough for speed not to matter, and there I do not care whether my method is pretty inefficient (ok, I
2004 Jun 20
4
if syntax
I ran into an interesting oddity of R, if (0) { print(1); } else { print(2); } is a syntax error, while if (0) { print(1); } else { print(2); } or if (0) { print(1); } else { print(2); } is not. I presume it has to do with the duality of the newline functioning as an end of command (;) character, though it still seems a bit odd, and it took me a while to figure out