similar to: overlapping docids when searching on multiple databases?

Displaying 20 results from an estimated 700 matches similar to: "overlapping docids when searching on multiple databases?"

2010 Oct 14
1
xapian-check on "crashed" index?
Hi. Is xapian-check aware of the uncommitted data that could be sitting in an xapian index if the indexer has crashed during indexing. Could errors be falsely reported by xapian-chek in this situation? -- Jesper
2010 Dec 01
2
Are stub databases still supported in 1.0.21?
I have the following setup: Databases: /var/lib/xapian-omega/data/db1 /var/lib/xapian-omega/data/db2 /var/lib/xapian-omega/data/db3 Stub: /var/lib/xapian-omega/data/default The stub file "default" is a text file that contains the following: auto /var/lib/xapian-omega/data/db1 auto /var/lib/xapian-omega/data/db2 auto /var/lib/xapian-omega/data/db3 Using the following returns nothing:
2011 Mar 08
1
MSet order
Hello I defined a weighting scheme to simulate a king of "euclidean" distance. To test it, i used a database with 1000 documents. If I run : enquire.set_weighting_scheme(MyWeight()); Xapian::MSet matches = enquire.get_mset(0, 1000); I have a correct list of results. But if I run Xapian::MSet matches = enquire.get_mset(0, 10); I don't have the top-10 results. If I run Xapian::MSet
2011 Feb 18
1
Is it possible to reset the parameters in BM25 each time a new query enters?
Hi guys, I'm trying to improve the search results of our collection by tuning the parameters in the BM25 weighting schema. Since our collection includes several databases, such as for pictures, websites, etc., I would like to use different values of the same schema to calculate the weights. Yet, rebuilding each time after the change was done to the head file seems not an optimal approach and
2010 Oct 28
1
hypens in words + NEAR + 3 terms + AND_MAYBE => crash
Probably an uncaught malformed query - the following form of search queries causes a crash for me (core 1.2.3, Perl API, 64bit Debian Lenny, self-compiled): x-y NEAR test NEAR test The first term can be anything with a hyphen in it but word characters at the beginning and end ("3--3" will do). The other 2 terms can be anything. "test NEAR x-y NEAR test" will not cause a
2009 Apr 29
2
if condition doesn't evaluate to True/False
Hi friends, Please help me with this bug. *Bug in my code:* In this variable sub_grp_whr_cls_data[sbgrp_no,1] I store the where clause.every sub group has a where condition linked with it. Database1 Where clause was not found for a particular subgroup, sub_grp_whr_cls_data[sbgrp_no,1] value was NULL So the condition (*sub_grp_whr_cls_data[sbgrp_no,1]=="NULL" ||
2004 Dec 21
1
Search::Xapian add_database'd search results are odd?
Sorry if this is the wrong forum to discuss Search::Xapian issues -- this just seems like the best place.. Anyways, I've been testing out using $db->add_database() when searching, and it seems like the docids I'm getting out of it are incorrect, almost as though they're "double" what they should be (numerically)... the docids that exist should be around 950,000 and
2015 Mar 11
2
stub-file and get_doccount
Hello, i switched from one big index to a stub file with many indexes and running into a problem. i have a tool to fetch a random document via: get_doccount random id up to get_doccount get_document with that id after changing to stub file this failes. Is there a nice way to get a random document from a stub file? ?MfG? Felix Ostmann
2013 Mar 26
1
Xapian wiki: typo in docid to sub-db translation?
On the Xapian wiki page: http://trac.xapian.org/wiki/FAQ/MultiDatabaseDocumentID It seems to me that: subdatabase_number = docid_combined % number_of_databases; Should read: subdatabase_number = (docid_combined - 1) % number_of_databases; Otherwise I'm seriously confused ... Cheers, jf
2006 Nov 17
5
configure a rails app for multiple databases
Hello Rails community I cannot seem to find via Google what I had hoped would be a simple issue On a single DB system (currently, postgres 8.1.4), I have two databases, each containing multiple tables. I would like to configure my app and database.yml to recognize these two databases. What is the corrrect config for the database.yml ? Is it something like: > production: > adapter:
2011 Feb 09
2
critical feature from version 1 not migrated to version 2 = authentication configuration database per IP
not possible make operation with dovecot version 2.x as was possible in version 1.x: requisites description: connect to dovecot service on IP1 - dovecot must serve users that related to domain1 located in database1 connect to dovecot service on IP2 - dovecot must serve users that related to domain2 located in database2 login must be with username that form not as "user at domain" but
2012 Mar 31
1
Project: Posting list encoding improvements
Hi Xapianers: My name is Weixian Zhou, Computer Science student of University at Buffalo, State University of New York. I am interested in the project of posting list encoding improvements and weighting schemes. I have some questions toward them. 1) After read the comments in brass_postlist.cc, I am still not very clear about the detailed structure of postings list. If you can provide some simple
2023 May 03
1
manual flushing thresholds for deletes?
On Wed, May 03, 2023 at 12:38:15PM +0000, Eric Wong wrote: > Olly Betts <olly at survex.com> wrote: > > This will also effectively ignore boolean terms, assuming you're giving > > them wdf of 0 (because $3 here is the collection frequency, which is > > sum(wdf(term)) over all documents). > > Should boolean terms be ignored when estimating flushing >
2014 May 10
2
some trouble when devising skiplist
Hi, I was confronted with some trouble, I describe the trouble in my journal http://trac.xapian.org/wiki/GSoC2014/Posting%20list%20encoding%20improvements/Journal#May10 And corresponding code is in my git. Would you like to give me some help? ------------------ Shangtong Zhang,Second Year Undergraduate, School of Computer Science, Fudan University, China. -------------- next part
2023 May 03
1
manual flushing thresholds for deletes?
Olly Betts <olly at survex.com> wrote: > On Mon, Mar 27, 2023 at 11:22:09AM +0000, Eric Wong wrote: > > Olly Betts <olly at survex.com> wrote: > > > 10 seems too long. You want the mean word length weighted by frequency > > > of occurrence. For English that's typically around 5 characters, which > > > is 5 bytes. If we go for +1 that's:
2009 May 17
2
Chow test(1960)/Structural change test
Hi,   A question on something which normally should be easy !   I perform a linear regression using lm function:   > reg1 <- lm (a b+c+d, data = database1)   Then I try to perform the Chow (1960) test (structural change test) on my regression. I know the breakpoint date. I try the following code like it is described in the “Examples” section of the “strucchange” package :   > sctest(reg1,
2013 Jun 19
2
Compact databases and removing stale records at the same time
On Wed, Jun 19, 2013, at 03:49 PM, Olly Betts wrote: > On Wed, Jun 19, 2013 at 01:29:16PM +1000, Bron Gondwana wrote: > > The advantage of compact - it runs approximately 8 times as fast (we > > are CPU limited in each case - writing to tmpfs first, then rsyncing > > to the destination) and it takes approximately 75% of the space of a > > fresh database with maximum
2013 Jan 09
1
problem adding curve/abline
Hey, I'm stuck on something I already did before (just a different kind of database), and whatever I try, it doesn't work anymore. So thanks for your help. Here's how my data approximately looks like: year season replicate size freq weight 2000 summer ch1 6 1 45 2000 summer ch1
2020 Feb 19
2
prioritizing aggregated DBs
Olly Betts <olly at survex.com> wrote: > On Sat, Feb 08, 2020 at 06:04:42PM +0000, Eric Wong wrote: > > Olly Betts <olly at survex.com> wrote: > > > On Fri, Feb 07, 2020 at 09:33:08PM +0000, Eric Wong wrote: > > > > Or would I fiddle with wdf_inc for all ->index_text and ->add_term > > > > calls on a per-DB basis? > > > >
2017 Jun 05
2
Logging the click data
Hi James, > ID: some identifier for each query > QUERY: text of the query (when the query is run) > URLs: every URL displayed (or alternatively, the Xapian docid — this > might be easier) > OFFSET: otherwise you'll have difficulty coping with result pages other > than the first page (when this happens, the query ID should probably > remain the same, and when you aggregate