thr3ads.net - similar to: "Question from a new user of xapian: query term weight"

Displaying 20 results from an estimated 9000 matches similar to: "Question from a new user of xapian: query term weight"

Boosted fields search in Python

2018 Aug 09

Boosted fields search in Python

Hi, I'm using Xapian in Python2. I'm trying to replicate an analysis that somebody else performed in Lucene. To do that I need to do a search for a multi-word query in which particular fields are boosted - preferably at query time. That is, given a query like "the cat is lying on the mat" (with an OR operator, ignoring word positions but with stemming and stop words removed),

Xapian vs Lucene

2007 Jan 27

Xapian vs Lucene

Hello, It's probably quite troll-risky to put a title like this, but did anyone take the trouble to compare Lucene to Xapian and make a list of differences? As I told the list at the end of last year, I'm going to have to integrate an indexing/search engine in the coming weeks or months. It will be integrated to Dokeos, an open-source e-learning application in PHP, and at the moment we

Participation in GSOC

2011 Mar 29

Participation in GSOC

Hi, I'm Michael, I would like to participate in this year's Google Summer of Code, and I picked Xapian as the project to code for. Before writing a full proposal, I want to get in contact with the community, as well as introducing myself and discuss my ideas for the contribution to Xapian. First of all I'd like to talk about my motivation. I'm currently working on a webapp

Participation in GSOC

2011 Mar 29

Participation in GSOC

Lucene 3.6.2 backend for xapian (#25)

2013 Oct 30

Lucene 3.6.2 backend for xapian (#25)

[Replying to xapian-devel, as I think a wider audience would be useful] On Mon, Oct 21, 2013 at 11:24:51PM +0800, jiangwen jiang wrote: > yes, it's less efficient. Lucene database has multiple segments, each > segment can treat as a independent database. The same term may exists in >= > 1 segments. Sorry for taking a while to respond - I've been both busy and mulling this

does xapian have these disadvantages?

2009 Apr 15

does xapian have these disadvantages?

hi! alls i have read an article about comments on Lucene. http://www.jroller.com/melix/entry/why_lucene_isn_t_that i have more understanding about Lucene through this article,especially its disadvantage or limitation. then i would like to question that are there similar disadvantages to xapian? any advice would be appreciated. baijl

does xapian have these disadvantages?

2009 Apr 15

does xapian have these disadvantages?

Dovecot Full Text Search results in SolrException: undefined field text [SERIOUS]

2015 Mar 05

Dovecot Full Text Search results in SolrException: undefined field text [SERIOUS]

Hello, My dovecot constantly runs into this error. I want to fix this one last time, I am tired of troubleshooting so please someone give me a lasting and proper solution for this error. I think its a problem with the dovecot-solr module. Please tell me how do I find the root of this problem with Dovecot. There is a problem with the body search text field. It always fails(with no result), other

Backend for Lucene format indexes-How to get doclength

2013 Jun 16

Backend for Lucene format indexes-How to get doclength

Hi, all: I have wrote a demo patch for Backend for Lucene format indexes, Lucene version is 3.6.2. http://lucene.apache.org/core/3_6_2/fileformats.html Now, this demo patch just support the basic features in Lucene. Compound File(.cfs/.cfe)?term vector(.tvx/.tvd/.tvf) delete document(.del) are not supported, skip list in .fdx is not supported too example/quest.cc is used to test this demo.

Chinese, Japanese, Korean Tokenizer.

2007 Jun 05

Chinese, Japanese, Korean Tokenizer.

Hi, I am looking for Chinese Japanese and Korean tokenizer that could can be use to tokenize terms for CJK languages. I am not very familiar with these languages however I think that these languages contains one or more words in one symbol which it make more difficult to tokenize into searchable terms. Lucene has CJK Tokenizer ... and I am looking around if there is some open source that we

2006 Sep 26

Scoring/similarity, biased towards small fields?

Lucene, and perhaps most search engines, are biased towards small fields with little content (where thus the term frequency is higher). Lucene has the option to define a custom (Similarity) class to calculate the similarity between two fields (custom calculation of lengthNorm and tf) in different documents. But how do I do this in ferret? (I know to boost a field, but this is not what I

Solr -> Xapian ?

2019 Jan 12

Solr -> Xapian ?

THank you Now, for the results I see the member of fts_result is : ARRAY_TYPE(seq_range) definite_uids; I have the UID as a aray of uint32_t * How to put my UIDs into this "definite_uids" ? Obviously this is not a simple array/pointer. How to say someting similar to result->definite_uids[1]=my_uid ? On 2019-01-12 10:25, Timo Sirainen wrote: > On 11 Jan 2019, at 21.23,

Make Xapian accept all characters

2015 May 03

Make Xapian accept all characters

Hello everyone, I'm using Xapian at work (PHP bindings) and I have to make it accept '##' as a term to index. We have a layer on top of xapian, but as far as I can tell, Xapian's QueryParser is removing them from the query. So, if I search for just '##' I get an empty query, after Xapian parsed it. I've seen the flags this class accepts, but I can't do what I want

Positive experiences with Xapian

2011 Aug 02

Positive experiences with Xapian

Hi Guys, I just wanted to take a moment to give some positive feedback regarding my experiences with Xapian recently. I've been doing a fair amount of research into search engines recently, as we have some fairly specific requirements with what we're attempting to do with them. Long story short, after a few weeks of playing around with just about everything under the sun (or at least,

other than default labels in lattice plot

2005 Feb 24

other than default labels in lattice plot

Dear all I solved a problem of customised labels on strips and boxes in bwplot by this construction. > bbb <- bwplot(zavoj ~ typmleti | pu) > bbb$condlevels$pu <- c("Povrchov? ?prava", "Bez PU") > bbb$x.limits <- c("Mleto", "Mleto a s?tov?no", "Nemleto") > bbb but I wonder if some other easy option exist. Let say something

Solr -> Xapian ?

2019 Jan 12

Solr -> Xapian ?

I somehow fixed the folder issue. (seems some unix rights after too many tests) Getting back on the "fts_results" structure: I am trying: I_ARRAY_INIT(&(RESULT->DEFINITE_UIDS),R->SIZE); I_ARRAY_INIT(&(RESULT->MAYBE_UIDS),0); uint32_t uid; for(i=0;i<r->size;i++) { try {

Inbound call from sip peer to internal webrtc peer fails while internal sip-webrtc calls work

2014 Dec 05

Inbound call from sip peer to internal webrtc peer fails while internal sip-webrtc calls work

Hello, I'd appreciate your comments on the following problem I'm having, please forgive me if this is something obvious, I've been scratching my head on this for a while: I have Asterisk+Kamailio setup where I'm currently testing inbound calls from outside. I have both webrtc and sip clients, where webrtc peers are defined according to sip.js instructions (

Asterisk removes ice lines in sdp when calling between webrtc clients

2014 Sep 08

Asterisk removes ice lines in sdp when calling between webrtc clients

Hello, I have a problem with a call between 2 webrtc clients. Asterisk removes the ice-related lines from the sdp when it sends the INVITE out, and the called webrtc client rejects the INVITE due to the missing ice lines. Both webrtc clients are defined exactly the same way, same values in all fields except the number of the peer. There's probably something I've changed that causes this

QueryParser : some remarks

2007 Nov 08

QueryParser : some remarks

Hi to all, First, I would like to say a big thank you for the work which was done on my 'wish bug' to allow mapping one field to multiple prefixes (http://www.xapian.org/cgi-bin/bugzilla/show_bug.cgi?id=93). That's great! I have upgraded to 1.0.4 and I am revisiting my code, replacing the php query parser I wrote with Xapian's one. Everything works well, but I have some

Xapian Benchmark results

2018 Nov 30

Xapian Benchmark results

Hi, I am currently trying to benchmark a multithreaded xapian implementation on a chameleon baremetal instance written in C++. My workload is a 3 Gig wikipedia xml dump consisting of ~286 file of different sizes. My results are showing me that indexing on xapian is an order of magnitude faster than my lucene and lucene plusplus implementations. This is a result that I did not expect. Just want to

similar to: Question from a new user of xapian: query term weight