Displaying 20 results from an estimated 9000 matches similar to: "Question from a new user of xapian: query term weight"
2018 Aug 09
2
Boosted fields search in Python
Hi,
I'm using Xapian in Python2. I'm trying to replicate an analysis that
somebody else performed in Lucene. To do that I need to do a search for a
multi-word query in which particular fields are boosted - preferably at
query time. That is, given a query like "the cat is lying on the mat" (with
an OR operator, ignoring word positions but with stemming and stop words
removed),
2007 Jan 27
4
Xapian vs Lucene
Hello,
It's probably quite troll-risky to put a title like this, but did anyone
take the trouble to compare Lucene to Xapian and make a list of
differences?
As I told the list at the end of last year, I'm going to have to
integrate an indexing/search engine in the coming weeks or months. It
will be integrated to Dokeos, an open-source e-learning application in
PHP, and at the moment we
2011 Mar 29
2
Participation in GSOC
Hi,
I'm Michael, I would like to participate in this year's Google Summer of
Code, and I picked Xapian as the project to code for.
Before writing a full proposal, I want to get in contact with the
community, as well as introducing myself and discuss my ideas for the
contribution to Xapian.
First of all I'd like to talk about my motivation.
I'm currently working on a webapp
2011 Mar 29
2
Participation in GSOC
Hi,
I'm Michael, I would like to participate in this year's Google Summer of
Code, and I picked Xapian as the project to code for.
Before writing a full proposal, I want to get in contact with the
community, as well as introducing myself and discuss my ideas for the
contribution to Xapian.
First of all I'd like to talk about my motivation.
I'm currently working on a webapp
2013 Oct 30
2
Lucene 3.6.2 backend for xapian (#25)
[Replying to xapian-devel, as I think a wider audience would be useful]
On Mon, Oct 21, 2013 at 11:24:51PM +0800, jiangwen jiang wrote:
> yes, it's less efficient. Lucene database has multiple segments, each
> segment can treat as a independent database. The same term may exists in >=
> 1 segments.
Sorry for taking a while to respond - I've been both busy and mulling
this
2009 Apr 15
2
does xapian have these disadvantages?
hi! alls
i have read an article about comments on Lucene.
http://www.jroller.com/melix/entry/why_lucene_isn_t_that
i have more understanding about Lucene through this article,especially its disadvantage or limitation.
then i would like to question that are there similar disadvantages to xapian?
any advice would be appreciated.
baijl
2009 Apr 15
2
does xapian have these disadvantages?
hi! alls
i have read an article about comments on Lucene.
http://www.jroller.com/melix/entry/why_lucene_isn_t_that
i have more understanding about Lucene through this article,especially its disadvantage or limitation.
then i would like to question that are there similar disadvantages to xapian?
any advice would be appreciated.
baijl
2015 Mar 05
3
Dovecot Full Text Search results in SolrException: undefined field text [SERIOUS]
Hello,
My dovecot constantly runs into this error.
I want to fix this one last time, I am tired of troubleshooting so
please someone give me a lasting and proper solution for this error. I
think its a problem with the dovecot-solr module.
Please tell me how do I find the root of this problem with Dovecot.
There is a problem with the body search text field. It always
fails(with no result), other
2013 Jun 16
3
Backend for Lucene format indexes-How to get doclength
Hi, all:
I have wrote a demo patch for Backend for Lucene format indexes, Lucene
version is 3.6.2.
http://lucene.apache.org/core/3_6_2/fileformats.html
Now, this demo patch just support the basic features in Lucene. Compound
File(.cfs/.cfe)?term vector(.tvx/.tvd/.tvf)
delete document(.del) are not supported, skip list in .fdx is not supported
too
example/quest.cc is used to test this demo.
2007 Jun 05
7
Chinese, Japanese, Korean Tokenizer.
Hi,
I am looking for Chinese Japanese and Korean tokenizer that could can
be use to tokenize terms for CJK languages. I am not very familiar
with these languages however I think that these languages contains one
or more words in one symbol which it make more difficult to tokenize
into searchable terms.
Lucene has CJK Tokenizer ... and I am looking around if there is some
open source that we
2006 Sep 26
3
Scoring/similarity, biased towards small fields?
Lucene, and perhaps most search engines, are biased towards small fields
with little content (where thus the term frequency is higher). Lucene
has the option to define a custom (Similarity) class to calculate the
similarity between two fields (custom calculation of lengthNorm and tf)
in different documents. But how do I do this in ferret? (I know to boost
a field, but this is not what I
2019 Jan 12
2
Solr -> Xapian ?
THank you
Now, for the results
I see the member of fts_result is :
ARRAY_TYPE(seq_range) definite_uids;
I have the UID as a aray of uint32_t *
How to put my UIDs into this "definite_uids" ? Obviously this is not a
simple array/pointer. How to say someting similar to
result->definite_uids[1]=my_uid ?
On 2019-01-12 10:25, Timo Sirainen wrote:
> On 11 Jan 2019, at 21.23,
2015 May 03
2
Make Xapian accept all characters
Hello everyone,
I'm using Xapian at work (PHP bindings) and I have to make it accept '##' as a term to index. We have a layer on top of xapian, but as far as I can tell, Xapian's QueryParser is removing them from the query. So, if I search for just '##' I get an empty query, after Xapian parsed it. I've seen the flags this class accepts, but I can't do what I want
2011 Aug 02
2
Positive experiences with Xapian
Hi Guys,
I just wanted to take a moment to give some positive feedback regarding my
experiences with Xapian recently.
I've been doing a fair amount of research into search engines recently, as
we have some fairly specific requirements with what we're attempting to do
with them. Long story short, after a few weeks of playing around with just
about everything under the sun (or at least,
2005 Feb 24
2
other than default labels in lattice plot
Dear all
I solved a problem of customised labels on strips and boxes in bwplot
by this construction.
> bbb <- bwplot(zavoj ~ typmleti | pu)
> bbb$condlevels$pu <- c("Povrchov? ?prava", "Bez PU")
> bbb$x.limits <- c("Mleto", "Mleto a s?tov?no", "Nemleto")
> bbb
but I wonder if some other easy option exist. Let say something
2019 Jan 12
2
Solr -> Xapian ?
I somehow fixed the folder issue. (seems some unix rights after too many
tests)
Getting back on the "fts_results" structure:
I am trying:
I_ARRAY_INIT(&(RESULT->DEFINITE_UIDS),R->SIZE);
I_ARRAY_INIT(&(RESULT->MAYBE_UIDS),0);
uint32_t uid;
for(i=0;i<r->size;i++)
{
try
{
2014 Dec 05
2
Inbound call from sip peer to internal webrtc peer fails while internal sip-webrtc calls work
Hello,
I'd appreciate your comments on the following problem I'm having, please
forgive me if this is something obvious, I've been scratching my head on
this for a while:
I have Asterisk+Kamailio setup where I'm currently testing inbound calls
from outside. I have both webrtc and sip clients, where webrtc peers are
defined according to sip.js instructions (
2014 Sep 08
1
Asterisk removes ice lines in sdp when calling between webrtc clients
Hello,
I have a problem with a call between 2 webrtc clients. Asterisk removes the
ice-related lines from the sdp when it sends the INVITE out, and the called
webrtc client rejects the INVITE due to the missing ice lines. Both webrtc
clients are defined exactly the same way, same values in all fields except
the number of the peer.
There's probably something I've changed that causes this
2007 Nov 08
1
QueryParser : some remarks
Hi to all,
First, I would like to say a big thank you for the work which was done
on my 'wish bug' to allow mapping one field to multiple prefixes
(http://www.xapian.org/cgi-bin/bugzilla/show_bug.cgi?id=93).
That's great!
I have upgraded to 1.0.4 and I am revisiting my code, replacing the php
query parser I wrote with Xapian's one.
Everything works well, but I have some
2018 Nov 30
1
Xapian Benchmark results
Hi,
I am currently trying to benchmark a multithreaded xapian implementation on
a chameleon baremetal instance written in C++. My workload is a 3 Gig
wikipedia xml dump consisting of ~286 file of different sizes. My results
are showing me that indexing on xapian is an order of magnitude faster than
my lucene and lucene plusplus implementations. This is a result that I did
not expect. Just want to