similar to: Moving indextext.cc into core.

Displaying 20 results from an estimated 8000 matches similar to: "Moving indextext.cc into core."

2004 Jun 28
2
[Fwd: Irix install of omega fails.]
OK, I'll try again. Thanks, Jim. -------------- next part -------------- An embedded message was scrubbed... From: Jim Lynch <jwl@sgi.com> Subject: Irix install of omega fails. Date: Mon, 28 Jun 2004 14:16:46 -0400 Size: 2057 Url: http://lists.tartarus.org/pipermail/xapian-discuss/attachments/20040628/212669c1/Irixinstallofomegafails.eml
2005 Dec 30
1
Query Parser, filenames and compound words
When I submit a filename to the query parser it breaks it up Example: /home/user/file_name.ext becomes Xapian::Query((home:(pos=1) PHRASE 5 user:(pos=2) PHRASE 5 file:(pos=3) PHRASE 5 name:(pos=4) PHRASE 5 ext:(pos=5))) which does not find the document. If I do an single term query not using the query parser then I find the document. The Query Parser also breaks up hyphenated terms
2006 May 26
1
Unicode troubles
Hi, I've tried to follow all helpful tips I've found in the mailing-list and I've applied these two utf-8 patches; http://article.gmane.org/gmane.comp.search.xapian.general/2324 http://article.gmane.org/gmane.comp.search.xapian.general/1927 Now the QueryParser works as I wants it to do, and creates the terms correctly. But sadly I can't find any documents. If I do this; $ quest
2009 Apr 06
2
omindex => Unknown extension
Hi all, I'm having a recurrent problem with Omega's indexing. When I run omindex, it sometimes misses to recognize the extension of some files (.doc, .pdf) and skips them. In the same run, omindex is otherwise perfectly able to index other files with same extensions. The reason is not clear but it should occur before it selects a content converter since for example, if I manually run
2018 Aug 09
2
Boosted fields search in Python
Hi, I'm using Xapian in Python2. I'm trying to replicate an analysis that somebody else performed in Lucene. To do that I need to do a search for a multi-word query in which particular fields are boosted - preferably at query time. That is, given a query like "the cat is lying on the mat" (with an OR operator, ignoring word positions but with stemming and stop words removed),
2007 Jul 04
3
Stemming problem
Does anyone know if xapian stemming support suffix -er? I tried -s and -ing both work, but not -er. _________________________________________________________________ ?????????????? MSN Messenger: http://messenger.msn.com/cn
2010 Nov 15
4
Stopword addition and stemming
Hi, Two questions which I'm unsure about: Stemming: I've turned on stemming, etc, but how can I confirm that it's being used in searches? What should I look/search for? Stopwords: I'm trying out xapian on a regional dataset (searching data from a *.co.us TLD, eg) . I've noticed that searching for [bob co.us] results in *very* slow search times (tens of seconds), since it
2014 Sep 20
2
Help with xapian
Hi, I am interested in doing some developement work for xapian. I have built the xapian core library on my system and also tried my hands on a few features of xapian. For the past few days I have been going through the code base of xapian. I am trying to understand how the features are implemented ( i tried looking into the codes of porter stemming and list encoding) but I've not had any
2016 Sep 19
2
Pull requests: CJK words and Snippet generator
Olly, sorry for my delayed reply. Am Mo, 12. Sep 2016, um 05:32, schrieb Olly Betts: > On Wed, Sep 07, 2016 at 02:30:16PM +0200, rsto at paranoia.at wrote: > > On Tue, Sep 6, 2016, at 09:16, Olly Betts wrote: > > > I think my main concerns are about efficiency [...] > > For the proposed term coverage, the implementation looks up and inserts > > terms into a map. That
2011 Sep 23
2
understanding stemming and synonyms
I am working with version 1.2.7 and want to use stemming and synonyms. I use the perl-bindings and get some problems. First of all: the perl-bindings dont allow the QueryParser a third argument when calling parse_query! So i cannot set a default prefix (which perhaps is the solution to my problem, but later more) i have a simple testcase: 3 documents, every document only has one word:
2011 Apr 09
1
Pretty URLs for omega?
Hello :-) How can the default omega URL be prettified? http://<host_ID>/cgi-bin/omega is working fine, giving us all omega's default CGI parameters. Now we want multiple databases which could be accessed using http://<host_ID>/cgi-bin/omega?DB=<index_ID> but this is starting to get messy. It will get messier when we start to customise templates with
2010 Apr 11
1
A Hebrew stemmer based on libhspell
Hello. I'm interested in creating a Hebrew stemmer to use with Xapian. Hebrew is a complicated language to stem, as it uses the semitic "root" system, rather than prefixes and suffixes, and has many irregularities in accidence (morphology). Fortunately, two bright fellows from the Technion University in Israel have already created a Hebrew morphological analyzer as part of their
2008 Jul 29
1
xapian-omega runfilter.cc patch
Hi, The following patch for runfilter.cc is needed for building xapian-omega on FreeBSD: --- runfilter.cc.orig 2008-07-03 21:16:54.000000000 +0200 +++ runfilter.cc 2008-07-03 21:18:48.000000000 +0200 @@ -25,6 +25,7 @@ #include "safeerrno.h" #include <sys/types.h> #include <stdio.h> +#include <signal.h> #include "safefcntl.h" #ifdef HAVE_SYS_TIME_H
2009 Jun 05
2
Blacklist stemming
Hi, I need to modify the stemming for a couple of words (a blacklist) and for all the other to use the usual snowball stemmer. The "natural" way of doing it would be to derive from Stem and override operator ()... but I am using *python-bindings*. Would this be possible? If not I have two other solutions in mind: - add a custom stemmer to Xapian - write custom index & search
2007 Mar 29
1
stemtest failing with romanian
On Tuesday, I replaced the romanian1 and romanian2 stemmers in Xapian-core with Martin's new romanian stemmer. At the time, I also updated the stemming test data (by re-generating the output file using snowball's "stemwords" utility), and I clearly remember re-running the testsuite at the time and checking that all tests passed. Now, when I run make check, stemtest fails
2006 May 17
3
QueryParser lowercase / uppercase and stemming
Hello. There are several problems I couldn't find a solution. 1. QueryParser does not perform stemming I am working with PHP5 and use the xapian wrapper written by Daniel M?nard I build a query using parseQuery. Output of the parsed query shows that terms are not stemmed, although a stemmer is set ( see code snippet) # create a XapianDatabase object to search in $db = new
2014 Sep 29
2
Help with xapian
Hi, I have started getting a hang of the xapian codebase. I think I would like to try my hands on the letor module of xapian. Could you please suggest some free data set for the training and testing of letor features. I am not able to get the INEX data set from anywhere (the one mentioned by parth gupta in his GSOC 2011 projecct. Regards Karthik On Mon, Sep 22, 2014 at 4:23 PM, Olly Betts
2010 Jul 13
1
Czech stemming
Hello, I just find Xapian project when looking for some indexing engine in Ruby and was quite impressed. Is there any change for Czech stemming? I found that it is already written in Java as part of Lucene here: http://svn.apache.org/viewvc/lucene/dev/trunk/modules/analysis/common/src/ja va/org/apache/lucene/analysis/cz/CzechStemmer.java?view=markup Sadly, I have no experience with C++, but
2013 Jan 10
1
Add an example to the community page and contribute more code
Hi guys.I've finished an example indexer which acts like a grep replacement for a file.It indexes each line containing a proper noun in a given text file.The line containing the proper noun will be displayed upon searching for that noun.I would like to add it to the community code examples.I'm planning to write more examples which demonstrate some advanced features of Xapian along similar
2006 Mar 29
1
Using boolean terms in PHP bindings
OK, I'm indexing my data with the scriptindex. I want to be able to restrict the search by the category field. Do I need to do anything to the data itself? Like, literally prefix it with the characters "XC"? Below is my indexor for scriptindex and the my php code... document_id : field=ref unique=Q boolean=Q search_id : field=document_id index=S document_title : field=title