similar to: QueryParser prefixing terms when stemming?

Displaying 20 results from an estimated 1000 matches similar to: "QueryParser prefixing terms when stemming?"

2008 Mar 27
2
Proper noun stemming
Hi All I was wondering if anyone had a solution for the following problem. I user QueryParser to stem my documents before adding them to a database. During the stemming process I would like to find a way of keeping proper nouns that span two or more words together as a phrase. For example "New York" or "Gordon Brown" or "Prime Minister" get spilt up. I see
2008 Sep 16
0
[PATCH] Add set_max_wildcard_expansion method to the queryparser.
--- search-xapian/XS/QueryParser.xs | 6 ++++++ search-xapian/Xapian/QueryParser.pm | 7 +++++++ xapian-core/include/xapian/queryparser.h | 3 +++ xapian-core/queryparser/queryparser.cc | 6 ++++++ xapian-core/queryparser/queryparser.lemony | 9 +++++++++ xapian-core/queryparser/queryparser_internal.h | 4 +++- 6 files changed, 34
2008 Sep 16
1
Some Questions From the beginner of Xapian
Dear, guys: I am a beginner of Xapian, when reading the documents, I encountered follow questions. (1) I see the Xapian::Document has a method void add_value (Xapian::valueno valueno, const std::string &value) What's the purpose of this method? Document will related to the terms, but what's the purpose of this? (2) add_posting method will add term to a documents. void
2010 Nov 02
1
How to make QueryParser select entire word like "H.O.T"
Hi, I'm using xapian to build my search engine, but met with a problem. The code snippet is like: ----------------------Code begin------------------------------------------------------------- Xapian::QueryParser qp; qp.add_prefix("Singer", "S"); Xapian::Query query = qp.parse_query("Singer:s.h.e",
2006 May 17
3
QueryParser lowercase / uppercase and stemming
Hello. There are several problems I couldn't find a solution. 1. QueryParser does not perform stemming I am working with PHP5 and use the xapian wrapper written by Daniel M?nard I build a query using parseQuery. Output of the parsed query shows that terms are not stemmed, although a stemmer is set ( see code snippet) # create a XapianDatabase object to search in $db = new
2010 Nov 15
4
Stopword addition and stemming
Hi, Two questions which I'm unsure about: Stemming: I've turned on stemming, etc, but how can I confirm that it's being used in searches? What should I look/search for? Stopwords: I'm trying out xapian on a regional dataset (searching data from a *.co.us TLD, eg) . I've noticed that searching for [bob co.us] results in *very* slow search times (tens of seconds), since it
2007 Nov 14
1
Problem indexing text with spelling enabled in Perl
Hi All, I'm using the TermGenerator::index_text() on version 1.0.4 with the FLAG_SPELLING turned on, because the new spelling suggestion stuff seems awesome, but I'm getting a segv. (gdb) bt #0 0xb7ae153c in Xapian::WritableDatabase::add_spelling (this=0xa553988, word=@0xbff97724, freqinc=1) at ./include/xapian/ base.h:154 #1 0xb7becf47 in
2008 Apr 02
1
Using special characters in query terms
Hi, I would like to search for filenames in a xapian database. For now my query for "foo-bar.po" turns into the following: Xapian::Query((foo:(pos=1) PHRASE 3 bar:(pos=2) PHRASE 3 po:(pos=3))) This query is successful, if I used the term generator to tokenize "foo-bar.po" during indexing. The problem is: this workaround makes it impossible to distinguish between
2013 Sep 02
2
having trouble with prefixes
I've got a small test database setup with one record. $ delve -r 1 -V /tmp/1/ Values for record #1: 0:DD4F2162FFFF0E43741A4A1C2B8EC0E7 1:./Text_page_scan_2.jpg 2:jpg 3:.jpg Term List for record #1: E:.jpg P:./Text_page_scan_2.jpg Q:DD4F2162FFFF0E43741A4A1C2B8EC0E7 T:jpg The terms were added with lines like this: doc.add_term(string("P:") + path); Problem is, I can't seem to
2017 Feb 08
1
searching for " in phrase and other special chars
Hello, I'm reading xapian-core/docs/queryparser.rst and haven't been able to find a way to escape " (double-quote) inside quoted phrases. Is this possible? I'm also wondering if searching for other special characters, such as a literal '*', is possible without triggering a wildcard match. It would be helpful for some source code searches. Thanks!
2011 Sep 14
1
Integrated Chinese tokenizer SCWS in xapian-core
Xapian is a very excellent open source search engine library, but there is no native support for Chinese word segmentation in queryparser and termgenerator. Therefore, I modified small amount of source codes, integrated into the SCWS tokenizer, that is the same open-source and developped by myself. Anyone can obtain the patch from below URL. After patching, Xapian::QueryParser::parse_query and
2015 Jul 26
1
Get term from document by position
mple (see attachment). > > Attachments get stripped out by the mailing list, so I?ve made a private gist of the two files here: <https://gist.github.com/jaylett/ce8455b37e2b84422346>. > > Actually, when I run it I get 0 matches, which would explain why you?re just getting the start of the document. However if I adjust things (match the stemming strategy for TermGenerator to
2007 Dec 17
1
Crashes with spelling enabled and perl.
Hi Guys, Here's a simple test case that causes a segfault with the perl bindings patched to enable spelling correction: use strict; use warnings; use Search::Xapian; my $db = Search::Xapian::WritableDatabase->new("test.db", Search::Xapian::DB_CREATE_OR_OPEN); if (!defined($db)) { die("Failed to open xapian_database: $!"); } my $indexer =
2011 May 27
1
Does OP_NEAR works with stemming?
Hi All, I used the OP_NEAR operator for queryparser, and when I searched for "apple store" from my own collection, the query is parsed as "Zappl:(pos=1) NEAR 11 Zstore:(pos=2)" but retrieved nothing. However, if I type in "Apple Store", the query is parsed as Xapian::Query((apple:(pos=1) NEAR 11 store:(pos=2))) and some results are showed. I'm not sure whether
2007 Jun 28
1
TermGenerator and SimpleStopper
Hi, I'm using SimpleStopper with TermGenerator in a Python indexing script, in an attempt to keep my index size down (currently 30K per doc, and I have 200 million docs to index, which I think implies 6TB.) However, unprefixed (positional?) terms are not affected by the stopper, though Z-prefixed terms are. I assume this is intentional for phrase queries, but I need to reduce my
2012 Nov 26
1
Word missing after stemmed with Norwegian in Search::Xapian::TermGenerator
Hi all Xapian-devel, Gist: https://gist.github.com/10d2222d8bffe8d7631d I'm using Xapian-TermGenerator to extract Norwegian sentences to vsm (vector space model) using TermGenerator. But when I test generating vsm from 'Truet med ? stevne misforn?yd PC-kunde - PC-leverand?ren Asus likte sv?rt d?rlig kundens misforn?yde leserbrev.' It doen't return 'asus' result in vsm.
2007 Mar 28
2
Moving indextext.cc into core.
One of the items on the ToDo list for version 1.0 at http://wiki.xapian.org/TodoFor1_2e0#preview is: "Rework Omega's indextext.cc as a xapian-core "TextSplitter" class." I've been wondering about this for a while now. Currently, we have the Query Parser in Xapian core, but no text processing. Clearly, it makes sense to have a "text splitter" class in
2012 Jan 05
1
Enhance synonyms feature of the query parser (patch included)
Very few people seem to be using synonym in Xapian, I recently found some problems in the use of synonyms. Normally, I think we should not contain any prefix info in synonym table except that 'Z'. For example, I have the following synonyms and prefix info: db.add_synonym("search", "find"); db.add_synonym("Zsearch", "Zfind");
2007 Oct 19
1
Re: [Xapian-commits] 9476: trunk/xapian-core/ trunk/xapian-core/include/xapian/ trunk/xapian-core/queryparser/ trunk/xapian-core/tests/
olly wrote: > SVN root: svn://svn.xapian.org/xapian > Changes by: olly > Revision: 9476 > Date: 2007-10-19 03:47:11 +0100 (Fri, 19 Oct 2007) > > Log message (14 lines): > include/xapian/queryparser.h,queryparser/queryparser.cc, > queryparser/queryparser.lemony,queryparser/queryparser_internal.h, > tests/queryparsertest.cc: Since calling
2016 Dec 29
2
Formulating Advanced Queries with Xapian-Omega
To Olly Betts: Thank you very much for any feedback. I apologise for this belated reply and also for the fact that the text of the previous posting appeared fragmented, due to its fixed chars/line format. With reference to: > Can, or could, one construct a query so that Omega (Xapian) can handle > this ? > > ... perhaps with some type of Regex ? > > It would seem