Displaying 20 results from an estimated 1000 matches similar to: "QueryParser prefixing terms when stemming?"
2008 Mar 27
2
Proper noun stemming
Hi All
I was wondering if anyone had a solution for the following problem.
I user QueryParser to stem my documents before adding them to a
database. During the stemming process I would like to find a way of
keeping proper nouns that span two or more words together as a phrase.
For example "New York" or "Gordon Brown" or "Prime Minister" get spilt
up. I see
2008 Sep 16
0
[PATCH] Add set_max_wildcard_expansion method to the queryparser.
---
search-xapian/XS/QueryParser.xs | 6 ++++++
search-xapian/Xapian/QueryParser.pm | 7 +++++++
xapian-core/include/xapian/queryparser.h | 3 +++
xapian-core/queryparser/queryparser.cc | 6 ++++++
xapian-core/queryparser/queryparser.lemony | 9 +++++++++
xapian-core/queryparser/queryparser_internal.h | 4 +++-
6 files changed, 34
2008 Sep 16
1
Some Questions From the beginner of Xapian
Dear, guys:
I am a beginner of Xapian, when reading the documents, I encountered follow questions.
(1) I see the Xapian::Document has a method
void add_value (Xapian::valueno valueno, const std::string &value)
What's the purpose of this method? Document will related to the terms, but what's the purpose of this?
(2) add_posting method will add term to a documents.
void
2006 May 17
3
QueryParser lowercase / uppercase and stemming
Hello.
There are several problems I couldn't find a solution.
1. QueryParser does not perform stemming
I am working with PHP5 and use the xapian wrapper written by Daniel M?nard
I build a query using parseQuery. Output of the parsed query shows that
terms are not stemmed, although a stemmer is set ( see code snippet)
# create a XapianDatabase object to search in
$db = new
2010 Nov 15
4
Stopword addition and stemming
Hi,
Two questions which I'm unsure about:
Stemming: I've turned on stemming, etc, but how can I confirm that
it's being used in searches? What should I look/search for?
Stopwords: I'm trying out xapian on a regional dataset (searching
data from a *.co.us TLD, eg) . I've noticed that searching for [bob
co.us] results in *very* slow search times (tens of seconds), since it
2010 Nov 02
1
How to make QueryParser select entire word like "H.O.T"
Hi,
I'm using xapian to build my search engine, but met with a problem.
The code snippet is like:
----------------------Code begin-------------------------------------------------------------
Xapian::QueryParser qp;
qp.add_prefix("Singer", "S");
Xapian::Query query = qp.parse_query("Singer:s.h.e",
2013 Sep 02
2
having trouble with prefixes
I've got a small test database setup with one record.
$ delve -r 1 -V /tmp/1/
Values for record #1: 0:DD4F2162FFFF0E43741A4A1C2B8EC0E7 1:./Text_page_scan_2.jpg 2:jpg 3:.jpg
Term List for record #1: E:.jpg P:./Text_page_scan_2.jpg Q:DD4F2162FFFF0E43741A4A1C2B8EC0E7 T:jpg
The terms were added with lines like this:
doc.add_term(string("P:") + path);
Problem is, I can't seem to
2011 May 27
1
Does OP_NEAR works with stemming?
Hi All,
I used the OP_NEAR operator for queryparser, and when I searched for "apple store" from my own collection, the query is parsed as "Zappl:(pos=1) NEAR 11 Zstore:(pos=2)" but retrieved nothing. However, if I type in "Apple Store", the query is parsed as Xapian::Query((apple:(pos=1) NEAR 11 store:(pos=2))) and some results are showed. I'm not sure whether
2015 Jul 26
1
Get term from document by position
mple (see attachment).
>
> Attachments get stripped out by the mailing list, so I?ve made a private gist of the two files here: <https://gist.github.com/jaylett/ce8455b37e2b84422346>.
>
> Actually, when I run it I get 0 matches, which would explain why you?re just getting the start of the document. However if I adjust things (match the stemming strategy for TermGenerator to
2007 Nov 14
1
Problem indexing text with spelling enabled in Perl
Hi All,
I'm using the TermGenerator::index_text() on version 1.0.4 with the
FLAG_SPELLING turned on, because the new spelling suggestion stuff
seems awesome, but I'm getting a segv.
(gdb) bt
#0 0xb7ae153c in Xapian::WritableDatabase::add_spelling
(this=0xa553988, word=@0xbff97724, freqinc=1) at ./include/xapian/
base.h:154
#1 0xb7becf47 in
2008 Apr 02
1
Using special characters in query terms
Hi,
I would like to search for filenames in a xapian database.
For now my query for "foo-bar.po" turns into the following:
Xapian::Query((foo:(pos=1) PHRASE 3 bar:(pos=2) PHRASE 3 po:(pos=3)))
This query is successful, if I used the term generator to tokenize "foo-bar.po"
during indexing.
The problem is: this workaround makes it impossible to distinguish between
2007 Mar 28
2
Moving indextext.cc into core.
One of the items on the ToDo list for version 1.0 at
http://wiki.xapian.org/TodoFor1_2e0#preview is:
"Rework Omega's indextext.cc as a xapian-core "TextSplitter" class."
I've been wondering about this for a while now. Currently, we have the
Query Parser in Xapian core, but no text processing. Clearly, it makes
sense to have a "text splitter" class in
2017 Feb 08
1
searching for " in phrase and other special chars
Hello,
I'm reading xapian-core/docs/queryparser.rst and haven't been
able to find a way to escape " (double-quote) inside quoted
phrases.
Is this possible?
I'm also wondering if searching for other special characters,
such as a literal '*', is possible without triggering a wildcard
match. It would be helpful for some source code searches.
Thanks!
2011 Sep 14
1
Integrated Chinese tokenizer SCWS in xapian-core
Xapian is a very excellent open source search engine library, but there is no native support for Chinese word segmentation in queryparser and termgenerator.
Therefore, I modified small amount of source codes, integrated into the SCWS tokenizer, that is the same open-source and developped by myself.
Anyone can obtain the patch from below URL. After patching, Xapian::QueryParser::parse_query and
2007 Dec 17
1
Crashes with spelling enabled and perl.
Hi Guys,
Here's a simple test case that causes a segfault with the perl
bindings patched to enable spelling correction:
use strict;
use warnings;
use Search::Xapian;
my $db = Search::Xapian::WritableDatabase->new("test.db",
Search::Xapian::DB_CREATE_OR_OPEN);
if (!defined($db)) {
die("Failed to open xapian_database: $!");
}
my $indexer =
2007 Oct 19
1
Re: [Xapian-commits] 9476: trunk/xapian-core/ trunk/xapian-core/include/xapian/ trunk/xapian-core/queryparser/ trunk/xapian-core/tests/
olly wrote:
> SVN root: svn://svn.xapian.org/xapian
> Changes by: olly
> Revision: 9476
> Date: 2007-10-19 03:47:11 +0100 (Fri, 19 Oct 2007)
>
> Log message (14 lines):
> include/xapian/queryparser.h,queryparser/queryparser.cc,
> queryparser/queryparser.lemony,queryparser/queryparser_internal.h,
> tests/queryparsertest.cc: Since calling
2007 Jun 28
1
TermGenerator and SimpleStopper
Hi,
I'm using SimpleStopper with TermGenerator in a Python indexing
script, in an attempt to keep my index size down (currently 30K per
doc, and I have 200 million docs to index, which I think implies
6TB.) However, unprefixed (positional?) terms are not affected by
the stopper, though Z-prefixed terms are.
I assume this is intentional for phrase queries, but I need to reduce
my
2008 Mar 12
1
how can i use stopwords?
Hi,
I do not understand the stopword function...
I've set the termgenerator like this:
$self->{'Stemmer'} = new Search::Xapian::Stem(german2);
$self->{'Stopper'} = new Search::Xapian::SimpleStopper();
$self->{'TermGenerator'} = new Search::Xapian::TermGenerator;
$self->{'TermGenerator'}->set_stemmer( $self->{'Stemmer'} );
2007 Jun 12
1
Empty results OMEGA with XAPIAN 1.0.1
Hi,
I configured XAPIAN 1.0.1 and OMEGA 1.0.1. on my development machine
(first removed the old ones). I recreated my databases (both quartz
and flint) and tried to run original queries against the databases
created by the new versions.
I'm getting empty result sets from OMEGA. If I use the delve tool I
actually see that the records are created fine. No log files are
written as far as I
2012 Nov 26
1
Word missing after stemmed with Norwegian in Search::Xapian::TermGenerator
Hi all Xapian-devel,
Gist: https://gist.github.com/10d2222d8bffe8d7631d
I'm using Xapian-TermGenerator to extract Norwegian sentences to vsm
(vector space model) using TermGenerator. But when I test generating vsm
from 'Truet med ? stevne misforn?yd PC-kunde - PC-leverand?ren Asus likte
sv?rt d?rlig kundens misforn?yde leserbrev.' It doen't return 'asus' result
in vsm.