Displaying 20 results from an estimated 30000 matches similar to: "add_term"
2020 Feb 19
2
prioritizing aggregated DBs
Olly Betts <olly at survex.com> wrote:
> On Sat, Feb 08, 2020 at 06:04:42PM +0000, Eric Wong wrote:
> > Olly Betts <olly at survex.com> wrote:
> > > On Fri, Feb 07, 2020 at 09:33:08PM +0000, Eric Wong wrote:
> > > > Or would I fiddle with wdf_inc for all ->index_text and ->add_term
> > > > calls on a per-DB basis?
> > >
>
2008 Jan 15
7
PHP indexing, what's the PHP method for indexscript
Currently I have the following indexscript:
pid : unique=Q boolean=Q field=pid
postdate : field=startdate
author_name: unhtml boolean=XAUTHORNAME field=author
author_id: boolean=XAUTHORID field=authorid
url : field=url
sample : weight=1 index field=sample
How can I create the same indexing using PHP?
With this, I can get an searchable index, but I have no idea how to set the fields, so that I
2012 Nov 21
1
about index speed of xapian
hi,
i use xapian to index a txt file, it's size is 268M. i take each line as a document, and each line has two field like 13445511 | 111115151. the recored size is 10000000. the XAPIAN_FLUSH_THRESHOLD set 1000000. it takes 1026544ms to index the file, it is more slower than lucene. The lucene speed is about 40000 records per second.
code:
try
{
Xapian::WritableDatabase
2013 Sep 02
2
having trouble with prefixes
I've got a small test database setup with one record.
$ delve -r 1 -V /tmp/1/
Values for record #1: 0:DD4F2162FFFF0E43741A4A1C2B8EC0E7 1:./Text_page_scan_2.jpg 2:jpg 3:.jpg
Term List for record #1: E:.jpg P:./Text_page_scan_2.jpg Q:DD4F2162FFFF0E43741A4A1C2B8EC0E7 T:jpg
The terms were added with lines like this:
doc.add_term(string("P:") + path);
Problem is, I can't seem to
2013 Sep 22
2
How to filter search result with query with has white space.
Hello,
include <iostream>#include <string>#include <xapian.h>struct document{
std::string title;
std::string content;
std::string url;};
void indexData(document d) {
try {
Xapian::WritableDatabase db("/Users/ramesh/Desktop/xapian",
Xapian::DB_CREATE_OR_OPEN);
Xapian::TermGenerator indexer;
Xapian::Stem
2013 Sep 22
2
How to filter search result with query with has white space.
Hello,
include <iostream>#include <string>#include <xapian.h>struct document{
std::string title;
std::string content;
std::string url;};
void indexData(document d) {
try {
Xapian::WritableDatabase db("/Users/ramesh/Desktop/xapian",
Xapian::DB_CREATE_OR_OPEN);
Xapian::TermGenerator indexer;
Xapian::Stem
2010 Jan 28
3
Problem getting Xapian working with Burmese
On Fri, Aug 21, 2009 at 02:44:44PM +0200, emmanuel at engelhart.org wrote:
>> I want to update my request.
>> Is my question bad formulated? too trivial? ... or maybe pretty
>> complicated/unclear?
>
>I think nobody answered as it was hard to follow your example because
>the Burmese characters seem to have been mangled (at least the message I
>received wasn't
2016 Jul 24
3
Xapian 1.4.0 released
On Fri, Jul 22, 2016 at 07:19:43PM -0700, Kevin Duraj wrote:
> I would like to propose to change the following code while indexing a
> term that is larger than 245 characters and then crashing and aborting
> the entire index, we could rather truncate the term to 245 characters
> and continue with indexing.
Kevin -- I wonder what others are currently doing when this comes up
(or if
2020 Feb 08
2
prioritizing aggregated DBs
Olly Betts <olly at survex.com> wrote:
> On Fri, Feb 07, 2020 at 09:33:08PM +0000, Eric Wong wrote:
> > Hey all, I've been using ->add_database for a few years
> > to tie sharded DBs together and it works great.
> >
> > Now, I want to be able to search across several DBs
> > which aren't sharded, say: linux-DB, glibc-DB, freebsd-DB.
> >
2011 Jul 20
1
Phrase search problem
Hi,
I'm experiencing problems when doing phrase searches with adjacent
repeated terms. Example:
if I search for 'curtain curtain' and there are documents that matches
the query, they aren't returned. But, if I search for 'curtain nice
curtain' and there are documents that matches this query, it works ok.
attached there is a python program that shows the problem. I tried
2010 Feb 02
1
How to use a custom stemmer from Python bindings?
Hi,
I'm using Xapian bindings for Python in my project. How could I use a
custom stemmer instead of the included one (Snowball)? The one I'm
looking at right now is Hunspell (http://hunspell.sourceforge.net/)
which has Python bindings (http://code.google.com/p/pyhunspell/).
Thanks in advance,
Eugene
2010 Dec 01
2
Are stub databases still supported in 1.0.21?
I have the following setup:
Databases:
/var/lib/xapian-omega/data/db1
/var/lib/xapian-omega/data/db2
/var/lib/xapian-omega/data/db3
Stub:
/var/lib/xapian-omega/data/default
The stub file "default" is a text file that contains the following:
auto /var/lib/xapian-omega/data/db1
auto /var/lib/xapian-omega/data/db2
auto /var/lib/xapian-omega/data/db3
Using the following returns nothing:
2020 Feb 07
2
prioritizing aggregated DBs
Hey all, I've been using ->add_database for a few years
to tie sharded DBs together and it works great.
Now, I want to be able to search across several DBs
which aren't sharded, say: linux-DB, glibc-DB, freebsd-DB.
I want to search for something across all of them, but
prioritize results to favor one or some of those DBs over
others. Is there a way to do that without reindexing?
Or
2012 Jun 04
1
Search not finding queries with stop words.
I have a search in perl that looks a bit like:
my $qp = new Search::Xapian::QueryParser();
$qp->set_stemmer(new Search::Xapian::Stem("english"));
$qp->set_stemming_strategy(STEM_SOME);
$qp->set_default_op($defaultop);
...
my $par = $qp->parse_query($query);
my $enq = $xDatabase->enquire( $par );
and in the db create script:
my $stopper =
2014 Mar 26
3
about sort_by_value
Hello, I have found that the use of sort_by_value very slow.
16800 result, return to the previous 10, sorting takes about 25ms.
And if you do not sort, returns 10, need only about 0.3ms.
How to make the sort faster?
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
2010 Jun 09
1
TermGenerator incorrectly tokenizes German text which contains special characters
Dear Xapian users,
I try to index some German text with Xapian using the xapian_php bindings. I
run Apache 2.2 on Windows using PHP 5.2.13 with the pre build xapian
bindings from Flax:
Xapian Support enabled Xapian
Compiled Version @PACKAGE_VERSION@
Xapian Linked Version 1.2.0
The problem is that after indexing text which contains special characters
like ?, ?, ? and ?, using
2004 Sep 15
1
Objects in PHP4
I've been looking at the PHP4 bindings, wondering why we don't have
"proper" objects.
Digging back through mail and CVS logs, it looks like I updated the
SWIG invocation for PHP from using "-shadow" to using "-noproxy" so
that it worked with newer versions of SWIG. But that's wrong as the
switches have opposite meanings (and what was "-shadow"
2011 Jul 05
2
Usage of DateRangeValueProcessor
Hi,
I have been playing with Djapian recently, and am running into a problem
where I have the following setup:
class SomeIndexer(Indexer):
fields = [
'actual_date_of_arrival',
]
tags = [
('ata', 'actual_date_of_arrival'),
]
Some model instances exist with dates like 2011/07/04 and 2011/07/04 and my
search should yield all model instances
2014 Nov 30
3
Contributing to Xapian
Hi Olly
I will try to work on :
http://trac.xapian.org/wiki/GSoCProjectIdeas#Project:LearningtoRank
I will be taking a Machine Learning class the next semester and I hope that
this project will help me supplement my learning in Machine Learning and
also gain a bit of knowledge in IR.
If you can give me ideas on how to get around with the code for LTR
project, it will be awesome. I can look at
2010 Jun 19
2
Xapian 1.0.21 released
I've uploaded Xapian 1.0.21 (including Search::Xapian 1.0.21.0), which
as usual you can download from:
http://xapian.org/download
The most notable changes in this release are:
Xapian-core API:
* Xapian::Stem now recognises "nb" and "nn" as additional codes for the
Norwegian stemmer.
* Xapian::QueryParser now correctly parses a wildcarded term in between two
other