Displaying 20 results from an estimated 300 matches similar to: "get the title from the document"
2012 Jun 04
1
Search not finding queries with stop words.
I have a search in perl that looks a bit like:
my $qp = new Search::Xapian::QueryParser();
$qp->set_stemmer(new Search::Xapian::Stem("english"));
$qp->set_stemming_strategy(STEM_SOME);
$qp->set_default_op($defaultop);
...
my $par = $qp->parse_query($query);
my $enq = $xDatabase->enquire( $par );
and in the db create script:
my $stopper =
2008 Jan 15
7
PHP indexing, what's the PHP method for indexscript
Currently I have the following indexscript:
pid : unique=Q boolean=Q field=pid
postdate : field=startdate
author_name: unhtml boolean=XAUTHORNAME field=author
author_id: boolean=XAUTHORID field=authorid
url : field=url
sample : weight=1 index field=sample
How can I create the same indexing using PHP?
With this, I can get an searchable index, but I have no idea how to set the fields, so that I
2018 Nov 30
1
Xapian Benchmark results
Hi,
I am currently trying to benchmark a multithreaded xapian implementation on
a chameleon baremetal instance written in C++. My workload is a 3 Gig
wikipedia xml dump consisting of ~286 file of different sizes. My results
are showing me that indexing on xapian is an order of magnitude faster than
my lucene and lucene plusplus implementations. This is a result that I did
not expect. Just want to
2018 Mar 30
2
sorting large msets
Hello, is there a way to optimize sorting by certain values
for queries which return a huge amount of results?
For example, I just want a simple query that gives me the 200
most recent emails out of millions. The elapsed time for
get_mset increases as the number of documents ($n * 2000)
increases.
I suppose I could store a pre-sorted set using SQLite or
similar. Thanks in advance for any
2018 Jun 21
0
Welcome to the "Xapian-discuss" mailing list
Please keep replies on the mailing list — more people can help (and benefit) that way :)
So OP_NEAR looks for its terms close to each other (hence "near"). The window is how far away they can be. Probably the easiest way to play with this is using the NEAR syntax in the query parser. So if you had a plain text document:
I am walking, always walking.
And index it in a very simple
2011 Sep 21
2
Xapian-discuss Digest, Vol 88, Issue 9
Thanks that helped :).
I am still trying to cover add_value some more though since I seem to not
understand it totally.
I guess it is because I am used to Lucene and Sphinx and Solr and it appears
that Xapian seems to attach the type of value stored more on add_value. Like
for example I am still a bit confused on how slotno actually works and what
it actually is.
I think the main thing is
2013 Sep 22
2
How to filter search result with query with has white space.
Hello,
include <iostream>#include <string>#include <xapian.h>struct document{
std::string title;
std::string content;
std::string url;};
void indexData(document d) {
try {
Xapian::WritableDatabase db("/Users/ramesh/Desktop/xapian",
Xapian::DB_CREATE_OR_OPEN);
Xapian::TermGenerator indexer;
Xapian::Stem
2013 Sep 22
2
How to filter search result with query with has white space.
Hello,
include <iostream>#include <string>#include <xapian.h>struct document{
std::string title;
std::string content;
std::string url;};
void indexData(document d) {
try {
Xapian::WritableDatabase db("/Users/ramesh/Desktop/xapian",
Xapian::DB_CREATE_OR_OPEN);
Xapian::TermGenerator indexer;
Xapian::Stem
2015 Jul 26
1
Get term from document by position
mple (see attachment).
>
> Attachments get stripped out by the mailing list, so I?ve made a private gist of the two files here: <https://gist.github.com/jaylett/ce8455b37e2b84422346>.
>
> Actually, when I run it I get 0 matches, which would explain why you?re just getting the start of the document. However if I adjust things (match the stemming strategy for TermGenerator to
2017 Dec 07
0
xapian 1.4 performance issue
On Thu, Dec 07, 2017 at 10:29:09AM +0100, Jean-Francois Dockes wrote:
> Recoll builds snippets by partially reconstructing documents out of index
> contents.
>
[...]
>
> The specific operation which has become slow is opening many term position
> lists, each quite short.
The difference will actually be chert vs glass, rather than 1.2 vs 1.4
as such (glass is the new backend in
2007 Nov 09
1
Your favorite desktop wifi sip hardphone ?
Hi,
Which is your favorite desktop wifi sip hardphone ?
I'm looking for something like
http://www.mitel.com/DocController?documentId=19401 which could be easily
moved from one meeting room to another.
(In this specific case, finding an electrical plug to power a large desktop
phone is seen more relevant than finding an PoE Ethernet plug or using a
mobile handset.)
Which product would you
2011 Jul 12
1
Possible leaking records in dashboard db
I''m using puppet-dashboard 1.1.0-1 on Ubuntu. I remove old reports using this command:
nice -n +1 rake RAILS_ENV=production reports:prune upto=1 unit=mon
This seems to work fine, and the amount of reports returnd by this mysql query seems to drop by the proper amount:
select count(*) from reports;
Right now it returns a value of 12591. So far so good.
The problem is the
2010 Oct 24
1
Cannot index with dynamic spelling data (Perl/Search::Xapian)
This is my test case, what am I doing wrong? It seems that the API is used
incorrectly, but I cannot find the problem...
--- 8< ---
#!/usr/bin/perl
use Search::Xapian qw(:all);
use strict;
my $xa = new Search::Xapian::WritableDatabase ("/tmp/xapian",
DB_CREATE_OR_OVERWRITE);
my $indexer = Search::Xapian::TermGenerator->new();
2011 Sep 20
1
Understanding API Documentation for PHP
Hey everyone,
I am brand new to Xapian so forgive me if I am just being noob.
I looked over the sparse documentation for the Xapian library and its PHP
hooks and I am really confused how to complete my index.
I understand how to add documents etc etc etc and how to build queries but
how I do specify in add_value what field type xapian should take (i.e.
tokenized, unindexed, indexed)?
Is there
2010 Sep 17
0
Asterisk 1.8 and CEL logging
Is there the ability in the Asterisk 1.8 CEL logging to log the SIP
endpoint IP as weell as the medie enpoint's ID's?
Thanks
Bryant
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.digium.com/pipermail/asterisk-users/attachments/20100917/905bafb0/attachment.htm
2012 Jan 20
2
Perl version of sortable_serialize missing?
I attempted to use the sortable_serialize function from perl, however
doesn't seem to exist. The only occurrence of the string "sortable" in
the /usr/local/perl/5.10.1/Search/ tree is in the pod in Xapian.pm.
What am I doing wrong?
use Search::Xapian;
...
$doc->add_value(4,sortable_serialize($recdate));
Undefined subroutine &main::sortable_serialize called
2015 Sep 25
2
issues with dev.new avoiding RStudio plot device on unix?
Hi R-devl,
I'm still unable to force opening an *interactive* non-Rstudio
platform-specific plot device on *unix* systems.
dev.new() add a new argument 'noRStudioGD' in R 3.1.1. Thank you. It
works for me when using RStudio on Windows, but on the unix system it
opens a pdf device instead of an interactive device when using an
interactive RStudio session (with R_DEFAULT_DEVICE
2011 Jan 11
1
chert-update creates a db with some errors
I've some problems converting a xapian db, created with core 1.1.3 (using
chert), to the new chert format.
I'm using xapian-chert-update, compiled from the core-1.2.4.
The conversion seems to run without errors:
#./xapian-core-1.2.4/bin/xapian-chert-update old new
postlist: Reduced by 33.3333% 16K (48K -> 32K)
record: Size unchanged (8K)
termlist: doesn't exist
position: Size
2010 Jan 01
1
Document values vs data
In a recent post, someone asked about storing "metadata" in a
document. My guess would have been to use add_value. Olly's
recommendation was to use set_data.
What are the general guidelines for deciding whether to use values or
data in a document?
Garrett
2010 May 22
1
How to search documents with certain values
Hi all,
I am creating Xapian documents and adding a unix timestamp to each document
as a value using the doc.add_value method.
When I search my Xapian database, I want the option to only search documents
with a timestamp within the last year.
Is there a way to search across documents with a value greater than a
specified value string? Or is there a better way of doing something like
this?
Any