Kevin Duraj
2007-Feb-02 03:13 UTC
[Xapian-discuss] Working demo of search engine using boolean query.
Lately I was reading many articles about using boolean queries for search
engine but I haven't seen any complete working demo. Therefore I put
together very simple working demo of search engine using boolean query. Feel
free to suggest any performance improvement or error while keeping it as
simple as possible for understanding.
Thanks,
-Kevin Duraj
http://myhealthcare.com
#------------------------------------------------------------------------------#
# Sample Data #
#------------------------------------------------------------------------------#
url=webmd.com
text=fitness health cancer
url=health.com
text=diseases health calorie disability
url=healthfinder.gov
text=medications diet food
url=myhealthcare.com
text=Nanotechnology nanoscale science
#------------------------------------------------------------------------------#
# scriptindex
#------------------------------------------------------------------------------#
url : index=S field
text : indexnopos truncate=1024 field
#------------------------------------------------------------------------------#
# Perl Search Script
#------------------------------------------------------------------------------#
#!/usr/bin/perl -w
use strict;
use ExtUtils::testlib;
use Search::Xapian qw/:all/;
#------------------------------------------------------------------------------#
my $primary_terms = "Swebmd.com Shealth.com Smyhealthcare.com
Shealthfinder.gov";
my $secondary_terms = "fitness Diseases diet science ";
$secondary_terms =~ tr/A-Z/a-z/; # convert to lower case
#------------------------------------------------------------------------------#
my $db = Search::Xapian::Database->new(
'/home/myhealthcare/xapian_index' );
my $enq = $db->enquire();
#------------------------------------------------------------------------------#
my @primary_terms = split ' ', $primary_terms;
my @secondary_terms = split ' ', $secondary_terms;
#------------------------------------------------------------------------------#
my $query1 = Search::Xapian::Query->new( OP_OR, @primary_terms );
my $query2 = Search::Xapian::Query->new( OP_OR, @secondary_terms );
my $boolean_query = Search::Xapian::Query->new( OP_AND, $query1, $query2 );
$enq->set_query( $boolean_query );
#------------------------------------------------------------------------------#
printf "Parsing query '%s'\n",
$enq->get_query()->get_description();
my $total = $enq->matches(1, 100000000);
print "Total: $total results found.\n------------------------\n";
my @matches = $enq->matches(0, 15);
#------------------------------------------------------------------------------#
foreach my $match ( @matches )
{
printf "ID %d %d%%", $match->get_docid(),
$match->get_percent();
my $doc = $match->get_document();
printf " [ %s ]", $doc->get_data();
print "\n";
}
#------------------------------------------------------------------------------#
Olly Betts
2007-Feb-09 04:03 UTC
[Xapian-discuss] Working demo of search engine using boolean query.
On Thu, Feb 01, 2007 at 07:07:26PM -0800, Kevin Duraj wrote:> url : index=S fieldI think you mean "boolean=S" here.> my $total = $enq->matches(1, 100000000); > print "Total: $total results found.\n------------------------\n"; > my @matches = $enq->matches(0, 15);And you're running the query twice here! Cheers, Olly
Possibly Parallel Threads
- glm StepAIC with all interactions and update to remove a term vs. glm specifying all but a few terms and stepAIC
- Xapian Search Websites Listings
- Ferret not able to read a Lucene Index?
- Xapian Index: 607GB = 219 million of unique documents
- Xapian Index 253 million documents = 704G