Kevin Duraj
2007-Feb-02 03:13 UTC
[Xapian-discuss] Working demo of search engine using boolean query.
Lately I was reading many articles about using boolean queries for search engine but I haven't seen any complete working demo. Therefore I put together very simple working demo of search engine using boolean query. Feel free to suggest any performance improvement or error while keeping it as simple as possible for understanding. Thanks, -Kevin Duraj http://myhealthcare.com #------------------------------------------------------------------------------# # Sample Data # #------------------------------------------------------------------------------# url=webmd.com text=fitness health cancer url=health.com text=diseases health calorie disability url=healthfinder.gov text=medications diet food url=myhealthcare.com text=Nanotechnology nanoscale science #------------------------------------------------------------------------------# # scriptindex #------------------------------------------------------------------------------# url : index=S field text : indexnopos truncate=1024 field #------------------------------------------------------------------------------# # Perl Search Script #------------------------------------------------------------------------------# #!/usr/bin/perl -w use strict; use ExtUtils::testlib; use Search::Xapian qw/:all/; #------------------------------------------------------------------------------# my $primary_terms = "Swebmd.com Shealth.com Smyhealthcare.com Shealthfinder.gov"; my $secondary_terms = "fitness Diseases diet science "; $secondary_terms =~ tr/A-Z/a-z/; # convert to lower case #------------------------------------------------------------------------------# my $db = Search::Xapian::Database->new( '/home/myhealthcare/xapian_index' ); my $enq = $db->enquire(); #------------------------------------------------------------------------------# my @primary_terms = split ' ', $primary_terms; my @secondary_terms = split ' ', $secondary_terms; #------------------------------------------------------------------------------# my $query1 = Search::Xapian::Query->new( OP_OR, @primary_terms ); my $query2 = Search::Xapian::Query->new( OP_OR, @secondary_terms ); my $boolean_query = Search::Xapian::Query->new( OP_AND, $query1, $query2 ); $enq->set_query( $boolean_query ); #------------------------------------------------------------------------------# printf "Parsing query '%s'\n", $enq->get_query()->get_description(); my $total = $enq->matches(1, 100000000); print "Total: $total results found.\n------------------------\n"; my @matches = $enq->matches(0, 15); #------------------------------------------------------------------------------# foreach my $match ( @matches ) { printf "ID %d %d%%", $match->get_docid(), $match->get_percent(); my $doc = $match->get_document(); printf " [ %s ]", $doc->get_data(); print "\n"; } #------------------------------------------------------------------------------#
Olly Betts
2007-Feb-09 04:03 UTC
[Xapian-discuss] Working demo of search engine using boolean query.
On Thu, Feb 01, 2007 at 07:07:26PM -0800, Kevin Duraj wrote:> url : index=S fieldI think you mean "boolean=S" here.> my $total = $enq->matches(1, 100000000); > print "Total: $total results found.\n------------------------\n"; > my @matches = $enq->matches(0, 15);And you're running the query twice here! Cheers, Olly
Reasonably Related Threads
- glm StepAIC with all interactions and update to remove a term vs. glm specifying all but a few terms and stepAIC
- Xapian Search Websites Listings
- Ferret not able to read a Lucene Index?
- Xapian Index: 607GB = 219 million of unique documents
- Xapian Index 253 million documents = 704G