Frank Chang
2013-Sep-02 13:01 UTC
[Xapian-discuss] Test Harness for testing the xapian algorithm and alterbative algorithm
Mr. Olly Betts, My management team is interested in using xapian. My management team would like me to create a C/C++ test harness for measuring the processing speed of Xapian algorithm . Please let me know the specifications of the C/C++ test harness.> From: xapian-discuss-request at lists.xapian.org > Subject: Xapian-discuss Digest, Vol 112, Issue 1 > To: xapian-discuss at lists.xapian.org > Date: Mon, 2 Sep 2013 12:00:06 +0100 > > Send Xapian-discuss mailing list submissions to > xapian-discuss at lists.xapian.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://lists.xapian.org/mailman/listinfo/xapian-discuss > or, via email, send a message with subject or body 'help' to > xapian-discuss-request at lists.xapian.org > > You can reach the person managing the list at > xapian-discuss-owner at lists.xapian.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Xapian-discuss digest..." > > > Today's Topics: > > 1. having trouble with prefixes (Christopher Harvey) > 2. Re: having trouble with prefixes (Olly Betts) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Sun, 01 Sep 2013 22:37:59 -0400 > From: Christopher Harvey <chris at basementcode.com> > To: xapian-discuss at lists.xapian.org > Subject: [Xapian-discuss] having trouble with prefixes > Message-ID: <871u58osmw.fsf at basementcode.com> > Content-Type: text/plain > > I've got a small test database setup with one record. > $ delve -r 1 -V /tmp/1/ > Values for record #1: 0:DD4F2162FFFF0E43741A4A1C2B8EC0E7 1:./Text_page_scan_2.jpg 2:jpg 3:.jpg > Term List for record #1: E:.jpg P:./Text_page_scan_2.jpg Q:DD4F2162FFFF0E43741A4A1C2B8EC0E7 T:jpg > > The terms were added with lines like this: > doc.add_term(string("P:") + path); > > Problem is, I can't seem to run a query that returns the document using > any of the terms. Here is the outline of the code that runs the queries > I'm trying to run: > > Database db(db_path.string()); > QueryParser queryparser; > Stem stemmer("english"); > //queryparser.set_stemmer(stemmer); > queryparser.set_database(db); > queryparser.add_prefix("type", "T"); > queryparser.add_prefix("md5sum", "Q"); > queryparser.add_prefix("path", "P"); > queryparser.add_prefix("extension", "E"); > //maybe set stemming strategy here (in query parser)? > queryparser.set_stemming_strategy(QueryParser::STEM_NONE); > Query query(queryparser.parse_query(full_string)); > cout<<"Query is '"<<full_string<<"'"<<endl; > Enquire enquire(db); > enquire.set_query(query); > MSet match_set(enquire.get_mset(0, 10)); > for_each(match_set.begin(), match_set.end(), > [&db](docid id) { > print_doc_info(db.get_document(id)); > }); > > I expected the following query to work, > md5sum:DD4F2162FFFF0E43741A4A1C2B8EC0E7 > but it returns nothing. Same for all the other terms and prefixes. Terms > without prefixes seem to be working normally. I set stemming to NONE on > everything. > > All I want is a way to ask xapian to return a list of all documents with > specific paths and/or md5sums. > > thanks for any tips, > Chris > > > > ------------------------------ > > Message: 2 > Date: Mon, 2 Sep 2013 11:09:27 +0100 > From: Olly Betts <olly at survex.com> > To: Christopher Harvey <chris at basementcode.com> > Cc: xapian-discuss at lists.xapian.org > Subject: Re: [Xapian-discuss] having trouble with prefixes > Message-ID: <20130902100926.GG19292 at survex.com> > Content-Type: text/plain; charset=us-ascii > > On Sun, Sep 01, 2013 at 10:37:59PM -0400, Christopher Harvey wrote: > > I've got a small test database setup with one record. > > $ delve -r 1 -V /tmp/1/ > > Values for record #1: 0:DD4F2162FFFF0E43741A4A1C2B8EC0E7 1:./Text_page_scan_2.jpg 2:jpg 3:.jpg > > Term List for record #1: E:.jpg P:./Text_page_scan_2.jpg Q:DD4F2162FFFF0E43741A4A1C2B8EC0E7 T:jpg > > > > The terms were added with lines like this: > > doc.add_term(string("P:") + path); > > Just add the prefix "P" here. > > > Problem is, I can't seem to run a query that returns the document using > > any of the terms. Here is the outline of the code that runs the queries > > I'm trying to run: > > > > Database db(db_path.string()); > > QueryParser queryparser; > > Stem stemmer("english"); > > //queryparser.set_stemmer(stemmer); > > queryparser.set_database(db); > > queryparser.add_prefix("type", "T"); > > queryparser.add_prefix("md5sum", "Q"); > > queryparser.add_prefix("path", "P"); > > Or if you really want that colon in there, add the prefix as "P:" here. > > > queryparser.add_prefix("extension", "E"); > > //maybe set stemming strategy here (in query parser)? > > queryparser.set_stemming_strategy(QueryParser::STEM_NONE); > > Query query(queryparser.parse_query(full_string)); > > cout<<"Query is '"<<full_string<<"'"<<endl; > > If you print out query.get_description() it should be clearer what's > going on. > > Cheers, > Olly > > > > ------------------------------ > > Subject: Digest Footer > > _______________________________________________ > Xapian-discuss mailing list > Xapian-discuss at lists.xapian.org > http://lists.xapian.org/mailman/listinfo/xapian-discuss > > > ------------------------------ > > End of Xapian-discuss Digest, Vol 112, Issue 1 > **********************************************