Hi guys I'm trying to figure out how I can use probabilistic searching on a given field within a document; I've written to the list about this before, but haven't quite figured out what's required and, following a little research, I think I understand what I need to do but I'd like a clarification on this. o We have a database of a number of documents, with fields: title, subtitle, summary and table of contents o By default, we pass these fields into the TermGenerator::index_text function to generate terms and add these to a Xapian::Document, applying a weighting where required o We then search these fields using XapianQueryParser::parse_query o This gives a result which searches all of the fields for the required string I'd like to add the ability to search JUST one on the fields (title, in this case) so according to the API documentation, here's what I understand I need to do: o When creating the index, call TermGenerator::index_text with the prefix 'S' (i.e. index_text('some text', 150, 'S') o When querying the index, call QueryParser.add_prefix('', 'S') before calling parse_query with the string I want to use However, the documentation is a little unclear as to how this actually works - specifically, how I do a search for multiple words in just the title. For example, I have a title: "Research into Cheese in China". With the changes I have made to the indexer, I will have terms for this both without the S prefix and also WITH the S prefix to allow title-only searching. When it comes to searching, I want to be able to take the string "Cheese in China" as user input and pass this into the QueryParser and have it perform the search with the 'S' prefix added internally somehow. From the documentation, it looks like I do this: Xapian::QueryParser qp; qp.add_prefix("", "S"); Xapian::Query query = qp.parse_query("Cheese in China"); Is this correct? Thanks, Justin -- Redwire Design Limited 54 Maltings Place 169 Tower Bridge Road London SE1 3LJ www.redwiredesign.com [ 020 7403 1444 ] - voice [ 020 7378 8711 ] - fax
On Wed, Jul 27, 2011 at 01:01:18PM +0100, Justin Finkelstein wrote:> However, the documentation is a little unclear as to how this actually > works - specifically, how I do a search for multiple words in just the > title. For example, I have a title: "Research into Cheese in China". > With the changes I have made to the indexer, I will have terms for this > both without the S prefix and also WITH the S prefix to allow title-only > searching. > > When it comes to searching, I want to be able to take the string "Cheese > in China" as user input and pass this into the QueryParser and have it > perform the search with the 'S' prefix added internally somehow. From > the documentation, it looks like I do this: > > Xapian::QueryParser qp; > qp.add_prefix("", "S"); > Xapian::Query query = qp.parse_query("Cheese in China");Yes, that should work. You can print out the query description to check what you got: cout << query.get_description() << endl; (If you want to see what terms have been generated by indexing to compare, see the delve utility which is in xapian-core/examples.) Any suggestions on how this could be made clearer in the docs? Cheers, Olly
Bruce Zhang
2011-Jul-27 13:08 UTC
[Xapian-discuss] Does Xapian support value in [value1, value2, value3...]?
Hi guys, I wonder if Xapian support the operations like value in [value1, value2, value3...]? from Xapian document, for query, the supported operations are index, boolean... but we have requirements to query see if a value is in a list of values. it is already supported or we need to add ourselves? thanks, Bruce
Bruce Zhang
2011-Jul-28 05:11 UTC
[Xapian-discuss] 答复: Does Xapian support value in [value1, value2, value3...]?
I am still new to Xapian, and didn't make this successfully, for example, my data are: document 1 name=xxxx,xxx desc=... country=us, fr document 2: name=... desc=... country=jp,cn I want to be able to search by country, like query document by us or jp or cn I use scriptindex to build indexing database. then how should I wrote my index scripts? country : ??? I use omega to query, then how should I create command line? ./omega DB=default B=XCOUNTRY??? thanks lot for help, Bruce ???: Matt Goodall [mailto:matt.goodall at gmail.com] ????: Wednesday, July 27, 2011 9:43 PM ???: Bruce Zhang ??: Re: [Xapian-discuss] Does Xapian support value in [value1, value2, value3...]? On 27 July 2011 14:08, Bruce Zhang <bruce.zhang at trustgo.com> wrote: Hi guys, I wonder if Xapian support the operations like value in [value1, value2, value3...]? from Xapian document, for query, the supported operations are index, boolean... but we have requirements to query see if a value is in a list of values. it is already supported or we need to add ourselves? You can achieve this by adding an exact (unstemmed, etc) term for each value and then querying for any one of the values in the usual way. For example. if you have a document that has some general, indexable content and a list of tags, ["foo", "bar"], you could add "Xtag:foo" and "Xtag:bar" terms for the tags. Then, configure a QueryParser with a "tag"->"Xtag:" prefix and you can search for "tag:foo", "tag:bar", "tag:foo AND tag:bar", etc. - Matt