Bira
2008-Dec-04 13:24 UTC
[Xapian-discuss] Newbie Question: Indexing and searching multiple fields
I've recently heard of Xapian and was taking a look at it in my spare time. I have some experience with Ferret, a Ruby IR library, and I have some questions about Xapian. Ferret's indexes are similar to Lucene's, and support multiple fields per document. Now, from reading Xapian's documentation, I know it can do something similar through values. However, I wasn't able to find clear instructions for how to search by these values. For example, let's say I'm indexing documents with authors, titles and potentially long bodies. In Ferret, each one would be a separate field, and if I wanted to find all documents written by author "A", I could pass it an "author:A" query, and it would return those to me. Now, I know this is possible in Xapian, because I've googled up a few messages that alude to it, but I haven't found clear instructions on how to do it. I imagine title and author would be saved as values in a Xapian docuent, and the body would be the document's data. But how should I set up the index and Enquires so that it's possible to use queries like "author:A"? If this is documented somewhere, I probably missed it. I would appreciate a pointed to the right documentation :). -- Bira http://compexplicita.wordpress.com http://compexplicita.tumblr.com
Charlie Hull
2008-Dec-04 13:30 UTC
[Xapian-discuss] Newbie Question: Indexing and searching multiple fields
Bira wrote:> Ferret's indexes are similar to Lucene's,Actually Ferret's indexes were Lucene's - Ferret is Lucene ported to Ruby. Then it seems the Ferret author rewrote things and used a different file format... http://rubyforge.org/forum/forum.php?forum_id=9058 Charlie
Bira
2008-Dec-04 13:44 UTC
[Xapian-discuss] Newbie Question: Indexing and searching multiple fields
On Thu, Dec 4, 2008 at 11:30 AM, Charlie Hull <charlie at juggler.net> wrote:> Bira wrote: > >> Ferret's indexes are similar to Lucene's, > > Actually Ferret's indexes were Lucene's - Ferret is Lucene ported to > Ruby. Then it seems the Ferret author rewrote things and used a > different file format... > http://rubyforge.org/forum/forum.php?forum_id=9058I'm aware of that :). I just mentioned that in passing because of their ability to contain multiple fields for each document (I think Lucene does that too). Xapian seems to handle this a little differently, with values, which gave rise to my question. -- Bira http://compexplicita.wordpress.com http://compexplicita.tumblr.com
Henry
2008-Dec-04 14:15 UTC
[Xapian-discuss] Newbie Question: Indexing and searching multiple fields
Quoting Bira <u.alberton at gmail.com>:> If this is documented somewhere, I probably missed it. I would > appreciate a pointed to the right documentation :).I asked similar questions and got some good answers: http://lists.xapian.org/pipermail/xapian-discuss/2008-November/006067.html Also, don't gloss over the Glossary :), it's full of preciousss nuggets which tend to put things into context: http://www.xapian.org/docs/glossary.html Cheers Henry