Hi! I'm facing the following issue: I need to index with xapian a relational table, which has a few enumerated fields and two text fields: E1 E2 E3 T1 T2 ------------------ ---------- Enumerated Text I want to index the table in order to narrow by specific field (e.g. for a given text in T1 and by enumerated fields E2 and E3). For example, if E2 is author's code, E3 is theme's code and T1 is the chapter text, I want to issue: "there and back again author:tolkien theme:fantasy" I was playing with Lucene and there is an api call to index by term like Lucene::Keyword and Lucene::Text to index text (contents), so I can do the following: addField( Lucene::Keyword, 'author:tolkien' ) addField( Lucene::Keyword, 'theme:fantasy' ) addField( Lucene::Text, <textofthebook> ) I think it is possible to do in Xapian to index termnames and termdata, but I don't found the right way... could you give an example or a little sample ?? Thanks in advance! Sebasti?n.
Michael Schlenker
2006-Jul-24 15:32 UTC
[Xapian-discuss] Best way to index relational table
Sebastian Araya wrote:> Hi! > > > I'm facing the following issue: I need to index with xapian a relational > table, which has a few enumerated fields and two text fields: > > E1 E2 E3 T1 T2 > ------------------ ---------- > Enumerated Text > > I want to index the table in order to narrow by specific field (e.g. for a > given text in T1 and by enumerated fields E2 and E3). For example, if E2 is > author's code, E3 is theme's code and T1 is the chapter text, I want to issue: > > "there and back again author:tolkien theme:fantasy" > > I was playing with Lucene and there is an api call to index by term like > Lucene::Keyword and Lucene::Text to index text (contents), so I can do the > following: > > addField( Lucene::Keyword, 'author:tolkien' ) > addField( Lucene::Keyword, 'theme:fantasy' ) > addField( Lucene::Text, <textofthebook> ) > > I think it is possible to do in Xapian to index termnames and termdata, but I > don't found the right way... could you give an example or a little sample ??Yes, you can do that easily with Xapian. Depending on your needs a combination of a standard relational table for keywords and xapian for the fulltext fields could be the best solution. For the Tcl binding for example that translates to: xapian::Document doc doc add_term author:Tolkien doc add_term theme:fantasy # not sure if Lucene::Text only stores the text or actually indexes # the text and breaks it down into terms # this would simply store the fulltext, but not break it down into terms # the examples dir has a proc to do indexing doc add_data $textOfTheBook You simply prefix your categories with a unique prefix, that is not used in normal terms (in this example you would disallow : in normal terms, uppercase prefixes are also a possibility if you lowercase all your terms while indexing). Michael