Gilles Polart-Donat
2005-Nov-01 17:19 UTC
[Xapian-discuss] Understanding omindex and scriptindex
Hello, I continue to investigate Xapian, and I get some headache ! ;-) I understand that a database is a collection of documents and terms, with a relation between them (a documents contains terms). A document can have one field of arbitrary datas, meaningful for the program who put them. To understand uses of Xapian, I look at the omega search engine and it is not as clear as I want for me ! I see there are some differences in the work made by omindex and scriptindex. I think scriptindex add terms with a prefix to constraint the query, but the way to do it is free for the user : he can add as many field he needs, if it use them on query parameters. Omindex is more closed, with no parameters, Right ? I saw there are two way to add information with a doc : - put a new line in set_data - add a new term in the document How do we make the choice between these two ways ? I saw a set_value method, but I don't understand (if | where) it is used on a query. Can omindex and scriptindex share the same database ? For exemple to add some fields with scriptindex on a document created by omindex. I don't think so, but ... Best regards Gilles Polart-Donat
On Tue, Nov 01, 2005 at 06:28:10PM +0100, Gilles Polart-Donat wrote:> I see there are some differences in the work made by omindex and > scriptindex. I think scriptindex add terms with a prefix to constraint the > query, but the way to do it is free for the user : he can add as many field > he needs, if it use them on query parameters. Omindex is more closed, with > no parameters, Right ?Omindex doesn't currently index anything suitable for fielded searching. It could index document titles and other metadata as separate fields but doesn't currently.> I saw there are two way to add information with a doc : > - put a new line in set_data > - add a new term in the document > > How do we make the choice between these two ways ?Do you want to be able to locate documents using this information? If so, make it a term. Or do you want to be able to display (or otherwise use) this information given a particular document? If so, add it to the document data. Sometimes you want to do both. For example, you might want to be able to filter searches on mime-type, and also use it to display a "filetype" icon next to each search result.> I saw a set_value method, but I don't understand (if | where) it is used on > a query.Values allow you to sort results, perform non-boolean filtering, and collapse multiple matches into one.> Can omindex and scriptindex share the same database ?Not usefully, in general. But if you know what you're doing, I can see special cases where using both on the same database could be usful. For example, you could use scriptindex to quickly delete particular documents from a large omindex built database (say you have a court order to remove certain material from a site...) Cheers, Olly