On Thu, Mar 31, 2005 at 09:29:20AM -0500, info at bannershift.com wrote:> While indexing text with omindex.cc possition of terms is saved with gap. > This is not happening with scriptindex.cc > While this is happening ?The gaps are added to avoid a phrase (or NEAR) matching across the last word(s) of one field and the first word(s) of the next. It's a bug that scriptindex doesn't add them too, really. I'll fix that once I've finished the move from CVS to SVN.> Another question is why in omindex.cc the term possition starts with 0 while > in scriptindex it starts from 1 ?Because it isn't important where positions start - only the relative positions really matter. Actually, you have the situation reversed. It's omindex which starts from 1, scriptindex from 0. Cheers, Olly
info at bannershift.com
2005-Mar-31 14:29 UTC
[Xapian-devel] omindex and scriptindex question
Hi, I was researching indexing of text in omindex and scriptindex. While indexing text with omindex.cc possition of terms is saved with gap. This is not happening with scriptindex.cc While this is happening ? Another question is why in omindex.cc the term possition starts with 0 while in scriptindex it starts from 1 ? Code snippet from omindex.cc // Add postings for terms to the document Xapian::termpos pos = 1; pos = index_text(title, newdocument, stemmer, pos); pos = index_text(dump, newdocument, stemmer, pos + 100); pos = index_text(keywords, newdocument, stemmer, pos + 100); Code snippet from scriptindex.cc Xapian::termpos wordcount = 0; ........... for (i = v.begin(); i != v.end(); ++i) { ...................... case Action::INDEX: wordcount = index_text(value, doc, stemmer, weight, i->get_string_arg(), wordcount); break;