I have installed and tested xapian/omega 1.02. I am using the omega.cgi for searches and have indexed 1000+ plain text files using omindex. Fantastic! Thanks, I am able to quickly and sensibly order these documents by text criteria. These text files contain various numbers of dates in the form of YYYY, mostly in free-text rather than formatted fields. I was hoping to be able to use the cgi to limit searches of these documents by date/number ranges e.g. "catalogue 1890..1910", but I not getting what I expected. Using godmode to inspect the indexing, only the 'file date modified' is stored as Y2007, fair enough, the text is not in the YYYYMMDD format, but none of the other numbers are stored as values either. In fact the document values all appear to contain garbage e.g. Document Values Value# Value 0 F?? 1 X?s?1??z ,:;??* The various years are picked out as plain terms, but I do not seem to be able to do number range searches. I have scoured the documentation and mailing lists and have now confused myself. Can you help me with 2 questions: 1) Should the omega cgi interface support number range searches with/without additional configuration? 2) From my outline does it appear that omindex has indexed my documents as expected - with particular reference to the document values?
Richard Boulton
2007-Jul-24 04:58 UTC
[Xapian-discuss] omega number range searches - query
Eike wrote:> These text files contain various numbers of dates in the form of YYYY, mostly in > free-text rather than formatted fields. I was hoping to be able to use the cgi > to limit searches of these documents by date/number ranges e.g. "catalogue > 1890..1910", but I not getting what I expected. > > Using godmode to inspect the indexing, only the 'file date modified' is stored > as Y2007, fair enough, the text is not in the YYYYMMDD format, but none of the > other numbers are stored as values either. In fact the document values all > appear to contain garbage e.g. > > Document Values > Value# Value > 0 F?? > 1 X?s?1??z,:;??*> > The various years are picked out as plain terms, but I do not seem to be able to > do number range searches. I have scoured the documentation and mailing lists and > have now confused myself. > > Can you help me with 2 questions: > 1) Should the omega cgi interface support number range searches with/without > additional configuration?If you use omindex, only the last modified date is stored. It _should_ be possible to do a date range search using this value, by setting the START and END cgi parameters. I don't believe sorting or range restriction is possible currently with any other numeric value.> 2) From my outline does it appear that omindex has indexed my documents as > expected - with particular reference to the document values?Yes - those two values look vaguely plausible: value 0 is the last_mod timestamp, as a 4 byte integer (ie, binary data, not ascii; hence the odd characters). value 1 is an MD5 sum of the document, again as binary data, not ascii.