On Tue, Feb 20, 2007 at 07:07:45PM +0100, Markus W?rle
wrote:> * I use sorting by_value_then_relevance in some cases. In this
> condition it happens that if both value and relevance of documents
> are equal, the sorting between those documents becomes unstable, that
> is, the order of those documents may differ from query to query.
If so, that is a bug. The final ordering is by docid (or reverse docid)
if all else is equal. Can you produce a small self-contained example to
demonstrate the problem?
> It would be nice if it would be possible to set more than one value
> to sort the results by, and to choose a particular order (ascending/
> descending) for each value. Beyond, it would be practical to have a
> possibility to freely mix up several values together with the
> relevance. E.g.:
>
> "sort by value 0 ascending, then relevance ascending, then value 1
> descending"
> (where value 1 could be a token that makes the sorting unique, to
> solve my initial problem)
There's already a bug open for "custom sort orders", which has a
working
patch attached:
http://www.xapian.org/cgi-bin/bugzilla/show_bug.cgi?id=100
> * The second thing is also something that would be easy to build into
> the client-software, but might be also a proper feature for xapian:
> Indexed terms can have positional informations. It would be handy to
> regain the information on which (positional indexed) terms a hit
> occured within the result, e.g. to generate previews that actually
> contain the requested tokens.
A "dynamic sample" feature would certainly be useful.
The positonal information is only scanned for phrase and near queries,
and to the minimal extent possible to determine if they match or not.
So it's probably not worth trying to store this explicitly as you'll
need to scan the rest of the relevant positional information to find out
where all the matches are. Any positional information already read
should still be in the disk cache when you come to generate the dynamic
sample.
Cheers,
Olly