On Tue, Aug 08, 2006 at 12:06:17PM +0000, Chris Good wrote:
> We are using scriptindex and omega to match against a list of place
> names from a text file. The problem that we have is that we get a
> 100% match back for a partial hit ie for the example query
"centre"
> we get back hits for CHEQUERS CENTRE, EVERSLEY CENTRE, TOWN CENTRE -
> all with 100% relevance.
>
> What we would like to be able to do is to differentiate between an
> exact phrase match and a partial one, in this case if all those
> cases either didn't match or came back out as say 50% relevant then
> that would be fine. Is this possible with scriptindex or would we
> have to write our own indexer?
This is tricky, and more to do with querying than indexing. The
trouble is that the percentage is a normalised relevance, not an
absolute relevance. You can fetch the relevance and use that in some
way. For instance, with two documents:
1 Cheques Centre
2 Town Centre
searching on the term 'Centre' (I haven't done any stemming), I get
back two matches at 100%, but with relevance 0.095310179804324935. I
suspect you'll see something similar.
How you want to use that, I'm not sure.
James
--
/--------------------------------------------------------------------------\
James Aylett xapian.org
james@tartarus.org uncertaintydivision.org