Hi!
On 19.05.2008, at 22:46, Radu Spineanu wrote:
> Hi,
>
> Can ferret search for a combination of words and return the distance
> between them in a text?
It won''t directly return you the distance but given the fact that
Ferret stores term positions it should be possible to manually
determine the distance between different terms. You may also issue
phrase queries that only return hits for terms that are separated by
at most n other terms. The QueryParser API docs or the Ferret book
have examples of this.
> If it exists is there a way you can improve on this by looking if
> they are separated by a certain character(like . for different
> sentences)?
Usually you dont index characters like ''.'' at all (they are
removed
during analysis, when the text is split up into tokens), but if you
changed that so sentence endings end up in the index as kind of
special terms this might be possible, too.
I dont know your use case, but keep in mind that you can get the
effect of ranking terms that are closer together higher by chaining
Phrase Queries with different Slop values, and assigning them
different boosts:
("red fox")^15 OR ("red fox"~4)^10 OR ("red
fox"~10)^5 OR ("red
fox"~100)
this will boost the exact match the most, and assign lower boosts to
matches where the terms have larger distance.
Maybe something like this will already be a ''good enough''
solution to
your problem?
cheers,
Jens
--
Jens Kr?mer
Finkenlust 14, 06449 Aschersleben, Germany
VAT Id DE251962952
http://www.jkraemer.net/ - Blog
http://www.omdb.org/ - The new free film database