thr3ads.net - Xapian discuss - [Xapian-discuss] Spelling and term prefixes [Jun 2009]

If this information is useful, please help other people find it:
Share via:

Simon Roe

2009-Jun-11 21:01 UTC

[Xapian-discuss] Spelling and term prefixes

Hi,

According to http://xapian.org/docs/spelling.html, "Currently spelling
correction ignores prefixed terms.".

I'd like to know why this is, and if there is a workaround for it.

For example, if I index 2 distinct fields, with two prefixes ('title'
and 'body'), I'd like to be able to do (in python):

query = qp.parse_query(query_string, DEFAULT_SEARCH_FLAGS, 'title')

And have spelling work on it.  From the end users point of view, this
makes sense, but I understand it might be tricky tracking the
spellings for each prefix.

-- 
Help save the economy:
http://seriouschange.org.uk/

E: simon.roe at talusdesign.co.uk
M: 07742079314

Olly Betts

2009-Jun-12 01:08 UTC

head link

[Xapian-discuss] Spelling and term prefixes

On Thu, Jun 11, 2009 at 10:01:52PM +0100, Simon Roe
wrote:> According to http://xapian.org/docs/spelling.html, "Currently spelling
> correction ignores prefixed terms.".
> 
> I'd like to know why this is, and if there is a workaround for it.
It wasn't totally obvious how it should operate, so we punted on it at
the time.  I guess someone needs to write a coherent spec for how this
should work and then implement it.
> For example, if I index 2 distinct fields, with two prefixes
('title'
> and 'body'), I'd like to be able to do (in python):
> 
> query = qp.parse_query(query_string, DEFAULT_SEARCH_FLAGS, 'title')
> 
> And have spelling work on it.  From the end users point of view, this
> makes sense, but I understand it might be tricky tracking the
> spellings for each prefix.
That's not hard actually, but you seem to be assuming that the spellings
should just be tracked separately per prefix.

In some cases, that doesn't make much sense - you'll probably get better
results by using the same dictionary for two free-text fields in the
same language.  Certainly if you're loading in a static dictionary, it's
really unhelpful if you are forced to load it once for each prefix you
want to use it for.

Conversely, for a field in a different language, or a field with a
distinctly different vocabulary (e.g. "author name"), then whether
tracking spelling dynamically or from a static dictionary you would want
a per-prefix dictionary.

Also, some fields may not be appropriate for spelling correction.

Cheers,
    Olly

Xapian discuss - Jun 2009 - Spelling and term prefixes

[Xapian-discuss] Spelling and term prefixes

[Xapian-discuss] Spelling and term prefixes