thr3ads.net - Xapian devel - [Xapian-devel] [GSOC 2011] Improving Spell Checker [Mar 2011]

If this information is useful, please help other people find it:
Share via:

Prasad Prabhu

2011-Mar-23 23:42 UTC

[Xapian-devel] [GSOC 2011] Improving Spell Checker

Hey Xapian guyz,

This is my list set of ideas and overview of my analysis I have done on some
other ideas I felt should be discussed. Please provide me some comments and
suggestions to make it better before the application process starts.
Here is the link: Idea Log <http://goo.gl/GjCcA>
Thank you,


Prasad Prabhu
Masters Student
SUNY STONY BROOK
NY

Follow Me on:  Twitter <http://www.twitter.com/prap19>

* If Microsoft ever does applications for Linux it means I've won.
Linus Torvalds*
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.xapian.org/pipermail/xapian-devel/attachments/20110323/cd503e81/attachment.html>

Olly Betts

2011-Mar-24 14:35 UTC

head link

[Xapian-devel] [GSOC 2011] Improving Spell Checker

On Wed, Mar 23, 2011 at 07:42:08PM -0400, Prasad Prabhu
wrote:> This is my list set of ideas and overview of my analysis I have done on
some
> other ideas I felt should be discussed. Please provide me some comments and
> suggestions to make it better before the application process starts.
> Here is the link: Idea Log <http://goo.gl/GjCcA>
I think trying to actually parse queries as sentences isn't likely to
work well.  People usually search for a few words without the proper
grammar, or for a sentence fragment.  So for being context sensitive,
I think a statistical approach is more likely to work (e.g. something
like tracking how likely is this word to appear near that one, and
then comparing that for words within edit distance X of the word we
are considering for correction).

I'm not clear how stemming helps here - perhaps you could elaborate
on how it would be used?

And soundex is really a non-starter.  It's only intended to be used
on surnames common in the USA, and it's not even much good for those.
Metaphone (and metaphone 2) are better alternatives.

Cheers,
    Olly

Seemingly Similar Threads

Search for more possibly parallel threads

Xapian devel - Mar 2011 - [GSOC 2011] Improving Spell Checker

[Xapian-devel] [GSOC 2011] Improving Spell Checker

[Xapian-devel] [GSOC 2011] Improving Spell Checker

Seemingly Similar Threads