We decided to use the xapian-devel list for GSoC discussions, since GSoC
generates a lot of extra list traffic and we don't want to fill up the
mailboxes of everyone on xapian-discuss. We've let them know that they
can subscribe to xapian-devel if they want to follow GSoC, but this way
they can choose not to easily.
So I'm replying to xapian-devel - please keep replies there too.
On Tue, Mar 22, 2011 at 08:01:37PM -0400, Prasad Prabhu
wrote:> I am interested in participating in open source project through Google
> Summer of Code 2011. I went through the idea list and I am keen on
*improving
> spelling correction *as I have projects in algorithms and lingustics. Can
> anyone suggest me some reading material?
You could have a look at the current code. Here's where we find a
spelling suggestion for a given word:
http://trac.xapian.org/browser/trunk/xapian-core/api/omdatabase.cc#L535
This is the edit distance algorithm we currently use:
http://berghel.net/publications/asm/asm.php
Here's an interesting algorithm someone (Dan I think) pointed out
recently, which might be useful:
http://blog.notdot.net/2010/07/Damn-Cool-Algorithms-Levenshtein-Automata
Other than that, I'd suggest trying to find resources on the internet.
Cheers,
Olly