Hi everyone,
I was working on the Paice-Husk Stemmer, which is a Bite Size Project for
Xapian, and I have created a C++ as well as Snowball version of it.
I read the algorithm, and picked the rules from here:
http://www.comp.lancs.ac.uk/computing/research/stemming/paice/descript.htm
The C++ code takes rules as input from a file and generates the stem of
given word, whereas the Snowball version has rules written in it. This is
because file handling is not possible in Snowball, and so I have written a
C++ code that generates the Snowball code (Code-ception :P).
Since the algorithm has many steps, my codes might have some mistakes.
This is where they are located: https://github.com/satwantrana/codes
I will be integrating this in my Xapian fork, and release a patch soon.
Meanwhile, if someone finds a bug/mistake in this, please respond.
Also, I hope this implementation helps my GSoC application.
Thanks,
Satwant Rana
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.xapian.org/pipermail/xapian-devel/attachments/20140401/7876f811/attachment-0002.html>