Olly Betts
2014-Dec-01 04:41 UTC
[Xapian-devel] Adding Support for Krovetz Stemmer Algo in Xapian
> On 30 Nov 2014, at 17:51, Abhishek Singh Kushwah <abhishek18kushwah at gmail.com> wrote: > > > Two of the implementation of algorithms has already been rejected > > previously due to licenses both being the implementation of porter > > but our xapian use implementation in snowball which i assume is > > under GPL.The only cases I can think you might be referring to are two different submissions of patches based on Andy Stark's implementation of the Paice/Husk stemmer (not the Porter stemmer). The problem there is that this implementation has no explicit licence, which means we simply can't use it: http://www.gnu.org/licenses/license-list.html#NoLicense Cheers, Olly
Abhishek Singh Kushwah
2014-Dec-01 07:33 UTC
[Xapian-devel] Adding Support for Krovetz Stemmer Algo in Xapian
Yeah right, Its was Paice/Husk stemmer implementation( Andy Stark's), Now that if i code a Krovetz Implementation from scratch, than possibly i have to use the api and backend calls from xapian api rather than making it as independent module and similar to stem.h in namespace xapian. Regards, Abhishek Singh Kushwah -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20141201/73fe1b0b/attachment-0002.html>
Olly Betts
2014-Dec-03 10:56 UTC
[Xapian-devel] Adding Support for Krovetz Stemmer Algo in Xapian
On Mon, Dec 01, 2014 at 01:03:13PM +0530, Abhishek Singh Kushwah wrote:> Now that if i code a Krovetz Implementation from scratch, than > possibly i have to use the api and backend calls from xapian api > rather than making it as independent module and similar to stem.h in > namespace xapian.I don't think the backends would be useful for a stemmer. Even for a dictionary-based stemmer you'd probably want to have the dictionary in memory while indexing, since you have to look up every word in every document. The only part of the Xapian API you're likely to find useful is the Unicode support (and if you need such functionality it would be better to use this than duplicate it). Cheers, Olly