I've noticed some strange results from the stemmer in the Ruby port: irb(main):003:0> @stem.stem_word("anybody") => "anybodi" irb(main):004:0> @stem.stem_word("swimmingly") => "swim" irb(main):005:0> @stem.stem_word("fiercely") => "fierc" irb(main):006:0> @stem.stem_word("fraudulently") => "fraudul" Is it supposed to behave like this, or is this a bug in my Ruby wrapper? Best, Paul -- -------------------------------------------------- -- Paul Legato, Senior Software Engineer -- --- Networked Knowledge Systems --- ---- P.O. Box 20772 Tampa, FL. 33622-0772 ---- ----- (813)594-0064 Voice (813)594-0045 FAX ----- ------ plegato at nks.net ------ -------------------------------------------------- -------------------------------------------------- ----- This email bound by the following: ----- ---- http://www.nks.net/email_disclaimer.html ---- --------------------------------------------------
On Thu, Apr 20, 2006 at 08:45:11PM -0300, Paul Legato wrote:> I've noticed some strange results from the stemmer in the Ruby port: > > irb(main):003:0> @stem.stem_word("anybody") > => "anybodi" > irb(main):004:0> @stem.stem_word("swimmingly") > => "swim" > irb(main):005:0> @stem.stem_word("fiercely") > => "fierc" > irb(main):006:0> @stem.stem_word("fraudulently") > => "fraudul" > > Is it supposed to behave like this, or is this a bug in my Ruby wrapper?Those are the results that are expected. The stem isn't necessarily a word itself, though it generally looks mostly like one. In this case "swimmingly" -> "swim" is an example of overstemming, since "swim" also stems to "swim" and the two words don't really share anything in meaning (although I imagine that they share a linguistic root, but that's not what we care about here). Cheers, Olly