On Tuesday, I replaced the romanian1 and romanian2 stemmers in
Xapian-core with Martin's new romanian stemmer. At the time, I also
updated the stemming test data (by re-generating the output file using
snowball's "stemwords" utility), and I clearly remember re-running
the
testsuite at the time and checking that all tests passed.
Now, when I run make check, stemtest fails with the romanian stemmer on
the word "acela?I". This should stem to "acel" according to
the output
file generated by the stemwords utility from snowball, but xapian stems
it to "acela?i". The only change I can see to the stemming algorithms
since then is a change to the snowball code generator made on wednesday
morning by Olly, but reverting this change doesn't seem to fix the
problem. So - does anyone else see this error, or is it just something
on my local machine which has changed (possibly a character set thing,
or something like that)?
For reference, the output from stemtest is as follows:
$ ./tests/runtest tests/stemtest stemdict -l romanian -v
Running test 'tests/stemtest stemdict -l romanian -v' under valgrind
The random seed is 42
Please report the seed when reporting a test failure.
Running tests with romanian stemmer...
Running test: stemdict...
Testing romanian with fixed dictionary...
/home/richard/private/Working/xapian/xapian-core/tests/stemtest.cc:146:
((stem) == (expect))
Expected `stem' and `expect' to be equal: were acela?i and acel
FAILED
/home/richard/private/Working/xapian/build/xapian-core/tests/.libs/lt-stemtest
completed test run: 0 tests passed, 1 failed.
--
Richard