On 3/9/07, Linus <lajanus at o2.pl> wrote:> I have a problem, I use utf all over a rails site, but the search failes
> to search characters with acccents...
>
> I try to debug it, and i had run unit tests for ferret, can those
> failures cose problems?
>
> <pre>
> /usr/lib/ruby/gems/1.8/gems/ferret-0.11.3/test/ ruby test_all.rb
> Loading once
> Loaded suite test_all
> Started
>
................F.............................................FF...........................................................................................FF...FF
> Finished in 4.729505 seconds.
>
> 1) Failure:
> test_custom_filter(CustomAnalyzerTest)
>
[./unit/../unit/index/../../unit/store/../../unit/analysis/tc_analyzer.rb:516]:
> <token["dbalm??n at gmail.com":0:18:1]> expected but was
> <token["BAD_DATA":0:18:1]>.
>
> 2) Failure:
> test_letter_analyzer(LetterAnalyzerTest)
>
[./unit/../unit/index/../../unit/store/../../unit/analysis/tc_analyzer.rb:100]:
> <token["??
> G?":55:62:1]> expected but was
> <token["??
> G":55:60:1]>.
>
> 3) Failure:
> test_letter_tokenizer(LetterTokenizerTest)
>
[./unit/../unit/index/../../unit/store/../../unit/analysis/tc_token_stream.rb:73]:
> <token["??
> G?":55:62:1]> expected but was
> <token["??
> G":55:60:1]>.
>
> 4) Failure:
> test_standard_analyzer(StandardAnalyzerTest)
>
[./unit/../unit/index/../../unit/store/../../unit/analysis/tc_analyzer.rb:275]:
> <token["dbalm??n at gmail.com":0:18:1]> expected but was
> <token["BAD_DATA":0:18:1]>.
>
> 5) Failure:
> test_standard_tokenizer(StandardTokenizerTest)
> [./unit/../unit/index/../../unit/
> <token["??
> G?":117:124:1]> expected but was
> <token["??
> G":117:122:1]>.
>
> 6) Failure:
> test_white_space_analyzer(WhiteSpaceAnalyzerTest)
> [./unit/../unit/index/../../un
> <token["?? ":55:86:1]> expected but was
> <token["??G????????????????
> G":55:60:1]>.
>
> 7) Failure:
> test_whitespace_tokenizer(WhiteSpaceTokenizerTest)
> [./unit/../unit/index/../../u
> <token["?? ":55:86:1]> expected but was
> <token["??G????????????????
> G":55:60:1]>.
>
> 162 tests, 12082 assertions, 7 failures, 0 errors
> </pre>
You need to have a UTF-8 locale installed or Ferret doesn''t know how
to deal with UTF-8 characters. Try typing locale at the command line
to see what locale you have installed.
Cheers,
Dave
--
Dave Balmain
http://www.davebalmain.com/