hi all! after hours of trying to find contents with german umlauts i stumbled upon a post where someone said ferret won''t work with utf-8 on windows??? is that really true? do i really have to iconv everything to iso-8859-15 before indexing and do the same with the query to get it working? i''m running ruby 1.8.5, ferret 0.10.9-mswin32, and rails 1.2.2 and just reinstalled aaf from svn yesterday (can''t find any version info, and forgot to remember the svn revision) *banging*head*against*wall* -- Posted via http://www.ruby-forum.com/.
On 2/26/07, neongrau __ <neongrau at gmail.com> wrote:> hi all! > > after hours of trying to find contents with german umlauts i stumbled > upon a post where someone said ferret won''t work with utf-8 on > windows??? > > is that really true? > > do i really have to iconv everything to iso-8859-15 before indexing and > do the same with the query to get it working?The StandardAnalyzer uses your current locale settings to determine what a letter is when tokenizing your data. As far as I was able to determine, Windows doesn''t have support for UTF-8 locales in C and the win32 libraries. (I''d love for someone to correct me on thise). What you can do is write a custom analyzer and UTF-8 should be fine. There has been plenty of discussion on creating your own analyzer in the past: http://www.ruby-forum.com/search?query=ferret+analyzer&submit=Search You can also look in the unit tests. -- Dave Balmain http://www.davebalmain.com/