Hello, I''m trying to setup ferret search engine(what i did successfully for searching english phrases, words) But i doesn''t work for UTF-8 symbols!!! I tried to find the solution for this problem: 1) i add to enviroment.rb followinf lines: ENV[''LANG''] = ''de_DE.UTF-8 at euro'' ENV[''LC_TIME''] = ''C'' require ''acts_as_ferret'' 2) next i using scaffold added few posts(my model, that i index this way: acts_as_ferret :fields => [ :name, :post ] English phrases searching well, but for german it get empty result array. I don''t know what to do, so I asked to professional, who do this!! Thanks for your replies! -- Posted via http://www.ruby-forum.com/.
Hey .. i just set the followings: ENV[''LC_CTYPE''] = ''en_US.UTF-8'' Ferret.locale = "en_US.UTF-8" see http://bugs.omdb.org/browser/trunk/config/environment.rb searching for utf-8 works great, i''ve indexed characters in german, english, asian languages, hebrew and other.. Ben On 2007-09-02, at 12:03 PM, Igor K. wrote:> Hello, > > I''m trying to setup ferret search engine(what i did successfully for > searching english phrases, words) > > But i doesn''t work for UTF-8 symbols!!! > > I tried to find the solution for this problem: > 1) i add to enviroment.rb followinf lines: > ENV[''LANG''] = ''de_DE.UTF-8 at euro'' > ENV[''LC_TIME''] = ''C'' > require ''acts_as_ferret'' > 2) next i using scaffold added few posts(my model, that i index this > way: > acts_as_ferret :fields => [ :name, :post ] > > English phrases searching well, but for german it get empty result > array. > > I don''t know what to do, so I asked to professional, who do this!! > > Thanks for your replies! > -- > Posted via http://www.ruby-forum.com/. > _______________________________________________ > Ferret-talk mailing list > Ferret-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/ferret-talk
Benjamin Krause wrote:> Hey .. > > i just set the followings: > > ENV[''LC_CTYPE''] = ''en_US.UTF-8'' > Ferret.locale = "en_US.UTF-8" > > see http://bugs.omdb.org/browser/trunk/config/environment.rb > > searching for utf-8 works great, i''ve indexed characters > in german, english, asian languages, hebrew and other.. > > BenThak you for your reply, but even this can''t help me, please check my sources in attachment. Here is 2 controlles - add(for adding new posts) and search Thanks Attachments: http://www.ruby-forum.com/attachment/212/ferrettest2.rar -- Posted via http://www.ruby-forum.com/.
On Sun, Sep 02, 2007 at 12:03:59PM +0200, Igor K. wrote:> Hello, > > I''m trying to setup ferret search engine(what i did successfully for > searching english phrases, words) > > But i doesn''t work for UTF-8 symbols!!! > > I tried to find the solution for this problem: > 1) i add to enviroment.rb followinf lines: > ENV[''LANG''] = ''de_DE.UTF-8 at euro'' > ENV[''LC_TIME''] = ''C'' > require ''acts_as_ferret'' > 2) next i using scaffold added few posts(my model, that i index this > way: > acts_as_ferret :fields => [ :name, :post ] > > English phrases searching well, but for german it get empty result > array.Does this apply for searching via a web form, or in unit tests? Maybe non-ascii characters in your queries get garbled before ferret even gets to see them? The same applies for the content you save. Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa
Jens Kraemer wrote:> On Sun, Sep 02, 2007 at 12:03:59PM +0200, Igor K. wrote: >> ENV[''LC_TIME''] = ''C'' >> require ''acts_as_ferret'' >> 2) next i using scaffold added few posts(my model, that i index this >> way: >> acts_as_ferret :fields => [ :name, :post ] >>> Does this apply for searching via a web form, or in unit tests? Maybe > non-ascii characters in your queries get garbled before ferret even gets > to see them?No, i use UTF8 in database, controllers, environment.rb When i do search like Post.find(:all, :conditions => [''name like ?'', ''SOME-UTF8-TEXT'']) it founds. Have anybody looked at my test project in attachment? -- Posted via http://www.ruby-forum.com/.
On Tue, Sep 04, 2007 at 11:28:11AM +0200, Igor K. wrote:> Jens Kraemer wrote: > > On Sun, Sep 02, 2007 at 12:03:59PM +0200, Igor K. wrote: > >> ENV[''LC_TIME''] = ''C'' > >> require ''acts_as_ferret'' > >> 2) next i using scaffold added few posts(my model, that i index this > >> way: > >> acts_as_ferret :fields => [ :name, :post ] > >> > > > Does this apply for searching via a web form, or in unit tests? Maybe > > non-ascii characters in your queries get garbled before ferret even gets > > to see them? > > No, i use UTF8 in database, controllers, environment.rb > When i do search like Post.find(:all, :conditions => [''name like ?'', > ''SOME-UTF8-TEXT'']) it founds. > > Have anybody looked at my test project in attachment?Yes, I did. After changing the line setting ENV[''LANG''] to ''en_US.UTF-8'' in config/environment.rb (because I don''t have the de_DE.UTF-8 at euro locale installed on my system) it works perfectly. Before it didn''t, and I couldn''t even enter german umlauts on the rails console. So you should make sure the locale you set LANG to also exists on your system. ''dpkg-reconfigure locales'' can be used on Debian/Ubuntu to check which locales you have and enable more if you like. cheers, Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa
> Yes, I did. After changing the line setting ENV[''LANG''] > to ''en_US.UTF-8'' in config/environment.rb (because I don''t have > the de_DE.UTF-8 at euro locale installed on my system) it works perfectly. > > Before it didn''t, and I couldn''t even enter german umlauts on the rails > console. > > So you should make sure the locale you set LANG to also exists on your > system. ''dpkg-reconfigure locales'' can be used on Debian/Ubuntu to check > which locales you have and enable more if you like. > >Thank you for your replies, I will check it! Maybe it depends on operating system? I have Win XP Home SP2 russian. Where i can get a list of locations for other countries(''en_US.UTF-8'', de_DE.UTF-8)??? Thanks -- Posted via http://www.ruby-forum.com/.
On Tue, Sep 04, 2007 at 04:40:15PM +0200, Igor K. wrote:> > > Yes, I did. After changing the line setting ENV[''LANG''] > > to ''en_US.UTF-8'' in config/environment.rb (because I don''t have > > the de_DE.UTF-8 at euro locale installed on my system) it works perfectly. > > > > Before it didn''t, and I couldn''t even enter german umlauts on the rails > > console. > > > > So you should make sure the locale you set LANG to also exists on your > > system. ''dpkg-reconfigure locales'' can be used on Debian/Ubuntu to check > > which locales you have and enable more if you like. > > > > > Thank you for your replies, > > I will check it! > > Maybe it depends on operating system? I have Win XP Home SP2 russian. > Where i can get a list of locations for other countries(''en_US.UTF-8'', > de_DE.UTF-8)???imho you should be in a utf8 environment by default then - however I don''t have a windows to check this. Maybe for starters just start irb and have a look at ENV[''LANG''] ? Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa
> > imho you should be in a utf8 environment by default then - however I > don''t have a windows to check this. Maybe for starters just start irb > and have a look at ENV[''LANG''] ? > > Jens >Hello, Thanks for replies :) you really helps me I have tried to change: - windows locale to Russia - environment.rb ENV[''LC_CTYPE''] = ''ru_RU.UTF-8'' Ferret.locale = "ru_RU.UTF-8" - and what was important to change database locale from UTF8-unicode to UTF8-general. But as i espect it was no finish of problems: - because it index data incorrect, for example some data came to index db. The other problem is when - some objects it can find and some not(russian objects), when i search by ''*'' i can get all objects, but with ''**'' i get only part of them. what version of ferret-server, acts-as-ferret, locale in MySQL database do you have? Thanks -- Posted via http://www.ruby-forum.com/.
On Wed, Sep 05, 2007 at 08:57:43AM +0200, Igor K. wrote:> > > > imho you should be in a utf8 environment by default then - however I > > don''t have a windows to check this. Maybe for starters just start irb > > and have a look at ENV[''LANG''] ? > > > > Jens > > > Hello, > Thanks for replies :) you really helps me > I have tried to change: > - windows locale to Russia > - environment.rb > ENV[''LC_CTYPE''] = ''ru_RU.UTF-8'' > Ferret.locale = "ru_RU.UTF-8" > - and what was important to change database locale from UTF8-unicode to > UTF8-general. > > But as i espect it was no finish of problems: > - because it index data incorrect, for example some data came to index > db. The other problem is when - some objects it can find and some > not(russian objects),Have you checked with the Ferret browser to make sure it really indexed incorret values? Have you rebuilt your index after changing these locale things?> when i search by ''*'' i can get all objects, but with ''**'' i get only > part of them.I''m not sure what Ferret does with a query like ''**''...> what version of ferret-server, acts-as-ferret, locale in MySQL database > do you have?I used the aaf from inside your app when checking it out, and ferret 0.11.4. Here''s a small checklist I use for making sure everything is UTF-8: HTTP/HTML content-type delivered by web server should be ''Content-Type: text/html; charset=utf-8'' html content-type meta tag should be ''<meta http-equiv="content-type" content="text/html;charset=utf-8" />'' Mysql settings: In your application''s console, execute the following: r = Post.connection.execute "SHOW VARIABLES LIKE ''character%''" r.each {|r|puts r} This should result in something like this: character_set_client utf8 character_set_connection utf8 character_set_database utf8 If your output differs, try the following: set ''encoding: utf8'' in environment.rb (this affects the character_set_connection and character_set_client values) set ''default-character-set=utf8'' in the mysqld section of the mysql configuration (/etc/mysql/my.cnf on linux). After restart of the server, newly created databases will be utf8 by default, and new tables in these databases will inherit this setting. Maybe it''s possible to change the character set of existing databases/tables, too, however your data will have to be converted, too. The per database setting imho is only a default setting applied to new tables. with all these settings in place, everything should be fine on the UTF-8 front :-) Cheers, Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa
Thank you very much for answers, but i still have problems> HTTP/HTML > content-type delivered by web server > should be ''Content-Type: text/html; charset=utf-8'' > html content-type meta tag > should be ''<meta http-equiv="content-type" > content="text/html;charset=utf-8" />''for this i did: application.rhtml <meta http-equiv="content-type" content="text/html;charset=utf-8" /> application.rb before_filter :set_charset def set_charset if request.xhr? headers["Content-Type"] = "text/javascript; charset=utf-8" else headers["Content-Type"] = "text/html; charset=utf-8" end end> > Mysql settings: > In your application''s console, execute the following: > r = Post.connection.execute "SHOW VARIABLES LIKE ''character%''" > r.each {|r|puts r} > This should result in something like this: > > character_set_client > utf8 > character_set_connection > utf8 > character_set_database > utf8 >Here i gen results: ["character_set_client", "latin1"] ["character_set_connection", "latin1"] ["character_set_database", "utf8"] ["character_set_filesystem", "binary"] ["character_set_results", "latin1"] ["character_set_server", "latin1"] ["character_set_system", "utf8"] ["character_sets_dir", "C:\\InstantRails\\mysql\\share\\charsets\\"]> If your output differs, try the following: > > set ''encoding: utf8'' in environment.rb > (this affects the character_set_connection and character_set_client > values) >in environement.rb i add: ENV[''LC_CTYPE''] = ''ru_RU.UTF8'' ENV[''LANG''] = ''ru_RU.UTF8'' $KCODE = ''u'' require ''acts_as_ferret''> set ''default-character-set=utf8'' in the mysqld section of the mysql > configuration (/etc/mysql/my.cnf on linux). After restart of the > server, > newly created databases will be utf8 by default, and new tables in > these > databases will inherit this setting. Maybe it''s possible to change the > character set of existing databases/tables, too, however your data > will > have to be converted, too. The per database setting imho is only a > default setting applied to new tables. >What else to do? Thanks -- Posted via http://www.ruby-forum.com/.
On Wed, Sep 05, 2007 at 11:16:20AM +0200, Igor K. wrote:> > Here i gen results: > ["character_set_client", "latin1"] > ["character_set_connection", "latin1"] > ["character_set_database", "utf8"] > ["character_set_filesystem", "binary"] > ["character_set_results", "latin1"] > ["character_set_server", "latin1"] > ["character_set_system", "utf8"] > ["character_sets_dir", "C:\\InstantRails\\mysql\\share\\charsets\\"] > > > If your output differs, try the following: > > > > set ''encoding: utf8'' in environment.rb > > (this affects the character_set_connection and character_set_client > > values)Sorry, my mistake. You need to set encoding: utf8 in database.yml for your db connections. Looking at the output above this should fix your problem. Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa
> Sorry, my mistake. You need to set > encoding: utf8 > in database.yml for your db connections.No problem, anyway you really helps me :) now i got ["character_set_client", "utf8"] ["character_set_connection", "utf8"] ["character_set_database", "utf8"] ["character_set_filesystem", "binary"] ["character_set_results", "utf8"] ["character_set_server", "latin1"] ["character_set_system", "utf8"] ["character_sets_dir", "C:\\InstantRails\\mysql\\share\\charsets\\"] is it okay -> ["character_set_server", "latin1"]???> Looking at the output above this should fix your problem. >-- Posted via http://www.ruby-forum.com/.
and how to start Ferret browser ??? -- Posted via http://www.ruby-forum.com/.
On Wed, Sep 05, 2007 at 11:32:35AM +0200, Igor K. wrote:> > Sorry, my mistake. You need to set > > encoding: utf8 > > in database.yml for your db connections. > > No problem, anyway you really helps me :) > > now i got > ["character_set_client", "utf8"] > ["character_set_connection", "utf8"] > ["character_set_database", "utf8"] > ["character_set_filesystem", "binary"] > ["character_set_results", "utf8"] > ["character_set_server", "latin1"] > ["character_set_system", "utf8"] > ["character_sets_dir", "C:\\InstantRails\\mysql\\share\\charsets\\"] > > > is it okay -> ["character_set_server", "latin1"]???yeah, as long as your app''s database is set to utf8. ferret-browser is started with ferret-browser path/to/index Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa
> ferret-browser is started with > > ferret-browser path/to/index >Seems to be that ferret works incorrect please check my sources and screenshot 22.gif. ps: when i start server i get an warning C:/ruby/lib/ruby/gems/1.8/gems/acts_as_ferret-0.4.1/lib/ferret_server.rb:123: warning: parenthesize argument(s) for future version -- Posted via http://www.ruby-forum.com/.
> > Seems to be that ferret works incorrect > please check my sources and screenshot 22.gif. > > ps: when i start server i get an warning > C:/ruby/lib/ruby/gems/1.8/gems/acts_as_ferret-0.4.1/lib/ferret_server.rb:123: > warning: parenthesize argument(s) for future versionhere ia attachment Attachments: http://www.ruby-forum.com/attachment/228/ferrettest3.rar -- Posted via http://www.ruby-forum.com/.
On Wed, Sep 05, 2007 at 12:00:03PM +0200, Igor K. wrote:> > ferret-browser is started with > > > > ferret-browser path/to/index > > > > Seems to be that ferret works incorrectin general I don''t think so :-)> please check my sources and screenshot 22.gif.where?> ps: when i start server i get an warning > C:/ruby/lib/ruby/gems/1.8/gems/acts_as_ferret-0.4.1/lib/ferret_server.rb:123: > warning: parenthesize argument(s) for future versionnothing to worry about... Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa
On Wed, Sep 05, 2007 at 12:06:50PM +0200, Igor K. wrote:> > > > Seems to be that ferret works incorrect > > please check my sources and screenshot 22.gif. > > > > ps: when i start server i get an warning > > C:/ruby/lib/ruby/gems/1.8/gems/acts_as_ferret-0.4.1/lib/ferret_server.rb:123: > > warning: parenthesize argument(s) for future version > > here ia attachment > > Attachments: > http://www.ruby-forum.com/attachment/228/ferrettest3.rarcan''t see what''s wrong with the screenshot (besides the missing images for yes/no). did you check the index contents? Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa
> can''t see what''s wrong with the screenshot (besides the missing images > for yes/no).> did you check the index contents?What do you mean? please check another attachment, why i can see only id field? Attachments: http://www.ruby-forum.com/attachment/229/333.GIF -- Posted via http://www.ruby-forum.com/.
On Wed, Sep 05, 2007 at 12:28:53PM +0200, Igor K. wrote:> > > can''t see what''s wrong with the screenshot (besides the missing images > > for yes/no). > > > > did you check the index contents? > > What do you mean? > > please check another attachment, why i can see only id field?ah, of course. aaf by default only stores the id in the index, the rest of the data is just indexed (so records can be found, but you cannot retrieve their original contents from the index). use something like acts_as_ferret :fields => { :name => { :store => :yes }, :body => { :store => :yes } } then you''ll see the data in ferret-browser. Jens -- Jens Kr?mer webit! Gesellschaft f?r neue Medien mbH Schnorrstra?e 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 kraemer at webit.de | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa
Ohhhh :), it is not working anyway, please check attachment. in console as you can see i searching by symbol that one word in database starts. i don''t know what to do? locale in system - russian Attachments: http://www.ruby-forum.com/attachment/230/1a.rar -- Posted via http://www.ruby-forum.com/.