Hi, All. I have one question, how use setlocale? My system is FreeBSD 6.1-STABLE. A simple test program: #include <ctype.h> #include <locale.h> #include <stdio.h> int main(int argc, char *argv[]) { const char* loc="ru_RU.KOI8-R"; int i; char *loc_ret; char buf[]="???????? ?????? abcdef"; loc_ret = setlocale(LC_CTYPE, loc); if (!loc_ret) return (-1); printf("original string = %s\n", buf); for(i = 0; i < sizeof(buf); i++) buf[i] = (char)toupper(buf[i]); printf("toupper string = %s\n", buf); return (0); } This programm don't work correctly. Cyrillic symbols don't converted to upper case. But in the same time perl programm work fine: use locale; use POSIX qw(locale_h); my $str = "???????? ?????? abcdef"; setlocale(LC_CTYPE, "ru_RU.KOI8-R"); print uc ($str); What is wrong? -- WBR, Andrey V. Elsukov
On Sat, Aug 12, 2006 at 05:51:01PM +0400, Andrey V. Elsukov wrote:> for(i = 0; i < sizeof(buf); i++) > buf[i] = (char)toupper(buf[i]);buf[i] = (char)toupper((unsigned char)buf[i]); Standard integer promotion promotes KOI8-R char codes like 0xd4 into 0xffffffd4. Since such codepoints are not defined for KOI8-R, toupper returns them unchaged, as specified in documentation. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20060812/66c1599e/attachment.pgp
>On Sat, Aug 12, 2006 at 05:51:01PM +0400, Andrey V. Elsukov wrote: >> for(i = 0; i < sizeof(buf); i++) >> buf[i] = (char)toupper(buf[i]); > > buf[i] = (char)toupper((unsigned char)buf[i]); >Standard integer promotion promotes KOI8-R char codes like 0xd4 into 0xffffffd4. >Since such codepoints are not defined for KOI8-R, toupper returns them >unchaged, as specified in documentation.Thanks, this works! But why this example works on Linux without type conversions? -- WBR, Andrey V. Elsukov
On Sat, 12 Aug 2006, Andrey V. Elsukov wrote: AVE> >On Sat, Aug 12, 2006 at 05:51:01PM +0400, Andrey V. Elsukov wrote: AVE> >> for(i = 0; i < sizeof(buf); i++) AVE> >> buf[i] = (char)toupper(buf[i]); AVE> > AVE> > buf[i] = (char)toupper((unsigned char)buf[i]); AVE> >Standard integer promotion promotes KOI8-R char codes like 0xd4 into 0xffffffd4. AVE> >Since such codepoints are not defined for KOI8-R, toupper returns them AVE> >unchaged, as specified in documentation. AVE> AVE> Thanks, this works! But why this example works on Linux without type conversions? Linux has unsigned chars by default, while FreeBSD (and other current *BSDs) signed. Even large projects like PostgreSQL stepped into this trap at least once ;-) Sincerely, D.Marck [DM5020, MCK-RIPE, DM3-RIPN] ------------------------------------------------------------------------ *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru *** ------------------------------------------------------------------------