Hi, All.
I have one question, how use setlocale?
My system is FreeBSD 6.1-STABLE.
A simple test program:
#include <ctype.h>
#include <locale.h>
#include <stdio.h>
int main(int argc, char *argv[])
{
const char* loc="ru_RU.KOI8-R";
int i;
char *loc_ret;
char buf[]="???????? ?????? abcdef";
loc_ret = setlocale(LC_CTYPE, loc);
if (!loc_ret) return (-1);
printf("original string = %s\n", buf);
for(i = 0; i < sizeof(buf); i++)
buf[i] = (char)toupper(buf[i]);
printf("toupper string = %s\n", buf);
return (0);
}
This programm don't work correctly. Cyrillic symbols don't converted to
upper case.
But in the same time perl programm work fine:
use locale;
use POSIX qw(locale_h);
my $str = "???????? ?????? abcdef";
setlocale(LC_CTYPE, "ru_RU.KOI8-R");
print uc ($str);
What is wrong?
--
WBR, Andrey V. Elsukov
On Sat, Aug 12, 2006 at 05:51:01PM +0400, Andrey V. Elsukov wrote:> for(i = 0; i < sizeof(buf); i++) > buf[i] = (char)toupper(buf[i]);buf[i] = (char)toupper((unsigned char)buf[i]); Standard integer promotion promotes KOI8-R char codes like 0xd4 into 0xffffffd4. Since such codepoints are not defined for KOI8-R, toupper returns them unchaged, as specified in documentation. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20060812/66c1599e/attachment.pgp
>On Sat, Aug 12, 2006 at 05:51:01PM +0400, Andrey V. Elsukov wrote: >> for(i = 0; i < sizeof(buf); i++) >> buf[i] = (char)toupper(buf[i]); > > buf[i] = (char)toupper((unsigned char)buf[i]); >Standard integer promotion promotes KOI8-R char codes like 0xd4 into 0xffffffd4. >Since such codepoints are not defined for KOI8-R, toupper returns them >unchaged, as specified in documentation.Thanks, this works! But why this example works on Linux without type conversions? -- WBR, Andrey V. Elsukov
On Sat, 12 Aug 2006, Andrey V. Elsukov wrote: AVE> >On Sat, Aug 12, 2006 at 05:51:01PM +0400, Andrey V. Elsukov wrote: AVE> >> for(i = 0; i < sizeof(buf); i++) AVE> >> buf[i] = (char)toupper(buf[i]); AVE> > AVE> > buf[i] = (char)toupper((unsigned char)buf[i]); AVE> >Standard integer promotion promotes KOI8-R char codes like 0xd4 into 0xffffffd4. AVE> >Since such codepoints are not defined for KOI8-R, toupper returns them AVE> >unchaged, as specified in documentation. AVE> AVE> Thanks, this works! But why this example works on Linux without type conversions? Linux has unsigned chars by default, while FreeBSD (and other current *BSDs) signed. Even large projects like PostgreSQL stepped into this trap at least once ;-) Sincerely, D.Marck [DM5020, MCK-RIPE, DM3-RIPN] ------------------------------------------------------------------------ *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru *** ------------------------------------------------------------------------