Maybe it is the wrong list, but maybe someone can quickly
help me out.
I have a very stupid problem. I cannot convert to upper
or lower case using manually set locale (setlocale(..)).
A very simple program:
#include <locale.h>
#include <errno.h>
#include <ctype.h>
main(){
char *b=setlocale(LC_CTYPE, "ru_RU.CP1251");
if (!b){
printf("FAILED! %d\n",errno);
}
else {
printf("OK: %s %d\n",b,errno);
printf("IS UPPER ?: %d\n",isupper('?'));
printf("IS UPPER ?: %d\n",isupper('?'));
printf("IS LOWER ?: %d\n",islower('?'));
printf("IS LOWER ?: %d\n",islower('?'));
printf("LOCALE %s\n",setlocale(LC_CTYPE,NULL));
printf("1: TO UPPER %c TO LOWER
%c\n",toupper('?'),tolower('?'));
printf("1-0: TO UPPER %c TO LOWER
%c\n",toupper('?'),tolower('?'));
printf("2: TO UPPER %c TO LOWER
%c\n",toupper('r'),tolower('R'));
}
}
Output is always:
OK: ru_RU.CP1251 0
IS UPPER ?: 0
IS UPPER ?: 0
IS LOWER ?: 0
IS LOWER ?: 0
LOCALE ru_RU.CP1251
1: TO UPPER ? TO LOWER ?
1-0: TO UPPER ? TO LOWER ?
2: TO UPPER R TO LOWER r
?,?,? - is lower case leters and
?,?,? - is upper case
As you see, it simply does not work at all.
It seems like the locale is "C" but as you see
setlocale returned ru_RU.CP1251
tested on 6.2, 5.4 and 4.10 - all the same.
What am i doing wrong?
(except posting in the wrong list ;)
--
Regards,
Artem
From the setlocale(3) manual page:
... A locale argument of NULL causes setlocale() to return the
current locale. ...
--
-max
2007/5/24, Artem Kuchin <matrix@itlegion.ru>:> Maybe it is the wrong list, but maybe someone can quickly
> help me out.
>
> I have a very stupid problem. I cannot convert to upper
> or lower case using manually set locale (setlocale(..)).
>
> A very simple program:
> #include <locale.h>
> #include <errno.h>
> #include <ctype.h>
>
> main(){
>
> char *b=setlocale(LC_CTYPE, "ru_RU.CP1251");
> if (!b){
> printf("FAILED! %d\n",errno);
> }
> else {
> printf("OK: %s %d\n",b,errno);
> printf("IS UPPER ?: %d\n",isupper('?'));
> printf("IS UPPER ?: %d\n",isupper('?'));
> printf("IS LOWER ?: %d\n",islower('?'));
> printf("IS LOWER ?: %d\n",islower('?'));
> printf("LOCALE %s\n",setlocale(LC_CTYPE,NULL));
> printf("1: TO UPPER %c TO LOWER
%c\n",toupper('?'),tolower('?'));
> printf("1-0: TO UPPER %c TO LOWER
%c\n",toupper('?'),tolower('?'));
> printf("2: TO UPPER %c TO LOWER
%c\n",toupper('r'),tolower('R'));
> }
> }
>
> Output is always:
>
> OK: ru_RU.CP1251 0
> IS UPPER ?: 0
> IS UPPER ?: 0
> IS LOWER ?: 0
> IS LOWER ?: 0
> LOCALE ru_RU.CP1251
> 1: TO UPPER ? TO LOWER ?
> 1-0: TO UPPER ? TO LOWER ?
> 2: TO UPPER R TO LOWER r
>
> ?,?,? - is lower case leters and
> ?,?,? - is upper case
>
> As you see, it simply does not work at all.
> It seems like the locale is "C" but as you see
> setlocale returned ru_RU.CP1251
>
> tested on 6.2, 5.4 and 4.10 - all the same.
> What am i doing wrong?
> (except posting in the wrong list ;)
>
>
>
> --
> Regards,
> Artem
>
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to
"freebsd-stable-unsubscribe@freebsd.org"
>
Artem Kuchin wrote:
> I have a very stupid problem. I cannot convert to upper
> or lower case using manually set locale (setlocale(..)).
>
> A very simple program:
> [...]
> printf("IS UPPER ?: %d\n",isupper('?'));
> printf("IS UPPER ?: %d\n",isupper('?'));
> printf("IS LOWER ?: %d\n",islower('?'));
> printf("IS LOWER ?: %d\n",islower('?'));
> printf("LOCALE %s\n",setlocale(LC_CTYPE,NULL));
> printf("1: TO UPPER %c TO LOWER
%c\n",toupper('?'),tolower('?'));
> printf("1-0: TO UPPER %c TO LOWER
%c\n",toupper('?'),tolower('?'));
> printf("2: TO UPPER %c TO LOWER
%c\n",toupper('r'),tolower('R'));
> [...]
> IS UPPER ?: 0
> IS UPPER ?: 0
> IS LOWER ?: 0
> IS LOWER ?: 0
> LOCALE ru_RU.CP1251
> 1: TO UPPER ? TO LOWER ?
> 1-0: TO UPPER ? TO LOWER ?
> 2: TO UPPER R TO LOWER r
That's a common pitfall. Chars are signed by default on
FreeBSD, and the isupper() etc. function take an int type
argument. That means that characters >= 128 end up as
negative numbers, so they fail all isupper() and islower()
checks, and toupper()/tolower() don't touch them at all.
The solution is to typecast the constants to unsigned char
explicitly, like this: isupper((unsigned char) '?') etc.
Your program will work fine then.
Best regards
Oliver
PS: You should also #include <stdio.h>
PPS: This is not a FreeBSD-specific pitfall. The ISO-C
standard does not specify the signedness of chars, and
most implementations (but not all) seem to prefer to
have chars signed by default. So, in order to write
portable programs, you always need to typecast if the
difference between signed and unsigned matters in your
application.
PPPS: I think follow-ups should go to the -standards
mailing list.
--
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606, Gesch?ftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht M?n-
chen, HRB 125758, Gesch?ftsf?hrer: Maik Bachmann, Olaf Erb, Ralf Gebhart
FreeBSD-Dienstleistungen, -Produkte und mehr: http://www.secnetix.de/bsd
"With sufficient thrust, pigs fly just fine. However, this
is not necessarily a good idea. It is hard to be sure where
they are going to land, and it could be dangerous sitting
under them as they fly overhead." -- RFC 1925