On 17/08/2018 17:15, Ingo Schwarze wrote:> Hi Darren, > > Darren Tucker wrote on Fri, Aug 17, 2018 at 07:16:03AM -0700: >> On 13 August 2018 at 15:06, Val Baranov <val.baranov at duke.edu> wrote: >>> test_utf8: ........................ >>> regress/unittests/utf8/tests.c:48 test #25 "c_esc" >>> ASSERT_INT_EQ(len, wantlen) failed: >>> len = -1 >>> wantlen = 5 >> This boils down to meaning OpenSSH's smnprintf call failed for the >> string "\033x" instead of returning the expected escaped version >> "\\033x". The code is in utf8.c but I am not sure why it failed. > Actually, it is *supposed* to fail unless the locale is either > UTF-8 or the POSIX (ASCII) locale, because '\033' is not a > printable character and attempting to escape invalid stuff > is unsafe in arbitrary locales. > >> What's your locale set to?OK. Double checked. AIX defaults: environment: LANG=en_US root at x064:[/usr/lib/nls/loc]ls -l /usr/lib/nls/loc/en_US lrwxrwxrwx??? 1 bin????? bin????????????? 32 Aug 02 06:40 /usr/lib/nls/loc/en_US -> /usr/lib/nls/loc/en_US.ISO8859-1 And, after installing the UTF-8 fileset (? /usr/lib/nls/loc/en_US.UTF-8??????????????? bos.loc.utf.EN_US???? File) The test is attempted, and fails. Question #1 - how can I run only this test? Then it is easier to look for potential resolutions.> It doesn't matter on OpenBSD, but maybe you should consider setting > LC_CTYPE=en_US.UTF-8 by default in TEST_ENV in the portable version > of the test suite? Of course, it would do no harm on OpenBSD either.While I wait for the answer - I'll just run the tests prefixed with export? LC_CTYPE=en_US.UTF-8 - maybe that is all that is needed. Reminds me of Question #2: how is your definition of POSIX different from ISO8859-1 (and/or ISO8859-15, the "UK" or EN_US variant)?> If you worry that some target system might not have a en_US.UTF-8 > locale installed, you can look at > > http://mandoc.bsd.lv/cgi-bin/cvsweb/configure?rev=HEAD > > for a way to autodetect a suitable UTF-8 locale - look for UTF8_LOCALE > in that script. > > But that may be overkill for OpenSSH. Just recklessly forcing > LC_CTYPE=en_US.UTF-8 may be good enough for OpenSSH's purposes. > If the target system doesn't provide it, setlocale(3) will fall > back to POSIX, which should be good enough for the tests. > > Yours, > Ingo > _______________________________________________ > openssh-unix-dev mailing list > openssh-unix-dev at mindrot.org > https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev
On 20/08/2018 10:33, Michael Felt wrote:> On 17/08/2018 17:15, Ingo Schwarze wrote: >> Hi Darren, >> >> Darren Tucker wrote on Fri, Aug 17, 2018 at 07:16:03AM -0700: >>> On 13 August 2018 at 15:06, Val Baranov <val.baranov at duke.edu> wrote: >>>> test_utf8: ........................ >>>> regress/unittests/utf8/tests.c:48 test #25 "c_esc" >>>> ASSERT_INT_EQ(len, wantlen) failed: >>>> len = -1 >>>> wantlen = 5 >>> This boils down to meaning OpenSSH's smnprintf call failed for the >>> string "\033x" instead of returning the expected escaped version >>> "\\033x". The code is in utf8.c but I am not sure why it failed. >> Actually, it is *supposed* to fail unless the locale is either >> UTF-8 or the POSIX (ASCII) locale, because '\033' is not a >> printable character and attempting to escape invalid stuff >> is unsafe in arbitrary locales. >> >>> What's your locale set to? > OK. Double checked. > AIX defaults: > environment: > LANG=en_US > root at x064:[/usr/lib/nls/loc]ls -l /usr/lib/nls/loc/en_US > lrwxrwxrwx??? 1 bin????? bin????????????? 32 Aug 02 06:40 > /usr/lib/nls/loc/en_US -> /usr/lib/nls/loc/en_US.ISO8859-1 > > And, after installing the UTF-8 fileset (? > /usr/lib/nls/loc/en_US.UTF-8??????????????? bos.loc.utf.EN_US???? File) > > The test is attempted, and fails. > > Question #1 - how can I run only this test? Then it is easier to look > for potential resolutions. > >> It doesn't matter on OpenBSD, but maybe you should consider setting >> LC_CTYPE=en_US.UTF-8 by default in TEST_ENV in the portable version >> of the test suite? Of course, it would do no harm on OpenBSD either. > While I wait for the answer - I'll just run the tests prefixed with > export? LC_CTYPE=en_US.UTF-8 - maybe that is all that is needed.Test fails. Use smitty to change primary "cultural", "language" and "keyboard" ? Primary CULTURAL convention? UTF-8?? English (United States) [EN_US]???????????????????????? + ? Primary LANGUAGE translation UTF-8?? English (UTF-8) [en.UTF-8]????????????????????????????? + ? Primary KEYBOARD???????????? UTF-8?? English(POSIX) KBD ID 103P [EN_US]????????????????????? + initially fails, as these are also required: Error:? The selected settings for cultural convention, language, ??????? and keyboard require the installation of additional filesets ??????? that are not currently installed.? Select an installation device ??????? that contains the following filesets: ?bos.msg.EN_US.net.tcp.client ?bos.msg.EN_US.rte After that is added, and set: root at x064:[/data/prj/openbsd/mindrot/openssh-7.8.0.20]grep LANG /etc/environment LANG=EN_US root at x064:[/data/prj/openbsd/mindrot/openssh-7.8.0.20]grep LC /etc/environment LC__FASTMSG=true LC_MESSAGES=C at lft root at x064:[/usr/lib/nls/loc]ls -l EN_US lrwxrwxrwx??? 1 bin????? bin????????????? 28 Aug 03 12:28 EN_US -> /usr/lib/nls/loc/EN_US.UTF-8 unset LC_CTYPE export LANG=EN_US Still same error sequence as above. regress/unittests/utf8/tests.c:48 test #25 "c_esc" ASSERT_INT_EQ(len, wantlen) failed: ???????? len = -1 ???? wantlen = 5 make: 1254-059 The signal code from the last command is 6. (back to question #2: en_US is what I would expect to be enough to satisfy "posix". traditionally EN_US has been to mean iso8559-15 while en_US has been iso8559-1. Not clear on what the key differences are. So, what is needed for OpenBSD definition of "POSIX"?> > Reminds me of Question #2: how is your definition of POSIX different > from ISO8859-1 (and/or ISO8859-15, the "UK" or EN_US variant)? >> If you worry that some target system might not have a en_US.UTF-8 >> locale installed, you can look at >> >> http://mandoc.bsd.lv/cgi-bin/cvsweb/configure?rev=HEAD >> >> for a way to autodetect a suitable UTF-8 locale - look for UTF8_LOCALE >> in that script. >> >> But that may be overkill for OpenSSH. Just recklessly forcing >> LC_CTYPE=en_US.UTF-8 may be good enough for OpenSSH's purposes. >> If the target system doesn't provide it, setlocale(3) will fall >> back to POSIX, which should be good enough for the tests. >> >> Yours, >> Ingo >> _______________________________________________ >> openssh-unix-dev mailing list >> openssh-unix-dev at mindrot.org >> https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev > > > _______________________________________________ > openssh-unix-dev mailing list > openssh-unix-dev at mindrot.org > https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev >
Hi, Michael Felt wrote on Mon, Aug 20, 2018 at 11:28:26AM +0200:> On 20/08/2018 10:33, Michael Felt wrote: >> On 17/08/2018 17:15, Ingo Schwarze wrote: >>> Darren Tucker wrote on Fri, Aug 17, 2018 at 07:16:03AM -0700: >>>> On 13 August 2018 at 15:06, Val Baranov <val.baranov at duke.edu> wrote:>>>>> test_utf8: ........................ >>>>> regress/unittests/utf8/tests.c:48 test #25 "c_esc" >>>>> ASSERT_INT_EQ(len, wantlen) failed: >>>>> len = -1 >>>>> wantlen = 5>>>> This boils down to meaning OpenSSH's smnprintf call failed for the >>>> string "\033x" instead of returning the expected escaped version >>>> "\\033x". The code is in utf8.c but I am not sure why it failed.>>> Actually, it is *supposed* to fail unless the locale is either >>> UTF-8 or the POSIX (ASCII) locale, because '\033' is not a >>> printable character and attempting to escape invalid stuff >>> is unsafe in arbitrary locales.Sorry, i spoke too soon, i didn't correctly remember how the tests work. It is completely irrelevant what the user sets their locale to, as it should in a test suite. The tests themselves make sure the locale ist set correctly for the tests. The "c_esc" in the test output above means that it is testing the "C" locale at that point. So the problem is somewhere else, likely in what your nl_langinfo(3) function does in the POSIX locale. Could you please run the following simple test program on your system and show us the output, for further diagnosis? OpenBSD: $ make nl_langinfo cc -O2 -pipe -o nl_langinfo nl_langinfo.c $ ./nl_langinfo setlocale -> "C" nl_langinfo -> "US-ASCII" Linux: $ make nl_langinfo cc nl_langinfo.c -o nl_langinfo $ ./nl_langinfo setlocale -> "C" nl_langinfo -> "ANSI_X3.4-1968" AIX: ? Thank you, Ingo $ cat nl_langinfo.c #include <err.h> #include <langinfo.h> #include <locale.h> #include <stdio.h> int main(void) { char *res; res = setlocale(LC_CTYPE, "C"); if (res == NULL) err(1, "setlocale"); printf("setlocale -> \"%s\"\n", res); res = nl_langinfo(CODESET); if (res == NULL) err(1, "nl_langinfo"); printf("nl_langinfo -> \"%s\"\n", res); return 0; }