On 12 Jan 2015, at 08:07, Dimitry Andric <dimitry at andric.com> wrote:> > > On 15 Oct 2014, at 19:42, Richard Smith <richard at metafoo.co.uk> wrote: >> On 15 Oct 2014 05:12, "Ed Schouten" <ed at 80386.nl> wrote: > ... >> The test case in the LLVM tree is invalid and should be discarded. It >> erroneously assumes that the encoding of wchar_t is independent of the >> locale. >> > That makes no sense. These value are compile-time constants and cannot possibly depend on the locale.I believe Richard is wrong here. There are a number of similar compile-time constant macros in the C spec. I believe the clue as to the correct reading of the spec is in the name of the macro: __STDC_MB_MIGHT_NEQ_WC__ Note the word *might*. It means that it is not safe for code to assume that a cast will give the corresponding char value if one exists. i.e. that assumption is not true for all locales. As dim says, the test is wrong. It would be a valid test for it to fail if __STDC_MB_MIGHT_NEQ_WC__ is *not* defined and not all characters in the basic set have the same encoding as wide chars, but it is not correct to fail if it is set unless that have the same encoding in *all* locales, *and* the vendor is willing to guarantee that they will have the same encoding for all locales in all future binary-compatible versions of the system (which an automated test can't check). In summary: the test is nonsense and should be removed. David
On Mon, Jan 12, 2015 at 1:26 AM, David Chisnall <David.Chisnall at cl.cam.ac.uk> wrote:> On 12 Jan 2015, at 08:07, Dimitry Andric <dimitry at andric.com> wrote: >> >> >> On 15 Oct 2014, at 19:42, Richard Smith <richard at metafoo.co.uk> wrote: >>> On 15 Oct 2014 05:12, "Ed Schouten" <ed at 80386.nl> wrote: >> ... >>> The test case in the LLVM tree is invalid and should be discarded. It >>> erroneously assumes that the encoding of wchar_t is independent of the >>> locale. >>> >> That makes no sense. These value are compile-time constants and cannot possibly depend on the locale. > > I believe Richard is wrong here. There are a number of similar compile-time constant macros in the C spec. I believe the clue as to the correct reading of the spec is in the name of the macro: __STDC_MB_MIGHT_NEQ_WC__ > > Note the word *might*. It means that it is not safe for code to assume that a cast will give the corresponding char value if one exists. i.e. that assumption is not true for all locales. > > As dim says, the test is wrong. It would be a valid test for it to fail if __STDC_MB_MIGHT_NEQ_WC__ is *not* defined and not all characters in the basic set have the same encoding as wide chars, but it is not correct to fail if it is set unless that have the same encoding in *all* locales, *and* the vendor is willing to guarantee that they will have the same encoding for all locales in all future binary-compatible versions of the system (which an automated test can't check). > > In summary: the test is nonsense and should be removed.I couldn't find an existing PR for this, so I filed http://llvm.org/PR22208 with folks on this thread cc'd. It would be great if we could get it resolved soon. Thanks, Hans
On Mon, Jan 12, 2015 at 1:26 AM, David Chisnall <David.Chisnall at cl.cam.ac.uk> wrote:> On 12 Jan 2015, at 08:07, Dimitry Andric <dimitry at andric.com> wrote: > > > > > > On 15 Oct 2014, at 19:42, Richard Smith <richard at metafoo.co.uk> wrote: > >> On 15 Oct 2014 05:12, "Ed Schouten" <ed at 80386.nl> wrote: > > ... > >> The test case in the LLVM tree is invalid and should be discarded. It > >> erroneously assumes that the encoding of wchar_t is independent of the > >> locale. > >> > > That makes no sense. These value are compile-time constants and cannot > possibly depend on the locale. > > I believe Richard is wrong here. There are a number of similar > compile-time constant macros in the C spec. I believe the clue as to the > correct reading of the spec is in the name of the macro: > __STDC_MB_MIGHT_NEQ_WC__ > > Note the word *might*. It means that it is not safe for code to assume > that a cast will give the corresponding char value if one exists. i.e. > that assumption is not true for all locales. >You should read the definitions in the relevant standards rather than trying to guess what the macro means from its name. Here is the definition: "The integer constant 1, intended to indicate that, in the encoding for wchar_t, a member of the basic character set need not have a code value equal to its value when used as the lone character in an integer character constant." So the value 1 indicates that 'x' might not equal L'x' for some character x in the basic character set. (Note that the 'might' means that there might exist some *character* where this happens, not that there might exist some *locale* where this happens.) Since the value of 'x' and L'x' are determined at translation time, this property obviously cannot depend in any way on the current locale in the execution environment. Note that the above property is *exactly* what the test is testing for. However... the FreeBSD folks don't seem interested in fixing their bug, and it's technically conforming for an implementation to define this macro to 1 in any situation -- a member of the basic source character set "need not" have the same value as a narrow or wide character, even though they all actually do -- making this a quality-of-implementation issue, and I'm tired of discussing this, so I've relaxed the test for FreeBSD in r225751. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150112/9ef79337/attachment.html>
Count me in for testing Fedora and OpenSUSE. On Tue, Jan 13, 2015 at 1:01 PM, Richard Smith <richard at metafoo.co.uk> wrote:> On Mon, Jan 12, 2015 at 1:26 AM, David Chisnall < > David.Chisnall at cl.cam.ac.uk> wrote: > >> On 12 Jan 2015, at 08:07, Dimitry Andric <dimitry at andric.com> wrote: >> > >> > >> > On 15 Oct 2014, at 19:42, Richard Smith <richard at metafoo.co.uk> wrote: >> >> On 15 Oct 2014 05:12, "Ed Schouten" <ed at 80386.nl> wrote: >> > ... >> >> The test case in the LLVM tree is invalid and should be discarded. It >> >> erroneously assumes that the encoding of wchar_t is independent of the >> >> locale. >> >> >> > That makes no sense. These value are compile-time constants and cannot >> possibly depend on the locale. >> >> I believe Richard is wrong here. There are a number of similar >> compile-time constant macros in the C spec. I believe the clue as to the >> correct reading of the spec is in the name of the macro: >> __STDC_MB_MIGHT_NEQ_WC__ >> >> Note the word *might*. It means that it is not safe for code to assume >> that a cast will give the corresponding char value if one exists. i.e. >> that assumption is not true for all locales. >> > > You should read the definitions in the relevant standards rather than > trying to guess what the macro means from its name. Here is the definition: > > "The integer constant 1, intended to indicate that, in the encoding for > wchar_t, a member of the basic character set need not have a code value > equal to its value when used as the lone character in an integer character > constant." > > So the value 1 indicates that 'x' might not equal L'x' for some character > x in the basic character set. (Note that the 'might' means that there might > exist some *character* where this happens, not that there might exist some > *locale* where this happens.) Since the value of 'x' and L'x' are > determined at translation time, this property obviously cannot depend in > any way on the current locale in the execution environment. > > Note that the above property is *exactly* what the test is testing for. > > However... the FreeBSD folks don't seem interested in fixing their bug, > and it's technically conforming for an implementation to define this macro > to 1 in any situation -- a member of the basic source character set "need > not" have the same value as a narrow or wide character, even though they > all actually do -- making this a quality-of-implementation issue, and I'm > tired of discussing this, so I've relaxed the test for FreeBSD in r225751. > > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150115/d4f02066/attachment.html>