thr3ads.net - llvm dev - [LLVMdev] [cfe-dev] Reminder: 3.6 branch is coming [Jan 2015]

If this information is useful, please help other people find it:
Share via:

David Chisnall

2015-Jan-12 09:26 UTC

[LLVMdev] [cfe-dev] Reminder: 3.6 branch is coming

On 12 Jan 2015, at 08:07, Dimitry Andric <dimitry at andric.com>
wrote:> 
> 
> On 15 Oct 2014, at 19:42, Richard Smith <richard at metafoo.co.uk>
wrote:
>> On 15 Oct 2014 05:12, "Ed Schouten" <ed at 80386.nl>
wrote:
> ...
>> The test case in the LLVM tree is invalid and should be discarded. It
>> erroneously assumes that the encoding of wchar_t is independent of the
>> locale.
>> 
> That makes no sense. These value are compile-time constants and cannot
possibly depend on the locale.
I believe Richard is wrong here.  There are a number of similar compile-time
constant macros in the C spec.  I believe the clue as to the correct reading of
the spec is in the name of the macro: __STDC_MB_MIGHT_NEQ_WC__

Note the word *might*.  It means that it is not safe for code to assume that a
cast will give the corresponding char value if one exists.  i.e. that assumption
is not true for all locales.

As dim says, the test is wrong.  It would be a valid test for it to fail if
__STDC_MB_MIGHT_NEQ_WC__ is *not* defined and not all characters in the basic
set have the same encoding as wide chars, but it is not correct to fail if it is
set unless that have the same encoding in *all* locales, *and* the vendor is
willing to guarantee that they will have the same encoding for all locales in
all future binary-compatible versions of the system (which an automated test
can't check).

In summary: the test is nonsense and should be removed.

David

Hans Wennborg

2015-Jan-12 17:31 UTC

head link

[LLVMdev] [cfe-dev] Reminder: 3.6 branch is coming

On Mon, Jan 12, 2015 at 1:26 AM, David Chisnall
<David.Chisnall at cl.cam.ac.uk> wrote:> On 12 Jan 2015, at 08:07, Dimitry Andric <dimitry at andric.com>
wrote:
>>
>>
>> On 15 Oct 2014, at 19:42, Richard Smith <richard at
metafoo.co.uk> wrote:
>>> On 15 Oct 2014 05:12, "Ed Schouten" <ed at
80386.nl> wrote:
>> ...
>>> The test case in the LLVM tree is invalid and should be discarded.
It
>>> erroneously assumes that the encoding of wchar_t is independent of
the
>>> locale.
>>>
>> That makes no sense. These value are compile-time constants and cannot
possibly depend on the locale.
>
> I believe Richard is wrong here.  There are a number of similar
compile-time constant macros in the C spec.  I believe the clue as to the
correct reading of the spec is in the name of the macro:
__STDC_MB_MIGHT_NEQ_WC__
>
> Note the word *might*.  It means that it is not safe for code to assume
that a cast will give the corresponding char value if one exists.  i.e. that
assumption is not true for all locales.
>
> As dim says, the test is wrong.  It would be a valid test for it to fail if
__STDC_MB_MIGHT_NEQ_WC__ is *not* defined and not all characters in the basic
set have the same encoding as wide chars, but it is not correct to fail if it is
set unless that have the same encoding in *all* locales, *and* the vendor is
willing to guarantee that they will have the same encoding for all locales in
all future binary-compatible versions of the system (which an automated test
can't check).
>
> In summary: the test is nonsense and should be removed.
I couldn't find an existing PR for this, so I filed
http://llvm.org/PR22208 with folks on this thread cc'd. It would be
great if we could get it resolved soon.

Thanks,
Hans

Richard Smith

2015-Jan-13 02:01 UTC

head link

[LLVMdev] [cfe-dev] Reminder: 3.6 branch is coming

On Mon, Jan 12, 2015 at 1:26 AM, David Chisnall <David.Chisnall at
cl.cam.ac.uk> wrote:
> On 12 Jan 2015, at 08:07, Dimitry Andric <dimitry at andric.com>
wrote:
> >
> >
> > On 15 Oct 2014, at 19:42, Richard Smith <richard at
metafoo.co.uk> wrote:
> >> On 15 Oct 2014 05:12, "Ed Schouten" <ed at
80386.nl> wrote:
> > ...
> >> The test case in the LLVM tree is invalid and should be discarded.
It
> >> erroneously assumes that the encoding of wchar_t is independent of
the
> >> locale.
> >>
> > That makes no sense. These value are compile-time constants and cannot
> possibly depend on the locale.
>
> I believe Richard is wrong here.  There are a number of similar
> compile-time constant macros in the C spec.  I believe the clue as to the
> correct reading of the spec is in the name of the macro:
> __STDC_MB_MIGHT_NEQ_WC__
>
> Note the word *might*.  It means that it is not safe for code to assume
> that a cast will give the corresponding char value if one exists.  i.e.
> that assumption is not true for all locales.
>
You should read the definitions in the relevant standards rather than
trying to guess what the macro means from its name. Here is the definition:

"The integer constant 1, intended to indicate that, in the encoding for
wchar_t, a member of the basic character set need not have a code value
equal to its value when used as the lone character in an integer character
constant."

So the value 1 indicates that 'x' might not equal L'x' for some
character x
in the basic character set. (Note that the 'might' means that there
might
exist some *character* where this happens, not that there might exist some
*locale* where this happens.) Since the value of 'x' and L'x'
are
determined at translation time, this property obviously cannot depend in
any way on the current locale in the execution environment.

Note that the above property is *exactly* what the test is testing for.

However... the FreeBSD folks don't seem interested in fixing their bug, and
it's technically conforming for an implementation to define this macro to 1
in any situation -- a member of the basic source character set "need
not"
have the same value as a narrow or wide character, even though they all
actually do -- making this a quality-of-implementation issue, and I'm tired
of discussing this, so I've relaxed the test for FreeBSD in r225751.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150112/9ef79337/attachment.html>

Nikola Smiljanic

2015-Jan-15 00:03 UTC

head link

[LLVMdev] [cfe-dev] Reminder: 3.6 branch is coming

Count me in for testing Fedora and OpenSUSE.

On Tue, Jan 13, 2015 at 1:01 PM, Richard Smith <richard at metafoo.co.uk>
wrote:
> On Mon, Jan 12, 2015 at 1:26 AM, David Chisnall <
> David.Chisnall at cl.cam.ac.uk> wrote:
>
>> On 12 Jan 2015, at 08:07, Dimitry Andric <dimitry at andric.com>
wrote:
>> >
>> >
>> > On 15 Oct 2014, at 19:42, Richard Smith <richard at
metafoo.co.uk> wrote:
>> >> On 15 Oct 2014 05:12, "Ed Schouten" <ed at
80386.nl> wrote:
>> > ...
>> >> The test case in the LLVM tree is invalid and should be
discarded. It
>> >> erroneously assumes that the encoding of wchar_t is
independent of the
>> >> locale.
>> >>
>> > That makes no sense. These value are compile-time constants and
cannot
>> possibly depend on the locale.
>>
>> I believe Richard is wrong here.  There are a number of similar
>> compile-time constant macros in the C spec.  I believe the clue as to
the
>> correct reading of the spec is in the name of the macro:
>> __STDC_MB_MIGHT_NEQ_WC__
>>
>> Note the word *might*.  It means that it is not safe for code to assume
>> that a cast will give the corresponding char value if one exists.  i.e.
>> that assumption is not true for all locales.
>>
>
> You should read the definitions in the relevant standards rather than
> trying to guess what the macro means from its name. Here is the definition:
>
> "The integer constant 1, intended to indicate that, in the encoding
for
> wchar_t, a member of the basic character set need not have a code value
> equal to its value when used as the lone character in an integer character
> constant."
>
> So the value 1 indicates that 'x' might not equal L'x' for
some character
> x in the basic character set. (Note that the 'might' means that
there might
> exist some *character* where this happens, not that there might exist some
> *locale* where this happens.) Since the value of 'x' and
L'x' are
> determined at translation time, this property obviously cannot depend in
> any way on the current locale in the execution environment.
>
> Note that the above property is *exactly* what the test is testing for.
>
> However... the FreeBSD folks don't seem interested in fixing their bug,
> and it's technically conforming for an implementation to define this
macro
> to 1 in any situation -- a member of the basic source character set
"need
> not" have the same value as a narrow or wide character, even though
they
> all actually do -- making this a quality-of-implementation issue, and
I'm
> tired of discussing this, so I've relaxed the test for FreeBSD in
r225751.
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150115/d4f02066/attachment.html>

llvm dev - Jan 2015 - [LLVMdev] [cfe-dev] Reminder: 3.6 branch is coming

[LLVMdev] [cfe-dev] Reminder: 3.6 branch is coming

[LLVMdev] [cfe-dev] Reminder: 3.6 branch is coming

[LLVMdev] [cfe-dev] Reminder: 3.6 branch is coming

[LLVMdev] [cfe-dev] Reminder: 3.6 branch is coming