Juneyoung Lee via llvm-dev
2020-Oct-09 03:08 UTC
[llvm-dev] Undef and Poison round table follow-up & a plan
It is UB when a poison is passed to certain operations that raise UB on
poison, such as division by poison/dereferencing poison pointer/branching
on poison condition/etc.
Otherwise, poison is simply propagated, but it does not raise UB
Copying poison bytes is okay:
// Members are initialized to poison at object creation.
p = alloca {i8, i32} // p[0], p[4~7] are poison
q = alloca {i8, i32} // we want to copy p to q
v = load i8* p[0] // v is poison
store i8 v, i8* q[0] // poison is simply copied; no UB happened
Similarly, passing/returning poison is allowed as well.
Juneyoung
On Fri, Oct 9, 2020 at 10:45 AM Hubert Tong <
hubert.reinterpretcast at gmail.com> wrote:
> On Thu, Oct 8, 2020 at 7:13 PM Juneyoung Lee <juneyoung.lee at
sf.snu.ac.kr>
> wrote:
>
>> > It is important to note that this applies to trap representations
and
>> not to unspecified values. A structure or union never has a trap
>> representation.
>> Yes, nondeterministic bits would work for padding of struct/union, as
>> described in (3) The third case is the value of struct/union padding.
>> For the members of struct/union, it is allowed to have trap
>> representation, so poison can be used.
>>
> At what point are the members considered poison? For
> copying/passing/returning a struct or union, there is no UB even if some
> members are uninitialized.
>
>
>>
>> Juneyoung
>>
>> On Fri, Oct 9, 2020 at 5:37 AM Hubert Tong <
>> hubert.reinterpretcast at gmail.com> wrote:
>>
>>> On Thu, Oct 8, 2020 at 12:12 PM Juneyoung Lee via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> Hello all,
>>>>
>>>> Thank everyone who participated in the (impromptu) round table
>>>> discussion on Tuesday.
>>>> For those who are interested, I share the summary of the
discussion.
>>>> Also, I share a short-term plan regarding this issue and
relevant
>>>> patches.
>>>>
>>>>
>>>> *Fixing Miscompilations using Freeze*
>>>> -----------------------------------
>>>>
>>>> To reduce the cost of fixing miscompilations using freeze
instruction,
>>>> we need to
>>>> optimize freeze away whenever possible.
>>>> Using the no-undef/poison assumption from the source language
(C/C++ in
>>>> this context) can play a significant role.
>>>> To make use the assumptions, here are short-term goals:
>>>>
>>>> *1. Preserve no-undef/poison assumption of function arguments
from
>>>> C/C++ when valid.*
>>>>
>>>> There is an ongoing relevant patch (that is written by others):
>>>> https://reviews.llvm.org/D81678
>>>>
>>>> *2. Preserve no-undef/poison assumption of lvalue reads in
C/C++ when
>>>> valid.*
>>>>
>>>> Reading an indeterminate value from an lvalue that does not
have char or
>>>> std::byte type is UB [1].
>>>> Since reading an lvalue is lowered to `load` in IR, we suggest
>>>> attaching a new
>>>> !noundef metadata to such `load`s.
>>>> The IR-side change is here: https://reviews.llvm.org/D89050
>>>> The clang-side change is going to be made after D81678 is
reviewed,
>>>> because it is likely
>>>> that this patch will have a lot of changes in clang tests.
>>>>
>>>>
>>>> *Replacing Undef with Poison*
>>>> ---------------------------
>>>>
>>>> Since undef is known to be the source of many optimizations due
to its
>>>> complexity,
>>>> we'd like to suggest gradually moving towards using poison
only.
>>>> To make it, (1) `poison` constant should be introduced into
LLVM IR
>>>> first, and (2)
>>>> transformations that introduce `undef` should be updated to
introduce
>>>> `poison` instead.
>>>>
>>>> For the step (2), we need an experimental result showing that
it does
>>>> not cause
>>>> performance degradation. This relies on better support for
freeze (the
>>>> no-undef/poison analysis patches).
>>>>
>>>> *1. Introduce a new `poison` constant into IR*:
>>>> https://reviews.llvm.org/D71126
>>>>
>>>> Note that `poison` constant can be used as a true placeholder
value as
>>>> well.
>>>> Undef cannot be used in general because it is less undefined
than
>>>> poison.
>>>>
>>>> *2. Update transformations that introduce `undef` to introduce
`poison`
>>>> instead*
>>>>
>>>> (1) There are transformations that introduce `undef` as a
placeholder
>>>> (e.g. phi operand
>>>> from an unreachable block).
>>>> For these, `poison` can be used instead.
>>>>
>>>> (2) The value of an uninitialized object (automatic or
dynamic).
>>>> They are indeterminate values in C/C++, so okay to use poison
instead.
>>>> A tricky case is a bitfield access, and we have two possible
solutions:
>>>>
>>>> - i. Introduce a very-packed struct type
>>>> ```
>>>> <C>
>>>> struct {
>>>> int a:2, b:6;
>>>> } s;
>>>>
>>>> v = s.a;
>>>>
>>>> =>
>>>>
>>>> <IR>
>>>>
>>>> s = alloca
>>>>
>>>> tmp = load *{{i2, i6}}** s ; load as a very packed struct type
>>>> v = extractvalue tmp, 0
>>>> ```
>>>> * Pros: Can be used to precisely lower C/C++'s struct
typed function
>>>> argument into IR
>>>> (currently clang coerces a struct into int if small enough;
I'll
>>>> explain about this detail if anyone requests)
>>>> * Cons: Since optimizations aren’t aware of the new type,
they should
>>>> be updated
>>>>
>>>> - ii. Use load-freeze
>>>> ```
>>>> <C>
>>>> struct {
>>>> int a:2, b:6;
>>>> } s;
>>>>
>>>> v = s.a;
>>>>
>>>> =>
>>>>
>>>> <IR>
>>>> s = alloca
>>>>
>>>> // Poison bits are frozen and returned
>>>> tmp = *load freeze* i8* s
>>>> v = tmp & 3
>>>> ```
>>>> * Pros: The change is simpler
>>>> * Cons: Store forwarding isn’t free; needs insertion of
freeze
>>>> (store x, p; v = load freeze p => store x, p; v = freeze
x)
>>>>
>>>>
>>>> (3) The third case is the value of struct/union padding.
>>>> Padding is filled with unspecified value in C, so it is too
undefined
>>>> to use poison.
>>>> We can fill it with defined bits nondeterministically chosen at
>>>> allocation time (freeze poison).
>>>>
>>>> ```
>>>> <C>
>>>> struct {
>>>> char a; // 3 bytes padding
>>>> int b;
>>>> } s;
>>>>
>>>> v = s.b;
>>>>
>>>> =>
>>>>
>>>> <IR>
>>>> s = alloca {i8, i32} // alloca initializes bytes in a
type-dependent
>>>> manner
>>>> // s[0], s[4~7]: poison
>>>> // s[1~3]: let's fill these bytes with nondet. bits
>>>>
>>>> s2 = gep (bitcast s to i8*), 4
>>>> v = load i32 s2
>>>> ```
>>>>
>>>>
>>>> Thanks,
>>>> Juneyoung
>>>>
>>>>
>>>>
>>>> [1]
>>>> C11 6.2.6.1.5: If the stored value of an object has such a
>>>> representation and is read by an lvalue expression that does
not have
>>>> character type, the behavior is undefined.
>>>> (Similarly, C17 6.2.6.1.5)
>>>>
>>> It is important to note that this applies to trap representations
and
>>> not to unspecified values. A structure or union never has a trap
>>> representation.
>>>
>>>
>>>> C++14 8.5.12: If an indeterminate value is produced by an
evaluation,
>>>> the behavior is undefined except in the following cases: If an
>>>> indeterminate value of unsigned narrow character type ...
>>>> (Similarly, C++17 11.6.12 , C++11 4.1.1)
>>>>
>>> While loading undef for the unsigned character type case merely
produces
>>> undef, for C++, operations such as sign-extend or zero-extend on an
undef
>>> i8 is also undefined behaviour.
>>>
>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>
>>
>> --
>>
>> Juneyoung Lee
>> Software Foundation Lab, Seoul National University
>>
>
--
Juneyoung Lee
Software Foundation Lab, Seoul National University
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20201009/a7ae424f/attachment-0001.html>
Hubert Tong via llvm-dev
2020-Oct-09 03:19 UTC
[llvm-dev] Undef and Poison round table follow-up & a plan
On Thu, Oct 8, 2020 at 11:09 PM Juneyoung Lee <juneyoung.lee at sf.snu.ac.kr> wrote:> It is UB when a poison is passed to certain operations that raise UB on > poison, such as division by poison/dereferencing poison pointer/branching > on poison condition/etc. >Got it. Thanks.> Otherwise, poison is simply propagated, but it does not raise UB > Copying poison bytes is okay: > > // Members are initialized to poison at object creation. > p = alloca {i8, i32} // p[0], p[4~7] are poison >p[0] is an i8, so it shouldn't be poison?> q = alloca {i8, i32} // we want to copy p to q > v = load i8* p[0] // v is poison > store i8 v, i8* q[0] // poison is simply copied; no UB happened > > Similarly, passing/returning poison is allowed as well. > > Juneyoung > > On Fri, Oct 9, 2020 at 10:45 AM Hubert Tong < > hubert.reinterpretcast at gmail.com> wrote: > >> On Thu, Oct 8, 2020 at 7:13 PM Juneyoung Lee <juneyoung.lee at sf.snu.ac.kr> >> wrote: >> >>> > It is important to note that this applies to trap representations and >>> not to unspecified values. A structure or union never has a trap >>> representation. >>> Yes, nondeterministic bits would work for padding of struct/union, as >>> described in (3) The third case is the value of struct/union padding. >>> For the members of struct/union, it is allowed to have trap >>> representation, so poison can be used. >>> >> At what point are the members considered poison? For >> copying/passing/returning a struct or union, there is no UB even if some >> members are uninitialized. >> >> >>> >>> Juneyoung >>> >>> On Fri, Oct 9, 2020 at 5:37 AM Hubert Tong < >>> hubert.reinterpretcast at gmail.com> wrote: >>> >>>> On Thu, Oct 8, 2020 at 12:12 PM Juneyoung Lee via llvm-dev < >>>> llvm-dev at lists.llvm.org> wrote: >>>> >>>>> Hello all, >>>>> >>>>> Thank everyone who participated in the (impromptu) round table >>>>> discussion on Tuesday. >>>>> For those who are interested, I share the summary of the discussion. >>>>> Also, I share a short-term plan regarding this issue and relevant >>>>> patches. >>>>> >>>>> >>>>> *Fixing Miscompilations using Freeze* >>>>> ----------------------------------- >>>>> >>>>> To reduce the cost of fixing miscompilations using freeze instruction, >>>>> we need to >>>>> optimize freeze away whenever possible. >>>>> Using the no-undef/poison assumption from the source language (C/C++ in >>>>> this context) can play a significant role. >>>>> To make use the assumptions, here are short-term goals: >>>>> >>>>> *1. Preserve no-undef/poison assumption of function arguments from >>>>> C/C++ when valid.* >>>>> >>>>> There is an ongoing relevant patch (that is written by others): >>>>> https://reviews.llvm.org/D81678 >>>>> >>>>> *2. Preserve no-undef/poison assumption of lvalue reads in C/C++ when >>>>> valid.* >>>>> >>>>> Reading an indeterminate value from an lvalue that does not have char >>>>> or >>>>> std::byte type is UB [1]. >>>>> Since reading an lvalue is lowered to `load` in IR, we suggest >>>>> attaching a new >>>>> !noundef metadata to such `load`s. >>>>> The IR-side change is here: https://reviews.llvm.org/D89050 >>>>> The clang-side change is going to be made after D81678 is reviewed, >>>>> because it is likely >>>>> that this patch will have a lot of changes in clang tests. >>>>> >>>>> >>>>> *Replacing Undef with Poison* >>>>> --------------------------- >>>>> >>>>> Since undef is known to be the source of many optimizations due to its >>>>> complexity, >>>>> we'd like to suggest gradually moving towards using poison only. >>>>> To make it, (1) `poison` constant should be introduced into LLVM IR >>>>> first, and (2) >>>>> transformations that introduce `undef` should be updated to introduce >>>>> `poison` instead. >>>>> >>>>> For the step (2), we need an experimental result showing that it does >>>>> not cause >>>>> performance degradation. This relies on better support for freeze (the >>>>> no-undef/poison analysis patches). >>>>> >>>>> *1. Introduce a new `poison` constant into IR*: >>>>> https://reviews.llvm.org/D71126 >>>>> >>>>> Note that `poison` constant can be used as a true placeholder value as >>>>> well. >>>>> Undef cannot be used in general because it is less undefined than >>>>> poison. >>>>> >>>>> *2. Update transformations that introduce `undef` to introduce >>>>> `poison` instead* >>>>> >>>>> (1) There are transformations that introduce `undef` as a placeholder >>>>> (e.g. phi operand >>>>> from an unreachable block). >>>>> For these, `poison` can be used instead. >>>>> >>>>> (2) The value of an uninitialized object (automatic or dynamic). >>>>> They are indeterminate values in C/C++, so okay to use poison instead. >>>>> A tricky case is a bitfield access, and we have two possible solutions: >>>>> >>>>> - i. Introduce a very-packed struct type >>>>> ``` >>>>> <C> >>>>> struct { >>>>> int a:2, b:6; >>>>> } s; >>>>> >>>>> v = s.a; >>>>> >>>>> => >>>>> >>>>> <IR> >>>>> >>>>> s = alloca >>>>> >>>>> tmp = load *{{i2, i6}}** s ; load as a very packed struct type >>>>> v = extractvalue tmp, 0 >>>>> ``` >>>>> * Pros: Can be used to precisely lower C/C++'s struct typed function >>>>> argument into IR >>>>> (currently clang coerces a struct into int if small enough; I'll >>>>> explain about this detail if anyone requests) >>>>> * Cons: Since optimizations aren’t aware of the new type, they >>>>> should be updated >>>>> >>>>> - ii. Use load-freeze >>>>> ``` >>>>> <C> >>>>> struct { >>>>> int a:2, b:6; >>>>> } s; >>>>> >>>>> v = s.a; >>>>> >>>>> => >>>>> >>>>> <IR> >>>>> s = alloca >>>>> >>>>> // Poison bits are frozen and returned >>>>> tmp = *load freeze* i8* s >>>>> v = tmp & 3 >>>>> ``` >>>>> * Pros: The change is simpler >>>>> * Cons: Store forwarding isn’t free; needs insertion of freeze >>>>> (store x, p; v = load freeze p => store x, p; v = freeze x) >>>>> >>>>> >>>>> (3) The third case is the value of struct/union padding. >>>>> Padding is filled with unspecified value in C, so it is too undefined >>>>> to use poison. >>>>> We can fill it with defined bits nondeterministically chosen at >>>>> allocation time (freeze poison). >>>>> >>>>> ``` >>>>> <C> >>>>> struct { >>>>> char a; // 3 bytes padding >>>>> int b; >>>>> } s; >>>>> >>>>> v = s.b; >>>>> >>>>> => >>>>> >>>>> <IR> >>>>> s = alloca {i8, i32} // alloca initializes bytes in a type-dependent >>>>> manner >>>>> // s[0], s[4~7]: poison >>>>> // s[1~3]: let's fill these bytes with nondet. bits >>>>> >>>>> s2 = gep (bitcast s to i8*), 4 >>>>> v = load i32 s2 >>>>> ``` >>>>> >>>>> >>>>> Thanks, >>>>> Juneyoung >>>>> >>>>> >>>>> >>>>> [1] >>>>> C11 6.2.6.1.5: If the stored value of an object has such a >>>>> representation and is read by an lvalue expression that does not have >>>>> character type, the behavior is undefined. >>>>> (Similarly, C17 6.2.6.1.5) >>>>> >>>> It is important to note that this applies to trap representations and >>>> not to unspecified values. A structure or union never has a trap >>>> representation. >>>> >>>> >>>>> C++14 8.5.12: If an indeterminate value is produced by an evaluation, >>>>> the behavior is undefined except in the following cases: If an >>>>> indeterminate value of unsigned narrow character type ... >>>>> (Similarly, C++17 11.6.12 , C++11 4.1.1) >>>>> >>>> While loading undef for the unsigned character type case merely >>>> produces undef, for C++, operations such as sign-extend or zero-extend on >>>> an undef i8 is also undefined behaviour. >>>> >>>> >>>>> >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> llvm-dev at lists.llvm.org >>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>> >>>> >>> >>> -- >>> >>> Juneyoung Lee >>> Software Foundation Lab, Seoul National University >>> >> > > -- > > Juneyoung Lee > Software Foundation Lab, Seoul National University >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201008/531b61ac/attachment.html>
Juneyoung Lee via llvm-dev
2020-Oct-09 03:54 UTC
[llvm-dev] Undef and Poison round table follow-up & a plan
> > // Members are initialized to poison at object creation. >> p = alloca {i8, i32} // p[0], p[4~7] are poison >> p[0] is an i8, so it shouldn't be poison? > >My interpretation of standard is that reading uninitialized char can also yield trap representation. If uninitialized, char variable has indeterminate value, and C/C++ does not seem to forbid reading trap representation from it. C++14 explicitly has an example that shows it is indeterminate value at 3.3.2.1 : ``` The point of declaration for a name is immediately after its complete declarator (Clause 8) and before its initializer (if any), except as noted below. [Example: unsigned char x = 12; { unsigned char x = x; } Here the second x is initialized with its own (*indeterminate*) value. —end example] ``` It seems there was a phrase saying that reading indeterminate value as an unsigned char should yield unspecified value in the C++14 draft in the past, but it is removed: http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1787 The removed phrase did not exist in C++11, so I believe it is fine to use poison for uninitialized char types. Juneyoung On Fri, Oct 9, 2020 at 12:19 PM Hubert Tong < hubert.reinterpretcast at gmail.com> wrote:> On Thu, Oct 8, 2020 at 11:09 PM Juneyoung Lee <juneyoung.lee at sf.snu.ac.kr> > wrote: > >> It is UB when a poison is passed to certain operations that raise UB on >> poison, such as division by poison/dereferencing poison pointer/branching >> on poison condition/etc. >> > Got it. Thanks. > > >> Otherwise, poison is simply propagated, but it does not raise UB >> Copying poison bytes is okay: >> >> // Members are initialized to poison at object creation. >> p = alloca {i8, i32} // p[0], p[4~7] are poison >> > p[0] is an i8, so it shouldn't be poison? > > >> q = alloca {i8, i32} // we want to copy p to q >> v = load i8* p[0] // v is poison >> store i8 v, i8* q[0] // poison is simply copied; no UB happened >> >> Similarly, passing/returning poison is allowed as well. >> >> Juneyoung >> >> On Fri, Oct 9, 2020 at 10:45 AM Hubert Tong < >> hubert.reinterpretcast at gmail.com> wrote: >> >>> On Thu, Oct 8, 2020 at 7:13 PM Juneyoung Lee <juneyoung.lee at sf.snu.ac.kr> >>> wrote: >>> >>>> > It is important to note that this applies to trap representations and >>>> not to unspecified values. A structure or union never has a trap >>>> representation. >>>> Yes, nondeterministic bits would work for padding of struct/union, as >>>> described in (3) The third case is the value of struct/union padding. >>>> For the members of struct/union, it is allowed to have trap >>>> representation, so poison can be used. >>>> >>> At what point are the members considered poison? For >>> copying/passing/returning a struct or union, there is no UB even if some >>> members are uninitialized. >>> >>> >>>> >>>> Juneyoung >>>> >>>> On Fri, Oct 9, 2020 at 5:37 AM Hubert Tong < >>>> hubert.reinterpretcast at gmail.com> wrote: >>>> >>>>> On Thu, Oct 8, 2020 at 12:12 PM Juneyoung Lee via llvm-dev < >>>>> llvm-dev at lists.llvm.org> wrote: >>>>> >>>>>> Hello all, >>>>>> >>>>>> Thank everyone who participated in the (impromptu) round table >>>>>> discussion on Tuesday. >>>>>> For those who are interested, I share the summary of the discussion. >>>>>> Also, I share a short-term plan regarding this issue and relevant >>>>>> patches. >>>>>> >>>>>> >>>>>> *Fixing Miscompilations using Freeze* >>>>>> ----------------------------------- >>>>>> >>>>>> To reduce the cost of fixing miscompilations using freeze >>>>>> instruction, we need to >>>>>> optimize freeze away whenever possible. >>>>>> Using the no-undef/poison assumption from the source language (C/C++ >>>>>> in >>>>>> this context) can play a significant role. >>>>>> To make use the assumptions, here are short-term goals: >>>>>> >>>>>> *1. Preserve no-undef/poison assumption of function arguments from >>>>>> C/C++ when valid.* >>>>>> >>>>>> There is an ongoing relevant patch (that is written by others): >>>>>> https://reviews.llvm.org/D81678 >>>>>> >>>>>> *2. Preserve no-undef/poison assumption of lvalue reads in C/C++ when >>>>>> valid.* >>>>>> >>>>>> Reading an indeterminate value from an lvalue that does not have char >>>>>> or >>>>>> std::byte type is UB [1]. >>>>>> Since reading an lvalue is lowered to `load` in IR, we suggest >>>>>> attaching a new >>>>>> !noundef metadata to such `load`s. >>>>>> The IR-side change is here: https://reviews.llvm.org/D89050 >>>>>> The clang-side change is going to be made after D81678 is reviewed, >>>>>> because it is likely >>>>>> that this patch will have a lot of changes in clang tests. >>>>>> >>>>>> >>>>>> *Replacing Undef with Poison* >>>>>> --------------------------- >>>>>> >>>>>> Since undef is known to be the source of many optimizations due to >>>>>> its complexity, >>>>>> we'd like to suggest gradually moving towards using poison only. >>>>>> To make it, (1) `poison` constant should be introduced into LLVM IR >>>>>> first, and (2) >>>>>> transformations that introduce `undef` should be updated to introduce >>>>>> `poison` instead. >>>>>> >>>>>> For the step (2), we need an experimental result showing that it does >>>>>> not cause >>>>>> performance degradation. This relies on better support for freeze (the >>>>>> no-undef/poison analysis patches). >>>>>> >>>>>> *1. Introduce a new `poison` constant into IR*: >>>>>> https://reviews.llvm.org/D71126 >>>>>> >>>>>> Note that `poison` constant can be used as a true placeholder value >>>>>> as well. >>>>>> Undef cannot be used in general because it is less undefined than >>>>>> poison. >>>>>> >>>>>> *2. Update transformations that introduce `undef` to introduce >>>>>> `poison` instead* >>>>>> >>>>>> (1) There are transformations that introduce `undef` as a placeholder >>>>>> (e.g. phi operand >>>>>> from an unreachable block). >>>>>> For these, `poison` can be used instead. >>>>>> >>>>>> (2) The value of an uninitialized object (automatic or dynamic). >>>>>> They are indeterminate values in C/C++, so okay to use poison instead. >>>>>> A tricky case is a bitfield access, and we have two >>>>>> possible solutions: >>>>>> >>>>>> - i. Introduce a very-packed struct type >>>>>> ``` >>>>>> <C> >>>>>> struct { >>>>>> int a:2, b:6; >>>>>> } s; >>>>>> >>>>>> v = s.a; >>>>>> >>>>>> => >>>>>> >>>>>> <IR> >>>>>> >>>>>> s = alloca >>>>>> >>>>>> tmp = load *{{i2, i6}}** s ; load as a very packed struct type >>>>>> v = extractvalue tmp, 0 >>>>>> ``` >>>>>> * Pros: Can be used to precisely lower C/C++'s struct typed >>>>>> function argument into IR >>>>>> (currently clang coerces a struct into int if small enough; I'll >>>>>> explain about this detail if anyone requests) >>>>>> * Cons: Since optimizations aren’t aware of the new type, they >>>>>> should be updated >>>>>> >>>>>> - ii. Use load-freeze >>>>>> ``` >>>>>> <C> >>>>>> struct { >>>>>> int a:2, b:6; >>>>>> } s; >>>>>> >>>>>> v = s.a; >>>>>> >>>>>> => >>>>>> >>>>>> <IR> >>>>>> s = alloca >>>>>> >>>>>> // Poison bits are frozen and returned >>>>>> tmp = *load freeze* i8* s >>>>>> v = tmp & 3 >>>>>> ``` >>>>>> * Pros: The change is simpler >>>>>> * Cons: Store forwarding isn’t free; needs insertion of freeze >>>>>> (store x, p; v = load freeze p => store x, p; v = freeze x) >>>>>> >>>>>> >>>>>> (3) The third case is the value of struct/union padding. >>>>>> Padding is filled with unspecified value in C, so it is too undefined >>>>>> to use poison. >>>>>> We can fill it with defined bits nondeterministically chosen at >>>>>> allocation time (freeze poison). >>>>>> >>>>>> ``` >>>>>> <C> >>>>>> struct { >>>>>> char a; // 3 bytes padding >>>>>> int b; >>>>>> } s; >>>>>> >>>>>> v = s.b; >>>>>> >>>>>> => >>>>>> >>>>>> <IR> >>>>>> s = alloca {i8, i32} // alloca initializes bytes in a type-dependent >>>>>> manner >>>>>> // s[0], s[4~7]: poison >>>>>> // s[1~3]: let's fill these bytes with nondet. bits >>>>>> >>>>>> s2 = gep (bitcast s to i8*), 4 >>>>>> v = load i32 s2 >>>>>> ``` >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Juneyoung >>>>>> >>>>>> >>>>>> >>>>>> [1] >>>>>> C11 6.2.6.1.5: If the stored value of an object has such a >>>>>> representation and is read by an lvalue expression that does not have >>>>>> character type, the behavior is undefined. >>>>>> (Similarly, C17 6.2.6.1.5) >>>>>> >>>>> It is important to note that this applies to trap representations and >>>>> not to unspecified values. A structure or union never has a trap >>>>> representation. >>>>> >>>>> >>>>>> C++14 8.5.12: If an indeterminate value is produced by an evaluation, >>>>>> the behavior is undefined except in the following cases: If an >>>>>> indeterminate value of unsigned narrow character type ... >>>>>> (Similarly, C++17 11.6.12 , C++11 4.1.1) >>>>>> >>>>> While loading undef for the unsigned character type case merely >>>>> produces undef, for C++, operations such as sign-extend or zero-extend on >>>>> an undef i8 is also undefined behaviour. >>>>> >>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> LLVM Developers mailing list >>>>>> llvm-dev at lists.llvm.org >>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>> >>>>> >>>> >>>> -- >>>> >>>> Juneyoung Lee >>>> Software Foundation Lab, Seoul National University >>>> >>> >> >> -- >> >> Juneyoung Lee >> Software Foundation Lab, Seoul National University >> >-- Juneyoung Lee Software Foundation Lab, Seoul National University -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201009/e42a3335/attachment.html>