Juneyoung Lee via llvm-dev
2020-Oct-09 03:08 UTC
[llvm-dev] Undef and Poison round table follow-up & a plan
It is UB when a poison is passed to certain operations that raise UB on poison, such as division by poison/dereferencing poison pointer/branching on poison condition/etc. Otherwise, poison is simply propagated, but it does not raise UB Copying poison bytes is okay: // Members are initialized to poison at object creation. p = alloca {i8, i32} // p[0], p[4~7] are poison q = alloca {i8, i32} // we want to copy p to q v = load i8* p[0] // v is poison store i8 v, i8* q[0] // poison is simply copied; no UB happened Similarly, passing/returning poison is allowed as well. Juneyoung On Fri, Oct 9, 2020 at 10:45 AM Hubert Tong < hubert.reinterpretcast at gmail.com> wrote:> On Thu, Oct 8, 2020 at 7:13 PM Juneyoung Lee <juneyoung.lee at sf.snu.ac.kr> > wrote: > >> > It is important to note that this applies to trap representations and >> not to unspecified values. A structure or union never has a trap >> representation. >> Yes, nondeterministic bits would work for padding of struct/union, as >> described in (3) The third case is the value of struct/union padding. >> For the members of struct/union, it is allowed to have trap >> representation, so poison can be used. >> > At what point are the members considered poison? For > copying/passing/returning a struct or union, there is no UB even if some > members are uninitialized. > > >> >> Juneyoung >> >> On Fri, Oct 9, 2020 at 5:37 AM Hubert Tong < >> hubert.reinterpretcast at gmail.com> wrote: >> >>> On Thu, Oct 8, 2020 at 12:12 PM Juneyoung Lee via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>>> Hello all, >>>> >>>> Thank everyone who participated in the (impromptu) round table >>>> discussion on Tuesday. >>>> For those who are interested, I share the summary of the discussion. >>>> Also, I share a short-term plan regarding this issue and relevant >>>> patches. >>>> >>>> >>>> *Fixing Miscompilations using Freeze* >>>> ----------------------------------- >>>> >>>> To reduce the cost of fixing miscompilations using freeze instruction, >>>> we need to >>>> optimize freeze away whenever possible. >>>> Using the no-undef/poison assumption from the source language (C/C++ in >>>> this context) can play a significant role. >>>> To make use the assumptions, here are short-term goals: >>>> >>>> *1. Preserve no-undef/poison assumption of function arguments from >>>> C/C++ when valid.* >>>> >>>> There is an ongoing relevant patch (that is written by others): >>>> https://reviews.llvm.org/D81678 >>>> >>>> *2. Preserve no-undef/poison assumption of lvalue reads in C/C++ when >>>> valid.* >>>> >>>> Reading an indeterminate value from an lvalue that does not have char or >>>> std::byte type is UB [1]. >>>> Since reading an lvalue is lowered to `load` in IR, we suggest >>>> attaching a new >>>> !noundef metadata to such `load`s. >>>> The IR-side change is here: https://reviews.llvm.org/D89050 >>>> The clang-side change is going to be made after D81678 is reviewed, >>>> because it is likely >>>> that this patch will have a lot of changes in clang tests. >>>> >>>> >>>> *Replacing Undef with Poison* >>>> --------------------------- >>>> >>>> Since undef is known to be the source of many optimizations due to its >>>> complexity, >>>> we'd like to suggest gradually moving towards using poison only. >>>> To make it, (1) `poison` constant should be introduced into LLVM IR >>>> first, and (2) >>>> transformations that introduce `undef` should be updated to introduce >>>> `poison` instead. >>>> >>>> For the step (2), we need an experimental result showing that it does >>>> not cause >>>> performance degradation. This relies on better support for freeze (the >>>> no-undef/poison analysis patches). >>>> >>>> *1. Introduce a new `poison` constant into IR*: >>>> https://reviews.llvm.org/D71126 >>>> >>>> Note that `poison` constant can be used as a true placeholder value as >>>> well. >>>> Undef cannot be used in general because it is less undefined than >>>> poison. >>>> >>>> *2. Update transformations that introduce `undef` to introduce `poison` >>>> instead* >>>> >>>> (1) There are transformations that introduce `undef` as a placeholder >>>> (e.g. phi operand >>>> from an unreachable block). >>>> For these, `poison` can be used instead. >>>> >>>> (2) The value of an uninitialized object (automatic or dynamic). >>>> They are indeterminate values in C/C++, so okay to use poison instead. >>>> A tricky case is a bitfield access, and we have two possible solutions: >>>> >>>> - i. Introduce a very-packed struct type >>>> ``` >>>> <C> >>>> struct { >>>> int a:2, b:6; >>>> } s; >>>> >>>> v = s.a; >>>> >>>> => >>>> >>>> <IR> >>>> >>>> s = alloca >>>> >>>> tmp = load *{{i2, i6}}** s ; load as a very packed struct type >>>> v = extractvalue tmp, 0 >>>> ``` >>>> * Pros: Can be used to precisely lower C/C++'s struct typed function >>>> argument into IR >>>> (currently clang coerces a struct into int if small enough; I'll >>>> explain about this detail if anyone requests) >>>> * Cons: Since optimizations aren’t aware of the new type, they should >>>> be updated >>>> >>>> - ii. Use load-freeze >>>> ``` >>>> <C> >>>> struct { >>>> int a:2, b:6; >>>> } s; >>>> >>>> v = s.a; >>>> >>>> => >>>> >>>> <IR> >>>> s = alloca >>>> >>>> // Poison bits are frozen and returned >>>> tmp = *load freeze* i8* s >>>> v = tmp & 3 >>>> ``` >>>> * Pros: The change is simpler >>>> * Cons: Store forwarding isn’t free; needs insertion of freeze >>>> (store x, p; v = load freeze p => store x, p; v = freeze x) >>>> >>>> >>>> (3) The third case is the value of struct/union padding. >>>> Padding is filled with unspecified value in C, so it is too undefined >>>> to use poison. >>>> We can fill it with defined bits nondeterministically chosen at >>>> allocation time (freeze poison). >>>> >>>> ``` >>>> <C> >>>> struct { >>>> char a; // 3 bytes padding >>>> int b; >>>> } s; >>>> >>>> v = s.b; >>>> >>>> => >>>> >>>> <IR> >>>> s = alloca {i8, i32} // alloca initializes bytes in a type-dependent >>>> manner >>>> // s[0], s[4~7]: poison >>>> // s[1~3]: let's fill these bytes with nondet. bits >>>> >>>> s2 = gep (bitcast s to i8*), 4 >>>> v = load i32 s2 >>>> ``` >>>> >>>> >>>> Thanks, >>>> Juneyoung >>>> >>>> >>>> >>>> [1] >>>> C11 6.2.6.1.5: If the stored value of an object has such a >>>> representation and is read by an lvalue expression that does not have >>>> character type, the behavior is undefined. >>>> (Similarly, C17 6.2.6.1.5) >>>> >>> It is important to note that this applies to trap representations and >>> not to unspecified values. A structure or union never has a trap >>> representation. >>> >>> >>>> C++14 8.5.12: If an indeterminate value is produced by an evaluation, >>>> the behavior is undefined except in the following cases: If an >>>> indeterminate value of unsigned narrow character type ... >>>> (Similarly, C++17 11.6.12 , C++11 4.1.1) >>>> >>> While loading undef for the unsigned character type case merely produces >>> undef, for C++, operations such as sign-extend or zero-extend on an undef >>> i8 is also undefined behaviour. >>> >>> >>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>> >> >> -- >> >> Juneyoung Lee >> Software Foundation Lab, Seoul National University >> >-- Juneyoung Lee Software Foundation Lab, Seoul National University -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201009/a7ae424f/attachment-0001.html>
Hubert Tong via llvm-dev
2020-Oct-09 03:19 UTC
[llvm-dev] Undef and Poison round table follow-up & a plan
On Thu, Oct 8, 2020 at 11:09 PM Juneyoung Lee <juneyoung.lee at sf.snu.ac.kr> wrote:> It is UB when a poison is passed to certain operations that raise UB on > poison, such as division by poison/dereferencing poison pointer/branching > on poison condition/etc. >Got it. Thanks.> Otherwise, poison is simply propagated, but it does not raise UB > Copying poison bytes is okay: > > // Members are initialized to poison at object creation. > p = alloca {i8, i32} // p[0], p[4~7] are poison >p[0] is an i8, so it shouldn't be poison?> q = alloca {i8, i32} // we want to copy p to q > v = load i8* p[0] // v is poison > store i8 v, i8* q[0] // poison is simply copied; no UB happened > > Similarly, passing/returning poison is allowed as well. > > Juneyoung > > On Fri, Oct 9, 2020 at 10:45 AM Hubert Tong < > hubert.reinterpretcast at gmail.com> wrote: > >> On Thu, Oct 8, 2020 at 7:13 PM Juneyoung Lee <juneyoung.lee at sf.snu.ac.kr> >> wrote: >> >>> > It is important to note that this applies to trap representations and >>> not to unspecified values. A structure or union never has a trap >>> representation. >>> Yes, nondeterministic bits would work for padding of struct/union, as >>> described in (3) The third case is the value of struct/union padding. >>> For the members of struct/union, it is allowed to have trap >>> representation, so poison can be used. >>> >> At what point are the members considered poison? For >> copying/passing/returning a struct or union, there is no UB even if some >> members are uninitialized. >> >> >>> >>> Juneyoung >>> >>> On Fri, Oct 9, 2020 at 5:37 AM Hubert Tong < >>> hubert.reinterpretcast at gmail.com> wrote: >>> >>>> On Thu, Oct 8, 2020 at 12:12 PM Juneyoung Lee via llvm-dev < >>>> llvm-dev at lists.llvm.org> wrote: >>>> >>>>> Hello all, >>>>> >>>>> Thank everyone who participated in the (impromptu) round table >>>>> discussion on Tuesday. >>>>> For those who are interested, I share the summary of the discussion. >>>>> Also, I share a short-term plan regarding this issue and relevant >>>>> patches. >>>>> >>>>> >>>>> *Fixing Miscompilations using Freeze* >>>>> ----------------------------------- >>>>> >>>>> To reduce the cost of fixing miscompilations using freeze instruction, >>>>> we need to >>>>> optimize freeze away whenever possible. >>>>> Using the no-undef/poison assumption from the source language (C/C++ in >>>>> this context) can play a significant role. >>>>> To make use the assumptions, here are short-term goals: >>>>> >>>>> *1. Preserve no-undef/poison assumption of function arguments from >>>>> C/C++ when valid.* >>>>> >>>>> There is an ongoing relevant patch (that is written by others): >>>>> https://reviews.llvm.org/D81678 >>>>> >>>>> *2. Preserve no-undef/poison assumption of lvalue reads in C/C++ when >>>>> valid.* >>>>> >>>>> Reading an indeterminate value from an lvalue that does not have char >>>>> or >>>>> std::byte type is UB [1]. >>>>> Since reading an lvalue is lowered to `load` in IR, we suggest >>>>> attaching a new >>>>> !noundef metadata to such `load`s. >>>>> The IR-side change is here: https://reviews.llvm.org/D89050 >>>>> The clang-side change is going to be made after D81678 is reviewed, >>>>> because it is likely >>>>> that this patch will have a lot of changes in clang tests. >>>>> >>>>> >>>>> *Replacing Undef with Poison* >>>>> --------------------------- >>>>> >>>>> Since undef is known to be the source of many optimizations due to its >>>>> complexity, >>>>> we'd like to suggest gradually moving towards using poison only. >>>>> To make it, (1) `poison` constant should be introduced into LLVM IR >>>>> first, and (2) >>>>> transformations that introduce `undef` should be updated to introduce >>>>> `poison` instead. >>>>> >>>>> For the step (2), we need an experimental result showing that it does >>>>> not cause >>>>> performance degradation. This relies on better support for freeze (the >>>>> no-undef/poison analysis patches). >>>>> >>>>> *1. Introduce a new `poison` constant into IR*: >>>>> https://reviews.llvm.org/D71126 >>>>> >>>>> Note that `poison` constant can be used as a true placeholder value as >>>>> well. >>>>> Undef cannot be used in general because it is less undefined than >>>>> poison. >>>>> >>>>> *2. Update transformations that introduce `undef` to introduce >>>>> `poison` instead* >>>>> >>>>> (1) There are transformations that introduce `undef` as a placeholder >>>>> (e.g. phi operand >>>>> from an unreachable block). >>>>> For these, `poison` can be used instead. >>>>> >>>>> (2) The value of an uninitialized object (automatic or dynamic). >>>>> They are indeterminate values in C/C++, so okay to use poison instead. >>>>> A tricky case is a bitfield access, and we have two possible solutions: >>>>> >>>>> - i. Introduce a very-packed struct type >>>>> ``` >>>>> <C> >>>>> struct { >>>>> int a:2, b:6; >>>>> } s; >>>>> >>>>> v = s.a; >>>>> >>>>> => >>>>> >>>>> <IR> >>>>> >>>>> s = alloca >>>>> >>>>> tmp = load *{{i2, i6}}** s ; load as a very packed struct type >>>>> v = extractvalue tmp, 0 >>>>> ``` >>>>> * Pros: Can be used to precisely lower C/C++'s struct typed function >>>>> argument into IR >>>>> (currently clang coerces a struct into int if small enough; I'll >>>>> explain about this detail if anyone requests) >>>>> * Cons: Since optimizations aren’t aware of the new type, they >>>>> should be updated >>>>> >>>>> - ii. Use load-freeze >>>>> ``` >>>>> <C> >>>>> struct { >>>>> int a:2, b:6; >>>>> } s; >>>>> >>>>> v = s.a; >>>>> >>>>> => >>>>> >>>>> <IR> >>>>> s = alloca >>>>> >>>>> // Poison bits are frozen and returned >>>>> tmp = *load freeze* i8* s >>>>> v = tmp & 3 >>>>> ``` >>>>> * Pros: The change is simpler >>>>> * Cons: Store forwarding isn’t free; needs insertion of freeze >>>>> (store x, p; v = load freeze p => store x, p; v = freeze x) >>>>> >>>>> >>>>> (3) The third case is the value of struct/union padding. >>>>> Padding is filled with unspecified value in C, so it is too undefined >>>>> to use poison. >>>>> We can fill it with defined bits nondeterministically chosen at >>>>> allocation time (freeze poison). >>>>> >>>>> ``` >>>>> <C> >>>>> struct { >>>>> char a; // 3 bytes padding >>>>> int b; >>>>> } s; >>>>> >>>>> v = s.b; >>>>> >>>>> => >>>>> >>>>> <IR> >>>>> s = alloca {i8, i32} // alloca initializes bytes in a type-dependent >>>>> manner >>>>> // s[0], s[4~7]: poison >>>>> // s[1~3]: let's fill these bytes with nondet. bits >>>>> >>>>> s2 = gep (bitcast s to i8*), 4 >>>>> v = load i32 s2 >>>>> ``` >>>>> >>>>> >>>>> Thanks, >>>>> Juneyoung >>>>> >>>>> >>>>> >>>>> [1] >>>>> C11 6.2.6.1.5: If the stored value of an object has such a >>>>> representation and is read by an lvalue expression that does not have >>>>> character type, the behavior is undefined. >>>>> (Similarly, C17 6.2.6.1.5) >>>>> >>>> It is important to note that this applies to trap representations and >>>> not to unspecified values. A structure or union never has a trap >>>> representation. >>>> >>>> >>>>> C++14 8.5.12: If an indeterminate value is produced by an evaluation, >>>>> the behavior is undefined except in the following cases: If an >>>>> indeterminate value of unsigned narrow character type ... >>>>> (Similarly, C++17 11.6.12 , C++11 4.1.1) >>>>> >>>> While loading undef for the unsigned character type case merely >>>> produces undef, for C++, operations such as sign-extend or zero-extend on >>>> an undef i8 is also undefined behaviour. >>>> >>>> >>>>> >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> llvm-dev at lists.llvm.org >>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>> >>>> >>> >>> -- >>> >>> Juneyoung Lee >>> Software Foundation Lab, Seoul National University >>> >> > > -- > > Juneyoung Lee > Software Foundation Lab, Seoul National University >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201008/531b61ac/attachment.html>
Juneyoung Lee via llvm-dev
2020-Oct-09 03:54 UTC
[llvm-dev] Undef and Poison round table follow-up & a plan
> > // Members are initialized to poison at object creation. >> p = alloca {i8, i32} // p[0], p[4~7] are poison >> p[0] is an i8, so it shouldn't be poison? > >My interpretation of standard is that reading uninitialized char can also yield trap representation. If uninitialized, char variable has indeterminate value, and C/C++ does not seem to forbid reading trap representation from it. C++14 explicitly has an example that shows it is indeterminate value at 3.3.2.1 : ``` The point of declaration for a name is immediately after its complete declarator (Clause 8) and before its initializer (if any), except as noted below. [Example: unsigned char x = 12; { unsigned char x = x; } Here the second x is initialized with its own (*indeterminate*) value. —end example] ``` It seems there was a phrase saying that reading indeterminate value as an unsigned char should yield unspecified value in the C++14 draft in the past, but it is removed: http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1787 The removed phrase did not exist in C++11, so I believe it is fine to use poison for uninitialized char types. Juneyoung On Fri, Oct 9, 2020 at 12:19 PM Hubert Tong < hubert.reinterpretcast at gmail.com> wrote:> On Thu, Oct 8, 2020 at 11:09 PM Juneyoung Lee <juneyoung.lee at sf.snu.ac.kr> > wrote: > >> It is UB when a poison is passed to certain operations that raise UB on >> poison, such as division by poison/dereferencing poison pointer/branching >> on poison condition/etc. >> > Got it. Thanks. > > >> Otherwise, poison is simply propagated, but it does not raise UB >> Copying poison bytes is okay: >> >> // Members are initialized to poison at object creation. >> p = alloca {i8, i32} // p[0], p[4~7] are poison >> > p[0] is an i8, so it shouldn't be poison? > > >> q = alloca {i8, i32} // we want to copy p to q >> v = load i8* p[0] // v is poison >> store i8 v, i8* q[0] // poison is simply copied; no UB happened >> >> Similarly, passing/returning poison is allowed as well. >> >> Juneyoung >> >> On Fri, Oct 9, 2020 at 10:45 AM Hubert Tong < >> hubert.reinterpretcast at gmail.com> wrote: >> >>> On Thu, Oct 8, 2020 at 7:13 PM Juneyoung Lee <juneyoung.lee at sf.snu.ac.kr> >>> wrote: >>> >>>> > It is important to note that this applies to trap representations and >>>> not to unspecified values. A structure or union never has a trap >>>> representation. >>>> Yes, nondeterministic bits would work for padding of struct/union, as >>>> described in (3) The third case is the value of struct/union padding. >>>> For the members of struct/union, it is allowed to have trap >>>> representation, so poison can be used. >>>> >>> At what point are the members considered poison? For >>> copying/passing/returning a struct or union, there is no UB even if some >>> members are uninitialized. >>> >>> >>>> >>>> Juneyoung >>>> >>>> On Fri, Oct 9, 2020 at 5:37 AM Hubert Tong < >>>> hubert.reinterpretcast at gmail.com> wrote: >>>> >>>>> On Thu, Oct 8, 2020 at 12:12 PM Juneyoung Lee via llvm-dev < >>>>> llvm-dev at lists.llvm.org> wrote: >>>>> >>>>>> Hello all, >>>>>> >>>>>> Thank everyone who participated in the (impromptu) round table >>>>>> discussion on Tuesday. >>>>>> For those who are interested, I share the summary of the discussion. >>>>>> Also, I share a short-term plan regarding this issue and relevant >>>>>> patches. >>>>>> >>>>>> >>>>>> *Fixing Miscompilations using Freeze* >>>>>> ----------------------------------- >>>>>> >>>>>> To reduce the cost of fixing miscompilations using freeze >>>>>> instruction, we need to >>>>>> optimize freeze away whenever possible. >>>>>> Using the no-undef/poison assumption from the source language (C/C++ >>>>>> in >>>>>> this context) can play a significant role. >>>>>> To make use the assumptions, here are short-term goals: >>>>>> >>>>>> *1. Preserve no-undef/poison assumption of function arguments from >>>>>> C/C++ when valid.* >>>>>> >>>>>> There is an ongoing relevant patch (that is written by others): >>>>>> https://reviews.llvm.org/D81678 >>>>>> >>>>>> *2. Preserve no-undef/poison assumption of lvalue reads in C/C++ when >>>>>> valid.* >>>>>> >>>>>> Reading an indeterminate value from an lvalue that does not have char >>>>>> or >>>>>> std::byte type is UB [1]. >>>>>> Since reading an lvalue is lowered to `load` in IR, we suggest >>>>>> attaching a new >>>>>> !noundef metadata to such `load`s. >>>>>> The IR-side change is here: https://reviews.llvm.org/D89050 >>>>>> The clang-side change is going to be made after D81678 is reviewed, >>>>>> because it is likely >>>>>> that this patch will have a lot of changes in clang tests. >>>>>> >>>>>> >>>>>> *Replacing Undef with Poison* >>>>>> --------------------------- >>>>>> >>>>>> Since undef is known to be the source of many optimizations due to >>>>>> its complexity, >>>>>> we'd like to suggest gradually moving towards using poison only. >>>>>> To make it, (1) `poison` constant should be introduced into LLVM IR >>>>>> first, and (2) >>>>>> transformations that introduce `undef` should be updated to introduce >>>>>> `poison` instead. >>>>>> >>>>>> For the step (2), we need an experimental result showing that it does >>>>>> not cause >>>>>> performance degradation. This relies on better support for freeze (the >>>>>> no-undef/poison analysis patches). >>>>>> >>>>>> *1. Introduce a new `poison` constant into IR*: >>>>>> https://reviews.llvm.org/D71126 >>>>>> >>>>>> Note that `poison` constant can be used as a true placeholder value >>>>>> as well. >>>>>> Undef cannot be used in general because it is less undefined than >>>>>> poison. >>>>>> >>>>>> *2. Update transformations that introduce `undef` to introduce >>>>>> `poison` instead* >>>>>> >>>>>> (1) There are transformations that introduce `undef` as a placeholder >>>>>> (e.g. phi operand >>>>>> from an unreachable block). >>>>>> For these, `poison` can be used instead. >>>>>> >>>>>> (2) The value of an uninitialized object (automatic or dynamic). >>>>>> They are indeterminate values in C/C++, so okay to use poison instead. >>>>>> A tricky case is a bitfield access, and we have two >>>>>> possible solutions: >>>>>> >>>>>> - i. Introduce a very-packed struct type >>>>>> ``` >>>>>> <C> >>>>>> struct { >>>>>> int a:2, b:6; >>>>>> } s; >>>>>> >>>>>> v = s.a; >>>>>> >>>>>> => >>>>>> >>>>>> <IR> >>>>>> >>>>>> s = alloca >>>>>> >>>>>> tmp = load *{{i2, i6}}** s ; load as a very packed struct type >>>>>> v = extractvalue tmp, 0 >>>>>> ``` >>>>>> * Pros: Can be used to precisely lower C/C++'s struct typed >>>>>> function argument into IR >>>>>> (currently clang coerces a struct into int if small enough; I'll >>>>>> explain about this detail if anyone requests) >>>>>> * Cons: Since optimizations aren’t aware of the new type, they >>>>>> should be updated >>>>>> >>>>>> - ii. Use load-freeze >>>>>> ``` >>>>>> <C> >>>>>> struct { >>>>>> int a:2, b:6; >>>>>> } s; >>>>>> >>>>>> v = s.a; >>>>>> >>>>>> => >>>>>> >>>>>> <IR> >>>>>> s = alloca >>>>>> >>>>>> // Poison bits are frozen and returned >>>>>> tmp = *load freeze* i8* s >>>>>> v = tmp & 3 >>>>>> ``` >>>>>> * Pros: The change is simpler >>>>>> * Cons: Store forwarding isn’t free; needs insertion of freeze >>>>>> (store x, p; v = load freeze p => store x, p; v = freeze x) >>>>>> >>>>>> >>>>>> (3) The third case is the value of struct/union padding. >>>>>> Padding is filled with unspecified value in C, so it is too undefined >>>>>> to use poison. >>>>>> We can fill it with defined bits nondeterministically chosen at >>>>>> allocation time (freeze poison). >>>>>> >>>>>> ``` >>>>>> <C> >>>>>> struct { >>>>>> char a; // 3 bytes padding >>>>>> int b; >>>>>> } s; >>>>>> >>>>>> v = s.b; >>>>>> >>>>>> => >>>>>> >>>>>> <IR> >>>>>> s = alloca {i8, i32} // alloca initializes bytes in a type-dependent >>>>>> manner >>>>>> // s[0], s[4~7]: poison >>>>>> // s[1~3]: let's fill these bytes with nondet. bits >>>>>> >>>>>> s2 = gep (bitcast s to i8*), 4 >>>>>> v = load i32 s2 >>>>>> ``` >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Juneyoung >>>>>> >>>>>> >>>>>> >>>>>> [1] >>>>>> C11 6.2.6.1.5: If the stored value of an object has such a >>>>>> representation and is read by an lvalue expression that does not have >>>>>> character type, the behavior is undefined. >>>>>> (Similarly, C17 6.2.6.1.5) >>>>>> >>>>> It is important to note that this applies to trap representations and >>>>> not to unspecified values. A structure or union never has a trap >>>>> representation. >>>>> >>>>> >>>>>> C++14 8.5.12: If an indeterminate value is produced by an evaluation, >>>>>> the behavior is undefined except in the following cases: If an >>>>>> indeterminate value of unsigned narrow character type ... >>>>>> (Similarly, C++17 11.6.12 , C++11 4.1.1) >>>>>> >>>>> While loading undef for the unsigned character type case merely >>>>> produces undef, for C++, operations such as sign-extend or zero-extend on >>>>> an undef i8 is also undefined behaviour. >>>>> >>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> LLVM Developers mailing list >>>>>> llvm-dev at lists.llvm.org >>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>> >>>>> >>>> >>>> -- >>>> >>>> Juneyoung Lee >>>> Software Foundation Lab, Seoul National University >>>> >>> >> >> -- >> >> Juneyoung Lee >> Software Foundation Lab, Seoul National University >> >-- Juneyoung Lee Software Foundation Lab, Seoul National University -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201009/e42a3335/attachment.html>