Nuno Lopes via llvm-dev
2016-Oct-18 19:25 UTC
[llvm-dev] RFC: Killing undef and spreading poison
> On 10/18/2016 5:06 AM, Nuno Lopes via llvm-dev wrote: >> Another use is, for example, to implement bit-fields: >> a.x = 2 >> becomes: >> %v = load %a >> %v2 = freeze %v ; %v could be uninitialized data (poison) >> %v3 = ... bitmasking... >> store %a, %v3 > > It seems like you're saying that an integer load which touches any > uninitialized byte of memory results in poison. Therefore, load %a > simplifies to "poison", and your proposed lowering of a bitfield write > throws away the value of every other adjacent bitfield. Am I missing > something?Right, a load touching a single uninitialized bit results in poison. The trick is that on the first bitfield write, all the remaining untouched fields become initialized (with an arbitrary value, though). Essentially we are making the adjacent bitfields undef. So if you have: struct foo a; a.x = 2; a.y = 3; IR becomes: %a = alloca foo %x = load %a %x2 = freeze %v ; %x2 not poison %x3 = bitmasking ; %x3 not poison store %a, %x3 %y = load %a %y2 = freeze %y ; not needed; %y is not poison etc.. Nuno
Krzysztof Parzyszek via llvm-dev
2016-Oct-18 20:07 UTC
[llvm-dev] RFC: Killing undef and spreading poison
On 10/18/2016 2:25 PM, Nuno Lopes via llvm-dev wrote:> Right, a load touching a single uninitialized bit results in poison. > The trick is that on the first bitfield write, all the remaining > untouched fields become initialized (with an arbitrary value, though). > Essentially we are making the adjacent bitfields undef.So "undef" still exists, except now it's obtainable via "freeze(poison)"? -Krzysztof -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Sanjoy Das via llvm-dev
2016-Oct-18 20:12 UTC
[llvm-dev] RFC: Killing undef and spreading poison
Hi Krzysztof, freeze(poison) is different from undef today, in the sense that it is an instruction that produces some random, but fixed bit pattern. E.g. today in %x = undef %y = xor %x, %x we can fold %y to undef since each use of %x can independently see some arbitrary (up to the compiler / environment) bit pattern. But in the new proposal, in: %x = freeze(poison) %y = xor %x, %x that is no longer allowed (%y _has_ to be 0) -- all uses of %x will see some garbage, but fixed bit pattern. -- Sanjoy Krzysztof Parzyszek via llvm-dev wrote:> On 10/18/2016 2:25 PM, Nuno Lopes via llvm-dev wrote: >> Right, a load touching a single uninitialized bit results in poison. >> The trick is that on the first bitfield write, all the remaining >> untouched fields become initialized (with an arbitrary value, though). >> Essentially we are making the adjacent bitfields undef. > > So "undef" still exists, except now it's obtainable via "freeze(poison)"? > > -Krzysztof >
Friedman, Eli via llvm-dev
2016-Oct-18 20:39 UTC
[llvm-dev] RFC: Killing undef and spreading poison
On 10/18/2016 12:25 PM, Nuno Lopes wrote:>> On 10/18/2016 5:06 AM, Nuno Lopes via llvm-dev wrote: >>> Another use is, for example, to implement bit-fields: >>> a.x = 2 >>> becomes: >>> %v = load %a >>> %v2 = freeze %v ; %v could be uninitialized data (poison) >>> %v3 = ... bitmasking... >>> store %a, %v3 >> >> It seems like you're saying that an integer load which touches any >> uninitialized byte of memory results in poison. Therefore, load %a >> simplifies to "poison", and your proposed lowering of a bitfield >> write throws away the value of every other adjacent bitfield. Am I >> missing something? > > Right, a load touching a single uninitialized bit results in poison. > The trick is that on the first bitfield write, all the remaining > untouched fields become initialized (with an arbitrary value, though). > Essentially we are making the adjacent bitfields undef. > > So if you have: > struct foo a; > a.x = 2; > a.y = 3; > > IR becomes: > %a = alloca foo > %x = load %a > %x2 = freeze %v ; %x2 not poison > %x3 = bitmasking ; %x3 not poison > store %a, %x3 > > %y = load %a > %y2 = freeze %y ; not needed; %y is not poison > etc.. > > NunoOh, okay, that works. I should have thought that through a bit more carefully. Sort of related testcase: struct S { int x : 24; }; S s = { 2 }; Gives: @s = local_unnamed_addr global { i8, i8, i8, i8 } { i8 2, i8 0, i8 0, i8 undef }, align 4 When undef goes away, clang will need to be patched to generate zero here? More generally, will struct padding in globals be poison, or some arbitrary value? A couple of related topics: Instcombine currently transforms "memcpy(dst, src, 4)" to "load i32 src; store i32 dst". I guess that's illegal under this model? How about the related transform "memcpy(dst, src, 4)" to "load <4 x i8> src; store <4 x i8> dst"? (SROA also performs similar transforms.) Have you given any thought to how auto-upgrade for bitcode files would work? I guess in general, you can upgrade an old "load i32" to "bitcast (freeze (load <4 x i8>)) to i32", but that's really awkward. -Eli -- Employee of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
Nuno Lopes via llvm-dev
2016-Oct-18 21:23 UTC
[llvm-dev] RFC: Killing undef and spreading poison
>> Right, a load touching a single uninitialized bit results in poison. >> The trick is that on the first bitfield write, all the remaining >> untouched fields become initialized (with an arbitrary value, though). >> Essentially we are making the adjacent bitfields undef. >> >> So if you have: >> struct foo a; >> a.x = 2; >> a.y = 3; >> >> IR becomes: >> %a = alloca foo >> %x = load %a >> %x2 = freeze %v ; %x2 not poison >> %x3 = bitmasking ; %x3 not poison >> store %a, %x3 >> >> %y = load %a >> %y2 = freeze %y ; not needed; %y is not poison >> etc.. > > Oh, okay, that works. I should have thought that through a bit more > carefully.I must say I got scared about bitfields a few weeks ago as well :)> Sort of related testcase: > > struct S { int x : 24; }; > S s = { 2 }; > > Gives: > > @s = local_unnamed_addr global { i8, i8, i8, i8 } { i8 2, i8 0, i8 0, i8 > undef }, align 4 > > When undef goes away, clang will need to be patched to generate zero here? > More generally, will struct padding in globals be poison, or some > arbitrary value?Padding can safely become poison.> Instcombine currently transforms "memcpy(dst, src, 4)" to "load i32 src; > store i32 dst". I guess that's illegal under this model?How about the > related transform "memcpy(dst, src, 4)" to "load <4 x i8> src; store <4 x > i8> dst"? (SROA also performs similar transforms.)The first is illegal, you're right (unless you know that they aren't poison, of course). The second version with vectors is correct. The problem is that codegen for vector loads/stores doesn't seem optimal ATM. We've seen really bad cases (e.g., http://llvm.org/PR30615)> Have you given any thought to how auto-upgrade for bitcode files would > work? I guess in general, you can upgrade an old "load i32" to "bitcast > (freeze (load <4 x i8>)) to i32", but that's really awkward.Ah, good point, I forgot about that. I was assuming auto-upgrade only needed to change undef into "freeze poison", but you're right that load semantics are also changing and would need patching as well. It actually needs to be "load <32 x i1>" because undef is per bit and one bit being poison cannot taint the whole byte. Of course we could have a few optimizations for common patterns such as (load (alloca)) -> (freeze (load (alloca)), but ATM I don't see any way around the pattern you suggest. Thanks, Nuno
Chandler Carruth via llvm-dev
2016-Oct-19 10:30 UTC
[llvm-dev] RFC: Killing undef and spreading poison
I'm still digesting the proposal in its entirety, but one point I want to call out here: On Tue, Oct 18, 2016 at 1:39 PM Friedman, Eli via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Instcombine currently transforms "memcpy(dst, src, 4)" to "load i32 src; > store i32 dst". I guess that's illegal under this model? How about the > related transform "memcpy(dst, src, 4)" to "load <4 x i8> src; store <4 > x i8> dst"? (SROA also performs similar transforms.) >I think it is really important that memcpy and load/store pair be equivalent. If this is in fact true about the proposal, I would want to really dig into why and whether anything could be done to make them equivalent. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161019/bd8ea53e/attachment.html>