thr3ads.net - llvm dev - [llvm-dev] RFC: Killing undef and spreading poison [Oct 2016]

If this information is useful, please help other people find it:
Share via:

Nuno Lopes via llvm-dev

2016-Oct-18 15:10 UTC

[llvm-dev] RFC: Killing undef and spreading poison

>> A use of freeze is to enable speculative execution.  For example, loop 
>> switching does the following transformation:
>> while (C) {
>>   if (C2) {
>>    A
>>   } else {
>>    B
>>   }
>> }
>> =>
>> if (C2) {
>>    while (C)
>>       A
>> } else {
>>     while (C)
>>        B
>> }
>> 
>> Here we are evaluating C2 before C.  If the original loop never 
>> executed then we had never evaluated C2, while now we do.  So we need 
>> to make sure there's no UB for branching on C2.  Freeze ensures
that
>> so we would actually have 'if (freeze(C2))' instead.
>> Note that having branch on poison not trigger UB has its own problems.
> 
> Can you please elaborate on these problems?
Sure! For example, the following transformation would be wrong if branch on
poison was a non-deterministic branch instead of UB:

    %a = add i32 %x, 1
    %c = icmp eq i32 %a, %poison
    br i1 %c, label %bb1, label %bb2
bb1:
    %b = add i32 %x, 1
    %d = call i32 @bar(i32 %b)
->
bb1:
    %d = call i32 @bar(i32 % poison)

GVN will perform this kind of transformation: it concludes that %a, %b, and
%poison are all equal and picks one representative value. However, these values
are equal only when they are not poison.  If %poison is indeed poison, then the
transformation is wrong because before it was passing an ok value to bar() and
after the transformation it is passing poison.
On the other hand, if branching on poison is UB, the original program was
executing UB already because %poison (and therefore %c) were poison. So the
transformation is ok.

>> Discussion on select
>> ===================>> Select is a tricky instruction to define
semantics for.  There are
>> multiple views of select's purpose that are contradictory:
>>  1) Select should be equivalent to arithmetic (e.g., allow  'select
>> %c, true, %x' -> 'or %c, %x' and arithmetic ->
select)
>>  2) br + phi -> select should be allowed (simplifyCFG)
>>  3) Select -> br + phi should be allowed
>> 
>> Since we really want to have 2) since that's simplifyCFG, we
propose
>> to make 'select %c, %a, %b' poison if any of the following
holds:
>>  - %c is poison
>>  - %c = true and %a is poison
>>  - %c = false and %b is poison
>> 
>> This disallows 3) and some transformations of 1). Since 3) is only 
>> performed in CodeGenPrepare, we can safely ignore it (no aggressive 
>> optimizations are run afterwards).
>
> This strikes me as a dangerous game to play. CGP's transformations need
to be semantically sound (even if they are anti-canonical). In part, this is
because our IR-level analyses are used by CGP. Moreover, targets after CGP run
IR-level transformations, and having different semantic rules there is going to
make code reuse hard in subtle ways. Which brings up another issue: We currently
have undef at the MI level; do you propose changing that, and if so, how?
It is sound to do the transformation at IR level if you freeze the condition.
My suggestion was to avoid introducing overhead in CGP.  This is the current
state of affairs and it seems to work right now.  I'm not shocked to see the
IR changing the semantics throughout the pipeline, since while the compiler
proceeds, the type of transformations done also change.
But I have to say that my understanding of LLVM after CGP is very limited. I was
not aware of the code reuse from IR-level analyses.  I agree this needs to be
scrutinized carefully.  Alternatively, we can just introduce freeze and pay the
price (likely low anyway).

Regarding undef at MI level, we are not proposing any change.  I don't see
any reason right now to change it since it seems that optimizations at MI level
are fine with just having undef. There's no nsw/nuw stuff there and undef
seems very useful.
Poison is lowered into undef. And freeze is lowered into this pair of copy nodes
that seems to stop propagation of undef (to ensure all uses see the same value).
I don't know enough about MI to know if there's a better way to lower
freeze, though.

Thanks,
Nuno

Hal Finkel via llvm-dev

2016-Oct-18 18:01 UTC

head link

[llvm-dev] RFC: Killing undef and spreading poison

----- Original Message -----> From: "Nuno Lopes" <nuno.lopes at ist.utl.pt>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "Sanjoy Das" <sanjoy at playingwithpointers.com>,
"David Majnemer" <david.majnemer at gmail.com>, "John
Regehr"
> <regehr at cs.utah.edu>, "Chung-Kil Hur" <gil.hur at
sf.snu.ac.kr>, "Juneyoung Lee" <juneyoung.lee at
sf.snu.ac.kr>,
> "Yoonseung Kim" <yoonseung.kim at sf.snu.ac.kr>,
"Youngju Song" <youngju.song at sf.snu.ac.kr>, "Dan
Gohman"
> <dan433584 at gmail.com>, nlewycky at google.com, mbraun at
apple.com, qcolombet at apple.com, "t p northover"
> <t.p.northover at gmail.com>, llvm-dev at lists.llvm.org
> Sent: Tuesday, October 18, 2016 10:10:41 AM
> Subject: RE: RFC: Killing undef and spreading poison
> 
> >> A use of freeze is to enable speculative execution.  For example,
> >> loop
> >> switching does the following transformation:
> >> while (C) {
> >>   if (C2) {
> >>    A
> >>   } else {
> >>    B
> >>   }
> >> }
> >> =>
> >> if (C2) {
> >>    while (C)
> >>       A
> >> } else {
> >>     while (C)
> >>        B
> >> }
> >> 
> >> Here we are evaluating C2 before C.  If the original loop never
> >> executed then we had never evaluated C2, while now we do.  So we
> >> need
> >> to make sure there's no UB for branching on C2.  Freeze
ensures
> >> that
> >> so we would actually have 'if (freeze(C2))' instead.
> >> Note that having branch on poison not trigger UB has its own
> >> problems.
> > 
> > Can you please elaborate on these problems?
> 
> Sure! For example, the following transformation would be wrong if
> branch on poison was a non-deterministic branch instead of UB:
> 
>     %a = add i32 %x, 1
>     %c = icmp eq i32 %a, %poison
>     br i1 %c, label %bb1, label %bb2
> bb1:
>     %b = add i32 %x, 1
>     %d = call i32 @bar(i32 %b)
> ->
> bb1:
>     %d = call i32 @bar(i32 % poison)
> 
> GVN will perform this kind of transformation: it concludes that %a,
> %b, and %poison are all equal and picks one representative value.
> However, these values are equal only when they are not poison. 
Okay; so the problem is that an instruction that is value-equivalent to a poison
value is not itself necessarily poison?
> If
> %poison is indeed poison, then the transformation is wrong because
> before it was passing an ok value to bar() and after the
> transformation it is passing poison.
> On the other hand, if branching on poison is UB, the original program
> was executing UB already because %poison (and therefore %c) were
> poison. So the transformation is ok.
> 
> 
> >> Discussion on select
> >> ===================> >> Select is a tricky instruction to
define semantics for.  There are
> >> multiple views of select's purpose that are contradictory:
> >>  1) Select should be equivalent to arithmetic (e.g., allow
> >>   'select
> >> %c, true, %x' -> 'or %c, %x' and arithmetic ->
select)
> >>  2) br + phi -> select should be allowed (simplifyCFG)
> >>  3) Select -> br + phi should be allowed
> >> 
> >> Since we really want to have 2) since that's simplifyCFG, we
> >> propose
> >> to make 'select %c, %a, %b' poison if any of the following
holds:
> >>  - %c is poison
> >>  - %c = true and %a is poison
> >>  - %c = false and %b is poison
> >> 
> >> This disallows 3) and some transformations of 1). Since 3) is only
> >> performed in CodeGenPrepare, we can safely ignore it (no
> >> aggressive
> >> optimizations are run afterwards).
> >
> > This strikes me as a dangerous game to play. CGP's transformations
> > need to be semantically sound (even if they are anti-canonical).
> > In part, this is because our IR-level analyses are used by CGP.
> > Moreover, targets after CGP run IR-level transformations, and
> > having different semantic rules there is going to make code reuse
> > hard in subtle ways. Which brings up another issue: We currently
> > have undef at the MI level; do you propose changing that, and if
> > so, how?
> 
> It is sound to do the transformation at IR level if you freeze the
> condition.
> My suggestion was to avoid introducing overhead in CGP.  This is the
> current state of affairs and it seems to work right now.  I'm not
> shocked to see the IR changing the semantics throughout the
> pipeline, since while the compiler proceeds, the type of
> transformations done also change.
> But I have to say that my understanding of LLVM after CGP is very
> limited. I was not aware of the code reuse from IR-level analyses.
>  I agree this needs to be scrutinized carefully.  Alternatively, we
> can just introduce freeze and pay the price (likely low anyway).
I think we'd need to do this to ensure consistency; I don't think that
the price would be high, especially because we soon drop into different
representations and the IR is used only for analysis (for pointer aliasing,
etc.).
> 
> Regarding undef at MI level, we are not proposing any change.  I
> don't see any reason right now to change it since it seems that
> optimizations at MI level are fine with just having undef. There's
> no nsw/nuw stuff there and undef seems very useful.
We've already started propagating nsw/nuw into the SDAG, and as we move
toward GlobalISel, we'll have them at the MI layer as well. I think
we'll need to consider how to propagate this information to the MI layer
directly.

 -Hal
> Poison is lowered into undef. And freeze is lowered into this pair of
> copy nodes that seems to stop propagation of undef (to ensure all
> uses see the same value).  I don't know enough about MI to know if
> there's a better way to lower freeze, though.
> 
> Thanks,
> Nuno
> 
> 
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

Nuno Lopes via llvm-dev

2016-Oct-18 19:35 UTC

head link

[llvm-dev] Killing undef and spreading poison

>> >> Here we are evaluating C2 before C.  If the original loop
never
>> >> executed then we had never evaluated C2, while now we do.  So
we
>> >> need
>> >> to make sure there's no UB for branching on C2.  Freeze
ensures
>> >> that
>> >> so we would actually have 'if (freeze(C2))' instead.
>> >> Note that having branch on poison not trigger UB has its own
>> >> problems.
>> >
>> > Can you please elaborate on these problems?
>>
>> Sure! For example, the following transformation would be wrong if
>> branch on poison was a non-deterministic branch instead of UB:
>>
>>     %a = add i32 %x, 1
>>     %c = icmp eq i32 %a, %poison
>>     br i1 %c, label %bb1, label %bb2
>> bb1:
>>     %b = add i32 %x, 1
>>     %d = call i32 @bar(i32 %b)
>> ->
>> bb1:
>>     %d = call i32 @bar(i32 % poison)
>>
>> GVN will perform this kind of transformation: it concludes that %a,
>> %b, and %poison are all equal and picks one representative value.
>> However, these values are equal only when they are not poison.
>
> Okay; so the problem is that an instruction that is value-equivalent to a 
> poison value is not itself necessarily poison?
Right.
I think there were other examples, but I don't have them here handy.  But at
least this one is very problematic for GVN.

>> >> Discussion on select
>> >> ===================>> >> Select is a tricky
instruction to define semantics for.  There are
>> >> multiple views of select's purpose that are contradictory:
>> >>  1) Select should be equivalent to arithmetic (e.g., allow
>> >>   'select
>> >> %c, true, %x' -> 'or %c, %x' and arithmetic
-> select)
>> >>  2) br + phi -> select should be allowed (simplifyCFG)
>> >>  3) Select -> br + phi should be allowed
>> >>
>> >> Since we really want to have 2) since that's simplifyCFG,
we
>> >> propose
>> >> to make 'select %c, %a, %b' poison if any of the
following holds:
>> >>  - %c is poison
>> >>  - %c = true and %a is poison
>> >>  - %c = false and %b is poison
>> >>
>> >> This disallows 3) and some transformations of 1). Since 3) is
only
>> >> performed in CodeGenPrepare, we can safely ignore it (no
>> >> aggressive
>> >> optimizations are run afterwards).
>> >
>> > This strikes me as a dangerous game to play. CGP's
transformations
>> > need to be semantically sound (even if they are anti-canonical).
>> > In part, this is because our IR-level analyses are used by CGP.
>> > Moreover, targets after CGP run IR-level transformations, and
>> > having different semantic rules there is going to make code reuse
>> > hard in subtle ways. Which brings up another issue: We currently
>> > have undef at the MI level; do you propose changing that, and if
>> > so, how?
>>
>> It is sound to do the transformation at IR level if you freeze the
>> condition.
>> My suggestion was to avoid introducing overhead in CGP.  This is the
>> current state of affairs and it seems to work right now.  I'm not
>> shocked to see the IR changing the semantics throughout the
>> pipeline, since while the compiler proceeds, the type of
>> transformations done also change.
>> But I have to say that my understanding of LLVM after CGP is very
>> limited. I was not aware of the code reuse from IR-level analyses.
>>  I agree this needs to be scrutinized carefully.  Alternatively, we
>> can just introduce freeze and pay the price (likely low anyway).
>
> I think we'd need to do this to ensure consistency; I don't think
that the
> price would be high, especially because we soon drop into different 
> representations and the IR is used only for analysis (for pointer 
> aliasing, etc.).
Ok, makes sense; thanks!

>> Regarding undef at MI level, we are not proposing any change.  I
>> don't see any reason right now to change it since it seems that
>> optimizations at MI level are fine with just having undef. There's
>> no nsw/nuw stuff there and undef seems very useful.
>
> We've already started propagating nsw/nuw into the SDAG, and as we move
> toward GlobalISel, we'll have them at the MI layer as well. I think
we'll
> need to consider how to propagate this information to the MI layer 
> directly.
I see.  Then it might make sense to introduce poison in MI then.  But only 
if there's a plan to start doing optimizations that leverage nsw/nuw at MI 
level as well.  Otherwise undef is just fine.
I guess we need to do this in phases, though, to avoid breaking everything 
at the same time :)

Nuno

llvm dev - Oct 2016 - RFC: Killing undef and spreading poison

[llvm-dev] RFC: Killing undef and spreading poison

[llvm-dev] RFC: Killing undef and spreading poison

[llvm-dev] Killing undef and spreading poison