thr3ads.net - llvm dev - [llvm-dev] RFC: Killing undef and spreading poison [Oct 2016]

If this information is useful, please help other people find it:
Share via:

Michael Kuperstein via llvm-dev

2016-Oct-20 21:05 UTC

[llvm-dev] RFC: Killing undef and spreading poison

The fact two IR values are defined the same way doesn't necessarily imply
they are actually the same value.
e.g.

%a = load volatile i32, i32* %p
%b = load volatile i32, i32* %p

As Sanjoy said, though, it should always be legal to optimize all uses of
different "freeze(%x)" values to use the same def - this is equivalent
to
choosing the same value for all freezes. It's just not necessary to do so.

On Thu, Oct 20, 2016 at 1:58 PM, Krzysztof Parzyszek via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> In both of these cases, the expression tree in the IR is going to look like
>   == (freeze(%x), freeze(%x))
>
> The %a and %b are just labels on values, which are defined in the exact
> same way. How do you differentiate these two?
>
> If %a = freeze(%x), is %a+1 == %a+1?
>
> -Krzysztof
>
>
>
> On 10/20/2016 3:36 PM, Sanjoy Das wrote:
>
>> Hi Krzysztof,
>>
>> Krzysztof Parzyszek wrote:
>>
>>> On 10/18/2016 4:29 PM, Nuno Lopes wrote:
>>>
>>>> Even %a and %b might not be the same in "%a = freeze(%x),
%b >>>> freeze(%x)" (each freeze returns an arbitrary, but
fixed, value).
>>>>
>>>
>>> Assume that %x is known to be a poison value and have:
>>> %a = freeze(%x)
>>> %b = freeze(%x)
>>>
>>> Is %a == %a true?
>>>
>>
>> Yes, %a is always == %a.  It is a normal SSA value with some unspecific
>> content.
>>
>> Is %a == %b true?
>>>
>>
>> Not necessarily; but the compiler can make it true by (consistently)
>> choosing equal values for %a and %b.
>>
>> By consistently I mean it can't fold one instance of %a == %b to
true
>> and fold another instance of %a == %b to false.
>>
>> -- Sanjoy
>>
>
> --
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
> by The Linux Foundation
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161020/0c2fb0bb/attachment.html>

Daniel Berlin via llvm-dev

2016-Oct-20 21:20 UTC

head link

[llvm-dev] RFC: Killing undef and spreading poison

On Thu, Oct 20, 2016 at 2:05 PM, Michael Kuperstein via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> The fact two IR values are defined the same way doesn't necessarily
imply
> they are actually the same value.
>
Is this true for anything other than memory (mainly curious) ?

> e.g.
>
> %a = load volatile i32, i32* %p
> %b = load volatile i32, i32* %p
>
It's worth pointing out these are just missing implicit state that could
easily be made explicit.

IE

%1 = new heap version
%a = load volatile i32, i32* %p, %1
%2 = new heap version
%a = load volatile i32, i32* %p, %2



>As Sanjoy said, though, it should always be legal to optimize all uses ofdifferent "freeze(%x)" values to use the same def - this is equivalent
to>choosing the same value for all freezes. It's just not necessary to do
so.
So just to be clear (mainly so i can go back to letting y'all hash this
out), in value numbering:

1. I can value number all freeze operations on the same operand to the same
value
 *as long as*
2. I  replace all freeze values of the same operand with a single one.

Or am i misunderstanding?

If i've got it right, the issue i see with #2 is whether i can hoist
freezes or not.

If not, it may not be possible to cover all uses.

IE

if (a)  {
  %1 = freeze (%x)
} else if (b) {
  %2 = freeze (%x)
} else if (c){
  %3 = freeze (%x)
}

I cannot replace these with a single freeze def without placing a freeze
above the if block.

*If* i need to replace them all with the same freeze def in order to be
able to consider them the same value, either i need to be able to hoist
them, or i have to assume the freezes have different values, or  this makes
value numbering much harder (I basically have to make some implicit state
explicit like we do for memoryssa, where the implicit state is "state of
the heap".  In this case, i would associate each freeze with a version
based on the SSA renaming algorithm, so i knew which freezes i could *ever*
replace with other existing freezes).

Normally the *value* doesn't depend on the location in the IR, and to the
degree it does, we try to make that explicit.

Again, just trying to understand, mainly so i know if i have to care or not.


On Thu, Oct 20, 2016 at 1:58 PM, Krzysztof Parzyszek via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> In both of these cases, the expression tree in the IR is going to look like
>   == (freeze(%x), freeze(%x))
>
> The %a and %b are just labels on values, which are defined in the exact
> same way. How do you differentiate these two?
>
> If %a = freeze(%x), is %a+1 == %a+1?
>
> -Krzysztof
>
>
>
> On 10/20/2016 3:36 PM, Sanjoy Das wrote:
>
>> Hi Krzysztof,
>>
>> Krzysztof Parzyszek wrote:
>>
>>> On 10/18/2016 4:29 PM, Nuno Lopes wrote:
>>>
>>>> Even %a and %b might not be the same in "%a = freeze(%x),
%b >>>> freeze(%x)" (each freeze returns an arbitrary, but
fixed, value).
>>>>
>>>
>>> Assume that %x is known to be a poison value and have:
>>> %a = freeze(%x)
>>> %b = freeze(%x)
>>>
>>> Is %a == %a true?
>>>
>>
>> Yes, %a is always == %a.  It is a normal SSA value with some unspecific
>> content.
>>
>> Is %a == %b true?
>>>
>>
>> Not necessarily; but the compiler can make it true by (consistently)
>> choosing equal values for %a and %b.
>>
>> By consistently I mean it can't fold one instance of %a == %b to
true
>> and fold another instance of %a == %b to false.
>>
>> -- Sanjoy
>>
>
> --
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
> by The Linux Foundation
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161020/8a713455/attachment.html>

Sanjoy Das via llvm-dev

2016-Oct-21 06:33 UTC

head link

[llvm-dev] RFC: Killing undef and spreading poison

Hi Daniel,

On Thu, Oct 20, 2016 at 2:20 PM, Daniel Berlin via llvm-dev
<llvm-dev at lists.llvm.org> wrote:>
>
> On Thu, Oct 20, 2016 at 2:05 PM, Michael Kuperstein via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>>
>> The fact two IR values are defined the same way doesn't necessarily
imply
>> they are actually the same value.
>
>
> Is this true for anything other than memory (mainly curious) ?
Memory, and everything we pretend is memory. :) E.g. inline assembly,
IO, @llvm.read_register.
>>As Sanjoy said, though, it should always be legal to optimize all uses
of
>> different "freeze(%x)" values to use the same def - this is
equivalent to
>> >choosing the same value for all freezes. It's just not
necessary to do so.
>
> So just to be clear (mainly so i can go back to letting y'all hash this
> out), in value numbering:
>
> 1. I can value number all freeze operations on the same operand to the same
> value
>  *as long as*
> 2. I  replace all freeze values of the same operand with a single one.
>
> Or am i misunderstanding?
I have to think a bit more before committing to this, but to merely be
*correct* I think optimizations should be able to pretend the
"implementation" of freeze is:

freeze(x):
  if x is not poison: ;; This check is actually impossible to write in IR
    return x
  val = load atomic <ty>, <ty>* @global, unordered
  return val

where @global is not clobbered / written to by anything in the IR (so
no stores/calls etc. alias the location), but is being concurrently
written to by some other thread not visible to LLVM (so every load
from @global is racing with that other thread).

This means while two instances of freeze(A) are not guaranteed to
return the same value, they _can_ be CSE'ed (since that only decreases
or "refines" the set of available behaviors).

In that respect, freeze is somewhat like a memory operation.
> If i've got it right, the issue i see with #2 is whether i can hoist
freezes
> or not.
>
> If not, it may not be possible to cover all uses.
>
> IE
>
> if (a)  {
>   %1 = freeze (%x)
> } else if (b) {
>   %2 = freeze (%x)
> } else if (c){
>   %3 = freeze (%x)
> }
>
> I cannot replace these with a single freeze def without placing a freeze
> above the if block.
I don't think there is any problem with hoisting freezes, so you
should be able to hoist the freeze out and CSE %1, %2 and %3 to the
same instance of freeze.

(@Nuno: does that sound right ^?)

Let me know if that answers your question.

-- Sanjoy

> *If* i need to replace them all with the same freeze def in order to be
able
> to consider them the same value, either i need to be able to hoist them, or
> i have to assume the freezes have different values, or  this makes value
> numbering much harder (I basically have to make some implicit state
explicit
> like we do for memoryssa, where the implicit state is "state of the
heap".
> In this case, i would associate each freeze with a version based on the SSA
> renaming algorithm, so i knew which freezes i could *ever* replace with
> other existing freezes).
>
> Normally the *value* doesn't depend on the location in the IR, and to
the
> degree it does, we try to make that explicit.
>
> Again, just trying to understand, mainly so i know if i have to care or
not.
>
>
> On Thu, Oct 20, 2016 at 1:58 PM, Krzysztof Parzyszek via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>>
>> In both of these cases, the expression tree in the IR is going to look
>> like
>>   == (freeze(%x), freeze(%x))
>>
>> The %a and %b are just labels on values, which are defined in the exact
>> same way. How do you differentiate these two?
>>
>> If %a = freeze(%x), is %a+1 == %a+1?
>>
>> -Krzysztof
>>
>>
>>
>> On 10/20/2016 3:36 PM, Sanjoy Das wrote:
>>>
>>> Hi Krzysztof,
>>>
>>> Krzysztof Parzyszek wrote:
>>>>
>>>> On 10/18/2016 4:29 PM, Nuno Lopes wrote:
>>>>>
>>>>> Even %a and %b might not be the same in "%a =
freeze(%x), %b >>>>> freeze(%x)" (each freeze returns an
arbitrary, but fixed, value).
>>>>
>>>>
>>>> Assume that %x is known to be a poison value and have:
>>>> %a = freeze(%x)
>>>> %b = freeze(%x)
>>>>
>>>> Is %a == %a true?
>>>
>>>
>>> Yes, %a is always == %a.  It is a normal SSA value with some
unspecific
>>> content.
>>>
>>>> Is %a == %b true?
>>>
>>>
>>> Not necessarily; but the compiler can make it true by
(consistently)
>>> choosing equal values for %a and %b.
>>>
>>> By consistently I mean it can't fold one instance of %a == %b
to true
>>> and fold another instance of %a == %b to false.
>>>
>>> -- Sanjoy
>>
>>
>> --
>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted
>> by The Linux Foundation
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

Krzysztof Parzyszek via llvm-dev

2016-Oct-21 23:18 UTC

head link

[llvm-dev] RFC: Killing undef and spreading poison

On 10/20/2016 4:05 PM, Michael Kuperstein wrote:> As Sanjoy said, though, it should always be legal to optimize all uses
> of different "freeze(%x)" values to use the same def - this is
> equivalent to choosing the same value for all freezes. It's just not
> necessary to do so.
The problem is that once such choice occurs, it takes effect for all 
occurrences of of that value.  For example:

   %x = freeze(provably-poison)
   %y = freeze(provably-poison)
   ...

   if (%x == %y) {
     // lots of code
   }

   ...

   if (%x < %y) {
     // lots of code
   }

   ...

   if (2*%x < %y*(%y+1)) {
     // even more code
   }

If we want to fold the first comparison to save on code size, what 
effects does it have on the other comparisons?  To remain consistent, 
the compiler will need to pick concrete values for %x and %y and fold 
the other two comparisons according to what the corresponding conditions 
turn out to be.  If we are ambitious, we could solve the system of 
inequalities and pick values that eliminate all of the conditional code, 
but that isn't always feasible. Of course, we could fold the first one 
and then leave the other two alone, but after the first folding, we 
would have to know that we are no longer allowed to arbitrarily fold the 
remaining ones.

If the "strong UB" corresponds to cases that could cause the hardware
to
crash (unaligned load or indirect branch to an undefined address), then 
maybe we should have "unspecified behavior" as a counterpart of 
"unspecified value"? There is no reason to treat a conditional branch 
(if-then-else) on an unspecified condition in the same way as an 
indirect branch to an unspecified address. The freeze kind of sort of 
does that, but it carries the "unspecifiness" forward allowing it to 
accumulate into a complex networks of dependencies.

-Krzysztof

Sanjoy Das via llvm-dev

2016-Oct-21 23:42 UTC

head link

[llvm-dev] RFC: Killing undef and spreading poison

Hi Krzysztof,

Krzysztof Parzyszek wrote:
 > On 10/20/2016 4:05 PM, Michael Kuperstein wrote:
 >> As Sanjoy said, though, it should always be legal to optimize all uses
 >> of different "freeze(%x)" values to use the same def - this
is
 >> equivalent to choosing the same value for all freezes. It's just
not
 >> necessary to do so.
 >
 > The problem is that once such choice occurs, it takes effect for all
 > occurrences of of that value. For example:
 >
 > %x = freeze(provably-poison)
 > %y = freeze(provably-poison)
 > ...
 >
 > if (%x == %y) {
 > // lots of code
 > }
 >
 > ...
 >
 > if (%x < %y) {
 > // lots of code
 > }
 >
 > ...
 >
 > if (2*%x < %y*(%y+1)) {
 > // even more code
 > }

I don't really see the complexity here.  If you want to fold "%x
=%y" to true, then you basically have to %y->rauw(%x).  This could lead
to lost opportunities down the line, but that's already the case with
transforms we do today e.g.

   if (x == y)
     f(x)

=>

   if (x == y)
     f(y)

The former would have been better for optimization if x is undef and y
isn't.  We just take a local best guess and go ahead with it.

However, there are cases where the consistency requirement will
prevent us from simplifying away freezes:

   %t = freeze <poison>
   store volatile %t to loc
   %x = load volatile other_loc
   %xor = xor %x, %t

We can't fold %xor to 0 because that would require us to %t->RAUW(%x)
which would break SSA.


 > If we want to fold the first comparison to save on code size, what
 > effects does it have on the other comparisons? To remain consistent, the
 > compiler will need to pick concrete values for %x and %y and fold the

Yes, it needs to take steps to avoid inconsistency down the line.

 > other two comparisons according to what the corresponding conditions
 > turn out to be. If we are ambitious, we could solve the system of
 > inequalities and pick values that eliminate all of the conditional code,
 > but that isn't always feasible. Of course, we could fold the first one
 > and then leave the other two alone, but after the first folding, we
 > would have to know that we are no longer allowed to arbitrarily fold the
 > remaining ones.

Agreed.

 > If the "strong UB" corresponds to cases that could cause the
hardware to
 > crash (unaligned load or indirect branch to an undefined address), then
 > maybe we should have "unspecified behavior" as a counterpart of
 > "unspecified value"? There is no reason to treat a conditional
branch
 > (if-then-else) on an unspecified condition in the same way as an
 > indirect branch to an unspecified address. The freeze kind of sort of

However, we need to justify control flow based correlated value
analysis.  That is:

   if (x != 0) {
     if (...)
       1/x; // safe to speculate
   }

 > does that, but it carries the "unspecifiness" forward allowing
it to
 > accumulate into a complex networks of dependencies.

-- Sanjoy

Sanjoy Das via llvm-dev

2016-Oct-22 01:55 UTC

head link

[llvm-dev] RFC: Killing undef and spreading poison

Hi Krzysztof,

On Fri, Oct 21, 2016 at 4:18 PM, Krzysztof Parzyszek
<kparzysz at codeaurora.org> wrote:> On 10/20/2016 4:05 PM, Michael Kuperstein wrote:
>>
>> As Sanjoy said, though, it should always be legal to optimize all uses
>> of different "freeze(%x)" values to use the same def - this
is
>> equivalent to choosing the same value for all freezes. It's just
not
>> necessary to do so.
>
>
> The problem is that once such choice occurs, it takes effect for all
> occurrences of of that value.  For example:
>
>   %x = freeze(provably-poison)
>   %y = freeze(provably-poison)
>   ...
>
>   if (%x == %y) {
>     // lots of code
>   }
>
>   ...
>
>   if (%x < %y) {
>     // lots of code
>   }
>
>   ...
>
>   if (2*%x < %y*(%y+1)) {
>     // even more code
>   }
I mis-parsed your code -- I suppose you wanted to fold _away_ the "//
lots of code" bits, for which you want the inverse of %x->RAUW(%y).
For cases like these, I'd say initially we should go for doing
something simple like replacing %x with 0 and %y with -1.  If we do
end up seeing cases like the above in practice, we can consider
becoming smarter.

This tradeoff isn't very different from how today we fold, say, "sext
i1 undef to i32" to "i32 0", even though it is possible that
downstream uses would have benefited from "remembering" the fact that
"sext i1 undef to i32" can be 0 or -1 depending on the compiler's
whim.

-- Sanjoy

Nuno Lopes via llvm-dev

2016-Oct-23 16:30 UTC

head link

[llvm-dev] RFC: Killing undef and spreading poison

> So just to be clear (mainly so i can go back to letting y'all hash this
out), in value numbering:
> 1. I can value number all freeze operations on the same operand to the same
value
>  *as long as*
> 2. I  replace all freeze values of the same operand with a single one.
> 
> Or am i misunderstanding?
> If i've got it right, the issue i see with #2 is whether i can hoist
freezes or not.
> 
> If not, it may not be possible to cover all uses.
> 
> IE
> if (a)  {
>   %1 = freeze (%x)
> } else if (b) {
>   %2 = freeze (%x)
> } else if (c){ 
>   %3 = freeze (%x)
> }
> 
> I cannot replace these with a single freeze def without placing a freeze
above the if block.
As Sanjoy said, hoisting freeze is fine, but movement is not without
constraints.  For example, you cannot duplicate execution of freeze.
e.g. the following is wrong (assuming loop body may execute multiple times):

%x = freeze %y
while(...) { use %x }
=>
while(...) {
   %x = freeze %y
   use %x
}

So Sanjoy saying that freeze can be thought as an atomic/volatile operation
seems like a good intuition.

> *If* i need to replace them all with the same freeze def in order to be
able to consider them the same value,  (...)
Right, so just to make it very clear, the following would be wrong:
%x = freeze %y
%w = freeze %y
use_1 %x
use_2 %w
use_3 %x
=>
%x = freeze %y
%w = freeze %y
use_1 %w
use_2 %w
use_3 %x

If you replace one use of %x, all uses have to be replaced, since different
freezes of the same operand *may* return different values.
I guess that since we are never (err, most of the times) sure of whether a value
may come out of freeze, this rule needs to be applied to any value.
Daniel: is that fine for GVN, or would it make things too complicated?

Nuno

Reasonably Related Threads

Search for more seemingly similar threads

llvm dev - Oct 2016 - RFC: Killing undef and spreading poison

[llvm-dev] RFC: Killing undef and spreading poison

[llvm-dev] RFC: Killing undef and spreading poison

[llvm-dev] RFC: Killing undef and spreading poison

[llvm-dev] RFC: Killing undef and spreading poison

[llvm-dev] RFC: Killing undef and spreading poison

[llvm-dev] RFC: Killing undef and spreading poison

[llvm-dev] RFC: Killing undef and spreading poison

Reasonably Related Threads