Michael Kuperstein via llvm-dev
2016-Oct-20 21:05 UTC
[llvm-dev] RFC: Killing undef and spreading poison
The fact two IR values are defined the same way doesn't necessarily imply they are actually the same value. e.g. %a = load volatile i32, i32* %p %b = load volatile i32, i32* %p As Sanjoy said, though, it should always be legal to optimize all uses of different "freeze(%x)" values to use the same def - this is equivalent to choosing the same value for all freezes. It's just not necessary to do so. On Thu, Oct 20, 2016 at 1:58 PM, Krzysztof Parzyszek via llvm-dev < llvm-dev at lists.llvm.org> wrote:> In both of these cases, the expression tree in the IR is going to look like > == (freeze(%x), freeze(%x)) > > The %a and %b are just labels on values, which are defined in the exact > same way. How do you differentiate these two? > > If %a = freeze(%x), is %a+1 == %a+1? > > -Krzysztof > > > > On 10/20/2016 3:36 PM, Sanjoy Das wrote: > >> Hi Krzysztof, >> >> Krzysztof Parzyszek wrote: >> >>> On 10/18/2016 4:29 PM, Nuno Lopes wrote: >>> >>>> Even %a and %b might not be the same in "%a = freeze(%x), %b >>>> freeze(%x)" (each freeze returns an arbitrary, but fixed, value). >>>> >>> >>> Assume that %x is known to be a poison value and have: >>> %a = freeze(%x) >>> %b = freeze(%x) >>> >>> Is %a == %a true? >>> >> >> Yes, %a is always == %a. It is a normal SSA value with some unspecific >> content. >> >> Is %a == %b true? >>> >> >> Not necessarily; but the compiler can make it true by (consistently) >> choosing equal values for %a and %b. >> >> By consistently I mean it can't fold one instance of %a == %b to true >> and fold another instance of %a == %b to false. >> >> -- Sanjoy >> > > -- > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted > by The Linux Foundation > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161020/0c2fb0bb/attachment.html>
Daniel Berlin via llvm-dev
2016-Oct-20 21:20 UTC
[llvm-dev] RFC: Killing undef and spreading poison
On Thu, Oct 20, 2016 at 2:05 PM, Michael Kuperstein via llvm-dev < llvm-dev at lists.llvm.org> wrote:> The fact two IR values are defined the same way doesn't necessarily imply > they are actually the same value. >Is this true for anything other than memory (mainly curious) ?> e.g. > > %a = load volatile i32, i32* %p > %b = load volatile i32, i32* %p >It's worth pointing out these are just missing implicit state that could easily be made explicit. IE %1 = new heap version %a = load volatile i32, i32* %p, %1 %2 = new heap version %a = load volatile i32, i32* %p, %2>As Sanjoy said, though, it should always be legal to optimize all uses ofdifferent "freeze(%x)" values to use the same def - this is equivalent to>choosing the same value for all freezes. It's just not necessary to do so.So just to be clear (mainly so i can go back to letting y'all hash this out), in value numbering: 1. I can value number all freeze operations on the same operand to the same value *as long as* 2. I replace all freeze values of the same operand with a single one. Or am i misunderstanding? If i've got it right, the issue i see with #2 is whether i can hoist freezes or not. If not, it may not be possible to cover all uses. IE if (a) { %1 = freeze (%x) } else if (b) { %2 = freeze (%x) } else if (c){ %3 = freeze (%x) } I cannot replace these with a single freeze def without placing a freeze above the if block. *If* i need to replace them all with the same freeze def in order to be able to consider them the same value, either i need to be able to hoist them, or i have to assume the freezes have different values, or this makes value numbering much harder (I basically have to make some implicit state explicit like we do for memoryssa, where the implicit state is "state of the heap". In this case, i would associate each freeze with a version based on the SSA renaming algorithm, so i knew which freezes i could *ever* replace with other existing freezes). Normally the *value* doesn't depend on the location in the IR, and to the degree it does, we try to make that explicit. Again, just trying to understand, mainly so i know if i have to care or not. On Thu, Oct 20, 2016 at 1:58 PM, Krzysztof Parzyszek via llvm-dev < llvm-dev at lists.llvm.org> wrote:> In both of these cases, the expression tree in the IR is going to look like > == (freeze(%x), freeze(%x)) > > The %a and %b are just labels on values, which are defined in the exact > same way. How do you differentiate these two? > > If %a = freeze(%x), is %a+1 == %a+1? > > -Krzysztof > > > > On 10/20/2016 3:36 PM, Sanjoy Das wrote: > >> Hi Krzysztof, >> >> Krzysztof Parzyszek wrote: >> >>> On 10/18/2016 4:29 PM, Nuno Lopes wrote: >>> >>>> Even %a and %b might not be the same in "%a = freeze(%x), %b >>>> freeze(%x)" (each freeze returns an arbitrary, but fixed, value). >>>> >>> >>> Assume that %x is known to be a poison value and have: >>> %a = freeze(%x) >>> %b = freeze(%x) >>> >>> Is %a == %a true? >>> >> >> Yes, %a is always == %a. It is a normal SSA value with some unspecific >> content. >> >> Is %a == %b true? >>> >> >> Not necessarily; but the compiler can make it true by (consistently) >> choosing equal values for %a and %b. >> >> By consistently I mean it can't fold one instance of %a == %b to true >> and fold another instance of %a == %b to false. >> >> -- Sanjoy >> > > -- > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted > by The Linux Foundation > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >_______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161020/8a713455/attachment.html>
Sanjoy Das via llvm-dev
2016-Oct-21 06:33 UTC
[llvm-dev] RFC: Killing undef and spreading poison
Hi Daniel, On Thu, Oct 20, 2016 at 2:20 PM, Daniel Berlin via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > > On Thu, Oct 20, 2016 at 2:05 PM, Michael Kuperstein via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> >> The fact two IR values are defined the same way doesn't necessarily imply >> they are actually the same value. > > > Is this true for anything other than memory (mainly curious) ?Memory, and everything we pretend is memory. :) E.g. inline assembly, IO, @llvm.read_register.>>As Sanjoy said, though, it should always be legal to optimize all uses of >> different "freeze(%x)" values to use the same def - this is equivalent to >> >choosing the same value for all freezes. It's just not necessary to do so. > > So just to be clear (mainly so i can go back to letting y'all hash this > out), in value numbering: > > 1. I can value number all freeze operations on the same operand to the same > value > *as long as* > 2. I replace all freeze values of the same operand with a single one. > > Or am i misunderstanding?I have to think a bit more before committing to this, but to merely be *correct* I think optimizations should be able to pretend the "implementation" of freeze is: freeze(x): if x is not poison: ;; This check is actually impossible to write in IR return x val = load atomic <ty>, <ty>* @global, unordered return val where @global is not clobbered / written to by anything in the IR (so no stores/calls etc. alias the location), but is being concurrently written to by some other thread not visible to LLVM (so every load from @global is racing with that other thread). This means while two instances of freeze(A) are not guaranteed to return the same value, they _can_ be CSE'ed (since that only decreases or "refines" the set of available behaviors). In that respect, freeze is somewhat like a memory operation.> If i've got it right, the issue i see with #2 is whether i can hoist freezes > or not. > > If not, it may not be possible to cover all uses. > > IE > > if (a) { > %1 = freeze (%x) > } else if (b) { > %2 = freeze (%x) > } else if (c){ > %3 = freeze (%x) > } > > I cannot replace these with a single freeze def without placing a freeze > above the if block.I don't think there is any problem with hoisting freezes, so you should be able to hoist the freeze out and CSE %1, %2 and %3 to the same instance of freeze. (@Nuno: does that sound right ^?) Let me know if that answers your question. -- Sanjoy> *If* i need to replace them all with the same freeze def in order to be able > to consider them the same value, either i need to be able to hoist them, or > i have to assume the freezes have different values, or this makes value > numbering much harder (I basically have to make some implicit state explicit > like we do for memoryssa, where the implicit state is "state of the heap". > In this case, i would associate each freeze with a version based on the SSA > renaming algorithm, so i knew which freezes i could *ever* replace with > other existing freezes). > > Normally the *value* doesn't depend on the location in the IR, and to the > degree it does, we try to make that explicit. > > Again, just trying to understand, mainly so i know if i have to care or not. > > > On Thu, Oct 20, 2016 at 1:58 PM, Krzysztof Parzyszek via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> >> In both of these cases, the expression tree in the IR is going to look >> like >> == (freeze(%x), freeze(%x)) >> >> The %a and %b are just labels on values, which are defined in the exact >> same way. How do you differentiate these two? >> >> If %a = freeze(%x), is %a+1 == %a+1? >> >> -Krzysztof >> >> >> >> On 10/20/2016 3:36 PM, Sanjoy Das wrote: >>> >>> Hi Krzysztof, >>> >>> Krzysztof Parzyszek wrote: >>>> >>>> On 10/18/2016 4:29 PM, Nuno Lopes wrote: >>>>> >>>>> Even %a and %b might not be the same in "%a = freeze(%x), %b >>>>> freeze(%x)" (each freeze returns an arbitrary, but fixed, value). >>>> >>>> >>>> Assume that %x is known to be a poison value and have: >>>> %a = freeze(%x) >>>> %b = freeze(%x) >>>> >>>> Is %a == %a true? >>> >>> >>> Yes, %a is always == %a. It is a normal SSA value with some unspecific >>> content. >>> >>>> Is %a == %b true? >>> >>> >>> Not necessarily; but the compiler can make it true by (consistently) >>> choosing equal values for %a and %b. >>> >>> By consistently I mean it can't fold one instance of %a == %b to true >>> and fold another instance of %a == %b to false. >>> >>> -- Sanjoy >> >> >> -- >> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted >> by The Linux Foundation >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >
Krzysztof Parzyszek via llvm-dev
2016-Oct-21 23:18 UTC
[llvm-dev] RFC: Killing undef and spreading poison
On 10/20/2016 4:05 PM, Michael Kuperstein wrote:> As Sanjoy said, though, it should always be legal to optimize all uses > of different "freeze(%x)" values to use the same def - this is > equivalent to choosing the same value for all freezes. It's just not > necessary to do so.The problem is that once such choice occurs, it takes effect for all occurrences of of that value. For example: %x = freeze(provably-poison) %y = freeze(provably-poison) ... if (%x == %y) { // lots of code } ... if (%x < %y) { // lots of code } ... if (2*%x < %y*(%y+1)) { // even more code } If we want to fold the first comparison to save on code size, what effects does it have on the other comparisons? To remain consistent, the compiler will need to pick concrete values for %x and %y and fold the other two comparisons according to what the corresponding conditions turn out to be. If we are ambitious, we could solve the system of inequalities and pick values that eliminate all of the conditional code, but that isn't always feasible. Of course, we could fold the first one and then leave the other two alone, but after the first folding, we would have to know that we are no longer allowed to arbitrarily fold the remaining ones. If the "strong UB" corresponds to cases that could cause the hardware to crash (unaligned load or indirect branch to an undefined address), then maybe we should have "unspecified behavior" as a counterpart of "unspecified value"? There is no reason to treat a conditional branch (if-then-else) on an unspecified condition in the same way as an indirect branch to an unspecified address. The freeze kind of sort of does that, but it carries the "unspecifiness" forward allowing it to accumulate into a complex networks of dependencies. -Krzysztof
Sanjoy Das via llvm-dev
2016-Oct-21 23:42 UTC
[llvm-dev] RFC: Killing undef and spreading poison
Hi Krzysztof, Krzysztof Parzyszek wrote: > On 10/20/2016 4:05 PM, Michael Kuperstein wrote: >> As Sanjoy said, though, it should always be legal to optimize all uses >> of different "freeze(%x)" values to use the same def - this is >> equivalent to choosing the same value for all freezes. It's just not >> necessary to do so. > > The problem is that once such choice occurs, it takes effect for all > occurrences of of that value. For example: > > %x = freeze(provably-poison) > %y = freeze(provably-poison) > ... > > if (%x == %y) { > // lots of code > } > > ... > > if (%x < %y) { > // lots of code > } > > ... > > if (2*%x < %y*(%y+1)) { > // even more code > } I don't really see the complexity here. If you want to fold "%x =%y" to true, then you basically have to %y->rauw(%x). This could lead to lost opportunities down the line, but that's already the case with transforms we do today e.g. if (x == y) f(x) => if (x == y) f(y) The former would have been better for optimization if x is undef and y isn't. We just take a local best guess and go ahead with it. However, there are cases where the consistency requirement will prevent us from simplifying away freezes: %t = freeze <poison> store volatile %t to loc %x = load volatile other_loc %xor = xor %x, %t We can't fold %xor to 0 because that would require us to %t->RAUW(%x) which would break SSA. > If we want to fold the first comparison to save on code size, what > effects does it have on the other comparisons? To remain consistent, the > compiler will need to pick concrete values for %x and %y and fold the Yes, it needs to take steps to avoid inconsistency down the line. > other two comparisons according to what the corresponding conditions > turn out to be. If we are ambitious, we could solve the system of > inequalities and pick values that eliminate all of the conditional code, > but that isn't always feasible. Of course, we could fold the first one > and then leave the other two alone, but after the first folding, we > would have to know that we are no longer allowed to arbitrarily fold the > remaining ones. Agreed. > If the "strong UB" corresponds to cases that could cause the hardware to > crash (unaligned load or indirect branch to an undefined address), then > maybe we should have "unspecified behavior" as a counterpart of > "unspecified value"? There is no reason to treat a conditional branch > (if-then-else) on an unspecified condition in the same way as an > indirect branch to an unspecified address. The freeze kind of sort of However, we need to justify control flow based correlated value analysis. That is: if (x != 0) { if (...) 1/x; // safe to speculate } > does that, but it carries the "unspecifiness" forward allowing it to > accumulate into a complex networks of dependencies. -- Sanjoy
Sanjoy Das via llvm-dev
2016-Oct-22 01:55 UTC
[llvm-dev] RFC: Killing undef and spreading poison
Hi Krzysztof, On Fri, Oct 21, 2016 at 4:18 PM, Krzysztof Parzyszek <kparzysz at codeaurora.org> wrote:> On 10/20/2016 4:05 PM, Michael Kuperstein wrote: >> >> As Sanjoy said, though, it should always be legal to optimize all uses >> of different "freeze(%x)" values to use the same def - this is >> equivalent to choosing the same value for all freezes. It's just not >> necessary to do so. > > > The problem is that once such choice occurs, it takes effect for all > occurrences of of that value. For example: > > %x = freeze(provably-poison) > %y = freeze(provably-poison) > ... > > if (%x == %y) { > // lots of code > } > > ... > > if (%x < %y) { > // lots of code > } > > ... > > if (2*%x < %y*(%y+1)) { > // even more code > }I mis-parsed your code -- I suppose you wanted to fold _away_ the "// lots of code" bits, for which you want the inverse of %x->RAUW(%y). For cases like these, I'd say initially we should go for doing something simple like replacing %x with 0 and %y with -1. If we do end up seeing cases like the above in practice, we can consider becoming smarter. This tradeoff isn't very different from how today we fold, say, "sext i1 undef to i32" to "i32 0", even though it is possible that downstream uses would have benefited from "remembering" the fact that "sext i1 undef to i32" can be 0 or -1 depending on the compiler's whim. -- Sanjoy
Nuno Lopes via llvm-dev
2016-Oct-23 16:30 UTC
[llvm-dev] RFC: Killing undef and spreading poison
> So just to be clear (mainly so i can go back to letting y'all hash this out), in value numbering: > 1. I can value number all freeze operations on the same operand to the same value > *as long as* > 2. I replace all freeze values of the same operand with a single one. > > Or am i misunderstanding? > If i've got it right, the issue i see with #2 is whether i can hoist freezes or not. > > If not, it may not be possible to cover all uses. > > IE > if (a) { > %1 = freeze (%x) > } else if (b) { > %2 = freeze (%x) > } else if (c){ > %3 = freeze (%x) > } > > I cannot replace these with a single freeze def without placing a freeze above the if block.As Sanjoy said, hoisting freeze is fine, but movement is not without constraints. For example, you cannot duplicate execution of freeze. e.g. the following is wrong (assuming loop body may execute multiple times): %x = freeze %y while(...) { use %x } => while(...) { %x = freeze %y use %x } So Sanjoy saying that freeze can be thought as an atomic/volatile operation seems like a good intuition.> *If* i need to replace them all with the same freeze def in order to be able to consider them the same value, (...)Right, so just to make it very clear, the following would be wrong: %x = freeze %y %w = freeze %y use_1 %x use_2 %w use_3 %x => %x = freeze %y %w = freeze %y use_1 %w use_2 %w use_3 %x If you replace one use of %x, all uses have to be replaced, since different freezes of the same operand *may* return different values. I guess that since we are never (err, most of the times) sure of whether a value may come out of freeze, this rule needs to be applied to any value. Daniel: is that fine for GVN, or would it make things too complicated? Nuno