Nuno Lopes via llvm-dev
2016-Oct-18 15:10 UTC
[llvm-dev] RFC: Killing undef and spreading poison
>> A use of freeze is to enable speculative execution. For example, loop >> switching does the following transformation: >> while (C) { >> if (C2) { >> A >> } else { >> B >> } >> } >> => >> if (C2) { >> while (C) >> A >> } else { >> while (C) >> B >> } >> >> Here we are evaluating C2 before C. If the original loop never >> executed then we had never evaluated C2, while now we do. So we need >> to make sure there's no UB for branching on C2. Freeze ensures that >> so we would actually have 'if (freeze(C2))' instead. >> Note that having branch on poison not trigger UB has its own problems. > > Can you please elaborate on these problems?Sure! For example, the following transformation would be wrong if branch on poison was a non-deterministic branch instead of UB: %a = add i32 %x, 1 %c = icmp eq i32 %a, %poison br i1 %c, label %bb1, label %bb2 bb1: %b = add i32 %x, 1 %d = call i32 @bar(i32 %b) -> bb1: %d = call i32 @bar(i32 % poison) GVN will perform this kind of transformation: it concludes that %a, %b, and %poison are all equal and picks one representative value. However, these values are equal only when they are not poison. If %poison is indeed poison, then the transformation is wrong because before it was passing an ok value to bar() and after the transformation it is passing poison. On the other hand, if branching on poison is UB, the original program was executing UB already because %poison (and therefore %c) were poison. So the transformation is ok.>> Discussion on select >> ===================>> Select is a tricky instruction to define semantics for. There are >> multiple views of select's purpose that are contradictory: >> 1) Select should be equivalent to arithmetic (e.g., allow 'select >> %c, true, %x' -> 'or %c, %x' and arithmetic -> select) >> 2) br + phi -> select should be allowed (simplifyCFG) >> 3) Select -> br + phi should be allowed >> >> Since we really want to have 2) since that's simplifyCFG, we propose >> to make 'select %c, %a, %b' poison if any of the following holds: >> - %c is poison >> - %c = true and %a is poison >> - %c = false and %b is poison >> >> This disallows 3) and some transformations of 1). Since 3) is only >> performed in CodeGenPrepare, we can safely ignore it (no aggressive >> optimizations are run afterwards). > > This strikes me as a dangerous game to play. CGP's transformations need to be semantically sound (even if they are anti-canonical). In part, this is because our IR-level analyses are used by CGP. Moreover, targets after CGP run IR-level transformations, and having different semantic rules there is going to make code reuse hard in subtle ways. Which brings up another issue: We currently have undef at the MI level; do you propose changing that, and if so, how?It is sound to do the transformation at IR level if you freeze the condition. My suggestion was to avoid introducing overhead in CGP. This is the current state of affairs and it seems to work right now. I'm not shocked to see the IR changing the semantics throughout the pipeline, since while the compiler proceeds, the type of transformations done also change. But I have to say that my understanding of LLVM after CGP is very limited. I was not aware of the code reuse from IR-level analyses. I agree this needs to be scrutinized carefully. Alternatively, we can just introduce freeze and pay the price (likely low anyway). Regarding undef at MI level, we are not proposing any change. I don't see any reason right now to change it since it seems that optimizations at MI level are fine with just having undef. There's no nsw/nuw stuff there and undef seems very useful. Poison is lowered into undef. And freeze is lowered into this pair of copy nodes that seems to stop propagation of undef (to ensure all uses see the same value). I don't know enough about MI to know if there's a better way to lower freeze, though. Thanks, Nuno
Hal Finkel via llvm-dev
2016-Oct-18 18:01 UTC
[llvm-dev] RFC: Killing undef and spreading poison
----- Original Message -----> From: "Nuno Lopes" <nuno.lopes at ist.utl.pt> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: "Sanjoy Das" <sanjoy at playingwithpointers.com>, "David Majnemer" <david.majnemer at gmail.com>, "John Regehr" > <regehr at cs.utah.edu>, "Chung-Kil Hur" <gil.hur at sf.snu.ac.kr>, "Juneyoung Lee" <juneyoung.lee at sf.snu.ac.kr>, > "Yoonseung Kim" <yoonseung.kim at sf.snu.ac.kr>, "Youngju Song" <youngju.song at sf.snu.ac.kr>, "Dan Gohman" > <dan433584 at gmail.com>, nlewycky at google.com, mbraun at apple.com, qcolombet at apple.com, "t p northover" > <t.p.northover at gmail.com>, llvm-dev at lists.llvm.org > Sent: Tuesday, October 18, 2016 10:10:41 AM > Subject: RE: RFC: Killing undef and spreading poison > > >> A use of freeze is to enable speculative execution. For example, > >> loop > >> switching does the following transformation: > >> while (C) { > >> if (C2) { > >> A > >> } else { > >> B > >> } > >> } > >> => > >> if (C2) { > >> while (C) > >> A > >> } else { > >> while (C) > >> B > >> } > >> > >> Here we are evaluating C2 before C. If the original loop never > >> executed then we had never evaluated C2, while now we do. So we > >> need > >> to make sure there's no UB for branching on C2. Freeze ensures > >> that > >> so we would actually have 'if (freeze(C2))' instead. > >> Note that having branch on poison not trigger UB has its own > >> problems. > > > > Can you please elaborate on these problems? > > Sure! For example, the following transformation would be wrong if > branch on poison was a non-deterministic branch instead of UB: > > %a = add i32 %x, 1 > %c = icmp eq i32 %a, %poison > br i1 %c, label %bb1, label %bb2 > bb1: > %b = add i32 %x, 1 > %d = call i32 @bar(i32 %b) > -> > bb1: > %d = call i32 @bar(i32 % poison) > > GVN will perform this kind of transformation: it concludes that %a, > %b, and %poison are all equal and picks one representative value. > However, these values are equal only when they are not poison.Okay; so the problem is that an instruction that is value-equivalent to a poison value is not itself necessarily poison?> If > %poison is indeed poison, then the transformation is wrong because > before it was passing an ok value to bar() and after the > transformation it is passing poison. > On the other hand, if branching on poison is UB, the original program > was executing UB already because %poison (and therefore %c) were > poison. So the transformation is ok. > > > >> Discussion on select > >> ===================> >> Select is a tricky instruction to define semantics for. There are > >> multiple views of select's purpose that are contradictory: > >> 1) Select should be equivalent to arithmetic (e.g., allow > >> 'select > >> %c, true, %x' -> 'or %c, %x' and arithmetic -> select) > >> 2) br + phi -> select should be allowed (simplifyCFG) > >> 3) Select -> br + phi should be allowed > >> > >> Since we really want to have 2) since that's simplifyCFG, we > >> propose > >> to make 'select %c, %a, %b' poison if any of the following holds: > >> - %c is poison > >> - %c = true and %a is poison > >> - %c = false and %b is poison > >> > >> This disallows 3) and some transformations of 1). Since 3) is only > >> performed in CodeGenPrepare, we can safely ignore it (no > >> aggressive > >> optimizations are run afterwards). > > > > This strikes me as a dangerous game to play. CGP's transformations > > need to be semantically sound (even if they are anti-canonical). > > In part, this is because our IR-level analyses are used by CGP. > > Moreover, targets after CGP run IR-level transformations, and > > having different semantic rules there is going to make code reuse > > hard in subtle ways. Which brings up another issue: We currently > > have undef at the MI level; do you propose changing that, and if > > so, how? > > It is sound to do the transformation at IR level if you freeze the > condition. > My suggestion was to avoid introducing overhead in CGP. This is the > current state of affairs and it seems to work right now. I'm not > shocked to see the IR changing the semantics throughout the > pipeline, since while the compiler proceeds, the type of > transformations done also change. > But I have to say that my understanding of LLVM after CGP is very > limited. I was not aware of the code reuse from IR-level analyses. > I agree this needs to be scrutinized carefully. Alternatively, we > can just introduce freeze and pay the price (likely low anyway).I think we'd need to do this to ensure consistency; I don't think that the price would be high, especially because we soon drop into different representations and the IR is used only for analysis (for pointer aliasing, etc.).> > Regarding undef at MI level, we are not proposing any change. I > don't see any reason right now to change it since it seems that > optimizations at MI level are fine with just having undef. There's > no nsw/nuw stuff there and undef seems very useful.We've already started propagating nsw/nuw into the SDAG, and as we move toward GlobalISel, we'll have them at the MI layer as well. I think we'll need to consider how to propagate this information to the MI layer directly. -Hal> Poison is lowered into undef. And freeze is lowered into this pair of > copy nodes that seems to stop propagation of undef (to ensure all > uses see the same value). I don't know enough about MI to know if > there's a better way to lower freeze, though. > > Thanks, > Nuno > >-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory
Nuno Lopes via llvm-dev
2016-Oct-18 19:35 UTC
[llvm-dev] Killing undef and spreading poison
>> >> Here we are evaluating C2 before C. If the original loop never >> >> executed then we had never evaluated C2, while now we do. So we >> >> need >> >> to make sure there's no UB for branching on C2. Freeze ensures >> >> that >> >> so we would actually have 'if (freeze(C2))' instead. >> >> Note that having branch on poison not trigger UB has its own >> >> problems. >> > >> > Can you please elaborate on these problems? >> >> Sure! For example, the following transformation would be wrong if >> branch on poison was a non-deterministic branch instead of UB: >> >> %a = add i32 %x, 1 >> %c = icmp eq i32 %a, %poison >> br i1 %c, label %bb1, label %bb2 >> bb1: >> %b = add i32 %x, 1 >> %d = call i32 @bar(i32 %b) >> -> >> bb1: >> %d = call i32 @bar(i32 % poison) >> >> GVN will perform this kind of transformation: it concludes that %a, >> %b, and %poison are all equal and picks one representative value. >> However, these values are equal only when they are not poison. > > Okay; so the problem is that an instruction that is value-equivalent to a > poison value is not itself necessarily poison?Right. I think there were other examples, but I don't have them here handy. But at least this one is very problematic for GVN.>> >> Discussion on select >> >> ===================>> >> Select is a tricky instruction to define semantics for. There are >> >> multiple views of select's purpose that are contradictory: >> >> 1) Select should be equivalent to arithmetic (e.g., allow >> >> 'select >> >> %c, true, %x' -> 'or %c, %x' and arithmetic -> select) >> >> 2) br + phi -> select should be allowed (simplifyCFG) >> >> 3) Select -> br + phi should be allowed >> >> >> >> Since we really want to have 2) since that's simplifyCFG, we >> >> propose >> >> to make 'select %c, %a, %b' poison if any of the following holds: >> >> - %c is poison >> >> - %c = true and %a is poison >> >> - %c = false and %b is poison >> >> >> >> This disallows 3) and some transformations of 1). Since 3) is only >> >> performed in CodeGenPrepare, we can safely ignore it (no >> >> aggressive >> >> optimizations are run afterwards). >> > >> > This strikes me as a dangerous game to play. CGP's transformations >> > need to be semantically sound (even if they are anti-canonical). >> > In part, this is because our IR-level analyses are used by CGP. >> > Moreover, targets after CGP run IR-level transformations, and >> > having different semantic rules there is going to make code reuse >> > hard in subtle ways. Which brings up another issue: We currently >> > have undef at the MI level; do you propose changing that, and if >> > so, how? >> >> It is sound to do the transformation at IR level if you freeze the >> condition. >> My suggestion was to avoid introducing overhead in CGP. This is the >> current state of affairs and it seems to work right now. I'm not >> shocked to see the IR changing the semantics throughout the >> pipeline, since while the compiler proceeds, the type of >> transformations done also change. >> But I have to say that my understanding of LLVM after CGP is very >> limited. I was not aware of the code reuse from IR-level analyses. >> I agree this needs to be scrutinized carefully. Alternatively, we >> can just introduce freeze and pay the price (likely low anyway). > > I think we'd need to do this to ensure consistency; I don't think that the > price would be high, especially because we soon drop into different > representations and the IR is used only for analysis (for pointer > aliasing, etc.).Ok, makes sense; thanks!>> Regarding undef at MI level, we are not proposing any change. I >> don't see any reason right now to change it since it seems that >> optimizations at MI level are fine with just having undef. There's >> no nsw/nuw stuff there and undef seems very useful. > > We've already started propagating nsw/nuw into the SDAG, and as we move > toward GlobalISel, we'll have them at the MI layer as well. I think we'll > need to consider how to propagate this information to the MI layer > directly.I see. Then it might make sense to introduce poison in MI then. But only if there's a plan to start doing optimizations that leverage nsw/nuw at MI level as well. Otherwise undef is just fine. I guess we need to do this in phases, though, to avoid breaking everything at the same time :) Nuno