In the section about poison values, the LLVM language reference manual says: "Values other than phi nodes depend on their operands." This implies that a select instruction's output can be poisoned by its not-selected argument value. If select were poisoned only by its selected argument, we would expect this fact to be mentioned specifically, as it is for phi. Next I'll show how poisoning by the not-selected value can introduce a problem. Take this C program: int printf(const char *, ...); int foo(int x0) { int x3 = x0 >> 27; int x4 = x3 - 27; int x2 = x4 ? x4 : (1 >> x4); int x1 = x2 != 0; return x1; } int arg; int main (void) { int x1 = foo(arg); printf("%x\n", x1); return 0; } This program has no UB and it should print "1'. Clang at -O2 turns foo() into this: define i32 @foo(i32 %x0) #0 { entry: %shr = ashr i32 %x0, 27 %sub = add nsw i32 %shr, -27 %tobool = icmp ne i32 %sub, 0 %shr1 = lshr i32 1, %sub %cond = select i1 %tobool, i32 %sub, i32 %shr1 %cmp = icmp ne i32 %cond, 0 %conv = zext i1 %cmp to i32 ret i32 %conv } Although the translation is very straightforward, there's a problem: %shr1 is a poison value and although it is not selected, it propagates through the rest of the function and poisons the return value. Obviously things are not going well if the source program contains no UB and the emitted code returns a poison value. Either there has been a miscompilation or else the LLVM documentation should be changed to make it clear that the select instruction only has a value dependence on the value that is selected. The immediate problem here is that both Souper and Alive are happy for foo() to return 0 instead of 1. Souper actually does this optimization -- legally, of course, according to the reference manual. Anyway this issue seemed tricky enough to bring up here instead of just filing a PR. Thanks, John
On Tue, Sep 9, 2014 at 9:50 AM, John Regehr <regehr at cs.utah.edu> wrote:> In the section about poison values, the LLVM language reference manual > says: > > "Values other than phi nodes depend on their operands." > > This implies that a select instruction's output can be poisoned by its > not-selected argument value. If select were poisoned only by its selected > argument, we would expect this fact to be mentioned specifically, as it is > for phi. > > Next I'll show how poisoning by the not-selected value can introduce a > problem. Take this C program: > > int printf(const char *, ...); > int foo(int x0) { > int x3 = x0 >> 27; > int x4 = x3 - 27; > int x2 = x4 ? x4 : (1 >> x4); > int x1 = x2 != 0; > return x1; > } > int arg; > int main (void) { > int x1 = foo(arg); > printf("%x\n", x1); > return 0; > } > > This program has no UB and it should print "1'. > > Clang at -O2 turns foo() into this: > > define i32 @foo(i32 %x0) #0 { > entry: > %shr = ashr i32 %x0, 27 > %sub = add nsw i32 %shr, -27 > %tobool = icmp ne i32 %sub, 0 > %shr1 = lshr i32 1, %sub > %cond = select i1 %tobool, i32 %sub, i32 %shr1 > %cmp = icmp ne i32 %cond, 0 > %conv = zext i1 %cmp to i32 > ret i32 %conv > } > > Although the translation is very straightforward, there's a problem: %shr1 > is a poison value and although it is not selected, it propagates through > the rest of the function and poisons the return value. > > Obviously things are not going well if the source program contains no UB > and the emitted code returns a poison value. Either there has been a > miscompilation or else the LLVM documentation should be changed to make it > clear that the select instruction only has a value dependence on the value > that is selected. >While the documentation may not match the facts on the ground, InstructionSimplify does (in my opinion) the right thing; an undef operand doesn't mean the entire SelectInst folds away to undef: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/InstructionSimplify.cpp?revision=217342&view=markup#l2880> > The immediate problem here is that both Souper and Alive are happy for > foo() to return 0 instead of 1. Souper actually does this optimization -- > legally, of course, according to the reference manual. > > Anyway this issue seemed tricky enough to bring up here instead of just > filing a PR. > > Thanks, > > John > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140910/3c9769ac/attachment.html>
> While the documentation may not match the facts on the ground, InstructionSimplify does (in my opinion) the > right thing; an undef operand doesn't mean the entire SelectInst folds away to undef: > http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/InstructionSimplify.cpp?revision=217342&view=m > arkup#l2880It seems pretty clear that this is the right semantics for select. If we believe the documentation, then select is only useful when we can prove that it won't introduce a spurious poison value. This will not often be the case. I'll go file a bug against the LLVM instruction reference to remind someone to tweak the doc. My proposed fix would be: * Values other than phi nodes and select instructions depend on their operands. * Phi nodes depend on the operand corresponding to their dynamic predecessor basic block. * Select instructions depend on their selected operand. John
Original inventor of the poison concept here. Poison is a flawed concept. I proved it was flawed back in 2011 [0] [0] http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-December/046368.html And since then, other holes have been found. I'm not aware of any serious proposals for its replacement that do not have similar problems. Also note that poison is very different from undef. Undef, as far as I know, remains an entirely coherent concept. Dan On Tue, Sep 9, 2014 at 6:50 AM, John Regehr <regehr at cs.utah.edu> wrote:> In the section about poison values, the LLVM language reference manual > says: > > "Values other than phi nodes depend on their operands." > > This implies that a select instruction's output can be poisoned by its > not-selected argument value. If select were poisoned only by its selected > argument, we would expect this fact to be mentioned specifically, as it is > for phi. > > Next I'll show how poisoning by the not-selected value can introduce a > problem. Take this C program: > > int printf(const char *, ...); > int foo(int x0) { > int x3 = x0 >> 27; > int x4 = x3 - 27; > int x2 = x4 ? x4 : (1 >> x4); > int x1 = x2 != 0; > return x1; > } > int arg; > int main (void) { > int x1 = foo(arg); > printf("%x\n", x1); > return 0; > } > > This program has no UB and it should print "1'. > > Clang at -O2 turns foo() into this: > > define i32 @foo(i32 %x0) #0 { > entry: > %shr = ashr i32 %x0, 27 > %sub = add nsw i32 %shr, -27 > %tobool = icmp ne i32 %sub, 0 > %shr1 = lshr i32 1, %sub > %cond = select i1 %tobool, i32 %sub, i32 %shr1 > %cmp = icmp ne i32 %cond, 0 > %conv = zext i1 %cmp to i32 > ret i32 %conv > } > > Although the translation is very straightforward, there's a problem: %shr1 > is a poison value and although it is not selected, it propagates through > the rest of the function and poisons the return value. > > Obviously things are not going well if the source program contains no UB > and the emitted code returns a poison value. Either there has been a > miscompilation or else the LLVM documentation should be changed to make it > clear that the select instruction only has a value dependence on the value > that is selected. > > The immediate problem here is that both Souper and Alive are happy for > foo() to return 0 instead of 1. Souper actually does this optimization -- > legally, of course, according to the reference manual. > > Anyway this issue seemed tricky enough to bring up here instead of just > filing a PR. > > Thanks, > > John > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140911/3f562d4b/attachment.html>
> Poison is a flawed concept. I proved it was flawed back in 2011 [0]Nice. My colleagues and I will dig through this material and possibly come back with some ideas. We're going to need some sort of semantics for UB in LLVM since otherwise these formal-methods-based tools like Souper and Alive risk not making sense. Thanks, John