thr3ads.net - llvm dev - [LLVMdev] poison and select [Sep 2014]

If this information is useful, please help other people find it:
Share via:

John Regehr

2014-Sep-09 13:50 UTC

[LLVMdev] poison and select

In the section about poison values, the LLVM language reference manual says:

   "Values other than phi nodes depend on their operands."

This implies that a select instruction's output can be poisoned by its 
not-selected argument value. If select were poisoned only by its 
selected argument, we would expect this fact to be mentioned 
specifically, as it is for phi.

Next I'll show how poisoning by the not-selected value can introduce a 
problem. Take this C program:

int printf(const char *, ...);
int foo(int x0) {
   int x3 = x0 >> 27;
   int x4 = x3 - 27;
   int x2 = x4 ? x4 : (1 >> x4);
   int x1 = x2 != 0;
   return x1;
}
int arg;
int main (void) {
   int x1 = foo(arg);
   printf("%x\n", x1);
   return 0;
}

This program has no UB and it should print "1'.

Clang at -O2 turns foo() into this:

define i32 @foo(i32 %x0) #0 {
entry:
   %shr = ashr i32 %x0, 27
   %sub = add nsw i32 %shr, -27
   %tobool = icmp ne i32 %sub, 0
   %shr1 = lshr i32 1, %sub
   %cond = select i1 %tobool, i32 %sub, i32 %shr1
   %cmp = icmp ne i32 %cond, 0
   %conv = zext i1 %cmp to i32
   ret i32 %conv
}

Although the translation is very straightforward, there's a problem: 
%shr1 is a poison value and although it is not selected, it propagates 
through the rest of the function and poisons the return value.

Obviously things are not going well if the source program contains no UB 
and the emitted code returns a poison value. Either there has been a 
miscompilation or else the LLVM documentation should be changed to make 
it clear that the select instruction only has a value dependence on the 
value that is selected.

The immediate problem here is that both Souper and Alive are happy for 
foo() to return 0 instead of 1. Souper actually does this optimization 
-- legally, of course, according to the reference manual.

Anyway this issue seemed tricky enough to bring up here instead of just 
filing a PR.

Thanks,

John

David Majnemer

2014-Sep-10 14:39 UTC

head link

[LLVMdev] poison and select

On Tue, Sep 9, 2014 at 9:50 AM, John Regehr <regehr at cs.utah.edu> wrote:
> In the section about poison values, the LLVM language reference manual
> says:
>
>   "Values other than phi nodes depend on their operands."
>
> This implies that a select instruction's output can be poisoned by its
> not-selected argument value. If select were poisoned only by its selected
> argument, we would expect this fact to be mentioned specifically, as it is
> for phi.
>
> Next I'll show how poisoning by the not-selected value can introduce a
> problem. Take this C program:
>
> int printf(const char *, ...);
> int foo(int x0) {
>   int x3 = x0 >> 27;
>   int x4 = x3 - 27;
>   int x2 = x4 ? x4 : (1 >> x4);
>   int x1 = x2 != 0;
>   return x1;
> }
> int arg;
> int main (void) {
>   int x1 = foo(arg);
>   printf("%x\n", x1);
>   return 0;
> }
>
> This program has no UB and it should print "1'.
>
> Clang at -O2 turns foo() into this:
>
> define i32 @foo(i32 %x0) #0 {
> entry:
>   %shr = ashr i32 %x0, 27
>   %sub = add nsw i32 %shr, -27
>   %tobool = icmp ne i32 %sub, 0
>   %shr1 = lshr i32 1, %sub
>   %cond = select i1 %tobool, i32 %sub, i32 %shr1
>   %cmp = icmp ne i32 %cond, 0
>   %conv = zext i1 %cmp to i32
>   ret i32 %conv
> }
>
> Although the translation is very straightforward, there's a problem:
%shr1
> is a poison value and although it is not selected, it propagates through
> the rest of the function and poisons the return value.
>
> Obviously things are not going well if the source program contains no UB
> and the emitted code returns a poison value. Either there has been a
> miscompilation or else the LLVM documentation should be changed to make it
> clear that the select instruction only has a value dependence on the value
> that is selected.
>
While the documentation may not match the facts on the
ground, InstructionSimplify does (in my opinion) the right thing; an undef
operand doesn't mean the entire SelectInst folds away to undef:
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/InstructionSimplify.cpp?revision=217342&view=markup#l2880

>
> The immediate problem here is that both Souper and Alive are happy for
> foo() to return 0 instead of 1. Souper actually does this optimization --
> legally, of course, according to the reference manual.
>
> Anyway this issue seemed tricky enough to bring up here instead of just
> filing a PR.
>
> Thanks,
>
> John
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140910/3c9769ac/attachment.html>

John Regehr

2014-Sep-10 15:19 UTC

head link

[LLVMdev] poison and select

> While the documentation may not match the facts on the
ground, InstructionSimplify does (in my opinion) the
> right thing; an undef operand doesn't mean the entire SelectInst folds
away to undef:
>
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/InstructionSimplify.cpp?revision=217342&view=m
> arkup#l2880
It seems pretty clear that this is the right semantics for select.

If we believe the documentation, then select is only useful when we can 
prove that it won't introduce a spurious poison value. This will not often 
be the case.

I'll go file a bug against the LLVM instruction reference to remind 
someone to tweak the doc. My proposed fix would be:

* Values other than phi nodes and select instructions depend on their 
operands.

* Phi nodes depend on the operand corresponding to their dynamic 
predecessor basic block.

* Select instructions depend on their selected operand.

John

Dan Gohman

2014-Sep-11 21:46 UTC

head link

[LLVMdev] poison and select

Original inventor of the poison concept here.

Poison is a flawed concept. I proved it was flawed back in 2011 [0]

[0] http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-December/046368.html

And since then, other holes have been found. I'm not aware of any serious
proposals for its replacement that do not have similar problems.

Also note that poison is very different from undef. Undef, as far as I
know, remains an entirely coherent concept.

Dan

On Tue, Sep 9, 2014 at 6:50 AM, John Regehr <regehr at cs.utah.edu> wrote:
> In the section about poison values, the LLVM language reference manual
> says:
>
>   "Values other than phi nodes depend on their operands."
>
> This implies that a select instruction's output can be poisoned by its
> not-selected argument value. If select were poisoned only by its selected
> argument, we would expect this fact to be mentioned specifically, as it is
> for phi.
>
> Next I'll show how poisoning by the not-selected value can introduce a
> problem. Take this C program:
>
> int printf(const char *, ...);
> int foo(int x0) {
>   int x3 = x0 >> 27;
>   int x4 = x3 - 27;
>   int x2 = x4 ? x4 : (1 >> x4);
>   int x1 = x2 != 0;
>   return x1;
> }
> int arg;
> int main (void) {
>   int x1 = foo(arg);
>   printf("%x\n", x1);
>   return 0;
> }
>
> This program has no UB and it should print "1'.
>
> Clang at -O2 turns foo() into this:
>
> define i32 @foo(i32 %x0) #0 {
> entry:
>   %shr = ashr i32 %x0, 27
>   %sub = add nsw i32 %shr, -27
>   %tobool = icmp ne i32 %sub, 0
>   %shr1 = lshr i32 1, %sub
>   %cond = select i1 %tobool, i32 %sub, i32 %shr1
>   %cmp = icmp ne i32 %cond, 0
>   %conv = zext i1 %cmp to i32
>   ret i32 %conv
> }
>
> Although the translation is very straightforward, there's a problem:
%shr1
> is a poison value and although it is not selected, it propagates through
> the rest of the function and poisons the return value.
>
> Obviously things are not going well if the source program contains no UB
> and the emitted code returns a poison value. Either there has been a
> miscompilation or else the LLVM documentation should be changed to make it
> clear that the select instruction only has a value dependence on the value
> that is selected.
>
> The immediate problem here is that both Souper and Alive are happy for
> foo() to return 0 instead of 1. Souper actually does this optimization --
> legally, of course, according to the reference manual.
>
> Anyway this issue seemed tricky enough to bring up here instead of just
> filing a PR.
>
> Thanks,
>
> John
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140911/3f562d4b/attachment.html>

John Regehr

2014-Sep-12 02:17 UTC

head link

[LLVMdev] poison and select

> Poison is a flawed concept. I proved it was flawed back in 2011 [0]
Nice. My colleagues and I will dig through this material and possibly come 
back with some ideas. We're going to need some sort of semantics for UB in 
LLVM since otherwise these formal-methods-based tools like Souper and 
Alive risk not making sense.

Thanks,

John

llvm dev - Sep 2014 - [LLVMdev] poison and select

[LLVMdev] poison and select

[LLVMdev] poison and select

[LLVMdev] poison and select

[LLVMdev] poison and select

[LLVMdev] poison and select