One way around this is to say that there are some special instructions, icmp, sext and zext which produce a value solely composed of poison bits if any of their input bits is poison. So `<poison> icmp X` is poison for any value of X, including INT_MAX. This is one way poison could be fundamentally different from undef. -- Sanjoy On Thu, Jan 29, 2015 at 8:05 PM, Matthias Braun <matze at braunis.de> wrote:> Having though about this some more I think optimizing > > (x+1 > x) <=> true > > and at the same time modeling undefined behavior as a posion value is impossible. This is because we also want the following rule: > > (x > INT_MAX) <=> false > > Now if poison is a value, then the second replacement tells us (poison > INT_MAX) == false which contradicts the first rule. > > > > The only way out of this while still allowing (x+1>x)<=>true I can see at the moment is defining that add nsw does produce actual undefined behavior allowing us to “freely rewrite" the following > to true in the UB cases. Of course having add produce actual undefined behaviour greatly limits us in the way we can actually move the instruction around without changing program semantics. > > The only way allowing moving instructions and having add produce real UB I can see is that as soon as we start moving instructions around (specifically moving the add to a place it does not dominate or moving any other instruction over the add) we change the add to an instruction that does not produce real UB anymore; something like add swu = “add signed wrap gives undef”... > > - Matthias > >> On Jan 29, 2015, at 11:29 AM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: >> >>> I've been discussing a model with David that might steer poison back towards >>> something that simply supports algebraic simplification. If we have a math >>> operation that cannot wrap, then it notionally produces as many bits of >>> undef as the operation could possibly produce. For example, "add nsw i8" can >>> produce an i9 undef, and "mul nsw i8" can produce an undefined 16 bit >>> bitpattern. This is strong enough to do things like "a + 1 > a --> true", >>> because on overflow of "a + 1" we can choose an poison value of "MAX_INT + >>> 1", even though that is not a valid i8 bit pattern. >>> >>> So, a short version would be that poison is like undef, except you get to >>> include the overflow bits of the computation in your undef value. >> >> I suspect it will be hard to justify reg2mem is meaning preserving in >> this scheme. But if this can be made to work, then I agree that this >> is the right thing to do -- an i32 poison effectively has the >> semantics of a wider type, and it makes sense to be explicit in that. >> >> -- Sanjoy >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >
But (Poison > INT_MAX) <=> poison contradicts (X > INT_MAX) <=> false and I don't think you want to abandon the second rule just because x might be poison. - Matthias> On Jan 29, 2015, at 9:43 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: > > One way around this is to say that there are some special > instructions, icmp, sext and zext which produce a value solely > composed of poison bits if any of their input bits is poison. So > `<poison> icmp X` is poison for any value of X, including INT_MAX. > This is one way poison could be fundamentally different from undef. > > -- Sanjoy > >> On Thu, Jan 29, 2015 at 8:05 PM, Matthias Braun <matze at braunis.de> wrote: >> Having though about this some more I think optimizing >> >> (x+1 > x) <=> true >> >> and at the same time modeling undefined behavior as a posion value is impossible. This is because we also want the following rule: >> >> (x > INT_MAX) <=> false >> >> Now if poison is a value, then the second replacement tells us (poison > INT_MAX) == false which contradicts the first rule. >> >> >> >> The only way out of this while still allowing (x+1>x)<=>true I can see at the moment is defining that add nsw does produce actual undefined behavior allowing us to “freely rewrite" the following > to true in the UB cases. Of course having add produce actual undefined behaviour greatly limits us in the way we can actually move the instruction around without changing program semantics. >> >> The only way allowing moving instructions and having add produce real UB I can see is that as soon as we start moving instructions around (specifically moving the add to a place it does not dominate or moving any other instruction over the add) we change the add to an instruction that does not produce real UB anymore; something like add swu = “add signed wrap gives undef”... >> >> - Matthias >> >>>> On Jan 29, 2015, at 11:29 AM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: >>>> >>>> I've been discussing a model with David that might steer poison back towards >>>> something that simply supports algebraic simplification. If we have a math >>>> operation that cannot wrap, then it notionally produces as many bits of >>>> undef as the operation could possibly produce. For example, "add nsw i8" can >>>> produce an i9 undef, and "mul nsw i8" can produce an undefined 16 bit >>>> bitpattern. This is strong enough to do things like "a + 1 > a --> true", >>>> because on overflow of "a + 1" we can choose an poison value of "MAX_INT + >>>> 1", even though that is not a valid i8 bit pattern. >>>> >>>> So, a short version would be that poison is like undef, except you get to >>>> include the overflow bits of the computation in your undef value. >>> >>> I suspect it will be hard to justify reg2mem is meaning preserving in >>> this scheme. But if this can be made to work, then I agree that this >>> is the right thing to do -- an i32 poison effectively has the >>> semantics of a wider type, and it makes sense to be explicit in that. >>> >>> -- Sanjoy >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On Thu, Jan 29, 2015 at 10:01 PM, Matthias Braun <matze at braunis.de> wrote:> But > (Poison > INT_MAX) <=> poison > contradicts > (X > INT_MAX) <=> false > > and I don't think you want to abandon the second rule just because x might be poison.Maybe we could define poison in such a way that it is safe to pretend it "is" false, as per our convenience. In this sense, could be defined to be very similar to undef. -- Sanjoy> > - Matthias > >> On Jan 29, 2015, at 9:43 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: >> >> One way around this is to say that there are some special >> instructions, icmp, sext and zext which produce a value solely >> composed of poison bits if any of their input bits is poison. So >> `<poison> icmp X` is poison for any value of X, including INT_MAX. >> This is one way poison could be fundamentally different from undef. >> >> -- Sanjoy >> >>> On Thu, Jan 29, 2015 at 8:05 PM, Matthias Braun <matze at braunis.de> wrote: >>> Having though about this some more I think optimizing >>> >>> (x+1 > x) <=> true >>> >>> and at the same time modeling undefined behavior as a posion value is impossible. This is because we also want the following rule: >>> >>> (x > INT_MAX) <=> false >>> >>> Now if poison is a value, then the second replacement tells us (poison > INT_MAX) == false which contradicts the first rule. >>> >>> >>> >>> The only way out of this while still allowing (x+1>x)<=>true I can see at the moment is defining that add nsw does produce actual undefined behavior allowing us to “freely rewrite" the following > to true in the UB cases. Of course having add produce actual undefined behaviour greatly limits us in the way we can actually move the instruction around without changing program semantics. >>> >>> The only way allowing moving instructions and having add produce real UB I can see is that as soon as we start moving instructions around (specifically moving the add to a place it does not dominate or moving any other instruction over the add) we change the add to an instruction that does not produce real UB anymore; something like add swu = “add signed wrap gives undef”... >>> >>> - Matthias >>> >>>>> On Jan 29, 2015, at 11:29 AM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: >>>>> >>>>> I've been discussing a model with David that might steer poison back towards >>>>> something that simply supports algebraic simplification. If we have a math >>>>> operation that cannot wrap, then it notionally produces as many bits of >>>>> undef as the operation could possibly produce. For example, "add nsw i8" can >>>>> produce an i9 undef, and "mul nsw i8" can produce an undefined 16 bit >>>>> bitpattern. This is strong enough to do things like "a + 1 > a --> true", >>>>> because on overflow of "a + 1" we can choose an poison value of "MAX_INT + >>>>> 1", even though that is not a valid i8 bit pattern. >>>>> >>>>> So, a short version would be that poison is like undef, except you get to >>>>> include the overflow bits of the computation in your undef value. >>>> >>>> I suspect it will be hard to justify reg2mem is meaning preserving in >>>> this scheme. But if this can be made to work, then I agree that this >>>> is the right thing to do -- an i32 poison effectively has the >>>> semantics of a wider type, and it makes sense to be explicit in that. >>>> >>>> -- Sanjoy >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On Thu, Jan 29, 2015 at 9:43 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote:> One way around this is to say that there are some special > instructions, icmp, sext and zext which produce a value solely > composed of poison bits if any of their input bits is poison. So > `<poison> icmp X` is poison for any value of X, including INT_MAX. > This is one way poison could be fundamentally different from undef. >So far, this is the model I like the best, but I do still have some concern. The primary concern I have is that with this model, zext is no longer 100% equivalent to anyext + mask. Much like you say, you *could* implement zext that way, but once you convert them, the poison is lost. Maybe that's OK though. I'm curious what others think. I haven't really had enough time to fully explore this in my head. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150129/318772ed/attachment.html>