> On Jun 29, 2017, at 4:39 AM, Hal Finkel <hfinkel at anl.gov> wrote: > > On 06/28/2017 05:33 PM, Peter Lawrence wrote: >> Chandler, >> where we disagree is in whether the current project is moving the issue >> forward. It is not. It is making the compiler more complex for no additional value. >> >> The current project is not based in evidence, I have asked for any SPEC benchmark >> that shows performance gain by the compiler taking advantage of “undefined behavior” >> and no one can show that. > > I can't comment on SPEC, but this does remind me of code I was working on recently. To abstract the relevant parts, it looked something like this: > > template <typename T> > int do_something(T mask, bool cond) { > if (mask & 2) > return 1; > > if (cond) { > T high_mask = mask >> 48; > if (high_mask > 5) > do_something_1(high_mask); > else if (high_mask > 3) > do_something_2(); > } > > return 0; > } > > This function ended up being instantiated on different types T (e.g. unsigned char, unsigned int, unsigned long, etc.) and, dynamically, cond was always false when T was char. The question is: Can the compiler eliminate all of the code predicated on cond for the smaller types? In this case, this code was hot, and moreover, performance depended on the fact that, for T = unsigned char, the function was inlined and the branch on cond was eliminated. In the relevant translation unit, however, the compiler would never see how cond was set. > > Luckily, we do the right thing here currently. In the case where T = unsigned char, we end up folding both of the high_mask tests as though they were false. That entire part of the code is eliminated, the function is inlined, and everyone is happy. > > Why was I looking at this? As it turns out, if the 'else if' in this example is just 'else', we don't actually eliminate both sides of the branch. The same is true for many other variants of the conditionals (i.e. we don't recognize all of the code as dead).I apologize in advance if I have missed something here and am misreading your example... This doesn’t make sense to me, a shift amount of 48 is “undefined” for unsigned char, How do we know this isn’t a source code bug, What makes us think the the user intended the result to be “0”. This strikes me as odd, we are mis-interpreting the user’s code In such a way so as to improve performance, but that isn’t necessarily what the user intended. Here’s one way to look at this issue, if something is “C undefined behavior” then The standard says, among other things, that we could trap here Why aren’t we doing that rather than optimizing it ? Here’s another way to look at it, no one has ever filed a bug that reads “I used undefined behavior in my program, but the optimizer isn’t taking advantage of it” But if they do I think the response should be “you should not expect that, standard says nothing positive about what undefined behavior does"> Once we have a self-consistent model for undef, we should be able to fix that. The user was confused, however, why seemingly innocuous changes to the code changed the performance characteristics of their application. The proposed semantics by John, et al. should fix this uniformly. > > In any case, to your point about: > >> if (a == a) >> S; > > I have the same thought. If a == undef here, the code should be dead. Dead code must be aggressively dropped to enable inlining and further optimization. This is an important way we eliminate abstraction penalties. Dead code also has costs in terms of register allocation, speculative execution, inlining, etc. >And yet IIRC Sanjoy in his last email was arguing for consistent behavior in cases like If (x != 0) { /* we can optimize in the then-clause assuming x != 0 */ } And in the case above when it is a function that gets inlined Here’s what Sanjoy said about the function-inline case> This too is fixed in the semantics mentioned in the paper. This also > isn't new to us, it is covered in section 3.1 "Duplicate SSA Uses".So this issue seems to be up in the air> I've also seen cases where templated types are used with fixed-sized arrays where the compiler to leveraged knowledge of UB on uninitialized values and out-of-bounds accesses to eliminate unnecessary part of the code. In short, "optimizing on undefined behavior" can end up being an important tool. >As you can tell from my first comments, I am not yet convinced, and would still like to see real evidence Peter Lawrence.> -Hal > -- > Hal Finkel > Lead, Compiler Technology and Programming Languages > Leadership Computing Facility > Argonne National Laboratory-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170629/b742f1c1/attachment.html>
On Thu, Jun 29, 2017 at 08:41:59AM -0700, Peter Lawrence via llvm-dev wrote:> This doesn’t make sense to me, a shift amount of 48 is “undefined” for unsigned char, > How do we know this isn’t a source code bug, > What makes us think the the user intended the result to be “0”.It is a source bug, if the code is ever executed. This is in fact a class of real world bugs as CPUs *do* implement overly large shifts differently. There are two different views here: (1) It obviously means the result should be zero, since all bits are shifted out. (2) It is faster to just mask the operand and avoid any compares in the ALU. It has the nice side effect of simplifying rotation in software. ARM and X86 are examples for those views. Joerg
On 06/29/2017 10:41 AM, Peter Lawrence wrote:> >> On Jun 29, 2017, at 4:39 AM, Hal Finkel <hfinkel at anl.gov >> <mailto:hfinkel at anl.gov>> wrote: >> >> On 06/28/2017 05:33 PM, Peter Lawrence wrote: >>> Chandler, >>> where we disagree is in whether the current project >>> is moving the issue >>> forward. It is not. It is making the compiler more complex for no >>> additional value. >>> >>> The current project is not based in evidence, I have asked for any >>> SPEC benchmark >>> that shows performance gain by the compiler taking advantage of >>> “undefined behavior” >>> and no one can show that. >> >> I can't comment on SPEC, but this does remind me of code I was >> working on recently. To abstract the relevant parts, it looked >> something like this: >> >> template <typename T> >> int do_something(T mask, bool cond) { >> if (mask & 2) >> return 1; >> >> if (cond) { >> T high_mask = mask >> 48; >> if (high_mask > 5) >> do_something_1(high_mask); >> else if (high_mask > 3) >> do_something_2(); >> } >> >> return 0; >> } >> >> This function ended up being instantiated on different types T (e.g. >> unsigned char, unsigned int, unsigned long, etc.) and, dynamically, >> cond was always false when T was char. The question is: Can the >> compiler eliminate all of the code predicated on cond for the smaller >> types? In this case, this code was hot, and moreover, performance >> depended on the fact that, for T = unsigned char, the function was >> inlined and the branch on cond was eliminated. In the relevant >> translation unit, however, the compiler would never see how cond was set. >> >> Luckily, we do the right thing here currently. In the case where T = >> unsigned char, we end up folding both of the high_mask tests as >> though they were false. That entire part of the code is eliminated, >> the function is inlined, and everyone is happy. >> >> Why was I looking at this? As it turns out, if the 'else if' in this >> example is just 'else', we don't actually eliminate both sides of the >> branch. The same is true for many other variants of the conditionals >> (i.e. we don't recognize all of the code as dead). > > > I apologize in advance if I have missed something here and am > misreading your example... > > This doesn’t make sense to me, a shift amount of 48 is “undefined” for > unsigned char, > How do we know this isn’t a source code bug, > What makes us think the the user intended the result to be “0”.As I said, this is representation of what the real code did, and looked like, after other inlining had taken place, etc. In the original form, the user's intent was clear. That code is never executed when T is a small integer type.> > This strikes me as odd, we are mis-interpreting the user’s code > In such a way so as to improve performance, but that isn’t necessarily > what the user intended.That is exactly what the user intended. That's why I brought it up as an example.> > Here’s one way to look at this issue, if something is “C undefined > behavior” then > The standard says, among other things, that we could trap here > Why aren’t we doing that rather than optimizing it ?We could. In fact, we have great tools (UBSan, ASan, etc.) that will instrument the code to do exactly that.> > Here’s another way to look at it, no one has ever filed a bug that reads > “I used undefined behavior in my program, but the optimizer isn’t > taking advantage of it”You say that as though it is true. It is not. Yes, users file bugs like that (although they don't often phrase it as "undefined behavior", but rather, "the compiler should figure out that...", and often, taking advantage of UB is the only available way for the compiler to figure that thing out). Type-based aliasing rules are another common case where this UB-exploitation comes up (although not in a way that directly deals with undef/poison).> But if they do I think the response should be > “you should not expect that, standard says nothing positive about what > undefined behavior does" >And, of course, often we do have to tell our users that the compiler has no way to figure something out. When we have a tool, and sometimes that tool is exploiting our assumptions that UB does not happen, then we use it. You may disagree with decisions to exploit certain classes of UB is certain situations, and that's fine. We do use our professional judgment and experience to draw a line somewhere in this regard.> > >> Once we have a self-consistent model for undef, we should be able to >> fix that. The user was confused, however, why seemingly innocuous >> changes to the code changed the performance characteristics of their >> application. The proposed semantics by John, et al. should fix this >> uniformly. >> >> In any case, to your point about: >> >>> if (a == a) >>> S; >> >> I have the same thought. If a == undef here, the code should be dead. >> Dead code must be aggressively dropped to enable inlining and further >> optimization. This is an important way we eliminate abstraction >> penalties. Dead code also has costs in terms of register allocation, >> speculative execution, inlining, etc. >> > > And yet IIRC Sanjoy in his last email was arguing for consistent > behavior in cases like > If (x != 0) { > /* we can optimize in the then-clause assuming x != 0 */ > } > And in the case above when it is a function that gets inlinedI don't believe these are contradictory statements. In the proposed semantics, we get to assume that branching on poison is UB, and thus, doesn't happen. So, if it were inevitable on some code path, that code path must be dead.> > Here’s what Sanjoy said about the function-inline case > > > This too is fixed in the semantics mentioned in the paper. This also > > isn't new to us, it is covered in section 3.1 "Duplicate SSA Uses". > > So this issue seems to be up in the air > > > >> I've also seen cases where templated types are used with fixed-sized >> arrays where the compiler to leveraged knowledge of UB on >> uninitialized values and out-of-bounds accesses to eliminate >> unnecessary part of the code. In short, "optimizing on undefined >> behavior" can end up being an important tool. >> > > As you can tell from my first comments, I am not yet convinced, and > would still like to see real evidenceI understand. However, to say that it is not useful to optimize based on UB, even explicit UB, or that this is never something that users desire, is not true. -Hal> > > Peter Lawrence. > > > >> -Hal >> -- >> Hal Finkel >> Lead, Compiler Technology and Programming Languages >> Leadership Computing Facility >> Argonne National Laboratory >-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170629/3d3a7863/attachment.html>
Hi Peter, On Thu, Jun 29, 2017 at 8:41 AM, Peter Lawrence <peterl95124 at sbcglobal.net> wrote:> Here’s another way to look at it, no one has ever filed a bug that reads > “I used undefined behavior in my program, but the optimizer isn’t taking advantage of it” > But if they do I think the response should be > “you should not expect that, standard says nothing positive about what undefined behavior does"Of course no one would file such a bug (since if your program has UB, the first thing you do is fix your program). However, there are plenty of bugs where people complain about: "LLVM does not optimize my (UB-free) program under the assumption that it does not have UB" (which is what poison allows): https://bugs.llvm.org/show_bug.cgi?id=28429 https://groups.google.com/forum/#!topic/llvm-dev/JGsDrfvS5wc> Once we have a self-consistent model for undef, we should be able to fix > that. The user was confused, however, why seemingly innocuous changes to the > code changed the performance characteristics of their application. The > proposed semantics by John, et al. should fix this uniformly. > > In any case, to your point about: > > if (a == a) > S; > > > I have the same thought. If a == undef here, the code should be dead. Dead > code must be aggressively dropped to enable inlining and further > optimization. This is an important way we eliminate abstraction penalties. > Dead code also has costs in terms of register allocation, speculative > execution, inlining, etc. > > > And yet IIRC Sanjoy in his last email was arguing for consistent behavior > in cases like > If (x != 0) { > /* we can optimize in the then-clause assuming x != 0 */ > } > And in the case above when it is a function that gets inlined > > Here’s what Sanjoy said about the function-inline case > >> This too is fixed in the semantics mentioned in the paper. This also >> isn't new to us, it is covered in section 3.1 "Duplicate SSA Uses". > > So this issue seems to be up in the airThis issue is *not* up in the air -- the paper addresses this problem in the new semantics in the way Hal described: since "if (poison =poison)" is explicitly UB in the new semantics, we will be able to aggressively drop the comparison and everything that it dominates.> I've also seen cases where templated types are used with fixed-sized arrays > where the compiler to leveraged knowledge of UB on uninitialized values and > out-of-bounds accesses to eliminate unnecessary part of the code. In short, > "optimizing on undefined behavior" can end up being an important tool. > > > As you can tell from my first comments, I am not yet convinced, and would > still like to see real evidenceI'm not sure why what Hal mentioned does not count as real evidence. The things he mentioned are cases where "exploiting" undefined behavior results in less code size better performance. -- Sanjoy
> On Jun 29, 2017, at 9:32 AM, Hal Finkel <hfinkel at anl.gov> wrote: > > > On 06/29/2017 10:41 AM, Peter Lawrence wrote: >> >>> On Jun 29, 2017, at 4:39 AM, Hal Finkel <hfinkel at anl.gov <mailto:hfinkel at anl.gov>> wrote: >>> >>> On 06/28/2017 05:33 PM, Peter Lawrence wrote: >>>> Chandler, >>>> where we disagree is in whether the current project is moving the issue >>>> forward. It is not. It is making the compiler more complex for no additional value. >>>> >>>> The current project is not based in evidence, I have asked for any SPEC benchmark >>>> that shows performance gain by the compiler taking advantage of “undefined behavior” >>>> and no one can show that. >>> >>> I can't comment on SPEC, but this does remind me of code I was working on recently. To abstract the relevant parts, it looked something like this: >>> >>> template <typename T> >>> int do_something(T mask, bool cond) { >>> if (mask & 2) >>> return 1; >>> >>> if (cond) { >>> T high_mask = mask >> 48; >>> if (high_mask > 5) >>> do_something_1(high_mask); >>> else if (high_mask > 3) >>> do_something_2(); >>> } >>> >>> return 0; >>> } >>> >>> This function ended up being instantiated on different types T (e.g. unsigned char, unsigned int, unsigned long, etc.) and, dynamically, cond was always false when T was char. The question is: Can the compiler eliminate all of the code predicated on cond for the smaller types? In this case, this code was hot, and moreover, performance depended on the fact that, for T = unsigned char, the function was inlined and the branch on cond was eliminated. In the relevant translation unit, however, the compiler would never see how cond was set. >>> >>> Luckily, we do the right thing here currently. In the case where T = unsigned char, we end up folding both of the high_mask tests as though they were false. That entire part of the code is eliminated, the function is inlined, and everyone is happy. >>> >>> Why was I looking at this? As it turns out, if the 'else if' in this example is just 'else', we don't actually eliminate both sides of the branch. The same is true for many other variants of the conditionals (i.e. we don't recognize all of the code as dead). >> >> >> I apologize in advance if I have missed something here and am misreading your example... >> >> This doesn’t make sense to me, a shift amount of 48 is “undefined” for unsigned char, >> How do we know this isn’t a source code bug, >> What makes us think the the user intended the result to be “0”. > > As I said, this is representation of what the real code did, and looked like, after other inlining had taken place, etc. In the original form, the user's intent was clear. That code is never executed when T is a small integer type.I will still have a hard time believing this until I see a real example, can you fill in the details ? Peter Lawrence. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170629/018df4c5/attachment.html>
On 6/29/17 9:41 AM, Peter Lawrence via llvm-dev wrote:> This doesn’t make sense to me, a shift amount of 48 is “undefined” for > unsigned char, > How do we know this isn’t a source code bug, > What makes us think the the user intended the result to be “0”. > > This strikes me as odd, we are mis-interpreting the user’s code > In such a way so as to improve performance, but that isn’t necessarily > what the user intended.The quoted text above is indicative of a serious misunderstanding and I would like to stop it from leading anyone else astray. The error is in thinking that we should consider the intent of a developer when we decide which optimizations to perform. That isn't how this works. LLVM code has a mathematical meaning: it describes computations. Any transformation that we do is either mathematically correct or it isn't. A transformation is correct when it refines the meaning of a piece of IR. Refinement mostly means "preserves equivalence" but not quite because it also allows undefined behaviors to be removed. For example "add nsw" is not equivalent to "add" but an "add nsw" can always be turned into an "add". The opposite transformation is only permissible when the add can be proven to not overflow. This is like the laws of physics for compiler optimizations, it is not open to debate. The place to consider developer intent, if one wanted to do that, is in the frontend that generates IR. If we don't want undef or poison to ever happen, then we must make the frontend generate IR that includes appropriate checks in front of operations that are sometimes undefined. To do this we have sanitizers and safe programming languages. SUMMARY: The intent, whatever it is, must be translated into IR. The LLVM middle end and backends are then obligated to preserve that meaning. They generally do this extremely well. But they are not, and must not be, obligated to infer the mental state of the developer who wrote the code that is being translated. John
> On Jun 29, 2017, at 10:38 AM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: > > Hi Peter, > > On Thu, Jun 29, 2017 at 8:41 AM, Peter Lawrence > <peterl95124 at sbcglobal.net <mailto:peterl95124 at sbcglobal.net>> wrote: >> Here’s another way to look at it, no one has ever filed a bug that reads >> “I used undefined behavior in my program, but the optimizer isn’t taking advantage of it” >> But if they do I think the response should be >> “you should not expect that, standard says nothing positive about what undefined behavior does" > > Of course no one would file such a bug (since if your program has UB, > the first thing you do is fix your program). However, there are > plenty of bugs where people complain about: "LLVM does not optimize my > (UB-free) program under the assumption that it does not have UB" > (which is what poison allows): >Sanjoy, I copied the bug reports below for reference. The way I read the first report is that the reason this problem exists today is the existence of “poison” in the LangRef, IE. we have to call isSCEVExprNeverPoison() which returns true in this case. So this is a reason to not have “poison”, “poison” is not enabling anything here, it is actually making it harder to do this optimization. This entire problem disappears in the alternate proposal (Fix “undef”, delete “poison” and “freeze”, no hoisting of nsw attributes) because the function isSCEVExprNeverPoison() would be deleted. The way I read the second report is that a special case needs to be added for "+nsw" This does not seem to have any thing to do with “undef” or “poison" Note that I don’t consider optimizing “nsw” as an example of optimizing “undefined Behavior”, “nsw” is simply an attribute that we take advantage of, so I am still waiting for someone to come up with a real source code example were “undef” / “poison” occurs in the IR and the optimizer takes advantage. Peter Lawrence.> https://bugs.llvm.org/show_bug.cgi?id=28429 <https://bugs.llvm.org/show_bug.cgi?id=28429>The following loop is not unrolled because the trip count is not simple constant: void foo(int x, int *p) { x += 2; //any value bigger than 1 for (int i = x; i <= x+1; ++i) p[i] = i; } The attached IR is generated compiling the above source with: ./bin/clang --target=aarch64-arm-none-eabi -mllvm -debug-only=loop-unroll -O3 SCEV find the trip count as: (-2 + (-1 * %x) + ((2 + %x) smax (3 + %x))) This expression can be simplified into a constant if the arguments of smax are not wrapping. But while the original source has non-wrapping expressions, they are not marked as such by SCEV. Reason is that SCEV considers these expressions possible poison values and marks them as wrapping. This seem to be a limitation in the llvm::ScalarEvolution::isSCEVExprNeverPoison function. Analysis so far: The actual add instructions have the nsw flags and reside in a basic block just before the loop body. The isSCEVExprNeverPoison function is called on them but assumes the instruction must be part of a loop. If the instruction is not part of any loop it is reported as a possible poison value. A possible fix would be to assume the instruction is loop independent if it is not part of any loop. But I am not sure if the loop information is complete (does absence of loop info on a basic block means there is no loop in the code in the scalar evolution pass?). With such a fix the resulting expression would be: (-2 + (-1 * %x) + ((2 + %x)<nsw> smax (3 + %x)<nsw>) This is foldable, although the current folding expression constructors don’t handle this case yet.> https://groups.google.com/forum/#!topic/llvm-dev/JGsDrfvS5wc <https://groups.google.com/forum/#!topic/llvm-dev/JGsDrfvS5wc>It looks like ScalarEvolution bails out of loop backedge computation if it cannot prove the IV stride as either positive or negative (based on loop control condition). I think this logic can be refined for signed IVs. Consider this simple loop- void foo(int *A, int n, int s) { int I; for(i=0; i<n; i += s) { A[i]++; } } The IV of this loop has this SCEV form- {0,+,%s}<nsw><%for.body> Can someone please clarify why it is not ok to deduce the stride to be positive based on the assumption that the IV cannot have a signed underflow due to the presence of the NSW flag otherwise the program has undefined behavior? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170629/57e01d87/attachment.html>
On Thu, Jun 29, 2017 at 11:28 AM, John Regehr via llvm-dev < llvm-dev at lists.llvm.org> wrote:> On 6/29/17 9:41 AM, Peter Lawrence via llvm-dev wrote: > > This doesn’t make sense to me, a shift amount of 48 is “undefined” for >> unsigned char, >> How do we know this isn’t a source code bug, >> What makes us think the the user intended the result to be “0”. >> >> This strikes me as odd, we are mis-interpreting the user’s code >> In such a way so as to improve performance, but that isn’t necessarily >> what the user intended. >> > > The quoted text above is indicative of a serious misunderstanding and I > would like to stop it from leading anyone else astray. > > The error is in thinking that we should consider the intent of a developer > when we decide which optimizations to perform. That isn't how this works. > LLVM code has a mathematical meaning: it describes computations. Any > transformation that we do is either mathematically correct or it isn't. > > A transformation is correct when it refines the meaning of a piece of IR. > Refinement mostly means "preserves equivalence" but not quite because it > also allows undefined behaviors to be removed. For example "add nsw" is not > equivalent to "add" but an "add nsw" can always be turned into an "add". > The opposite transformation is only permissible when the add can be proven > to not overflow. > > This is like the laws of physics for compiler optimizations, it is not > open to debate. > > The place to consider developer intent, if one wanted to do that, is in > the frontend that generates IR. If we don't want undef or poison to ever > happen, then we must make the frontend generate IR that includes > appropriate checks in front of operations that are sometimes undefined. To > do this we have sanitizers and safe programming languages. > > SUMMARY: The intent, whatever it is, must be translated into IR. The LLVM > middle end and backends are then obligated to preserve that meaning. They > generally do this extremely well. But they are not, and must not be, > obligated to infer the mental state of the developer who wrote the code > that is being translated. >Thanks so much for writing this John. This is something that I always have to explain to my interns or other folks that I'm bringing up to speed on compiler development (or on a bad day, to angry users :P). For some reason, it doesn't seem to be widely known or written down in very many (any?) places suitable for people new to the topic. I especially like how you've phrased this as "the laws of physics for compiler optimizations"; I think I'll be stealing that as it's a bit more memorable than "fundamental rule of compiler optimizations". -- Sean Silva> > John > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170629/f7a937e2/attachment.html>
> On Jun 29, 2017, at 9:32 AM, Hal Finkel <hfinkel at anl.gov> wrote: > >>> >>> On Jun 29, 2017, at 4:39 AM, Hal Finkel <hfinkel at anl.gov <mailto:hfinkel at anl.gov>> wrote: >>> >>> On 06/28/2017 05:33 PM, Peter Lawrence wrote: >>>> Chandler, >>>> where we disagree is in whether the current project is moving the issue >>>> forward. It is not. It is making the compiler more complex for no additional value. >>>> >>>> The current project is not based in evidence, I have asked for any SPEC benchmark >>>> that shows performance gain by the compiler taking advantage of “undefined behavior” >>>> and no one can show that. >>> >>> I can't comment on SPEC, but this does remind me of code I was working on recently. To abstract the relevant parts, it looked something like this: >>> >>> template <typename T> >>> int do_something(T mask, bool cond) { >>> if (mask & 2) >>> return 1; >>> >>> if (cond) { >>> T high_mask = mask >> 48; >>> if (high_mask > 5) >>> do_something_1(high_mask); >>> else if (high_mask > 3) >>> do_something_2(); >>> } >>> >>> return 0; >>> } >>>Hal, yes, there are times when it is expedient to suppress or ignore warnings, but I don’t believe this is one of them, a suggestion could have been made to change the source code along the lines of If (48 < 8*sizeof(mask) && Cond) Sizeof is always a compile-time constant, this would always be used to dead-code eliminate the offending block, no "undefined behavior”, no warning, and no performance issue. So I respectfully remain skeptical until I see a real world source code example where "optimizing away undefined behavior” is of benefit. Peter Lawrence. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170701/09982800/attachment.html>