Here's a fun spin on this same topic (I can't file a bug at this moment since llvm.org is down). Consider: define i32 @x(i32* %x, i1* %y) { entry: br label %loop loop: %v = phi i32 [ 0 , %entry ], [ %v.inc, %exit.inner ], [ %v, %loop ] %c = load volatile i1, i1* %y br i1 %c, label %exit.inner, label %loop exit.inner: %c1 = load volatile i1, i1* %y %x.val = load i32, i32* %x %v.inc = add i32 %v, %x.val br i1 %c1, label %exit.real, label %loop exit.real: ret i32 %v } Now if %c is false every time, %x is never dereferenced. Since this is an infinitely looping program that does a volatile load in every iteration, it is well defined (going by what Hal said). However, opt -licm transforms this to ; ModuleID = '/Users/sanjoy/tmp/inf.ll' define i32 @x(i32* %x, i1* %y) { entry: %x.val = load i32, i32* %x br label %loop.outer loop.outer: ; preds = %exit.inner, %entry %v.ph = phi i32 [ %v.inc, %exit.inner ], [ 0, %entry ] br label %loop loop: ; preds = %loop.outer, %loop %c = load volatile i1, i1* %y br i1 %c, label %exit.inner, label %loop exit.inner: ; preds = %loop %c1 = load volatile i1, i1* %y %v.inc = add i32 %v.ph, %x.val br i1 %c1, label %exit.real, label %loop.outer exit.real: ; preds = %exit.inner %v.ph.lcssa = phi i32 [ %v.ph, %exit.inner ] ret i32 %v.ph.lcssa } where it unconditionally dereferences %x, effectively introducing UB if %x is not dereferenceable. The bug is in isGuaranteedToExecute. It assumes that if an instruction dominates all the loop exits then it will always be executed by the loop "on its way out". But there may never be a way out of the loop. -- Sanjoy
This is going to be pretty annoying. At the *LLVM IR* level, i'm not aware of us having guarantees that match the C/C++ standard rules. If we do, i don't see them documented :) Unless someone shows me different, this looks legal at the LLVM IR, and thus, a but in clang for generating LLVM ir that doesn't represent the semantics of hte language properly :) (I'm half-joking) Note at the LLVM IR level, t he only way to deal with this in general would be to mark every loop that has a non-computable set of bounds as maybe infinite. Then, using metadata that contains the set of rules or something, mark as finite those that meet those rules (ie it contains no volatile loads, and ... and ... etc). Note that execution and termination are, as sanjoy demonstrates, two separate issues completely. Given a program that says: while (a) volatile load while (b) nothing special The second loop is guaranteed to terminate in C++, but not guaranteed to *execute*. Unless the standard believes that termination implies execution as well ( (i can't find it anwhere) ) On Thu, Jul 9, 2015 at 12:29 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote:> Here's a fun spin on this same topic (I can't file a bug at this > moment since llvm.org is down). > > Consider: > > define i32 @x(i32* %x, i1* %y) { > entry: > br label %loop > > loop: > %v = phi i32 [ 0 , %entry ], [ %v.inc, %exit.inner ], [ %v, %loop ] > %c = load volatile i1, i1* %y > br i1 %c, label %exit.inner, label %loop > > exit.inner: > %c1 = load volatile i1, i1* %y > %x.val = load i32, i32* %x > %v.inc = add i32 %v, %x.val > br i1 %c1, label %exit.real, label %loop > > exit.real: > ret i32 %v > } > > Now if %c is false every time, %x is never dereferenced. Since this > is an infinitely looping program that does a volatile load in every > iteration, it is well defined (going by what Hal said). > > However, opt -licm transforms this to > > ; ModuleID = '/Users/sanjoy/tmp/inf.ll' > > define i32 @x(i32* %x, i1* %y) { > entry: > %x.val = load i32, i32* %x > br label %loop.outer > > loop.outer: ; preds = %exit.inner, %entry > %v.ph = phi i32 [ %v.inc, %exit.inner ], [ 0, %entry ] > br label %loop > > loop: ; preds = %loop.outer, %loop > %c = load volatile i1, i1* %y > br i1 %c, label %exit.inner, label %loop > > exit.inner: ; preds = %loop > %c1 = load volatile i1, i1* %y > %v.inc = add i32 %v.ph, %x.val > br i1 %c1, label %exit.real, label %loop.outer > > exit.real: ; preds = %exit.inner > %v.ph.lcssa = phi i32 [ %v.ph, %exit.inner ] > ret i32 %v.ph.lcssa > } > > where it unconditionally dereferences %x, effectively introducing UB > if %x is not dereferenceable. > > > The bug is in isGuaranteedToExecute. It assumes that if an > instruction dominates all the loop exits then it will always be > executed by the loop "on its way out". But there may never be a way > out of the loop. > > -- Sanjoy > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
I guess it's also worth pointing out that 9 years ago in GCC, starting to do this instead of treating all infinite loops as "always exiting" cost 3% geomean on SPEC. (How useful this is as a measure for LLVM, no idea :P) On Thu, Jul 9, 2015 at 1:26 PM, Daniel Berlin <dberlin at dberlin.org> wrote:> This is going to be pretty annoying. > At the *LLVM IR* level, i'm not aware of us having guarantees that > match the C/C++ standard rules. > If we do, i don't see them documented :) > > Unless someone shows me different, this looks legal at the LLVM IR, > and thus, a but in clang for generating LLVM ir that doesn't represent > the semantics of hte language properly :) > > (I'm half-joking) > > Note at the LLVM IR level, t he only way to deal with this in general > would be to mark every loop that has a non-computable set of bounds as > maybe infinite. > Then, using metadata that contains the set of rules or something, mark > as finite those that meet those rules (ie it contains no volatile > loads, and ... and ... etc). > > Note that execution and termination are, as sanjoy demonstrates, two > separate issues completely. > > Given a program that says: > > while (a) > volatile load > > while (b) > nothing special > > > > The second loop is guaranteed to terminate in C++, but not guaranteed > to *execute*. > > Unless the standard believes that termination implies execution as > well ( (i can't find it anwhere) ) > > > > > > On Thu, Jul 9, 2015 at 12:29 PM, Sanjoy Das > <sanjoy at playingwithpointers.com> wrote: >> Here's a fun spin on this same topic (I can't file a bug at this >> moment since llvm.org is down). >> >> Consider: >> >> define i32 @x(i32* %x, i1* %y) { >> entry: >> br label %loop >> >> loop: >> %v = phi i32 [ 0 , %entry ], [ %v.inc, %exit.inner ], [ %v, %loop ] >> %c = load volatile i1, i1* %y >> br i1 %c, label %exit.inner, label %loop >> >> exit.inner: >> %c1 = load volatile i1, i1* %y >> %x.val = load i32, i32* %x >> %v.inc = add i32 %v, %x.val >> br i1 %c1, label %exit.real, label %loop >> >> exit.real: >> ret i32 %v >> } >> >> Now if %c is false every time, %x is never dereferenced. Since this >> is an infinitely looping program that does a volatile load in every >> iteration, it is well defined (going by what Hal said). >> >> However, opt -licm transforms this to >> >> ; ModuleID = '/Users/sanjoy/tmp/inf.ll' >> >> define i32 @x(i32* %x, i1* %y) { >> entry: >> %x.val = load i32, i32* %x >> br label %loop.outer >> >> loop.outer: ; preds = %exit.inner, %entry >> %v.ph = phi i32 [ %v.inc, %exit.inner ], [ 0, %entry ] >> br label %loop >> >> loop: ; preds = %loop.outer, %loop >> %c = load volatile i1, i1* %y >> br i1 %c, label %exit.inner, label %loop >> >> exit.inner: ; preds = %loop >> %c1 = load volatile i1, i1* %y >> %v.inc = add i32 %v.ph, %x.val >> br i1 %c1, label %exit.real, label %loop.outer >> >> exit.real: ; preds = %exit.inner >> %v.ph.lcssa = phi i32 [ %v.ph, %exit.inner ] >> ret i32 %v.ph.lcssa >> } >> >> where it unconditionally dereferences %x, effectively introducing UB >> if %x is not dereferenceable. >> >> >> The bug is in isGuaranteedToExecute. It assumes that if an >> instruction dominates all the loop exits then it will always be >> executed by the loop "on its way out". But there may never be a way >> out of the loop. >> >> -- Sanjoy >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
----- Original Message -----> From: "Daniel Berlin" <dberlin at dberlin.org> > To: "Sanjoy Das" <sanjoy at playingwithpointers.com> > Cc: "Hal Finkel" <hfinkel at anl.gov>, "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Thursday, July 9, 2015 3:26:32 PM > Subject: Re: [LLVMdev] readonly and infinite loops > > This is going to be pretty annoying. > At the *LLVM IR* level, i'm not aware of us having guarantees that > match the C/C++ standard rules. > If we do, i don't see them documented :) > > Unless someone shows me different, this looks legal at the LLVM IR, > and thus, a but in clang for generating LLVM ir that doesn't > represent > the semantics of hte language properly :) > > (I'm half-joking) > > Note at the LLVM IR level, t he only way to deal with this in general > would be to mark every loop that has a non-computable set of bounds > as > maybe infinite. > Then, using metadata that contains the set of rules or something, > mark > as finite those that meet those rules (ie it contains no volatile > loads, and ... and ... etc). > > Note that execution and termination are, as sanjoy demonstrates, two > separate issues completely. > > Given a program that says: > > while (a) > volatile load > > while (b) > nothing special > > > > The second loop is guaranteed to terminate in C++, but not guaranteed > to *execute*. > > Unless the standard believes that termination implies execution as > well ( (i can't find it anwhere) ) >I don't believe that it does. -Hal> > > > > On Thu, Jul 9, 2015 at 12:29 PM, Sanjoy Das > <sanjoy at playingwithpointers.com> wrote: > > Here's a fun spin on this same topic (I can't file a bug at this > > moment since llvm.org is down). > > > > Consider: > > > > define i32 @x(i32* %x, i1* %y) { > > entry: > > br label %loop > > > > loop: > > %v = phi i32 [ 0 , %entry ], [ %v.inc, %exit.inner ], [ %v, %loop > > ] > > %c = load volatile i1, i1* %y > > br i1 %c, label %exit.inner, label %loop > > > > exit.inner: > > %c1 = load volatile i1, i1* %y > > %x.val = load i32, i32* %x > > %v.inc = add i32 %v, %x.val > > br i1 %c1, label %exit.real, label %loop > > > > exit.real: > > ret i32 %v > > } > > > > Now if %c is false every time, %x is never dereferenced. Since > > this > > is an infinitely looping program that does a volatile load in every > > iteration, it is well defined (going by what Hal said). > > > > However, opt -licm transforms this to > > > > ; ModuleID = '/Users/sanjoy/tmp/inf.ll' > > > > define i32 @x(i32* %x, i1* %y) { > > entry: > > %x.val = load i32, i32* %x > > br label %loop.outer > > > > loop.outer: ; preds > > %exit.inner, %entry > > %v.ph = phi i32 [ %v.inc, %exit.inner ], [ 0, %entry ] > > br label %loop > > > > loop: ; preds > > %loop.outer, %loop > > %c = load volatile i1, i1* %y > > br i1 %c, label %exit.inner, label %loop > > > > exit.inner: ; preds = %loop > > %c1 = load volatile i1, i1* %y > > %v.inc = add i32 %v.ph, %x.val > > br i1 %c1, label %exit.real, label %loop.outer > > > > exit.real: ; preds > > %exit.inner > > %v.ph.lcssa = phi i32 [ %v.ph, %exit.inner ] > > ret i32 %v.ph.lcssa > > } > > > > where it unconditionally dereferences %x, effectively introducing > > UB > > if %x is not dereferenceable. > > > > > > The bug is in isGuaranteedToExecute. It assumes that if an > > instruction dominates all the loop exits then it will always be > > executed by the loop "on its way out". But there may never be a > > way > > out of the loop. > > > > -- Sanjoy > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
----- Original Message -----> From: "Sanjoy Das" <sanjoy at playingwithpointers.com> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: "Nick Lewycky" <nlewycky at google.com>, "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>, "Jeremy Lakeman" > <Jeremy.Lakeman at gmail.com>, "Nuno Lopes" <nunoplopes at sapo.pt> > Sent: Thursday, July 9, 2015 2:29:22 PM > Subject: Re: [LLVMdev] readonly and infinite loops > > Here's a fun spin on this same topic (I can't file a bug at this > moment since llvm.org is down). > > Consider: > > define i32 @x(i32* %x, i1* %y) { > entry: > br label %loop > > loop: > %v = phi i32 [ 0 , %entry ], [ %v.inc, %exit.inner ], [ %v, %loop ] > %c = load volatile i1, i1* %y > br i1 %c, label %exit.inner, label %loop > > exit.inner: > %c1 = load volatile i1, i1* %y > %x.val = load i32, i32* %x > %v.inc = add i32 %v, %x.val > br i1 %c1, label %exit.real, label %loop > > exit.real: > ret i32 %v > } > > Now if %c is false every time, %x is never dereferenced. Since this > is an infinitely looping program that does a volatile load in every > iteration, it is well defined (going by what Hal said). > > However, opt -licm transforms this to > > ; ModuleID = '/Users/sanjoy/tmp/inf.ll' > > define i32 @x(i32* %x, i1* %y) { > entry: > %x.val = load i32, i32* %x > br label %loop.outer > > loop.outer: ; preds > %exit.inner, %entry > %v.ph = phi i32 [ %v.inc, %exit.inner ], [ 0, %entry ] > br label %loop > > loop: ; preds > %loop.outer, %loop > %c = load volatile i1, i1* %y > br i1 %c, label %exit.inner, label %loop > > exit.inner: ; preds = %loop > %c1 = load volatile i1, i1* %y > %v.inc = add i32 %v.ph, %x.val > br i1 %c1, label %exit.real, label %loop.outer > > exit.real: ; preds > %exit.inner > %v.ph.lcssa = phi i32 [ %v.ph, %exit.inner ] > ret i32 %v.ph.lcssa > } > > where it unconditionally dereferences %x, effectively introducing UB > if %x is not dereferenceable. > > > The bug is in isGuaranteedToExecute. It assumes that if an > instruction dominates all the loop exits then it will always be > executed by the loop "on its way out". But there may never be a way > out of the loop.That's unfortunately correct. This is a bug. -Hal> > -- Sanjoy >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory