> You dropped some context...> A daemon program wouldn't be readonly. An infinite loop can be.Right. To prevent miscommunication, here is a quick analysis of a problematic (IMO) example: We start with ``` define void @infloop(i1 %c) { entry: br i1 %c, label %l, label %e l: br label %l e: ret void } define void @main_func() { entry: call void @infloop(i1 1) ret void } ``` In this program `@main_func`, when called, will loop infinitely. If we run this program through `opt -functionattrs -prune-eh -early-cse`, we get ``` ; Function Attrs: nounwind readnone define void @infloop(i1 %c) #0 { entry: br i1 %c, label %l, label %e l: ; preds = %l, %entry br label %l e: ; preds = %entry ret void } ; Function Attrs: nounwind define void @main_func() #1 { entry: ret void } attributes #0 = { nounwind readnone } attributes #1 = { nounwind } ``` LLVM has optimized `@main_func` to return immediately. `-functionattrs` and `-prune-eh` infer `nounwind readnone` for `@infloop` and `-early-cse` delets such calls that are dead. There are three ways to justify what happened: 1. The optimization was correct. Infinite loops [1] in LLVM IR are UB and since the original program had UB, it can be optimized to anything. This is problematic since if this were true, we would have trouble writing frontends for programming languages that actually have a well defined `while(1);`. 2. The bug is in `-early-cse`: a `readnone nounwind` function can loop infinitely, so it is not correct to remove a dead call to such a function. 3. The bug is in `-functionattrs`: a function cannot be marked as `readonly` if it may loop infinitely. Needless to say, both (2) and (3) are "easy to fix"; the question is really about the deeper semantic issue about how LLVM treats `while(1);`. There are interesting related questions on the semantics of the equivalent C/C++ programs, but I personally am interested only in "pure" LLVM IR semantics. -- Sanjoy [1]: We could make the distinction between infinite loops that are empty vs. infinite loops that are non-empty but that's tricky -- if only empty infinite loops were UB then e.g. LICM gets harder, since sinking the store out of loops like `while(1) *addr = 42;` now introduces UB.
What's really I missing, I believe, is the equivalent of the GCC "pure" attribute - which explicitly forbids infinite loops. I think readnone + halting (like in the proposed 2010 patch) is good enough, though. Then every function emitted by a C/C++ FE could be marked as halting, and we could then fix bug [2] by requiring "readnone halting". One potential issue with this is that "readnone halting" isn't strong enough to allow speculative execution of functions (because they may have other UB), but I'm not entirely sure GCC "pure" allows that either. Regarding Java - what are the expected semantics of infinite loops without side-effects, per spec? It seems that for "trivial" infinite loops (while(1);) the FE should actually generate a compile-time error, per JLS 14.21: https://docs.oracle.com/javase/specs/jls/se8/html/jls-14.html#jls-14.21 The JLS does, however, explicitly claim that this is legal code: { int n = 5; while (n > 7) k = 2; } But I don't see what the semantics of this are. Does it actually mandate non-termination? Michael -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Sanjoy Das Sent: Sunday, June 28, 2015 09:50 To: Jeremy Lakeman Cc: LLVM Developers Mailing List Subject: Re: [LLVMdev] readonly and infinite loops> You dropped some context...> A daemon program wouldn't be readonly. An infinite loop can be.Right. To prevent miscommunication, here is a quick analysis of a problematic (IMO) example: We start with ``` define void @infloop(i1 %c) { entry: br i1 %c, label %l, label %e l: br label %l e: ret void } define void @main_func() { entry: call void @infloop(i1 1) ret void } ``` In this program `@main_func`, when called, will loop infinitely. If we run this program through `opt -functionattrs -prune-eh -early-cse`, we get ``` ; Function Attrs: nounwind readnone define void @infloop(i1 %c) #0 { entry: br i1 %c, label %l, label %e l: ; preds = %l, %entry br label %l e: ; preds = %entry ret void } ; Function Attrs: nounwind define void @main_func() #1 { entry: ret void } attributes #0 = { nounwind readnone } attributes #1 = { nounwind } ``` LLVM has optimized `@main_func` to return immediately. `-functionattrs` and `-prune-eh` infer `nounwind readnone` for `@infloop` and `-early-cse` delets such calls that are dead. There are three ways to justify what happened: 1. The optimization was correct. Infinite loops [1] in LLVM IR are UB and since the original program had UB, it can be optimized to anything. This is problematic since if this were true, we would have trouble writing frontends for programming languages that actually have a well defined `while(1);`. 2. The bug is in `-early-cse`: a `readnone nounwind` function can loop infinitely, so it is not correct to remove a dead call to such a function. 3. The bug is in `-functionattrs`: a function cannot be marked as `readonly` if it may loop infinitely. Needless to say, both (2) and (3) are "easy to fix"; the question is really about the deeper semantic issue about how LLVM treats `while(1);`. There are interesting related questions on the semantics of the equivalent C/C++ programs, but I personally am interested only in "pure" LLVM IR semantics. -- Sanjoy [1]: We could make the distinction between infinite loops that are empty vs. infinite loops that are non-empty but that's tricky -- if only empty infinite loops were UB then e.g. LICM gets harder, since sinking the store out of loops like `while(1) *addr = 42;` now introduces UB. _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
> What's really I missing, I believe, is the equivalent of the GCC "pure" attribute - which explicitly forbids infinite loops. I think readnone + halting (like in the proposed 2010 patch) is good enough, though. Then every function emitted by a C/C++ FE could be marked as halting, and we could then fix bug [2] by requiring "readnone halting".This looks like the right fix to me too.> One potential issue with this is that "readnone halting" isn't strong enough to allow speculative execution of functions (because they may have other UB)Agreed.> Regarding Java - what are the expected semantics of infinite loops without side-effects, per spec? > It seems that for "trivial" infinite loops (while(1);) the FE should actually generate a compile-time error, per JLS 14.21: > https://docs.oracle.com/javase/specs/jls/se8/html/jls-14.html#jls-14.21By hit and trial with `javac`, it looks like cases like these are legal: public class P { public static int f() { while (true); } public static void main(String[] argv) { f(); } } But you're supposed to get compiler errors in cases like this: public class P { public static int f() { while (true); return 10; } public static void main(String[] argv) { f(); } } Since "return 10;" is now an unreachable statement. In any case, it was misleading of me to talk about Java (the language) semantics; what I really care about are java *bytecode* semantics, which is what we JIT compile via LLVM IR. And as far as I can tell, there is nothing in the bytecode spec that rules out trivial infinite loops. I will take a closer look and discuss this with my colleagues, in case I've managed to miss something.> The JLS does, however, explicitly claim that this is legal code: > { > int n = 5; > while (n > 7) k = 2; > } > > But I don't see what the semantics of this are. Does it actually mandate non-termination?The execution will never enter the while loop, as far as I can tell. -- Sanjoy
People have been aware of this issue for a long time (as you can see by Nick's 2010 patch). I guess it was not pushed further because of lack of practical interest. Maybe Nick remembers more about the issue and why the patch was dropped. Sanjoy: do you have a practical concern about this issue? I mean, of course this can be "fixed", but it will require some work, and even a termination checker for -functionattrs for the non-C/C++ frontends. Nuno -----Original Message----- From: Sanjoy Das Sent: Sunday, June 28, 2015 7:50 AM Subject: Re: [LLVMdev] readonly and infinite loops> You dropped some context...> A daemon program wouldn't be readonly. An infinite loop can be.Right. To prevent miscommunication, here is a quick analysis of a problematic (IMO) example: We start with ``` define void @infloop(i1 %c) { entry: br i1 %c, label %l, label %e l: br label %l e: ret void } define void @main_func() { entry: call void @infloop(i1 1) ret void } ``` In this program `@main_func`, when called, will loop infinitely. If we run this program through `opt -functionattrs -prune-eh -early-cse`, we get ``` ; Function Attrs: nounwind readnone define void @infloop(i1 %c) #0 { entry: br i1 %c, label %l, label %e l: ; preds = %l, %entry br label %l e: ; preds = %entry ret void } ; Function Attrs: nounwind define void @main_func() #1 { entry: ret void } attributes #0 = { nounwind readnone } attributes #1 = { nounwind } ``` LLVM has optimized `@main_func` to return immediately. `-functionattrs` and `-prune-eh` infer `nounwind readnone` for `@infloop` and `-early-cse` delets such calls that are dead. There are three ways to justify what happened: 1. The optimization was correct. Infinite loops [1] in LLVM IR are UB and since the original program had UB, it can be optimized to anything. This is problematic since if this were true, we would have trouble writing frontends for programming languages that actually have a well defined `while(1);`. 2. The bug is in `-early-cse`: a `readnone nounwind` function can loop infinitely, so it is not correct to remove a dead call to such a function. 3. The bug is in `-functionattrs`: a function cannot be marked as `readonly` if it may loop infinitely. Needless to say, both (2) and (3) are "easy to fix"; the question is really about the deeper semantic issue about how LLVM treats `while(1);`. There are interesting related questions on the semantics of the equivalent C/C++ programs, but I personally am interested only in "pure" LLVM IR semantics. -- Sanjoy [1]: We could make the distinction between infinite loops that are empty vs. infinite loops that are non-empty but that's tricky -- if only empty infinite loops were UB then e.g. LICM gets harder, since sinking the store out of loops like `while(1) *addr = 42;` now introduces UB.
----- Original Message -----> From: "Nuno Lopes" <nunoplopes at sapo.pt> > To: "Sanjoy Das" <sanjoy at playingwithpointers.com>, "Jeremy Lakeman" <Jeremy.Lakeman at gmail.com>, nlewycky at google.com > Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Sunday, June 28, 2015 6:08:52 AM > Subject: Re: [LLVMdev] readonly and infinite loops > > People have been aware of this issue for a long time (as you can see > by > Nick's 2010 patch). I guess it was not pushed further because of > lack of > practical interest.I can't comment about the rationale in 2010, but this came up again when I was originally working on heap-to-stack conversion. Being able to mark functions as halting is essential for doing heap-to-stack conversion. This was put on hold, however, essentially awaiting the new pass manager. The issue is that you'd like to be able to use SCEV in a module/CGSCC-level pass to infer the attribute on functions with loops, and this cannot be done directly until we have the new pass manager. On the other hand, maybe we don't need to wait for that now. We now also have a scheme for tagging loops with metadata, and we could use that to add metadata to loops known to have a finite trip count, and then use that metadata in a pass that infers the attribute (FunctionAttrs presumably). -Hal> Maybe Nick remembers more about the issue and why the patch was > dropped. > > Sanjoy: do you have a practical concern about this issue? I mean, of > course > this can be "fixed", but it will require some work, and even a > termination > checker for -functionattrs for the non-C/C++ frontends. > > Nuno > > -----Original Message----- > From: Sanjoy Das > Sent: Sunday, June 28, 2015 7:50 AM > Subject: Re: [LLVMdev] readonly and infinite loops > > > You dropped some context... > > > A daemon program wouldn't be readonly. An infinite loop can be. > > Right. > > To prevent miscommunication, here is a quick analysis of a > problematic > (IMO) example: > > We start with > > ``` > define void @infloop(i1 %c) { > entry: > br i1 %c, label %l, label %e > > l: > br label %l > > e: > ret void > } > > define void @main_func() { > entry: > call void @infloop(i1 1) > ret void > } > ``` > > In this program `@main_func`, when called, will loop infinitely. > > If we run this program through `opt -functionattrs -prune-eh > -early-cse`, we get > > ``` > ; Function Attrs: nounwind readnone > define void @infloop(i1 %c) #0 { > entry: > br i1 %c, label %l, label %e > > l: ; preds = %l, > %entry > br label %l > > e: ; preds = %entry > ret void > } > > ; Function Attrs: nounwind > define void @main_func() #1 { > entry: > ret void > } > > attributes #0 = { nounwind readnone } > attributes #1 = { nounwind } > ``` > > LLVM has optimized `@main_func` to return immediately. > `-functionattrs` and `-prune-eh` infer `nounwind readnone` for > `@infloop` and `-early-cse` delets such calls that are dead. > > There are three ways to justify what happened: > > 1. The optimization was correct. Infinite loops [1] in LLVM IR are > UB and since the original program had UB, it can be optimized to > anything. This is problematic since if this were true, we would > have trouble writing frontends for programming languages that > actually have a well defined `while(1);`. > > 2. The bug is in `-early-cse`: a `readnone nounwind` function can > loop infinitely, so it is not correct to remove a dead call to > such a function. > > 3. The bug is in `-functionattrs`: a function cannot be marked as > `readonly` if it may loop infinitely. > > > Needless to say, both (2) and (3) are "easy to fix"; the question is > really about the deeper semantic issue about how LLVM treats > `while(1);`. > > There are interesting related questions on the semantics of the > equivalent C/C++ programs, but I personally am interested > only in "pure" LLVM IR semantics. > > -- Sanjoy > > [1]: We could make the distinction between infinite loops that are > empty vs. infinite loops that are non-empty but that's tricky -- if > only empty infinite loops were UB then e.g. LICM gets harder, since > sinking the store out of loops like `while(1) *addr = 42;` now > introduces UB. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
> Sanjoy: do you have a practical concern about this issue? I mean, of courseI don't have any immediate practical concerns and I've not actually seen miscompiles resulting from this. But fixing this is certainly more important than a "nice to have" -- we cannot miscompile "while (true) ;" loops. -- Sanjoy> this can be "fixed", but it will require some work, and even a termination > checker for -functionattrs for the non-C/C++ frontends. > > Nuno > > -----Original Message----- From: Sanjoy Das > Sent: Sunday, June 28, 2015 7:50 AM > Subject: Re: [LLVMdev] readonly and infinite loops > > >> You dropped some context... > > >> A daemon program wouldn't be readonly. An infinite loop can be. > > > Right. > > To prevent miscommunication, here is a quick analysis of a problematic > (IMO) example: > > We start with > > ``` > define void @infloop(i1 %c) { > entry: > br i1 %c, label %l, label %e > > l: > br label %l > > e: > ret void > } > > define void @main_func() { > entry: > call void @infloop(i1 1) > ret void > } > ``` > > In this program `@main_func`, when called, will loop infinitely. > > If we run this program through `opt -functionattrs -prune-eh > -early-cse`, we get > > ``` > ; Function Attrs: nounwind readnone > define void @infloop(i1 %c) #0 { > entry: > br i1 %c, label %l, label %e > > l: ; preds = %l, %entry > br label %l > > e: ; preds = %entry > ret void > } > > ; Function Attrs: nounwind > define void @main_func() #1 { > entry: > ret void > } > > attributes #0 = { nounwind readnone } > attributes #1 = { nounwind } > ``` > > LLVM has optimized `@main_func` to return immediately. > `-functionattrs` and `-prune-eh` infer `nounwind readnone` for > `@infloop` and `-early-cse` delets such calls that are dead. > > There are three ways to justify what happened: > > 1. The optimization was correct. Infinite loops [1] in LLVM IR are > UB and since the original program had UB, it can be optimized to > anything. This is problematic since if this were true, we would > have trouble writing frontends for programming languages that > actually have a well defined `while(1);`. > > 2. The bug is in `-early-cse`: a `readnone nounwind` function can > loop infinitely, so it is not correct to remove a dead call to > such a function. > > 3. The bug is in `-functionattrs`: a function cannot be marked as > `readonly` if it may loop infinitely. > > > Needless to say, both (2) and (3) are "easy to fix"; the question is > really about the deeper semantic issue about how LLVM treats > `while(1);`. > > There are interesting related questions on the semantics of the > equivalent C/C++ programs, but I personally am interested > only in "pure" LLVM IR semantics. > > -- Sanjoy > > [1]: We could make the distinction between infinite loops that are > empty vs. infinite loops that are non-empty but that's tricky -- if > only empty infinite loops were UB then e.g. LICM gets harder, since > sinking the store out of loops like `while(1) *addr = 42;` now > introduces UB.