thr3ads.net - llvm dev - [LLVMdev] readonly and infinite loops [Jun 2015]

If this information is useful, please help other people find it:
Share via:

Sanjoy Das

2015-Jun-28 06:50 UTC

[LLVMdev] readonly and infinite loops

> You dropped some context...
> A daemon program wouldn't be readonly. An infinite loop can be.
Right.

To prevent miscommunication, here is a quick analysis of a problematic
(IMO) example:

We start with

```
define void @infloop(i1 %c) {
 entry:
  br i1 %c,  label %l, label %e

 l:
  br label %l

 e:
  ret void
}

define void @main_func() {
 entry:
  call void @infloop(i1 1)
  ret void
}
```

In this program `@main_func`, when called, will loop infinitely.

If we run this program through `opt -functionattrs -prune-eh
-early-cse`, we get

```
; Function Attrs: nounwind readnone
define void @infloop(i1 %c) #0 {
entry:
  br i1 %c, label %l, label %e

l:                                                ; preds = %l, %entry
  br label %l

e:                                                ; preds = %entry
  ret void
}

; Function Attrs: nounwind
define void @main_func() #1 {
entry:
  ret void
}

attributes #0 = { nounwind readnone }
attributes #1 = { nounwind }
```

LLVM has optimized `@main_func` to return immediately.
`-functionattrs` and `-prune-eh` infer `nounwind readnone` for
`@infloop` and `-early-cse` delets such calls that are dead.

There are three ways to justify what happened:

 1. The optimization was correct.  Infinite loops [1] in LLVM IR are
    UB and since the original program had UB, it can be optimized to
    anything.  This is problematic since if this were true, we would
    have trouble writing frontends for programming languages that
    actually have a well defined `while(1);`.

 2. The bug is in `-early-cse`: a `readnone nounwind` function can
    loop infinitely, so it is not correct to remove a dead call to
    such a function.

 3. The bug is in `-functionattrs`: a function cannot be marked as
    `readonly` if it may loop infinitely.


Needless to say, both (2) and (3) are "easy to fix"; the question is
really about the deeper semantic issue about how LLVM treats
`while(1);`.

There are interesting related questions on the semantics of the
equivalent C/C++ programs, but I personally am interested
only in "pure" LLVM IR semantics.

-- Sanjoy

[1]: We could make the distinction between infinite loops that are
empty vs. infinite loops that are non-empty but that's tricky -- if
only empty infinite loops were UB then e.g. LICM gets harder, since
sinking the store out of loops like `while(1) *addr = 42;` now
introduces UB.

Kuperstein, Michael M

2015-Jun-28 08:07 UTC

head link

[LLVMdev] readonly and infinite loops

What's really I missing, I believe, is the equivalent of the GCC
"pure" attribute - which explicitly forbids infinite loops. I think
readnone + halting (like in the proposed 2010 patch) is good enough, though.
Then every function emitted by a C/C++ FE could be marked as halting, and we
could then fix bug [2] by requiring "readnone halting".
One potential issue with this is that "readnone halting" isn't
strong enough to allow speculative execution of functions (because they may have
other UB), but I'm not entirely sure GCC "pure" allows that
either.

Regarding Java - what are the expected semantics of infinite loops without
side-effects, per spec?
It seems that for "trivial" infinite loops (while(1);) the FE should
actually generate a compile-time error, per JLS 14.21:
https://docs.oracle.com/javase/specs/jls/se8/html/jls-14.html#jls-14.21

The JLS does, however, explicitly claim that this is legal code:
{
    int n = 5;
    while (n > 7) k = 2;
}

But I don't see what the semantics of this are. Does it actually mandate
non-termination?

Michael

-----Original Message-----
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
Behalf Of Sanjoy Das
Sent: Sunday, June 28, 2015 09:50
To: Jeremy Lakeman
Cc: LLVM Developers Mailing List
Subject: Re: [LLVMdev] readonly and infinite loops
> You dropped some context...
> A daemon program wouldn't be readonly. An infinite loop can be.
Right.

To prevent miscommunication, here is a quick analysis of a problematic
(IMO) example:

We start with

```
define void @infloop(i1 %c) {
 entry:
  br i1 %c,  label %l, label %e

 l:
  br label %l

 e:
  ret void
}

define void @main_func() {
 entry:
  call void @infloop(i1 1)
  ret void
}
```

In this program `@main_func`, when called, will loop infinitely.

If we run this program through `opt -functionattrs -prune-eh -early-cse`, we get

```
; Function Attrs: nounwind readnone
define void @infloop(i1 %c) #0 {
entry:
  br i1 %c, label %l, label %e

l:                                                ; preds = %l, %entry
  br label %l

e:                                                ; preds = %entry
  ret void
}

; Function Attrs: nounwind
define void @main_func() #1 {
entry:
  ret void
}

attributes #0 = { nounwind readnone }
attributes #1 = { nounwind }
```

LLVM has optimized `@main_func` to return immediately.
`-functionattrs` and `-prune-eh` infer `nounwind readnone` for `@infloop` and
`-early-cse` delets such calls that are dead.

There are three ways to justify what happened:

 1. The optimization was correct.  Infinite loops [1] in LLVM IR are
    UB and since the original program had UB, it can be optimized to
    anything.  This is problematic since if this were true, we would
    have trouble writing frontends for programming languages that
    actually have a well defined `while(1);`.

 2. The bug is in `-early-cse`: a `readnone nounwind` function can
    loop infinitely, so it is not correct to remove a dead call to
    such a function.

 3. The bug is in `-functionattrs`: a function cannot be marked as
    `readonly` if it may loop infinitely.


Needless to say, both (2) and (3) are "easy to fix"; the question is
really about the deeper semantic issue about how LLVM treats `while(1);`.

There are interesting related questions on the semantics of the equivalent C/C++
programs, but I personally am interested only in "pure" LLVM IR
semantics.

-- Sanjoy

[1]: We could make the distinction between infinite loops that are empty vs.
infinite loops that are non-empty but that's tricky -- if only empty
infinite loops were UB then e.g. LICM gets harder, since sinking the store out
of loops like `while(1) *addr = 42;` now introduces UB.
_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

Sanjoy Das

2015-Jun-28 08:40 UTC

head link

[LLVMdev] readonly and infinite loops

> What's really I missing, I believe, is the equivalent of the GCC
"pure" attribute - which explicitly forbids infinite loops. I think
readnone + halting (like in the proposed 2010 patch) is good enough, though.
Then every function emitted by a C/C++ FE could be marked as halting, and we
could then fix bug [2] by requiring "readnone halting".
This looks like the right fix to me too.
> One potential issue with this is that "readnone halting"
isn't strong enough to allow speculative execution of functions (because
they may have other UB)
Agreed.
> Regarding Java - what are the expected semantics of infinite loops without
side-effects, per spec?
> It seems that for "trivial" infinite loops (while(1);) the FE
should actually generate a compile-time error, per JLS 14.21:
> https://docs.oracle.com/javase/specs/jls/se8/html/jls-14.html#jls-14.21
By hit and trial with `javac`, it looks like cases like these are legal:

public class P {
     public static int f() {
          while (true);
     }

     public static void main(String[] argv) {
          f();
     }
}

But you're supposed to get compiler errors in cases like this:

public class P {
     public static int f() {
          while (true);
          return 10;
     }

     public static void main(String[] argv) {
          f();
     }
}

Since "return 10;" is now an unreachable statement.


In any case, it was misleading of me to talk about Java (the language)
semantics; what I really care about are java *bytecode* semantics,
which is what we JIT compile via LLVM IR.  And as far as I can tell,
there is nothing in the bytecode spec that rules out trivial infinite
loops.  I will take a closer look and discuss this with my colleagues,
in case I've managed to miss something.
> The JLS does, however, explicitly claim that this is legal code:
> {
>     int n = 5;
>     while (n > 7) k = 2;
> }
>
> But I don't see what the semantics of this are. Does it actually
mandate non-termination?
The execution will never enter the while loop, as far as I can tell.

-- Sanjoy

Nuno Lopes

2015-Jun-28 11:08 UTC

head link

[LLVMdev] readonly and infinite loops

People have been aware of this issue for a long time (as you can see by 
Nick's 2010 patch).  I guess it was not pushed further because of lack of 
practical interest.
Maybe Nick remembers more about the issue and why the patch was dropped.

Sanjoy: do you have a practical concern about this issue?  I mean, of course 
this can be "fixed", but it will require some work, and even a
termination
checker for -functionattrs for the non-C/C++ frontends.

Nuno

-----Original Message----- 
From: Sanjoy Das
Sent: Sunday, June 28, 2015 7:50 AM
Subject: Re: [LLVMdev] readonly and infinite loops
> You dropped some context...
> A daemon program wouldn't be readonly. An infinite loop can be.
Right.

To prevent miscommunication, here is a quick analysis of a problematic
(IMO) example:

We start with

```
define void @infloop(i1 %c) {
entry:
  br i1 %c,  label %l, label %e

l:
  br label %l

e:
  ret void
}

define void @main_func() {
entry:
  call void @infloop(i1 1)
  ret void
}
```

In this program `@main_func`, when called, will loop infinitely.

If we run this program through `opt -functionattrs -prune-eh
-early-cse`, we get

```
; Function Attrs: nounwind readnone
define void @infloop(i1 %c) #0 {
entry:
  br i1 %c, label %l, label %e

l:                                                ; preds = %l, %entry
  br label %l

e:                                                ; preds = %entry
  ret void
}

; Function Attrs: nounwind
define void @main_func() #1 {
entry:
  ret void
}

attributes #0 = { nounwind readnone }
attributes #1 = { nounwind }
```

LLVM has optimized `@main_func` to return immediately.
`-functionattrs` and `-prune-eh` infer `nounwind readnone` for
`@infloop` and `-early-cse` delets such calls that are dead.

There are three ways to justify what happened:

1. The optimization was correct.  Infinite loops [1] in LLVM IR are
    UB and since the original program had UB, it can be optimized to
    anything.  This is problematic since if this were true, we would
    have trouble writing frontends for programming languages that
    actually have a well defined `while(1);`.

2. The bug is in `-early-cse`: a `readnone nounwind` function can
    loop infinitely, so it is not correct to remove a dead call to
    such a function.

3. The bug is in `-functionattrs`: a function cannot be marked as
    `readonly` if it may loop infinitely.


Needless to say, both (2) and (3) are "easy to fix"; the question is
really about the deeper semantic issue about how LLVM treats
`while(1);`.

There are interesting related questions on the semantics of the
equivalent C/C++ programs, but I personally am interested
only in "pure" LLVM IR semantics.

-- Sanjoy

[1]: We could make the distinction between infinite loops that are
empty vs. infinite loops that are non-empty but that's tricky -- if
only empty infinite loops were UB then e.g. LICM gets harder, since
sinking the store out of loops like `while(1) *addr = 42;` now
introduces UB.

Hal Finkel

2015-Jun-28 13:27 UTC

head link

[LLVMdev] readonly and infinite loops

----- Original Message -----> From: "Nuno Lopes" <nunoplopes at sapo.pt>
> To: "Sanjoy Das" <sanjoy at playingwithpointers.com>,
"Jeremy Lakeman" <Jeremy.Lakeman at gmail.com>, nlewycky at
google.com
> Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>
> Sent: Sunday, June 28, 2015 6:08:52 AM
> Subject: Re: [LLVMdev] readonly and infinite loops
> 
> People have been aware of this issue for a long time (as you can see
> by
> Nick's 2010 patch).  I guess it was not pushed further because of
> lack of
> practical interest.
I can't comment about the rationale in 2010, but this came up again when I
was originally working on heap-to-stack conversion. Being able to mark functions
as halting is essential for doing heap-to-stack conversion. This was put on
hold, however, essentially awaiting the new pass manager. The issue is that
you'd like to be able to use SCEV in a module/CGSCC-level pass to infer the
attribute on functions with loops, and this cannot be done directly until we
have the new pass manager.

On the other hand, maybe we don't need to wait for that now. We now also
have a scheme for tagging loops with metadata, and we could use that to add
metadata to loops known to have a finite trip count, and then use that metadata
in a pass that infers the attribute (FunctionAttrs presumably).

 -Hal
> Maybe Nick remembers more about the issue and why the patch was
> dropped.
> 
> Sanjoy: do you have a practical concern about this issue?  I mean, of
> course
> this can be "fixed", but it will require some work, and even a
> termination
> checker for -functionattrs for the non-C/C++ frontends.
> 
> Nuno
> 
> -----Original Message-----
> From: Sanjoy Das
> Sent: Sunday, June 28, 2015 7:50 AM
> Subject: Re: [LLVMdev] readonly and infinite loops
> 
> > You dropped some context...
> 
> > A daemon program wouldn't be readonly. An infinite loop can be.
> 
> Right.
> 
> To prevent miscommunication, here is a quick analysis of a
> problematic
> (IMO) example:
> 
> We start with
> 
> ```
> define void @infloop(i1 %c) {
> entry:
>   br i1 %c,  label %l, label %e
> 
> l:
>   br label %l
> 
> e:
>   ret void
> }
> 
> define void @main_func() {
> entry:
>   call void @infloop(i1 1)
>   ret void
> }
> ```
> 
> In this program `@main_func`, when called, will loop infinitely.
> 
> If we run this program through `opt -functionattrs -prune-eh
> -early-cse`, we get
> 
> ```
> ; Function Attrs: nounwind readnone
> define void @infloop(i1 %c) #0 {
> entry:
>   br i1 %c, label %l, label %e
> 
> l:                                                ; preds = %l,
> %entry
>   br label %l
> 
> e:                                                ; preds = %entry
>   ret void
> }
> 
> ; Function Attrs: nounwind
> define void @main_func() #1 {
> entry:
>   ret void
> }
> 
> attributes #0 = { nounwind readnone }
> attributes #1 = { nounwind }
> ```
> 
> LLVM has optimized `@main_func` to return immediately.
> `-functionattrs` and `-prune-eh` infer `nounwind readnone` for
> `@infloop` and `-early-cse` delets such calls that are dead.
> 
> There are three ways to justify what happened:
> 
> 1. The optimization was correct.  Infinite loops [1] in LLVM IR are
>     UB and since the original program had UB, it can be optimized to
>     anything.  This is problematic since if this were true, we would
>     have trouble writing frontends for programming languages that
>     actually have a well defined `while(1);`.
> 
> 2. The bug is in `-early-cse`: a `readnone nounwind` function can
>     loop infinitely, so it is not correct to remove a dead call to
>     such a function.
> 
> 3. The bug is in `-functionattrs`: a function cannot be marked as
>     `readonly` if it may loop infinitely.
> 
> 
> Needless to say, both (2) and (3) are "easy to fix"; the question
is
> really about the deeper semantic issue about how LLVM treats
> `while(1);`.
> 
> There are interesting related questions on the semantics of the
> equivalent C/C++ programs, but I personally am interested
> only in "pure" LLVM IR semantics.
> 
> -- Sanjoy
> 
> [1]: We could make the distinction between infinite loops that are
> empty vs. infinite loops that are non-empty but that's tricky -- if
> only empty infinite loops were UB then e.g. LICM gets harder, since
> sinking the store out of loops like `while(1) *addr = 42;` now
> introduces UB.
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Sanjoy Das

2015-Jun-28 19:20 UTC

head link

[LLVMdev] readonly and infinite loops

> Sanjoy: do you have a practical concern about this issue?  I mean, of
course
I don't have any immediate practical concerns and I've not actually
seen miscompiles resulting from this.  But fixing this is certainly
more important than a "nice to have" -- we cannot miscompile
"while
(true) ;" loops.

-- Sanjoy
> this can be "fixed", but it will require some work, and even a
termination
> checker for -functionattrs for the non-C/C++ frontends.
>
> Nuno
>
> -----Original Message----- From: Sanjoy Das
> Sent: Sunday, June 28, 2015 7:50 AM
> Subject: Re: [LLVMdev] readonly and infinite loops
>
>
>> You dropped some context...
>
>
>> A daemon program wouldn't be readonly. An infinite loop can be.
>
>
> Right.
>
> To prevent miscommunication, here is a quick analysis of a problematic
> (IMO) example:
>
> We start with
>
> ```
> define void @infloop(i1 %c) {
> entry:
>  br i1 %c,  label %l, label %e
>
> l:
>  br label %l
>
> e:
>  ret void
> }
>
> define void @main_func() {
> entry:
>  call void @infloop(i1 1)
>  ret void
> }
> ```
>
> In this program `@main_func`, when called, will loop infinitely.
>
> If we run this program through `opt -functionattrs -prune-eh
> -early-cse`, we get
>
> ```
> ; Function Attrs: nounwind readnone
> define void @infloop(i1 %c) #0 {
> entry:
>  br i1 %c, label %l, label %e
>
> l:                                                ; preds = %l, %entry
>  br label %l
>
> e:                                                ; preds = %entry
>  ret void
> }
>
> ; Function Attrs: nounwind
> define void @main_func() #1 {
> entry:
>  ret void
> }
>
> attributes #0 = { nounwind readnone }
> attributes #1 = { nounwind }
> ```
>
> LLVM has optimized `@main_func` to return immediately.
> `-functionattrs` and `-prune-eh` infer `nounwind readnone` for
> `@infloop` and `-early-cse` delets such calls that are dead.
>
> There are three ways to justify what happened:
>
> 1. The optimization was correct.  Infinite loops [1] in LLVM IR are
>    UB and since the original program had UB, it can be optimized to
>    anything.  This is problematic since if this were true, we would
>    have trouble writing frontends for programming languages that
>    actually have a well defined `while(1);`.
>
> 2. The bug is in `-early-cse`: a `readnone nounwind` function can
>    loop infinitely, so it is not correct to remove a dead call to
>    such a function.
>
> 3. The bug is in `-functionattrs`: a function cannot be marked as
>    `readonly` if it may loop infinitely.
>
>
> Needless to say, both (2) and (3) are "easy to fix"; the question
is
> really about the deeper semantic issue about how LLVM treats
> `while(1);`.
>
> There are interesting related questions on the semantics of the
> equivalent C/C++ programs, but I personally am interested
> only in "pure" LLVM IR semantics.
>
> -- Sanjoy
>
> [1]: We could make the distinction between infinite loops that are
> empty vs. infinite loops that are non-empty but that's tricky -- if
> only empty infinite loops were UB then e.g. LICM gets harder, since
> sinking the store out of loops like `while(1) *addr = 42;` now
> introduces UB.

llvm dev - Jun 2015 - [LLVMdev] readonly and infinite loops

[LLVMdev] readonly and infinite loops

[LLVMdev] readonly and infinite loops

[LLVMdev] readonly and infinite loops

[LLVMdev] readonly and infinite loops

[LLVMdev] readonly and infinite loops

[LLVMdev] readonly and infinite loops