thr3ads.net - llvm dev - [LLVMdev] readonly and infinite loops [Jun 2015]

If this information is useful, please help other people find it:
Share via:

Hal Finkel

2015-Jun-30 00:19 UTC

[LLVMdev] readonly and infinite loops

----- Original Message -----> From: "Nuno Lopes" <nunoplopes at sapo.pt>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "LLVM Developers Mailing List" <llvmdev at
cs.uiuc.edu>, "Sanjoy Das" <sanjoy at
playingwithpointers.com>, "Jeremy
> Lakeman" <Jeremy.Lakeman at gmail.com>, nlewycky at google.com
> Sent: Sunday, June 28, 2015 4:39:44 PM
> Subject: Re: [LLVMdev] readonly and infinite loops
> 
> >> People have been aware of this issue for a long time (as you can
> >> see
> >> by
> >> Nick's 2010 patch).  I guess it was not pushed further because
of
> >> lack of
> >> practical interest.
> >
> > I can't comment about the rationale in 2010, but this came up
again
> > when I
> > was originally working on heap-to-stack conversion. Being able to
> > mark
> > functions as halting is essential for doing heap-to-stack
> > conversion. This
> > was put on hold, however, essentially awaiting the new pass
> > manager. The
> > issue is that you'd like to be able to use SCEV in a
> > module/CGSCC-level
> > pass to infer the attribute on functions with loops, and this
> > cannot be
> > done directly until we have the new pass manager.
> 
> Interesting.  Could you give an example why knowing a function will
> halt is
> essential for the heap-to-stack conversion?
The key situation you need to establish in order to do heap-to-stack conversion,
is that you can see the calls to free (or realloc, etc.) along all control-flow
paths. If you can, then you can perform the conversion (because any additional
calls to free that you can't observe would lead to a double free, and thus
undefined behavior). Thus, if we have this situation:

void bar(int *a);
void foo() {
  int *a = (int*) malloc(sizeof(int)*40);
  bar(a);
  free(a);
}

we can perform heap-to-stack conversion iff we know that bar(int *) always
returns normally. If it never returns (perhaps by looping indefinitely) then it
might capture the pointer, pass it off to some other thread, and that other
thread might call free() (or it might just call free() itself before looping
indefinitely). In short, we need to know whether the call to free() after the
call to bar() is dead. If we know that it is reached, then we can perform
heap-to-stack conversion. Also worth noting is that because we unconditionally
free(a) after the call to bar(a), it would not be legal for bar(a) to call
realloc on a (because if realloc did reallocate the buffer we'd end up
freeing it twice when bar(a) did eventually return).
>  It's definitely an
> optimization
> we should be doing (although, correct me if I'm wrong, LLVM already
> has some
> form of this for very small mallocs, no?)
Not that I recall, although we will remove unused mallocs (those that are
immediately freed).

Thanks again,
Hal
> 
> Thanks,
> Nuno
> 
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Nuno Lopes

2015-Jun-30 21:30 UTC

head link

[LLVMdev] readonly and infinite loops

>> Interesting.  Could you give an example why knowing a function will
>> halt is
>> essential for the heap-to-stack conversion?
>
> The key situation you need to establish in order to do heap-to-stack 
> conversion, is that you can see the calls to free (or realloc, etc.) along 
> all control-flow paths. If you can, then you can perform the conversion 
> (because any additional calls to free that you can't observe would lead
to
> a double free, and thus undefined behavior). Thus, if we have this 
> situation:
>
> void bar(int *a);
> void foo() {
>   int *a = (int*) malloc(sizeof(int)*40);
>   bar(a);
>   free(a);
> }
>
> we can perform heap-to-stack conversion iff we know that bar(int *) always 
> returns normally. If it never returns (perhaps by looping indefinitely) 
> then it might capture the pointer, pass it off to some other thread, and 
> that other thread might call free() (or it might just call free() itself 
> before looping indefinitely). In short, we need to know whether the call 
> to free() after the call to bar() is dead. If we know that it is reached, 
> then we can perform heap-to-stack conversion. Also worth noting is that 
> because we unconditionally free(a) after the call to bar(a), it would not 
> be legal for bar(a) to call realloc on a (because if realloc did 
> reallocate the buffer we'd end up freeing it twice when bar(a) did 
> eventually return).
I see, thanks!
Your argument is that knowing that bar returns implies that 'a' cannot
be
captured or reallocated, otherwise it would be UB.  Makes sense, yes.

Thanks,
Nuno

Hal Finkel

2015-Jun-30 21:35 UTC

head link

[LLVMdev] readonly and infinite loops

----- Original Message -----> From: "Nuno Lopes" <nunoplopes at sapo.pt>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "LLVM Developers Mailing List" <llvmdev at
cs.uiuc.edu>, "Sanjoy Das" <sanjoy at
playingwithpointers.com>, "Jeremy
> Lakeman" <Jeremy.Lakeman at gmail.com>, nlewycky at google.com
> Sent: Tuesday, June 30, 2015 4:30:15 PM
> Subject: Re: [LLVMdev] readonly and infinite loops
> 
> >> Interesting.  Could you give an example why knowing a function
> >> will
> >> halt is
> >> essential for the heap-to-stack conversion?
> >
> > The key situation you need to establish in order to do
> > heap-to-stack
> > conversion, is that you can see the calls to free (or realloc,
> > etc.) along
> > all control-flow paths. If you can, then you can perform the
> > conversion
> > (because any additional calls to free that you can't observe would
> > lead to
> > a double free, and thus undefined behavior). Thus, if we have this
> > situation:
> >
> > void bar(int *a);
> > void foo() {
> >   int *a = (int*) malloc(sizeof(int)*40);
> >   bar(a);
> >   free(a);
> > }
> >
> > we can perform heap-to-stack conversion iff we know that bar(int *)
> > always
> > returns normally. If it never returns (perhaps by looping
> > indefinitely)
> > then it might capture the pointer, pass it off to some other
> > thread, and
> > that other thread might call free() (or it might just call free()
> > itself
> > before looping indefinitely). In short, we need to know whether the
> > call
> > to free() after the call to bar() is dead. If we know that it is
> > reached,
> > then we can perform heap-to-stack conversion. Also worth noting is
> > that
> > because we unconditionally free(a) after the call to bar(a), it
> > would not
> > be legal for bar(a) to call realloc on a (because if realloc did
> > reallocate the buffer we'd end up freeing it twice when bar(a) did
> > eventually return).
> 
> I see, thanks!
> Your argument is that knowing that bar returns implies that 'a'
> cannot be
> captured or reallocated, otherwise it would be UB. 
Yes. Technically, it can be captured, it just does not matter if it is captured,
but the captured value is 'dead' once you pass the call to free(a).

 -Hal
> Makes sense, yes.
> 
> Thanks,
> Nuno
> 
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Nick Lewycky

2015-Jul-01 22:15 UTC

head link

[LLVMdev] readonly and infinite loops

On 30 June 2015 at 14:30, Nuno Lopes <nunoplopes at sapo.pt> wrote:
> Interesting.  Could you give an example why knowing a function will
>>> halt is
>>> essential for the heap-to-stack conversion?
>>>
>>
>> The key situation you need to establish in order to do heap-to-stack
>> conversion, is that you can see the calls to free (or realloc, etc.)
along
>> all control-flow paths. If you can, then you can perform the conversion
>> (because any additional calls to free that you can't observe would
lead to
>> a double free, and thus undefined behavior). Thus, if we have this
>> situation:
>>
>> void bar(int *a);
>> void foo() {
>>   int *a = (int*) malloc(sizeof(int)*40);
>>   bar(a);
>>   free(a);
>> }
>>
>> we can perform heap-to-stack conversion iff we know that bar(int *)
>> always returns normally. If it never returns (perhaps by looping
>> indefinitely) then it might capture the pointer, pass it off to some
other
>> thread, and that other thread might call free() (or it might just call
>> free() itself before looping indefinitely). In short, we need to know
>> whether the call to free() after the call to bar() is dead. If we know
that
>> it is reached, then we can perform heap-to-stack conversion. Also worth
>> noting is that because we unconditionally free(a) after the call to
bar(a),
>> it would not be legal for bar(a) to call realloc on a (because if
realloc
>> did reallocate the buffer we'd end up freeing it twice when bar(a)
did
>> eventually return).
>>
>
> I see, thanks!
> Your argument is that knowing that bar returns implies that 'a'
cannot be
> captured or reallocated, otherwise it would be UB.  Makes sense, yes.
>
I'm afraid it's worse than that.

void caller() {
  int *ptr = malloc(sizeof(int));
  callee(ptr);
  free(ptr);
}

void callee(int *ptr) {
  if (...) {
    free(ptr);
    log("get critical log message out to the humans, maybe my final message
will be the last hint they need to finally resolve the bugs inside
myself");
  }
}

C and C++ do permit undefined behaviour to have interactions backwards in
time. In essence, knowing that UB must happen later can cause impossible
things to happen now. However, LLVM offers an implementation-defined
guarantee that no UB occurs until the earliest instruction where it occurs.

Your reasoning that we can't have a free in the callee because we'd hit
a
double-free (and hence UB) in the caller is not sufficient to perform
heap-to-stack transform with that guarantee, because it will move the UB
sooner to the free() in the callee. That violates our implementation
guarantee that the log message will be emitted (because we'll have already
entered UB-land).

The alternatives are to define double-free as not entering full UB (similar
to poison, but we also have problems to solve with load, store, call,
branch on value computed through signed overflow, etc.), or to remove the
guarantee that the log message will be emitted.

Nick
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150701/c376590e/attachment.html>

llvm dev - Jun 2015 - [LLVMdev] readonly and infinite loops

[LLVMdev] readonly and infinite loops

[LLVMdev] readonly and infinite loops

[LLVMdev] readonly and infinite loops

[LLVMdev] readonly and infinite loops