thr3ads.net - llvm dev - [LLVMdev] Non "folding" Stack Allocation [Aug 2011]

If this information is useful, please help other people find it:
Share via:

Matthieu Monrocq

2011-Aug-17 12:02 UTC

[LLVMdev] Non "folding" Stack Allocation

Following a question on StackOverflow [1], I was wondering if for big
allocations, LLVM would "delay" the allocation or rather perform it
upfront.

The following code was thus submitted to the LLVM Try Out page:

void doSomething(char*,char*);

void function(bool b)
{
    char b1[1 * 1024];
    if( b ) {
       char b2[1 * 1024];
       doSomething(b1, b2);
    } else {
       char b3[512 * 1024];
       doSomething(b1, b3);
    }
}

Certainly nothing spectacular.

I was however quite surprised by the output:

; ModuleID = '/tmp/webcompile/_28066_0.bc'
target datalayout
"e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
target triple = "x86_64-unknown-linux-gnu"

define void @_Z8functionb(i1 zeroext %b) {
entry:
  %b1 = alloca [1024 x i8], align 1               ; <[1024 x i8]*>
[#uses=1]
  %b2 = alloca [1024 x i8], align 1               ; <[1024 x i8]*>
[#uses=1]
  %b3 = alloca [524288 x i8], align 1            ; <[524288 x i8]*>
[#uses=1]
  %arraydecay = getelementptr inbounds [1024 x i8]* %b1, i64 0, i64 0
; <i8*> [#uses=2]
  br i1 %b, label %if.then, label %if.else

if.then:                                          ; preds = %entry
  %arraydecay2 = getelementptr inbounds [1024 x i8]* %b2, i64 0, i64 0
; <i8*> [#uses=1]
  call void @_Z11doSomethingPcS_(i8* %arraydecay, i8* %arraydecay2)
  ret void

if.else:                                          ; preds = %entry
  %arraydecay6 = getelementptr inbounds [524288 x i8]* %b3, i64 0, i64
0 ; <i8*> [#uses=1]
  call void @_Z11doSomethingPcS_(i8* %arraydecay, i8* %arraydecay6)
  ret void
}

declare void @_Z11doSomethingPcS_(i8*, i8*)

(Compiled with "Standard" optimizations as C++ code)

My surprise stems from the fact that Clang/LLVM seems to reserve (at least
in its bytecode) space for all temporary variables, not taking into account
that some are mutually exclusive. I would have expected the space to be *
folded*. However, since this is LLVM IR, and not the final assembly, and
since LLVM IR is strongly typed, it makes sense to keep them separated.

Therefore I was wondering if in the x86 representation (say) these
*would*be folded, and if so what is the name of the
Optimization/CodeGen pass
responsible ?

-- Matthieu

[1]
http://stackoverflow.com/questions/7089035/at-what-moment-is-memory-typically-allocated-for-local-variables-in-c
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110817/ca5cd8c7/attachment.html>

Chris Lattner

2011-Aug-17 18:22 UTC

head link

[LLVMdev] Non "folding" Stack Allocation

On Aug 17, 2011, at 5:02 AM, Matthieu Monrocq wrote:
> My surprise stems from the fact that Clang/LLVM seems to reserve (at least
in its bytecode) space for all temporary variables, not taking into account that
some are mutually exclusive. I would have expected the space to be folded.
However, since this is LLVM IR, and not the final assembly, and since LLVM IR is
strongly typed, it makes sense to keep them separated.
> 
> Therefore I was wondering if in the x86 representation (say) these would be
folded, and if so what is the name of the Optimization/CodeGen pass responsible
?
I commented on stack overflow.  The rough plan of record is captured here:
http://nondot.org/sabre/LLVMNotes/MemoryUseMarkers.txt

The basic idea is that we capture the lifetime of the memory object in IR, then
have the code generator allocate multiple alloca's with non-overlapping
lifetimes to the same stack offset.

-Chris

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110817/6071351f/attachment.html>

Matthieu Monrocq

2011-Aug-18 16:46 UTC

head link

[LLVMdev] Non "folding" Stack Allocation

2011/8/17 Chris Lattner <clattner at apple.com>
>
> On Aug 17, 2011, at 5:02 AM, Matthieu Monrocq wrote:
>
> My surprise stems from the fact that Clang/LLVM seems to reserve (at least
> in its bytecode) space for all temporary variables, not taking into account
> that some are mutually exclusive. I would have expected the space to be *
> folded*. However, since this is LLVM IR, and not the final assembly, and
> since LLVM IR is strongly typed, it makes sense to keep them separated.
>
> Therefore I was wondering if in the x86 representation (say) these *would*
>  be folded, and if so what is the name of the Optimization/CodeGen pass
> responsible ?
>
>
> I commented on stack overflow.  The rough plan of record is captured here:
> http://nondot.org/sabre/LLVMNotes/MemoryUseMarkers.txt
>
> The basic idea is that we capture the lifetime of the memory object in IR,
> then have the code generator allocate multiple alloca's with
non-overlapping
> lifetimes to the same stack offset.
>
> -Chris
>
>Thank you very much for the notes and answer!

--Matthieu
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110818/94f1b027/attachment.html>

Possibly Parallel Threads

Search for more possibly parallel threads

llvm dev - Aug 2011 - [LLVMdev] Non "folding" Stack Allocation

[LLVMdev] Non "folding" Stack Allocation

[LLVMdev] Non "folding" Stack Allocation

[LLVMdev] Non "folding" Stack Allocation

Possibly Parallel Threads