thr3ads.net - llvm dev - [llvm-dev] alloca combining, not (yet) possible ? [Aug 2015]

If this information is useful, please help other people find it:
Share via:

Nat! via llvm-dev

2015-Aug-31 13:21 UTC

[llvm-dev] alloca combining, not (yet) possible ?

Caldarale, Charles R schrieb:> You have not provided us with the declaration for f().  Unless its argument
is marked with the nocapture attribute, the compilation of g() cannot assume
that f() has not retained a pointer to the x struct and is using it in the
second call.
>
thanks a lot for the input. Yes, I forgot to that. The C function 
declaration would have been

	void	f( struct a_b *p);

which compiled into

	declare void @f(%struct.a_b*) #2

with

	attributes #2 = { "disable-tail-calls"="false" 
"less-precise-fpmad"="false"
"no-frame-pointer-elim"="true"
"no-frame-pointer-elim-non-leaf"
"no-infs-fp-math"="false"
"no-nans-fp-math"="false"
"stack-protector-buffer-size"="8"
"target-cpu"="core2"
"target-features"="+cx16,+sse,+sse2,+sse3,+ssse3"
"unsafe-fp-math"="false"
"use-soft-float"="false" }

---

I could not figure out how to decorate my C code to emit the nocapture 
attribute, __attribute(( nocapture) is unknown. So I tried to modify the 
IR code by hand to read thusly:

	declare void @f(%struct.a_b* nocapture) #1

But in the end, it didn't make a difference, when I compiled it with

../llvm-build.d/bin/llc -O3 -o test-combine-alloca.s test-combine-alloca.ir

it still used two allocas.

 From a C perspective, I find it weird, that it should concern the 
caller if the called function "mistakenly" holds onto an alloca
buffer,
that will be invalid soon anyway. But I guess that's C++ magic somehow :)

Ciao
    Nat!
----
; ModuleID = 'test-combine-alloca.c'
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-apple-macosx10.10.0"

%struct.a_b = type { i32, i32 }

declare void @f(%struct.a_b* nocapture) #1

; Function Attrs: nounwind ssp uwtable
define void @g() #0 {
entry:
   %x = alloca %struct.a_b, align 4
   %y = alloca %struct.a_b, align 4
   %a = getelementptr inbounds %struct.a_b, %struct.a_b* %x, i32 0, i32 0
   store i32 1, i32* %a, align 4
   %b = getelementptr inbounds %struct.a_b, %struct.a_b* %x, i32 0, i32 1
   store i32 2, i32* %b, align 4
   call void @f(%struct.a_b* %x)
   %a1 = getelementptr inbounds %struct.a_b, %struct.a_b* %y, i32 0, i32 0
   store i32 1, i32* %a1, align 4
   %b2 = getelementptr inbounds %struct.a_b, %struct.a_b* %y, i32 0, i32 1
   store i32 3, i32* %b2, align 4
   call void @f(%struct.a_b* %y)
   ret void
}


attributes #0 = { nounwind ssp uwtable
"disable-tail-calls"="false"
"less-precise-fpmad"="false"
"no-frame-pointer-elim"="true"
"no-frame-pointer-elim-non-leaf"
"no-infs-fp-math"="false"
"no-nans-fp-math"="false"
"stack-protector-buffer-size"="8"
"target-cpu"="core2"
"target-features"="+cx16,+sse,+sse2,+sse3,+ssse3"
"unsafe-fp-math"="false"
"use-soft-float"="false" }
attributes #1 = { "disable-tail-calls"="false" 
"less-precise-fpmad"="false"
"no-frame-pointer-elim"="true"
"no-frame-pointer-elim-non-leaf"
"no-infs-fp-math"="false"
"no-nans-fp-math"="false"
"stack-protector-buffer-size"="8"
"target-cpu"="core2"
"target-features"="+cx16,+sse,+sse2,+sse3,+ssse3"
"unsafe-fp-math"="false"
"use-soft-float"="false" }

!llvm.module.flags = !{!0}
!llvm.ident = !{!1}

!0 = !{i32 1, !"PIC Level", i32 2}
!1 = !{!"clang version 3.7.0 (http://llvm.org/git/clang.git 
36ba449caa88f710520cdce148457e5a75e9dabc) (http://llvm.org/git/llvm.git 
dccade93466c50834dbaa5f4dabb81e90d768c40)"}
----

Björn Steinbrink via llvm-dev

2015-Aug-31 13:32 UTC

head link

[llvm-dev] alloca combining, not (yet) possible ?

HI Nat,

LLVM currently only performs stack coloring to merge allocas if you
use lifetime intrinsics to tell it exactly where the lifetimes of the
alloca start and end. With your code, the lifetimes of both x and y
cover the entire function. Introducing a lexical scope to limit the
lifetime of x gives clang the necessary information to emit the
lifetime.end intrinsic, and declaring y after that scope makes it emit
the lifetime.start intrinsic appropriately as well.

struct a_b {
  long a;
  long b;
};

void f(struct a_b*);

void g(void)
{
  { // Lifetime of x starts here
    struct a_b   x;

    x.a = 1;
    x.b = 2;
    f(&x);
  } // Lifetime of x ends here

  // Lifetime of y starts here
  struct a_b   y;
  y.a = 1;
  y.b = 3;
  f(&y);
  // Lifetime of y ends here
}

It would be nice if LLVM could do this for non-escaping allocas
without the need for those intrinsics, but currently, this is the way
to go.

Cheers,
Björn


2015-08-31 15:21 GMT+02:00 Nat! via llvm-dev <llvm-dev at
lists.llvm.org>:> Caldarale, Charles R schrieb:
>>
>> You have not provided us with the declaration for f().  Unless its
>> argument is marked with the nocapture attribute, the compilation of g()
>> cannot assume that f() has not retained a pointer to the x struct and
is
>> using it in the second call.
>>
>
> thanks a lot for the input. Yes, I forgot to that. The C function
> declaration would have been
>
>         void    f( struct a_b *p);
>
> which compiled into
>
>         declare void @f(%struct.a_b*) #2
>
> with
>
>         attributes #2 = { "disable-tail-calls"="false"
> "less-precise-fpmad"="false"
"no-frame-pointer-elim"="true"
> "no-frame-pointer-elim-non-leaf"
"no-infs-fp-math"="false"
> "no-nans-fp-math"="false"
"stack-protector-buffer-size"="8"
> "target-cpu"="core2"
"target-features"="+cx16,+sse,+sse2,+sse3,+ssse3"
> "unsafe-fp-math"="false"
"use-soft-float"="false" }
>
> ---
>
> I could not figure out how to decorate my C code to emit the nocapture
> attribute, __attribute(( nocapture) is unknown. So I tried to modify the IR
> code by hand to read thusly:
>
>         declare void @f(%struct.a_b* nocapture) #1
>
> But in the end, it didn't make a difference, when I compiled it with
>
> ../llvm-build.d/bin/llc -O3 -o test-combine-alloca.s test-combine-alloca.ir
>
> it still used two allocas.
>
> From a C perspective, I find it weird, that it should concern the caller if
> the called function "mistakenly" holds onto an alloca buffer,
that will be
> invalid soon anyway. But I guess that's C++ magic somehow :)
>
> Ciao
>    Nat!
> ----
> ; ModuleID = 'test-combine-alloca.c'
> target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
> target triple = "x86_64-apple-macosx10.10.0"
>
> %struct.a_b = type { i32, i32 }
>
> declare void @f(%struct.a_b* nocapture) #1
>
> ; Function Attrs: nounwind ssp uwtable
> define void @g() #0 {
> entry:
>   %x = alloca %struct.a_b, align 4
>   %y = alloca %struct.a_b, align 4
>   %a = getelementptr inbounds %struct.a_b, %struct.a_b* %x, i32 0, i32 0
>   store i32 1, i32* %a, align 4
>   %b = getelementptr inbounds %struct.a_b, %struct.a_b* %x, i32 0, i32 1
>   store i32 2, i32* %b, align 4
>   call void @f(%struct.a_b* %x)
>   %a1 = getelementptr inbounds %struct.a_b, %struct.a_b* %y, i32 0, i32 0
>   store i32 1, i32* %a1, align 4
>   %b2 = getelementptr inbounds %struct.a_b, %struct.a_b* %y, i32 0, i32 1
>   store i32 3, i32* %b2, align 4
>   call void @f(%struct.a_b* %y)
>   ret void
> }
>
>
> attributes #0 = { nounwind ssp uwtable
"disable-tail-calls"="false"
> "less-precise-fpmad"="false"
"no-frame-pointer-elim"="true"
> "no-frame-pointer-elim-non-leaf"
"no-infs-fp-math"="false"
> "no-nans-fp-math"="false"
"stack-protector-buffer-size"="8"
> "target-cpu"="core2"
"target-features"="+cx16,+sse,+sse2,+sse3,+ssse3"
> "unsafe-fp-math"="false"
"use-soft-float"="false" }
> attributes #1 = { "disable-tail-calls"="false"
"less-precise-fpmad"="false"
> "no-frame-pointer-elim"="true"
"no-frame-pointer-elim-non-leaf"
> "no-infs-fp-math"="false"
"no-nans-fp-math"="false"
> "stack-protector-buffer-size"="8"
"target-cpu"="core2"
> "target-features"="+cx16,+sse,+sse2,+sse3,+ssse3"
"unsafe-fp-math"="false"
> "use-soft-float"="false" }
>
> !llvm.module.flags = !{!0}
> !llvm.ident = !{!1}
>
> !0 = !{i32 1, !"PIC Level", i32 2}
> !1 = !{!"clang version 3.7.0 (http://llvm.org/git/clang.git
> 36ba449caa88f710520cdce148457e5a75e9dabc) (http://llvm.org/git/llvm.git
> dccade93466c50834dbaa5f4dabb81e90d768c40)"}
> ----
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Nat! via llvm-dev

2015-Aug-31 14:13 UTC

head link

[llvm-dev] alloca combining, not (yet) possible ?

Björn Steinbrink schrieb:
 > ...>    // Lifetime of y starts here
>    struct a_b   y;
>    y.a = 1;
>    y.b = 3;
>    f(&y);
>    // Lifetime of y ends here
> }
Nice, thanks very much. This does the alloca combining (even without 
having to specify "nocapture"). Wrapping my clang output with lifetime
calls shouldn't be a problem.

The code that does that optimization, is I assume:

	http://www.llvm.org/docs/doxygen/html/StackColoring_8cpp_source.html


I would like to take the alloca combining a step further still, which is 
the combining of allocas across functions, at least on tail calls.
My current idea would be to

* invent an attribute to mark my parameter. Lets say "reusealloca"

* at the beginning of the optimization pass, collect all parameters of 
type reusealloca and place them in the alloca map with lifetimes ending 
before the tail call (figure out how to find it)

---
void  h( struct a_b  *p);

void  g( struct a_b __attribute((reusealloca)) *x)
{
     struct a_b   y;  // unneeded, use space provided by x
     y.a = 18;
     y.b = x->b;	     // unneeded, &y.b == &x->b
     h( &y);
}

void  f( void)
{
     struct a_b   x;

     x.a = 1;
     x.b = 3;
     g( &x);
}
---

Does that sound feasible ?

Ciao
    Nat!

llvm dev - Aug 2015 - alloca combining, not (yet) possible ?

[llvm-dev] alloca combining, not (yet) possible ?

[llvm-dev] alloca combining, not (yet) possible ?

[llvm-dev] alloca combining, not (yet) possible ?