thr3ads.net - llvm dev - [LLVMdev] missed optimizations [Sep 2008]

If this information is useful, please help other people find it:
Share via:

Nuno Lopes

2008-Sep-04 15:39 UTC

[LLVMdev] missed optimizations

Hi,

I have two questions about optimizations performed by llvm.

Consider these simple functions:
int x(int b) { return b?4:6; }
int y() { return x(0); }

int x2() { return 5; }
int y2() { return x2(); }

the optimized bitcode (with clang + opt -std-compiler-opts) is:
define i32 @y(...) nounwind {
entry:
 ret i32 6
}

define i32 @y2(...) nounwind {
entry:
 %call = call i32 (...)* @x2( )  ; <i32> [#uses=0]
 ret i32 5
}

So why does LLVM optimizes the more difficult case, but leaves behind the 
function call in the easiest case? :)


Second question:
int f();

int g() {
 if (f() == 5)  return 3;
 return 4;
}

int h() {
 if (g() == 6) return 2;
 return 1;
}

gives the following optimized bc:

define i32 @g(...) nounwind {
entry:
 %call = call i32 (...)* @f( )  ; <i32> [#uses=1]
 %cmp = icmp eq i32 %call, 5  ; <i1> [#uses=1]
 %retval = select i1 %cmp, i32 3, i32 4  ; <i32> [#uses=1]
 ret i32 %retval
}

define i32 @h(...) nounwind {
entry:
 %call = call i32 (...)* @g( )  ; <i32> [#uses=1]
 %cmp = icmp eq i32 %call, 6  ; <i1> [#uses=1]
 %retval = select i1 %cmp, i32 2, i32 1  ; <i32> [#uses=1]
 ret i32 %retval
}

In function h(), llvm doesn't realize that g() never returns 6, and thus it 
could reduce that function to { g(); return 1; }. Doesn't llvm produce some 
sort of summary for functions with e.g. their possible return values? If 
not, is there any pass where I could implement this?  (I have some code that 
has many opportunities for this kind of optimization)
Also, shouldn't the function g() get inlined in the h() function? It is 
inlined only if I change the g() function to be static. So isn't llvm being 
too conservative when inlining global functions?

Thanks,
Nuno

Cédric Venet

2008-Sep-04 16:19 UTC

head link

[LLVMdev] missed optimizations

Nuno Lopes a écrit :> Hi,
> 
> I have two questions about optimizations performed by llvm.
> 
> Consider these simple functions:
> int x(int b) { return b?4:6; }
> int y() { return x(0); }
> 
> int x2() { return 5; }
> int y2() { return x2(); }
> 
> the optimized bitcode (with clang + opt -std-compiler-opts) is:
> define i32 @y(...) nounwind {
> entry:
>  ret i32 6
> }
> 
> define i32 @y2(...) nounwind {
> entry:
>  %call = call i32 (...)* @x2( )  ; <i32> [#uses=0]
>  ret i32 5
> }
> 
> So why does LLVM optimizes the more difficult case, but leaves behind the 
> function call in the easiest case? :)
> 
I don't know why vararg would inhibit optimisation but maybe it is the 
problem.
You should try with:

int x2(void) { return 5; }
int y2(void) { return x2(); }

it may solve the problem (or use C++ as langage)

Cédric

OvermindDL1

2008-Sep-04 23:43 UTC

head link

[LLVMdev] missed optimizations

The LLVM browser test optimizes both examples as expected...  What
optimization options are different?  Although that optimization could
be gcc-llvm level and not actual llvm level...

On Thu, Sep 4, 2008 at 10:19 AM, Cédric Venet <cedric.venet at
laposte.net> wrote:> Nuno Lopes a écrit :
>> Hi,
>>
>> I have two questions about optimizations performed by llvm.
>>
>> Consider these simple functions:
>> int x(int b) { return b?4:6; }
>> int y() { return x(0); }
>>
>> int x2() { return 5; }
>> int y2() { return x2(); }
>>
>> the optimized bitcode (with clang + opt -std-compiler-opts) is:
>> define i32 @y(...) nounwind {
>> entry:
>>  ret i32 6
>> }
>>
>> define i32 @y2(...) nounwind {
>> entry:
>>  %call = call i32 (...)* @x2( )  ; <i32> [#uses=0]
>>  ret i32 5
>> }
>>
>> So why does LLVM optimizes the more difficult case, but leaves behind
the
>> function call in the easiest case? :)
>>
>
> I don't know why vararg would inhibit optimisation but maybe it is the
> problem.
> You should try with:
>
> int x2(void) { return 5; }
> int y2(void) { return x2(); }
>
> it may solve the problem (or use C++ as langage)
>
> Cédric
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

Eli Friedman

2008-Sep-05 00:45 UTC

head link

[LLVMdev] missed optimizations

On Thu, Sep 4, 2008 at 8:39 AM, Nuno Lopes <nunoplopes at sapo.pt>
wrote:> Hi,
>
> I have two questions about optimizations performed by llvm.
>
> Consider these simple functions:
> int x(int b) { return b?4:6; }
> int y() { return x(0); }
>
> int x2() { return 5; }
> int y2() { return x2(); }
>
> the optimized bitcode (with clang + opt -std-compiler-opts) is:
> define i32 @y(...) nounwind {
> entry:
>  ret i32 6
> }
>
> define i32 @y2(...) nounwind {
> entry:
>  %call = call i32 (...)* @x2( )  ; <i32> [#uses=0]
>  ret i32 5
> }
>
> So why does LLVM optimizes the more difficult case, but leaves behind the
> function call in the easiest case? :)
Pretty simple: LLVM doesn't know how to eliminate the call.

There are a couple of ways it could be eliminated: DCE or inlining.
The current DCE infrastructure isn't clever enough to do
interprocedural analysis, so that doesn't eliminate it.  And the
inliner immediately gives up on varargs functions, so that doesn't
eliminate it.  And

That said, clang really should be turning int x2() { return x(0); }
into "define i32 @x2()" rather than "define i32 @x2(...)";
the
function isn't varargs, and marking it as such could lead to wrong
code for exotic calling conventions.
>
> Second question:
> int f();
>
> int g() {
>  if (f() == 5)  return 3;
>  return 4;
> }
>
> int h() {
>  if (g() == 6) return 2;
>  return 1;
> }
>
> gives the following optimized bc:
>
> define i32 @g(...) nounwind {
> entry:
>  %call = call i32 (...)* @f( )  ; <i32> [#uses=1]
>  %cmp = icmp eq i32 %call, 5  ; <i1> [#uses=1]
>  %retval = select i1 %cmp, i32 3, i32 4  ; <i32> [#uses=1]
>  ret i32 %retval
> }
>
> define i32 @h(...) nounwind {
> entry:
>  %call = call i32 (...)* @g( )  ; <i32> [#uses=1]
>  %cmp = icmp eq i32 %call, 6  ; <i1> [#uses=1]
>  %retval = select i1 %cmp, i32 2, i32 1  ; <i32> [#uses=1]
>  ret i32 %retval
> }
>
> In function h(), llvm doesn't realize that g() never returns 6, and
thus it
> could reduce that function to { g(); return 1; }. Doesn't llvm produce
some
> sort of summary for functions with e.g. their possible return values? If
> not, is there any pass where I could implement this?  (I have some code
that
> has many opportunities for this kind of optimization)
No, there isn't any such infrastructure at the moment.  The LLVM pass
that might do that sort of thing is predsimplify, but it isn't an
interprocedural pass.  It's not necessarily difficult to implement,
depending on how precise you want it to be, but nobody's implemented
it; as far as I know, it doesn't provide significant benefits for
normal C/C++ code.  If there's some specific pattern in your code, you
could probably throw together a specialized pass pretty quickly.
> Also, shouldn't the function g() get inlined in the h() function? It is
> inlined only if I change the g() function to be static. So isn't llvm
being
> too conservative when inlining global functions?
The inliner doesn't like varargs; see above.

-Eli

Matthijs Kooijman

2008-Sep-05 08:03 UTC

head link

[LLVMdev] missed optimizations

Hi Eli,
> That said, clang really should be turning int x2() { return x(0); }
> into "define i32 @x2()" rather than "define i32
@x2(...)"; the
> function isn't varargs, and marking it as such could lead to wrong
> code for exotic calling conventions.I always understood that this is correct per C language specification. For
functions that are internal (static), the DeadArgElim pass removes the vararg
spec if it's unused, but that's not applicable in this case.

Any particular reason the inliner doesn't touch varargs? I think that when
you
call a varargs function without any varargs, it should be able to inline it
just fine? Or at least when the function in question also doesn't call the
va_* functions.
> > Also, shouldn't the function g() get inlined in the h() function?
It is
> > inlined only if I change the g() function to be static. So isn't
llvm being
> > too conservative when inlining global functions?
> 
> The inliner doesn't like varargs; see above.
And the fact that it is inlined when it's static, is probably due to the
DeadArgElim I mentioned above.

Gr.

Matthijs
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20080905/f0ce343d/attachment.sig>

Nuno Lopes

2008-Sep-16 18:59 UTC

head link

[LLVMdev] missed optimizations

Hi,

As a follow up of this thread I've made a patch that implements a simple 
approach to propagate the function return values as described previously.

It can transform e.g. the following program:

define i32 @f(...) nounwind {
(...)
 %cond = select i1 %tobool, i32 2, i32 3  ; <i32> [#uses=1]
 ret i32 %cond
}

define i32 @g(...) nounwind {
entry:
 %call = call i32 (...)* @f()  ; <i32> [#uses=1]
 switch i32 %call, label %sw.default [
  i32 5, label %sw.bb
  i32 2, label %sw.bb1
  i32 3, label %sw.bb2
 ]

sw.bb:  ; preds = %entry
 ret i32 -1

sw.bb1:  ; preds = %entry
 ret i32 3

sw.bb2:  ; preds = %entry
 ret i32 55

sw.default:  ; preds = %entry
 ret i32 0
}


into:

define i32 @g(...) nounwind {
entry:
 %call = call i32 (...)* @f()  ; <i32> [#uses=1]
 %cond = icmp eq i32 %call, 2  ; <i1> [#uses=1]
 %retval = select i1 %cond, i32 3, i32 55  ; <i32> [#uses=1]
 ret i32 %retval
}


This kind of transformation isn't currently done by LLVM (note that here 
this pass is only removing case statements. other transformations are not of 
my responsibility :).
The patch is available at 
http://web.ist.utl.pt/nuno.lopes/llvm_function_ret_infer.txt
I would love to ear some feedback, so that this pass can eventually be 
merged to the LLVM tree. In particular I'm not sure if there's a better
way
to track the possible return values of each function (e.g. is there any code 
that I can reuse?)


Thanks,
Nuno

Reasonably Related Threads

Search for more reasonably related threads

llvm dev - Sep 2008 - [LLVMdev] missed optimizations

[LLVMdev] missed optimizations

[LLVMdev] missed optimizations

[LLVMdev] missed optimizations

[LLVMdev] missed optimizations

[LLVMdev] missed optimizations

[LLVMdev] missed optimizations

Reasonably Related Threads