I recently downloaded LLVM 2.8 and started playing with the optimizations a
bit.
I saw something curious while trying the following function:
int g(unsigned int a) {
unsigned int c[100];
c[10] = a;
c[11] = a;
unsigned int b = c[10] + c[11];
if(b > a*2) a = 4;
else a = 8;
return a + 7;
}
The generated code, with -O3 activated, is
define i32 @g(i32 a) nounwind readnone {
%add = shl i32 %a, 1
%mul = shl i32 %a, 1
%cmp = icmp ugt i32 %add, %mul
%a.addr.0 = select i1 %cmp, i32 11, i32 15
ret i32 %a.addr.0
}
I find it strange that it hasn't found that %add and %mul have the same
value, %cmp would be then false, selecting and returning 15. If 'a' is
replaced by a constant it works.
I'm also curios which pass detects that c[10] and c[11] are 'a' in
'b c[10] + c[11]' (it isn't instcombine, at -O1 getelementptr/load
are still
there).
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20101228/bbbcfddd/attachment.html>
On Dec 28, 2010, at 9:39 AM, Lup Gratian wrote:> I recently downloaded LLVM 2.8 and started playing with the optimizations a bit. > I saw something curious while trying the following function: > > int g(unsigned int a) { > unsigned int c[100]; > c[10] = a; > c[11] = a; > unsigned int b = c[10] + c[11]; > > if(b > a*2) a = 4; > else a = 8; > return a + 7; > } > > The generated code, with -O3 activated, is > > define i32 @g(i32 a) nounwind readnone { > %add = shl i32 %a, 1 > %mul = shl i32 %a, 1 > %cmp = icmp ugt i32 %add, %mul > %a.addr.0 = select i1 %cmp, i32 11, i32 15 > ret i32 %a.addr.0 > } > > I find it strange that it hasn't found that %add and %mul have the same value, %cmp would be then false, selecting and returning 15. If 'a' is replaced by a constant it works.You're right, that is a missed optimization. I added it to the missed optimization notes in r122603. Did this come from a larger example, or was this just a test?> I'm also curios which pass detects that c[10] and c[11] are 'a' in 'b = c[10] + c[11]' (it isn't instcombine, at -O1 getelementptr/load are still there).There are several capable of picking this up, but GVN+MemDep is probably what you want. -Chris
On Dec 28, 2010, at 12:48 PM, Chris Lattner wrote:> On Dec 28, 2010, at 9:39 AM, Lup Gratian wrote: >> I find it strange that it hasn't found that %add and %mul have the same value, %cmp would be then false, selecting and returning 15. If 'a' is replaced by a constant it works. > > You're right, that is a missed optimization. I added it to the missed optimization notes in r122603. Did this come from a larger example, or was this just a test?Just as a note, if you run this example through the opt tool as well then the output is the expected "ret i32 15". > clang -O3 -S -emit-llvm -o - example.c |opt -std-compile-opts -o - |llvm-dis target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:128:128-n8:16:32" target triple = "i386-apple-darwin10.0.0" define i32 @g(i32 %a) nounwind readnone ssp { entry: ret i32 15 } -- Wesley Peck University of Kansas SLDG Laboratory
Apparently Analagous Threads
- [LLVMdev] Missed optimization opportunity
- [LLVMdev] Missed optimization opportunity
- [LLVMdev] Question about Value Range Propagation
- [LLVMdev] Question about Value Range Propagation
- [LLVMdev] Missed optimization opportunity in 3-way integer comparison case