I recently downloaded LLVM 2.8 and started playing with the optimizations a bit. I saw something curious while trying the following function: int g(unsigned int a) { unsigned int c[100]; c[10] = a; c[11] = a; unsigned int b = c[10] + c[11]; if(b > a*2) a = 4; else a = 8; return a + 7; } The generated code, with -O3 activated, is define i32 @g(i32 a) nounwind readnone { %add = shl i32 %a, 1 %mul = shl i32 %a, 1 %cmp = icmp ugt i32 %add, %mul %a.addr.0 = select i1 %cmp, i32 11, i32 15 ret i32 %a.addr.0 } I find it strange that it hasn't found that %add and %mul have the same value, %cmp would be then false, selecting and returning 15. If 'a' is replaced by a constant it works. I'm also curios which pass detects that c[10] and c[11] are 'a' in 'b c[10] + c[11]' (it isn't instcombine, at -O1 getelementptr/load are still there). -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20101228/bbbcfddd/attachment.html>
On Dec 28, 2010, at 9:39 AM, Lup Gratian wrote:> I recently downloaded LLVM 2.8 and started playing with the optimizations a bit. > I saw something curious while trying the following function: > > int g(unsigned int a) { > unsigned int c[100]; > c[10] = a; > c[11] = a; > unsigned int b = c[10] + c[11]; > > if(b > a*2) a = 4; > else a = 8; > return a + 7; > } > > The generated code, with -O3 activated, is > > define i32 @g(i32 a) nounwind readnone { > %add = shl i32 %a, 1 > %mul = shl i32 %a, 1 > %cmp = icmp ugt i32 %add, %mul > %a.addr.0 = select i1 %cmp, i32 11, i32 15 > ret i32 %a.addr.0 > } > > I find it strange that it hasn't found that %add and %mul have the same value, %cmp would be then false, selecting and returning 15. If 'a' is replaced by a constant it works.You're right, that is a missed optimization. I added it to the missed optimization notes in r122603. Did this come from a larger example, or was this just a test?> I'm also curios which pass detects that c[10] and c[11] are 'a' in 'b = c[10] + c[11]' (it isn't instcombine, at -O1 getelementptr/load are still there).There are several capable of picking this up, but GVN+MemDep is probably what you want. -Chris
On Dec 28, 2010, at 12:48 PM, Chris Lattner wrote:> On Dec 28, 2010, at 9:39 AM, Lup Gratian wrote: >> I find it strange that it hasn't found that %add and %mul have the same value, %cmp would be then false, selecting and returning 15. If 'a' is replaced by a constant it works. > > You're right, that is a missed optimization. I added it to the missed optimization notes in r122603. Did this come from a larger example, or was this just a test?Just as a note, if you run this example through the opt tool as well then the output is the expected "ret i32 15". > clang -O3 -S -emit-llvm -o - example.c |opt -std-compile-opts -o - |llvm-dis target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:128:128-n8:16:32" target triple = "i386-apple-darwin10.0.0" define i32 @g(i32 %a) nounwind readnone ssp { entry: ret i32 15 } -- Wesley Peck University of Kansas SLDG Laboratory