Hi all, I stumbled across a peculiarity regarding constant propagation that I don't understand. I'm not sure, if I oversee anything or if it's a missing feature. I have created the following simple test function in C: int times_zero(int a) { return (a * 0); } Compiling this with GCC using dragonegg generates the following code: %int = type i32 define i32 @times_zero(i32 %a) nounwind { entry: %a_addr = alloca i32, align 4 %memtmp = alloca i32 %"alloca point" = bitcast i32 0 to i32 store i32 %a, i32* %a_addr %"ssa point" = bitcast i32 0 to i32 br label %"2" "2": ; preds = %entry store i32 0, i32* %memtmp, align 1 br label %return return: ; preds = %"2" %retval = load i32* %memtmp ret i32 %retval } Running this through "opt -O2" generates define i32 @times_zero(i32 %a) nounwind readnone { entry: ret i32 0 } So far everything works as expected. LLVM recognizes the multiplication by zero and replaces the multiplication by its result zero. Doing the same for doubles however does not do the same optimization. double times_zero(double a) { return (a * 0.0); } Instead the following code is generated after optimization: define double @times_zero(double %a) nounwind readnone { entry: %0 = fmul double %a, 0.000000e+00 ret double %0 } Is there a reason, why this optimization opportunity is not taken? For the record, I am using LLVM 2.9 Regards, Martin
Eli Friedman
2011-Jul-15 07:36 UTC
[LLVMdev] Missing optimization in constant propagation?
On Fri, Jul 15, 2011 at 12:21 AM, Martin Apel <martin.apel at simpack.de> wrote:> Hi all, > > I stumbled across a peculiarity regarding constant propagation that I don't understand. I'm not sure, if I oversee anything or if it's a missing feature. > > I have created the following simple test function in C: > > int times_zero(int a) > { > return (a * 0); > } > > Compiling this with GCC using dragonegg generates the following code: > > %int = type i32 > > define i32 @times_zero(i32 %a) nounwind { > entry: > %a_addr = alloca i32, align 4 > %memtmp = alloca i32 > %"alloca point" = bitcast i32 0 to i32 > store i32 %a, i32* %a_addr > %"ssa point" = bitcast i32 0 to i32 > br label %"2" > > "2": ; preds = %entry > store i32 0, i32* %memtmp, align 1 > br label %return > > return: ; preds = %"2" > %retval = load i32* %memtmp > ret i32 %retval > } > > Running this through "opt -O2" generates > > define i32 @times_zero(i32 %a) nounwind readnone { > entry: > ret i32 0 > } > > So far everything works as expected. LLVM recognizes the multiplication by zero and replaces the multiplication by its result zero. > Doing the same for doubles however does not do the same optimization. > > double times_zero(double a) > { > return (a * 0.0); > } > > > Instead the following code is generated after optimization: > define double @times_zero(double %a) nounwind readnone { > entry: > %0 = fmul double %a, 0.000000e+00 > ret double %0 > } > > Is there a reason, why this optimization opportunity is not taken? For the record, I am using LLVM 2.9Because that isn't how FP arithmetic works; 1.0*0.0 is 0.0, but (-1.0)*0.0 is -0.0, and (0.0/0.0)*0.0 is NaN. -Eli
On 15/07/11 09:36, Eli Friedman wrote:> On Fri, Jul 15, 2011 at 12:21 AM, Martin Apel <martin.apel at simpack.de> wrote: >> Hi all, >> >> I stumbled across a peculiarity regarding constant propagation that I don't understand. I'm not sure, if I oversee anything or if it's a missing feature. >> >> I have created the following simple test function in C: >> >> int times_zero(int a) >> { >> return (a * 0); >> } >> >> Compiling this with GCC using dragonegg generates the following code: >> >> %int = type i32 >> >> define i32 @times_zero(i32 %a) nounwind { >> entry: >> %a_addr = alloca i32, align 4 >> %memtmp = alloca i32 >> %"alloca point" = bitcast i32 0 to i32 >> store i32 %a, i32* %a_addr >> %"ssa point" = bitcast i32 0 to i32 >> br label %"2" >> >> "2": ; preds = %entry >> store i32 0, i32* %memtmp, align 1 >> br label %return >> >> return: ; preds = %"2" >> %retval = load i32* %memtmp >> ret i32 %retval >> } >> >> Running this through "opt -O2" generates >> >> define i32 @times_zero(i32 %a) nounwind readnone { >> entry: >> ret i32 0 >> } >> >> So far everything works as expected. LLVM recognizes the multiplication by zero and replaces the multiplication by its result zero. >> Doing the same for doubles however does not do the same optimization. >> >> double times_zero(double a) >> { >> return (a * 0.0); >> } >> >> >> Instead the following code is generated after optimization: >> define double @times_zero(double %a) nounwind readnone { >> entry: >> %0 = fmul double %a, 0.000000e+00 >> ret double %0 >> } >> >> Is there a reason, why this optimization opportunity is not taken? For the record, I am using LLVM 2.9 > Because that isn't how FP arithmetic works; 1.0*0.0 is 0.0, but > (-1.0)*0.0 is -0.0, and (0.0/0.0)*0.0 is NaN. > > -Eli > . >Hi Eli, thanks for the enlightenment! I did not think of this, but you are completely right. Floating point arithmetic is kind of strange sometimes. Martin
Possibly Parallel Threads
- [LLVMdev] Missing optimization in constant propagation?
- [LLVMdev] Dragonegg + IR + llc = Dragonegg directly
- [LLVMdev] Dragonegg + IR + llc = Dragonegg directly
- [LLVMdev] How to unroll loops in opposite loop nest order
- [LLVMdev] How to unroll loops in opposite loop nest order