Ralf Karrenberg
2010-Jan-29 13:58 UTC
[LLVMdev] 64bit MRV problem: { float, float, float} -> { double, float }
Hey Duncan, hey everybody else, I just stumbled upon a problem in the latest llvm-gcc trunk which is related to my previous problem with the 64bit ABI and structs: Given the following code: struct float3 { float x, y, z; }; extern "C" void __attribute__((noinline)) test(float3 a, float3* res) { res->y = a.y; } int main(void) { float3 a; float3 res; test(a, &res); } llvm-gcc -c -emit-llvm -O3 produces this: %struct.float3 = type { float, float, float } define void @test(double %a.0, float %a.1, %struct.float3* nocapture %res) nounwind noinline { entry: %tmp8 = bitcast double %a.0 to i64 ; <i64> [#uses=1] %tmp9 = zext i64 %tmp8 to i96 ; <i96> [#uses=1] %tmp1 = lshr i96 %tmp9, 32 ; <i96> [#uses=1] %tmp2 = trunc i96 %tmp1 to i32 ; <i32> [#uses=1] %tmp3 = bitcast i32 %tmp2 to float ; <float> [#uses=1] %0 = getelementptr inbounds %struct.float3* %res, i64 0, i32 1 ; <float*> [#uses=1] store float %tmp3, float* %0, align 4 ret void } define i32 @main() nounwind { entry: %res = alloca %struct.float3, align 8 ; <%struct.float3*> [#uses=1] call void @test(double undef, float 0.000000e+00, %struct.float3* %res) nounwind ret i32 0 } The former second value of the struct is casted from float to i64, zero-extended, shifted, truncated and casted back to float. Unfortunately, in my case, LLVM seems to be unable to remove this kind of code (in more complex functions of course) even though it gets inlined and optimized. I end up with functions like this one: define void @xyz(float %aX, float %aY, float %aZ, float* noalias nocapture %resX, float* noalias nocapture %resY, float* noalias nocapture %resZ) nounwind { entry: %0 = fadd float %aZ, 5.000000e-01 ; <float> [#uses=1] %1 = fadd float %aY, 5.000000e-01 ; <float> [#uses=1] %2 = fadd float %aX, 5.000000e-01 ; <float> [#uses=1] %tmp16.i.i = bitcast float %1 to i32 ; <i32> [#uses=1] %tmp17.i.i = zext i32 %tmp16.i.i to i96 ; <i96> [#uses=1] %tmp18.i.i = shl i96 %tmp17.i.i, 32 ; <i96> [#uses=1] %tmp19.i = zext i96 %tmp18.i.i to i128 ; <i128> [#uses=1] %tmp8.i = lshr i128 %tmp19.i, 32 ; <i128> [#uses=1] %tmp9.i = trunc i128 %tmp8.i to i32 ; <i32> [#uses=1] %tmp10.i = bitcast i32 %tmp9.i to float ; <float> [#uses=1] store float %2, float* %resX, align 4 store float %tmp10.i, float* %resY, align 4 store float %0, float* %resZ, align 4 ret void } llvm-gcc4.2-2.5 generates the following code for the same example: define void @test(double %a.0, float %a.1, %struct.float3* nocapture %res) nounwind noinline { entry: %a_addr = alloca %struct.float3, align 8 ; <%struct.float3*> [#uses=3] %0 = bitcast %struct.float3* %a_addr to double* ; <double*> [#uses=1] store double %a.0, double* %0 %1 = getelementptr %struct.float3* %a_addr, i64 0, i32 2 ; <float*> [#uses=1] store float %a.1, float* %1, align 8 %2 = getelementptr %struct.float3* %a_addr, i64 0, i32 1 ; <float*> [#uses=1] %3 = load float* %2, align 4 ; <float> [#uses=1] %4 = getelementptr %struct.float3* %res, i64 0, i32 1 ; <float*> [#uses=1] store float %3, float* %4, align 4 ret void } Apparently, the optimizer can work better with that code and after inlining, it all goes away as expected. Is this change intentional? Any ideas where that code comes from or why it cannot be removed? Best, Ralf
Duncan Sands
2010-Jan-29 14:25 UTC
[LLVMdev] 64bit MRV problem: { float, float, float} -> { double, float }
Hi Ralf,> llvm-gcc -c -emit-llvm -O3 produces this: > > %struct.float3 = type { float, float, float } > define void @test(double %a.0, float %a.1, %struct.float3* nocapture > %res) nounwind noinline { > entry: > %tmp8 = bitcast double %a.0 to i64 ; <i64> [#uses=1] > %tmp9 = zext i64 %tmp8 to i96 ; <i96> [#uses=1] > %tmp1 = lshr i96 %tmp9, 32 ; <i96> [#uses=1] > %tmp2 = trunc i96 %tmp1 to i32 ; <i32> [#uses=1] > %tmp3 = bitcast i32 %tmp2 to float ; <float> [#uses=1] > %0 = getelementptr inbounds %struct.float3* %res, i64 0, i32 1 ; > <float*> [#uses=1] > store float %tmp3, float* %0, align 4 > ret void > }it is reasonable to expect the optimizers to turn this at least into %tmp8 = bitcast double %a.0 to i64 %tmp2 = lshr i64 %tmp8, 32 %tmp3 = bitcast i32 %tmp2 to float> define void @xyz(float %aX, float %aY, float %aZ, float* noalias > nocapture %resX, float* noalias nocapture %resY, float* noalias > nocapture %resZ) nounwind { > entry: > %0 = fadd float %aZ, 5.000000e-01 ; <float> [#uses=1] > %1 = fadd float %aY, 5.000000e-01 ; <float> [#uses=1] > %2 = fadd float %aX, 5.000000e-01 ; <float> [#uses=1] > %tmp16.i.i = bitcast float %1 to i32 ; <i32> [#uses=1] > %tmp17.i.i = zext i32 %tmp16.i.i to i96 ; <i96> [#uses=1] > %tmp18.i.i = shl i96 %tmp17.i.i, 32 ; <i96> [#uses=1] > %tmp19.i = zext i96 %tmp18.i.i to i128 ; <i128> [#uses=1] > %tmp8.i = lshr i128 %tmp19.i, 32 ; <i128> [#uses=1] > %tmp9.i = trunc i128 %tmp8.i to i32 ; <i32> [#uses=1] > %tmp10.i = bitcast i32 %tmp9.i to float ; <float> [#uses=1] > store float %2, float* %resX, align 4 > store float %tmp10.i, float* %resY, align 4 > store float %0, float* %resZ, align 4 > ret void > }Likewise, here it is reasonable to expect the optimizers to be able to get rid of all the mucking around with %1, and understand that %tmp10.i is the same as %1. So I think you should open a bug report for this. Please include everything in this email, as well as the original C for the second testcase.> Any ideas where that code comes from or why it cannot be removed?I think it comes directly from the frontend ABI logic. The optimizers should be able to handle this, but since they aren't handling it they need to be improved. Ciao, Duncan.
Ralf Karrenberg
2010-Feb-01 11:08 UTC
[LLVMdev] 64bit MRV problem: { float, float, float} -> { double, float }
Hey Duncan, Duncan Sands wrote:> So I think you should open a bug report for this. Please include > everything > in this email, as well as the original C for the second testcase.I just opened a bug report ( http://llvm.org/bugs/show_bug.cgi?id=6194 ) and tried to put in some simple and reproducible test cases. I hope this is all comprehensible enough :). Best, Ralf
Seemingly Similar Threads
- [LLVMdev] 64bit MRV problem: { float, float, float} -> { double, float }
- [LLVMdev] 64bit MRV problem: { float, float, float} -> { double, float }
- [LLVMdev] 64bit MRV problem: { float, float, float} -> { double, float }
- [LLVMdev] Missing Optimization Opportunities
- [LLVMdev] Labels