Chuck Rose III
2007-Oct-18 20:52 UTC
[LLVMdev] movaps being generated despite alignment 1 being specified
Hello LLVMers, High order bit: Presence of a called function is causing a store on an unrelated vector to generate an aligned store rather an unaligned one despite unaligned store being indicated in the associated StoreInst. Details: I pulled down the latest source, so this is something I'm finding with the current LLVM. I'm hoping you'll have an idea what's going on or at least know if it's a new issue I should log. It's related to the stack alignment issue that I know is being worked on, but seems sufficiently different to ask about it here. I checked the bug database for "align" and "movaps" and didn't see this issue raised. Ok, the first bit of code here seems to generate correct assembly for me. Basically, it copies the float4 stored at globalV and copies it into the address pointed to by dependentV. Along the way, it creates a <4 x float> and copies globalV into a temporary. I'm working on bridging the gap between the outside of our system and the LLVM generated code, so there is a little extra copying from and to parameters at the boundaries of this function. Since this is just a repro-example, there is very little besides the boundaries here. :-) I fully admit the constructions below may not be optimal. ; ModuleID = 'hydra' target datalayout "E-p:32:32:32-i1:8:8:8-i8:8:8:8-i32:32:32:32-f32:32:32:32" define void @evaluateDependents(float* %dependentV, float* %globalV) { Entry_evaluateDependents: %Promoted_dependentV_Ptr = alloca <4 x float>, align 16 ; <<4 x float>*> [#uses=2] %Promoted_globalV_Ptr = alloca <4 x float>, align 16 ; <<4 x float>*> [#uses=2] %externalVectorPtrCast = bitcast float* %globalV to <4 x float>* ; <<4 x float>*> [#uses=1] %externalVectorLoaded = load <4 x float>* %externalVectorPtrCast, align 1 ; <<4 x float>> [#uses=1] store <4 x float> %externalVectorLoaded, <4 x float>* %Promoted_globalV_Ptr, align 1 %globalV1 = load <4 x float>* %Promoted_globalV_Ptr, align 1 ; <<4 x float>> [#uses=1] br label %Body_evaluateDependents Body_evaluateDependents: ; preds %Entry_evaluateDependents store <4 x float> %globalV1, <4 x float>* %Promoted_dependentV_Ptr, align 1 br label %Exit_evaluateDependents Exit_evaluateDependents: ; preds %Body_evaluateDependents %vectorToDemote = load <4 x float>* %Promoted_dependentV_Ptr, align 1 ; <<4 x float>> [#uses=1] %externalVectorPtrCast2 = bitcast float* %dependentV to <4 x float>* ; <<4 x float>*> [#uses=1] store <4 x float> %vectorToDemote, <4 x float>* %externalVectorPtrCast2, align 1 ret void } Produces these instructions which obeys all the align 1 directives on the LoadInsts and StoreInsts.. ... 15D10010 sub esp,2Ch 15D10013 mov eax,dword ptr [esp+34h] 15D10017 movups xmm0,xmmword ptr [eax] 15D1001A movups xmmword ptr [esp],xmm0 15D1001E mov eax,dword ptr [esp+30h] 15D10022 movups xmmword ptr [esp+10h],xmm0 15D10027 movups xmm0,xmmword ptr [esp+10h] 15D1002C movups xmmword ptr [eax],xmm0 15D1002F add esp,2Ch 15D10032 ret Here's where it gets weird and confusing to me. Let's make our evaluateDependents function do something else. In addition to copying globalV into dependentV, it's also going to set a singleton float pointed to by dependentF. We'll call a function foo to get the value. (I tried setting dependentF directly and that did NOT cause the problem with the generated code). Here's the LLVM code: ; ModuleID = 'hydra' target datalayout "E-p:32:32:32-i1:8:8:8-i8:8:8:8-i32:32:32:32-f32:32:32:32" define float @foo(float %Y) { Entry_foo: %_ReturnValuePtr = alloca float ; <float*> [#uses=2] br label %Body_foo Body_foo: ; preds = %Entry_foo store float %Y, float* %_ReturnValuePtr, align 1 br label %Exit_foo Exit_foo: ; preds = %Body_foo %finalValue = load float* %_ReturnValuePtr, align 1 ; <float> [#uses=1] ret float %finalValue } define void @evaluateDependents(float* %dependentF, float* %dependentV, float* %globalV) { Entry_evaluateDependents: %Promoted_dependentV_Ptr = alloca <4 x float>, align 16 ; <<4 x float>*> [#uses=2] %Promoted_globalV_Ptr = alloca <4 x float>, align 16 ; <<4 x float>*> [#uses=2] %externalVectorPtrCast = bitcast float* %globalV to <4 x float>* ; <<4 x float>*> [#uses=1] %externalVectorLoaded = load <4 x float>* %externalVectorPtrCast, align 1 ; <<4 x float>> [#uses=1] store <4 x float> %externalVectorLoaded, <4 x float>* %Promoted_globalV_Ptr, align 1 %globalV1 = load <4 x float>* %Promoted_globalV_Ptr, align 1 ; <<4 x float>> [#uses=1] br label %Body_evaluateDependents Body_evaluateDependents: ; preds %Entry_evaluateDependents %fooResult = call float @foo( float 2.000000e+000 ) ; <float> [#uses=1] store float %fooResult, float* %dependentF, align 1 store <4 x float> %globalV1, <4 x float>* %Promoted_dependentV_Ptr, align 1 br label %Exit_evaluateDependents Exit_evaluateDependents: ; preds %Body_evaluateDependents %vectorToDemote = load <4 x float>* %Promoted_dependentV_Ptr, align 1 ; <<4 x float>> [#uses=1] %externalVectorPtrCast2 = bitcast float* %dependentV to <4 x float>* ; <<4 x float>*> [#uses=1] store <4 x float> %vectorToDemote, <4 x float>* %externalVectorPtrCast2, align 1 ret void } Here are the instructions for evaluateDependents. The JITter hasn't compiled foo yet. What's confusing to me is why did my movups suddenly become a movaps? All the stores and loads have align 1 on them. ... 15D10012 sub esp,4Ch 15D10015 mov eax,dword ptr [esp+60h] 15D10019 movups xmm0,xmmword ptr [eax] 15D1001C movaps xmmword ptr [esp+8],xmm0 <-- why did this become a movaps? 15D10021 movups xmmword ptr [esp+28h],xmm0 15D10026 mov esi,dword ptr [esp+58h] 15D1002A mov edi,dword ptr [esp+5Ch] 15D1002E mov dword ptr [esp],40000000h 15D10035 call X86CompilationCallback (1335030h) Thanks for the help! Chuck. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20071018/9bd4da0c/attachment.html>
Dale Johannesen
2007-Oct-19 03:01 UTC
[LLVMdev] movaps being generated despite alignment 1 being specified
On Oct 18, 2007, at 1:52 PM, Chuck Rose III wrote:> High order bit: > > Presence of a called function is causing a store on an unrelated > vector to generate an aligned store rather an unaligned one despite > unaligned store being indicated in the associated StoreInst.This probably means the compiler believes the stack pointer is 16- byte aligned in non-leaf functions. This would be correct if (a) the SP was aligned coming in and (b) the size of the stack decrement (including return address, etc.) is a multiple of 16. I haven't been following the Linux problems closely, but I think "the stack issue being worked on" is that (a) is not always correct? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20071018/18f2ab96/attachment.html>
Evan Cheng
2007-Oct-19 06:56 UTC
[LLVMdev] movaps being generated despite alignment 1 being specified
On Oct 18, 2007, at 1:52 PM, Chuck Rose III wrote:> > Here are the instructions for evaluateDependents. The JITter > hasn’t compiled foo yet. What’s confusing to me is why did my > movups suddenly become a movaps? All the stores and loads have > align 1 on them.Hi Chuck, I believe this is a bug but am unable to reproduce it with the test case you've provided. I should be able to see the same problem using llc since the code generator is going through all the same passes. The only difference should be the relocation model. Please file a bug and provide us with a test case. You should be able to set a break point somewhere in ExecutionEngine.cpp / JIT.cpp and just dump out the bitcode with Module->dump() / print(). Evan> > > … > > 15D10012 sub esp,4Ch > > 15D10015 mov eax,dword ptr [esp+60h] > > 15D10019 movups xmm0,xmmword ptr [eax] > > 15D1001C movaps xmmword ptr [esp+8],xmm0 ß why did this > become a movaps? > > 15D10021 movups xmmword ptr [esp+28h],xmm0 > > 15D10026 mov esi,dword ptr [esp+58h] > > 15D1002A mov edi,dword ptr [esp+5Ch] > > 15D1002E mov dword ptr [esp],40000000h > > 15D10035 call X86CompilationCallback (1335030h) > > > > Thanks for the help! > > > > Chuck. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20071018/d6682501/attachment.html>
Evan Cheng
2007-Nov-05 07:34 UTC
[LLVMdev] movaps being generated despite alignment 1 being specified
Fixed. See PR1776 and http://lists.cs.uiuc.edu/pipermail/llvm-commits/ Week-of-Mon-20071105/055148.html Evan On Oct 18, 2007, at 11:56 PM, Evan Cheng wrote:> > On Oct 18, 2007, at 1:52 PM, Chuck Rose III wrote: > >> >> Here are the instructions for evaluateDependents. The JITter >> hasn’t compiled foo yet. What’s confusing to me is why did my >> movups suddenly become a movaps? All the stores and loads have >> align 1 on them. > > Hi Chuck, > > I believe this is a bug but am unable to reproduce it with the test > case you've provided. I should be able to see the same problem > using llc since the code generator is going through all the same > passes. The only difference should be the relocation model. > > Please file a bug and provide us with a test case. You should be > able to set a break point somewhere in ExecutionEngine.cpp / > JIT.cpp and just dump out the bitcode with Module->dump() / print(). > > Evan > >> >> >> … >> >> 15D10012 sub esp,4Ch >> >> 15D10015 mov eax,dword ptr [esp+60h] >> >> 15D10019 movups xmm0,xmmword ptr [eax] >> >> 15D1001C movaps xmmword ptr [esp+8],xmm0 ß why did this >> become a movaps? >> >> 15D10021 movups xmmword ptr [esp+28h],xmm0 >> >> 15D10026 mov esi,dword ptr [esp+58h] >> >> 15D1002A mov edi,dword ptr [esp+5Ch] >> >> 15D1002E mov dword ptr [esp],40000000h >> >> 15D10035 call X86CompilationCallback (1335030h) >> >> >> >> Thanks for the help! >> >> >> >> Chuck. >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20071104/d8e072ec/attachment.html>