Craig Topper
2013-Jul-19 05:29 UTC
[LLVMdev] SIMD instructions and memory alignment on X86
That should map directly to sqrtpd which can't modify ecx. On Thu, Jul 18, 2013 at 10:27 PM, Peter Newman <peter at uformia.com> wrote:> Sorry, that should have been llvm.x86.sse2.sqrt.pd > > > On 19/07/2013 3:25 PM, Craig Topper wrote: > > What is "frep.x86.sse2.sqrt.pd". I'm only familiar with things prefixed > with "llvm.x86". > > > On Thu, Jul 18, 2013 at 10:12 PM, Peter Newman <peter at uformia.com> wrote: > >> After stepping through the produced assembly, I believe I have a >> culprit. >> >> One of the calls to @frep.x86.sse2.sqrt.pd is modifying the value of ECX >> - while the produced code is expecting it to still contain its previous >> value. >> >> Peter N >> >> >> On 19/07/2013 2:09 PM, Peter Newman wrote: >> >> I've attached the module->dump() that our code is producing. >> Unfortunately this is the smallest test case I have available. >> >> This is before any optimization passes are applied. There are two >> separate modules in existence at the time, and there are no guarantees >> about the order the surrounding code calls those functions, so there may be >> some interaction between them? There shouldn't be, they don't refer to any >> common memory etc. There is no multi-threading occurring. >> >> The function in module-dump.ll (called crashfunc in this file) is called >> with >> - func_params 0x0018f3b0 double [3] >> [0x0] -11.339976634695301 double >> [0x1] -9.7504239056205506 double >> [0x2] -5.2900856817382804 double >> at the time of the exception. >> >> This is compiled on a "i686-pc-win32" triple. All of the non-intrinsic >> functions referred to in these modules are the standard equivalents from >> the MSVC library (e.g. @asin is the standard C lib double asin( double ) >> ). >> >> Hopefully this is reproducible for you. >> >> -- >> PeterN >> >> On 18/07/2013 4:37 PM, Craig Topper wrote: >> >> Are you able to send any IR for others to reproduce this issue? >> >> >> On Wed, Jul 17, 2013 at 11:23 PM, Peter Newman <peter at uformia.com> wrote: >> >>> Unfortunately, this doesn't appear to be the bug I'm hitting. I applied >>> the fix to my source and it didn't make a difference. >>> >>> Also further testing found me getting the same behavior with other SIMD >>> instructions. The common factor is in each case, ECX is set to 0x7fffffff, >>> and it's an operation using xmm ptr ecx+offset . >>> >>> Additionally, turning the optimization level passed to createJIT down >>> appears to avoid it, so I'm now leaning towards a bug in one of the >>> optimization passes. >>> >>> I'm going to dig through the passes controlled by that parameter and see >>> if I can narrow down which optimization is causing it. >>> >>> Peter N >>> >>> >>> On 17/07/2013 1:58 PM, Solomon Boulos wrote: >>> >>>> As someone off list just told me, perhaps my new bug is the same issue: >>>> >>>> http://llvm.org/bugs/show_bug.cgi?id=16640 >>>> >>>> Do you happen to be using FastISel? >>>> >>>> Solomon >>>> >>>> On Jul 16, 2013, at 6:39 PM, Peter Newman <peter at uformia.com> wrote: >>>> >>>> Hello all, >>>>> >>>>> I'm currently in the process of debugging a crash occurring in our >>>>> program. In LLVM 3.2 and 3.3 it appears that JIT generated code is >>>>> attempting to perform access unaligned memory with a SSE2 instruction. >>>>> However this only happens under certain conditions that seem (but may not >>>>> be) related to the stacks state on calling the function. >>>>> >>>>> Our program acts as a front-end, using the LLVM C++ API to generate a >>>>> JIT generated function. This function is primarily mathematical, so we use >>>>> the Vector types to take advantage of SIMD instructions (as well as a few >>>>> SSE2 intrinsics). >>>>> >>>>> This worked in LLVM 2.8 but started failing in 3.2 and has continued >>>>> to fail in 3.3. It fails with no optimizations applied to the LLVM >>>>> Function/Module. It crashes with what is reported as a memory access error >>>>> (accessing 0xffffffff), however it's suggested that this is how the SSE >>>>> fault raising mechanism appears. >>>>> >>>>> The generated instruction varies, but it seems to often be similar to >>>>> (I don't have it in front of me, sorry): >>>>> movapd xmm0, xmm[ecx+0x???????] >>>>> Where the xmm register changes, and the second parameter is a memory >>>>> access. >>>>> ECX is always set to 0x7ffffff - however I don't know if this is part >>>>> of the SSE error reporting process or is part of the situation causing the >>>>> error. >>>>> >>>>> I haven't worked out exactly what code path etc is causing this crash. >>>>> I'm hoping that someone can tell me if there were any changed requirements >>>>> for working with SIMD in LLVM 3.2 (or earlier, we haven't tried 3.0 or >>>>> 3.1). I currently suspect the use of GlobalVariable (we first discovered >>>>> the crash when using a feature that uses them), however I have attempted >>>>> using setAlignment on the GlobalVariables without any change. >>>>> >>>>> -- >>>>> Peter N >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>> >>>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>> >> >> >> >> -- >> ~Craig >> >> >> >> > > > -- > ~Craig > > >-- ~Craig -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130718/b1fdb808/attachment.html>
Peter Newman
2013-Jul-19 05:45 UTC
[LLVMdev] SIMD instructions and memory alignment on X86
In the disassembly, I'm seeing three cases of call 76719BA1 I am assuming this is the sqrt function as this is the only function called in the LLVM IR. The code at 76719BA1 is: 76719BA1 push ebp 76719BA2 mov ebp,esp 76719BA4 sub esp,20h 76719BA7 and esp,0FFFFFFF0h 76719BAA fld st(0) 76719BAC fst dword ptr [esp+18h] 76719BB0 fistp qword ptr [esp+10h] 76719BB4 fild qword ptr [esp+10h] 76719BB8 mov edx,dword ptr [esp+18h] 76719BBC mov eax,dword ptr [esp+10h] 76719BC0 test eax,eax 76719BC2 je 76719DCF 76719BC8 fsubp st(1),st 76719BCA test edx,edx 76719BCC js 7671F9DB 76719BD2 fstp dword ptr [esp] 76719BD5 mov ecx,dword ptr [esp] 76719BD8 add ecx,7FFFFFFFh 76719BDE sbb eax,0 76719BE1 mov edx,dword ptr [esp+14h] 76719BE5 sbb edx,0 76719BE8 leave 76719BE9 ret As you can see at 76719BD5, it modifies ECX . I don't know that this is the sqrtpd function (for example, I'm not seeing any SSE instructions here?) but whatever it is, it's being called from the IR I attached earlier, and is modifying ECX under some circumstances. On 19/07/2013 3:29 PM, Craig Topper wrote:> That should map directly to sqrtpd which can't modify ecx. > > > On Thu, Jul 18, 2013 at 10:27 PM, Peter Newman <peter at uformia.com > <mailto:peter at uformia.com>> wrote: > > Sorry, that should have been llvm.x86.sse2.sqrt.pd > > > On 19/07/2013 3:25 PM, Craig Topper wrote: >> What is "frep.x86.sse2.sqrt.pd". I'm only familiar with things >> prefixed with "llvm.x86". >> >> >> On Thu, Jul 18, 2013 at 10:12 PM, Peter Newman <peter at uformia.com >> <mailto:peter at uformia.com>> wrote: >> >> After stepping through the produced assembly, I believe I >> have a culprit. >> >> One of the calls to @frep.x86.sse2.sqrt.pd is modifying the >> value of ECX - while the produced code is expecting it to >> still contain its previous value. >> >> Peter N >> >> >> On 19/07/2013 2:09 PM, Peter Newman wrote: >>> I've attached the module->dump() that our code is producing. >>> Unfortunately this is the smallest test case I have available. >>> >>> This is before any optimization passes are applied. There >>> are two separate modules in existence at the time, and there >>> are no guarantees about the order the surrounding code calls >>> those functions, so there may be some interaction between >>> them? There shouldn't be, they don't refer to any common >>> memory etc. There is no multi-threading occurring. >>> >>> The function in module-dump.ll (called crashfunc in this >>> file) is called with >>> - func_params 0x0018f3b0 double [3] >>> [0x0] -11.339976634695301 double >>> [0x1] -9.7504239056205506 double >>> [0x2] -5.2900856817382804 double >>> at the time of the exception. >>> >>> This is compiled on a "i686-pc-win32" triple. All of the >>> non-intrinsic functions referred to in these modules are the >>> standard equivalents from the MSVC library (e.g. @asin is >>> the standard C lib double asin( double ) ). >>> >>> Hopefully this is reproducible for you. >>> >>> -- >>> PeterN >>> >>> On 18/07/2013 4:37 PM, Craig Topper wrote: >>>> Are you able to send any IR for others to reproduce this issue? >>>> >>>> >>>> On Wed, Jul 17, 2013 at 11:23 PM, Peter Newman >>>> <peter at uformia.com <mailto:peter at uformia.com>> wrote: >>>> >>>> Unfortunately, this doesn't appear to be the bug I'm >>>> hitting. I applied the fix to my source and it didn't >>>> make a difference. >>>> >>>> Also further testing found me getting the same behavior >>>> with other SIMD instructions. The common factor is in >>>> each case, ECX is set to 0x7fffffff, and it's an >>>> operation using xmm ptr ecx+offset . >>>> >>>> Additionally, turning the optimization level passed to >>>> createJIT down appears to avoid it, so I'm now leaning >>>> towards a bug in one of the optimization passes. >>>> >>>> I'm going to dig through the passes controlled by that >>>> parameter and see if I can narrow down which >>>> optimization is causing it. >>>> >>>> Peter N >>>> >>>> >>>> On 17/07/2013 1:58 PM, Solomon Boulos wrote: >>>> >>>> As someone off list just told me, perhaps my new >>>> bug is the same issue: >>>> >>>> http://llvm.org/bugs/show_bug.cgi?id=16640 >>>> >>>> Do you happen to be using FastISel? >>>> >>>> Solomon >>>> >>>> On Jul 16, 2013, at 6:39 PM, Peter Newman >>>> <peter at uformia.com <mailto:peter at uformia.com>> wrote: >>>> >>>> Hello all, >>>> >>>> I'm currently in the process of debugging a >>>> crash occurring in our program. In LLVM 3.2 and >>>> 3.3 it appears that JIT generated code is >>>> attempting to perform access unaligned memory >>>> with a SSE2 instruction. However this only >>>> happens under certain conditions that seem (but >>>> may not be) related to the stacks state on >>>> calling the function. >>>> >>>> Our program acts as a front-end, using the LLVM >>>> C++ API to generate a JIT generated function. >>>> This function is primarily mathematical, so we >>>> use the Vector types to take advantage of SIMD >>>> instructions (as well as a few SSE2 intrinsics). >>>> >>>> This worked in LLVM 2.8 but started failing in >>>> 3.2 and has continued to fail in 3.3. It fails >>>> with no optimizations applied to the LLVM >>>> Function/Module. It crashes with what is >>>> reported as a memory access error (accessing >>>> 0xffffffff), however it's suggested that this >>>> is how the SSE fault raising mechanism appears. >>>> >>>> The generated instruction varies, but it seems >>>> to often be similar to (I don't have it in >>>> front of me, sorry): >>>> movapd xmm0, xmm[ecx+0x???????] >>>> Where the xmm register changes, and the second >>>> parameter is a memory access. >>>> ECX is always set to 0x7ffffff - however I >>>> don't know if this is part of the SSE error >>>> reporting process or is part of the situation >>>> causing the error. >>>> >>>> I haven't worked out exactly what code path etc >>>> is causing this crash. I'm hoping that someone >>>> can tell me if there were any changed >>>> requirements for working with SIMD in LLVM 3.2 >>>> (or earlier, we haven't tried 3.0 or 3.1). I >>>> currently suspect the use of GlobalVariable (we >>>> first discovered the crash when using a feature >>>> that uses them), however I have attempted using >>>> setAlignment on the GlobalVariables without any >>>> change. >>>> >>>> -- >>>> Peter N >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> LLVMdev at cs.uiuc.edu >>>> <mailto:LLVMdev at cs.uiuc.edu> >>>> http://llvm.cs.uiuc.edu >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>> >>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu> >>>> http://llvm.cs.uiuc.edu >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>> >>>> >>>> >>>> >>>> -- >>>> ~Craig >>> >> >> >> >> >> -- >> ~Craig > > > > > -- > ~Craig-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130719/181387d3/attachment.html>
Craig Topper
2013-Jul-19 05:47 UTC
[LLVMdev] SIMD instructions and memory alignment on X86
Hmm, maybe sse isn't being enabled so its falling back to emulating sqrt? On Thu, Jul 18, 2013 at 10:45 PM, Peter Newman <peter at uformia.com> wrote:> In the disassembly, I'm seeing three cases of > call 76719BA1 > > I am assuming this is the sqrt function as this is the only function > called in the LLVM IR. > > The code at 76719BA1 is: > > 76719BA1 push ebp > 76719BA2 mov ebp,esp > 76719BA4 sub esp,20h > 76719BA7 and esp,0FFFFFFF0h > 76719BAA fld st(0) > 76719BAC fst dword ptr [esp+18h] > 76719BB0 fistp qword ptr [esp+10h] > 76719BB4 fild qword ptr [esp+10h] > 76719BB8 mov edx,dword ptr [esp+18h] > 76719BBC mov eax,dword ptr [esp+10h] > 76719BC0 test eax,eax > 76719BC2 je 76719DCF > 76719BC8 fsubp st(1),st > 76719BCA test edx,edx > 76719BCC js 7671F9DB > 76719BD2 fstp dword ptr [esp] > 76719BD5 mov ecx,dword ptr [esp] > 76719BD8 add ecx,7FFFFFFFh > 76719BDE sbb eax,0 > 76719BE1 mov edx,dword ptr [esp+14h] > 76719BE5 sbb edx,0 > 76719BE8 leave > 76719BE9 ret > > > As you can see at 76719BD5, it modifies ECX . > > I don't know that this is the sqrtpd function (for example, I'm not seeing > any SSE instructions here?) but whatever it is, it's being called from the > IR I attached earlier, and is modifying ECX under some circumstances. > > > On 19/07/2013 3:29 PM, Craig Topper wrote: > > That should map directly to sqrtpd which can't modify ecx. > > > On Thu, Jul 18, 2013 at 10:27 PM, Peter Newman <peter at uformia.com> wrote: > >> Sorry, that should have been llvm.x86.sse2.sqrt.pd >> >> >> On 19/07/2013 3:25 PM, Craig Topper wrote: >> >> What is "frep.x86.sse2.sqrt.pd". I'm only familiar with things prefixed >> with "llvm.x86". >> >> >> On Thu, Jul 18, 2013 at 10:12 PM, Peter Newman <peter at uformia.com> wrote: >> >>> After stepping through the produced assembly, I believe I have a >>> culprit. >>> >>> One of the calls to @frep.x86.sse2.sqrt.pd is modifying the value of ECX >>> - while the produced code is expecting it to still contain its previous >>> value. >>> >>> Peter N >>> >>> >>> On 19/07/2013 2:09 PM, Peter Newman wrote: >>> >>> I've attached the module->dump() that our code is producing. >>> Unfortunately this is the smallest test case I have available. >>> >>> This is before any optimization passes are applied. There are two >>> separate modules in existence at the time, and there are no guarantees >>> about the order the surrounding code calls those functions, so there may be >>> some interaction between them? There shouldn't be, they don't refer to any >>> common memory etc. There is no multi-threading occurring. >>> >>> The function in module-dump.ll (called crashfunc in this file) is called >>> with >>> - func_params 0x0018f3b0 double [3] >>> [0x0] -11.339976634695301 double >>> [0x1] -9.7504239056205506 double >>> [0x2] -5.2900856817382804 double >>> at the time of the exception. >>> >>> This is compiled on a "i686-pc-win32" triple. All of the non-intrinsic >>> functions referred to in these modules are the standard equivalents from >>> the MSVC library (e.g. @asin is the standard C lib double asin( double ) >>> ). >>> >>> Hopefully this is reproducible for you. >>> >>> -- >>> PeterN >>> >>> On 18/07/2013 4:37 PM, Craig Topper wrote: >>> >>> Are you able to send any IR for others to reproduce this issue? >>> >>> >>> On Wed, Jul 17, 2013 at 11:23 PM, Peter Newman <peter at uformia.com>wrote: >>> >>>> Unfortunately, this doesn't appear to be the bug I'm hitting. I applied >>>> the fix to my source and it didn't make a difference. >>>> >>>> Also further testing found me getting the same behavior with other SIMD >>>> instructions. The common factor is in each case, ECX is set to 0x7fffffff, >>>> and it's an operation using xmm ptr ecx+offset . >>>> >>>> Additionally, turning the optimization level passed to createJIT down >>>> appears to avoid it, so I'm now leaning towards a bug in one of the >>>> optimization passes. >>>> >>>> I'm going to dig through the passes controlled by that parameter and >>>> see if I can narrow down which optimization is causing it. >>>> >>>> Peter N >>>> >>>> >>>> On 17/07/2013 1:58 PM, Solomon Boulos wrote: >>>> >>>>> As someone off list just told me, perhaps my new bug is the same issue: >>>>> >>>>> http://llvm.org/bugs/show_bug.cgi?id=16640 >>>>> >>>>> Do you happen to be using FastISel? >>>>> >>>>> Solomon >>>>> >>>>> On Jul 16, 2013, at 6:39 PM, Peter Newman <peter at uformia.com> wrote: >>>>> >>>>> Hello all, >>>>>> >>>>>> I'm currently in the process of debugging a crash occurring in our >>>>>> program. In LLVM 3.2 and 3.3 it appears that JIT generated code is >>>>>> attempting to perform access unaligned memory with a SSE2 instruction. >>>>>> However this only happens under certain conditions that seem (but may not >>>>>> be) related to the stacks state on calling the function. >>>>>> >>>>>> Our program acts as a front-end, using the LLVM C++ API to generate a >>>>>> JIT generated function. This function is primarily mathematical, so we use >>>>>> the Vector types to take advantage of SIMD instructions (as well as a few >>>>>> SSE2 intrinsics). >>>>>> >>>>>> This worked in LLVM 2.8 but started failing in 3.2 and has continued >>>>>> to fail in 3.3. It fails with no optimizations applied to the LLVM >>>>>> Function/Module. It crashes with what is reported as a memory access error >>>>>> (accessing 0xffffffff), however it's suggested that this is how the SSE >>>>>> fault raising mechanism appears. >>>>>> >>>>>> The generated instruction varies, but it seems to often be similar to >>>>>> (I don't have it in front of me, sorry): >>>>>> movapd xmm0, xmm[ecx+0x???????] >>>>>> Where the xmm register changes, and the second parameter is a memory >>>>>> access. >>>>>> ECX is always set to 0x7ffffff - however I don't know if this is part >>>>>> of the SSE error reporting process or is part of the situation causing the >>>>>> error. >>>>>> >>>>>> I haven't worked out exactly what code path etc is causing this >>>>>> crash. I'm hoping that someone can tell me if there were any changed >>>>>> requirements for working with SIMD in LLVM 3.2 (or earlier, we haven't >>>>>> tried 3.0 or 3.1). I currently suspect the use of GlobalVariable (we first >>>>>> discovered the crash when using a feature that uses them), however I have >>>>>> attempted using setAlignment on the GlobalVariables without any change. >>>>>> >>>>>> -- >>>>>> Peter N >>>>>> _______________________________________________ >>>>>> LLVM Developers mailing list >>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>> >>>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>> >>> >>> >>> >>> -- >>> ~Craig >>> >>> >>> >>> >> >> >> -- >> ~Craig >> >> >> > > > -- > ~Craig > > >-- ~Craig -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130718/ff8dda6c/attachment.html>
Reasonably Related Threads
- [LLVMdev] SIMD instructions and memory alignment on X86
- [LLVMdev] SIMD instructions and memory alignment on X86
- [LLVMdev] SIMD instructions and memory alignment on X86
- [LLVMdev] fptoui calling a function that modifies ECX
- [LLVMdev] fptoui calling a function that modifies ECX