search for: xmmword

Displaying 20 results from an estimated 36 matches for "xmmword".

2013 Jul 19
0
[LLVMdev] llvm.x86.sse2.sqrt.pd not using sqrtpd, calling a function that modifies ECX
...02E00D1 mov ebp,esp 002E00D3 push ebx 002E00D4 push edi 002E00D5 push esi 002E00D6 and esp,0FFFFFFF0h 002E00DC sub esp,110h 002E00E2 mov eax,dword ptr [ebp+8] 002E00E5 movddup xmm0,mmword ptr [eax+10h] 002E00EA movapd xmmword ptr [esp+80h],xmm0 002E00F3 movddup xmm0,mmword ptr [eax+8] 002E00F8 movapd xmmword ptr [esp+70h],xmm0 002E00FE movddup xmm0,mmword ptr [eax] 002E0102 movapd xmmword ptr [esp+60h],xmm0 002E0108 xorpd xmm0,xmm0 002E010C movapd xmmword ptr [esp+0C0h],xmm0 002...
2013 Jul 19
4
[LLVMdev] SIMD instructions and memory alignment on X86
Hmm, I'm not able to get those .ll files to compile if I disable SSE and I end up with SSE instructions(including sqrtpd) if I don't disable it. On Thu, Jul 18, 2013 at 10:53 PM, Peter Newman <peter at uformia.com> wrote: > Is there something specifically required to enable SSE? If it's not > detected as available (based from the target triple?) then I don't think
2004 Aug 06
2
Notes on 1.1.4 Windows. Testing of SSE Intrinics Code and others
...Compute next filter result */ 259: xx = _mm_load_ps1(x+i); 00413483 mov eax,dword ptr [ebp-64h] 00413486 mov ecx,dword ptr [ebx+8] 00413489 lea edx,[ecx+eax*4] 0041348C movss xmm0,dword ptr [edx] 00413490 shufps xmm0,xmm0,0 00413494 movaps xmmword ptr [xx],xmm0 260: yy = _mm_add_ss(xx, mem[0]); 00413498 movaps xmm0,xmmword ptr [ebp-60h] 0041349C movaps xmm1,xmmword ptr [xx] 004134A0 addss xmm1,xmm0 004134A4 movaps xmmword ptr [yy],xmm1 261: _mm_store_ss(y+i, yy); 004134AB movaps xmm0,xmmword...
2008 Jul 10
3
[LLVMdev] InstructionCombining forgets alignment of globals
Hi all, The InstructionCombining pass causes alignment of globals to be ignored. I've attached a replacement of Fibonacci.cpp which reproduces this (I used 2.3 release). Here's the x86 code it produces: 03C20019 movaps xmm0,xmmword ptr ds:[164E799h] 03C20020 mulps xmm0,xmmword ptr ds:[164E79Ah] 03C20027 movaps xmmword ptr ds:[164E799h],xmm0 03C2002E mov esp,ebp 03C20030 pop ebp 03C20031 ret All three SSE instructions will generate a fault for accessing unaligned memory....
2008 Jul 14
5
[LLVMdev] Spilled variables using unaligned moves
...ack is aligned. This seems like an optimization opportunity. The attached replacement of fibonacci.cpp generates x86 code like this: 03A70010 push ebp 03A70011 mov ebp,esp 03A70013 and esp,0FFFFFFF0h 03A70019 sub esp,1A0h ... 03A7006C movups xmmword ptr [esp+180h],xmm7 ... 03A70229 mulps xmm1,xmmword ptr [esp+180h] ... 03A70682 movups xmm0,xmmword ptr [esp+180h] Note how stores and loads use unaligned moves while it could use aligned moves. It's also interesting that the multiply does correctly assume the stack to be 1...
2008 Jul 12
2
[LLVMdev] Shuffle regression
...p to reproduce the issue. It runs fine on release 2.3 but revision 52648 fails, and I suspect that the issue is still present. 2.3 generates the following x86 code: 03A10010 push ebp 03A10011 mov ebp,esp 03A10013 and esp,0FFFFFFF0h 03A10019 movups xmm0,xmmword ptr ds:[141D280h] 03A10020 xorps xmm1,xmm1 03A10023 movaps xmm2,xmm0 03A10026 shufps xmm2,xmm1,32h 03A1002A movaps xmm1,xmm0 03A1002D shufps xmm1,xmm2,84h 03A10031 shufps xmm0,xmm1,23h 03A10035 shufps xmm1,xmm1,40h 03A10039 shufps xmm...
2007 Oct 18
3
[LLVMdev] movaps being generated despite alignment 1 being specified
...e, <4 x float>* %externalVectorPtrCast2, align 1 ret void } Produces these instructions which obeys all the align 1 directives on the LoadInsts and StoreInsts.. ... 15D10010 sub esp,2Ch 15D10013 mov eax,dword ptr [esp+34h] 15D10017 movups xmm0,xmmword ptr [eax] 15D1001A movups xmmword ptr [esp],xmm0 15D1001E mov eax,dword ptr [esp+30h] 15D10022 movups xmmword ptr [esp+10h],xmm0 15D10027 movups xmm0,xmmword ptr [esp+10h] 15D1002C movups xmmword ptr [eax],xmm0 15D1002F add esp,2Ch 15D10032 r...
2008 Jul 12
0
[LLVMdev] Shuffle regression
...; runs fine on release 2.3 but revision 52648 fails, and I suspect > that the issue is still present. > > 2.3 generates the following x86 code: > > 03A10010 push ebp > 03A10011 mov ebp,esp > 03A10013 and esp,0FFFFFFF0h > 03A10019 movups xmm0,xmmword ptr ds:[141D280h] > 03A10020 xorps xmm1,xmm1 > 03A10023 movaps xmm2,xmm0 > 03A10026 shufps xmm2,xmm1,32h > 03A1002A movaps xmm1,xmm0 > 03A1002D shufps xmm1,xmm2,84h > 03A10031 shufps xmm0,xmm1,23h > 03A10035 shufps xmm1,xmm1,40h > 0...
2008 May 22
4
[LLVMdev] SSE intrinsic alignment bug?
...void (*func)(float*,float*) = (void(*)(float*,float*))executionEngine->getPointerToFunction(function); func(out, in); delete executionEngine; return 0; } It generates the following assembly code: mov eax,dword ptr [esp+8] rcpps xmm0,xmmword ptr [eax] mov eax,dword ptr [esp+4] movups xmmword ptr [eax],xmm0 ret Note that even though the LoadInst is specified to have an alignment of 1 (in fact no alignment), the rcpps tries to reference the memory directly, but it expects aligned memory. If "in" happens t...
2008 Jul 14
0
[LLVMdev] Spilled variables using unaligned moves
...optimization opportunity. > > The attached replacement of fibonacci.cpp generates x86 code like > this: > > 03A70010 push ebp > 03A70011 mov ebp,esp > 03A70013 and esp,0FFFFFFF0h > 03A70019 sub esp,1A0h > ... > 03A7006C movups xmmword ptr [esp+180h],xmm7 > ... > 03A70229 mulps xmm1,xmmword ptr [esp+180h] > ... > 03A70682 movups xmm0,xmmword ptr [esp+180h] > > Note how stores and loads use unaligned moves while it could use > aligned moves. It’s also interesting that the multiply does > co...
2008 Jul 10
0
[LLVMdev] InstructionCombining forgets alignment of globals
...ructionCombining forgets alignment of globals Hi all, The InstructionCombining pass causes alignment of globals to be ignored. I've attached a replacement of Fibonacci.cpp which reproduces this (I used 2.3 release). Here's the x86 code it produces: 03C20019 movaps xmm0,xmmword ptr ds:[164E799h] 03C20020 mulps xmm0,xmmword ptr ds:[164E79Ah] 03C20027 movaps xmmword ptr ds:[164E799h],xmm0 03C2002E mov esp,ebp 03C20030 pop ebp 03C20031 ret All three SSE instructions will generate a fault for accessing unaligned memory....
2008 Jul 14
0
[LLVMdev] Spilled variables using unaligned moves
...e aligned stack. -Chris > > The attached replacement of fibonacci.cpp generates x86 code like > this: > > 03A70010 push ebp > 03A70011 mov ebp,esp > 03A70013 and esp,0FFFFFFF0h > 03A70019 sub esp,1A0h > ... > 03A7006C movups xmmword ptr [esp+180h],xmm7 > ... > 03A70229 mulps xmm1,xmmword ptr [esp+180h] > ... > 03A70682 movups xmm0,xmmword ptr [esp+180h] > > Note how stores and loads use unaligned moves while it could use > aligned moves. It’s also interesting that the multiply does > co...
2007 Oct 19
0
[LLVMdev] movaps being generated despite alignment 1 being specified
...ld be able to set a break point somewhere in ExecutionEngine.cpp / JIT.cpp and just dump out the bitcode with Module->dump() / print(). Evan > > > … > > 15D10012 sub esp,4Ch > > 15D10015 mov eax,dword ptr [esp+60h] > > 15D10019 movups xmm0,xmmword ptr [eax] > > 15D1001C movaps xmmword ptr [esp+8],xmm0 ß why did this > become a movaps? > > 15D10021 movups xmmword ptr [esp+28h],xmm0 > > 15D10026 mov esi,dword ptr [esp+58h] > > 15D1002A mov edi,dword ptr [esp+5Ch] > > 15D1002E...
2008 Jul 15
1
[LLVMdev] Spilled variables using unaligned moves
...stack is aligned. This seems like an optimization opportunity. The attached replacement of fibonacci.cpp generates x86 code like this: 03A70010 push ebp 03A70011 mov ebp,esp 03A70013 and esp,0FFFFFFF0h 03A70019 sub esp,1A0h ... 03A7006C movups xmmword ptr [esp+180h],xmm7 ... 03A70229 mulps xmm1,xmmword ptr [esp+180h] ... 03A70682 movups xmm0,xmmword ptr [esp+180h] Note how stores and loads use unaligned moves while it could use aligned moves. It's also interesting that the multiply does correctly assume the stack to be 1...
2009 Jun 30
2
[LLVMdev] JIT on Windows x64
...tly broken but am attempting to use the hack/patch proposed in this bug http://llvm.org/bugs/show_bug.cgi?id=3739. I checked out the revision the patch was created for (66183) and applied it but the assembler generated seems to fail whenever it reaches a movaps insctruction. eg. movaps xmmword ptr [rsp+20h],xmm8 movaps xmmword ptr [rsp+30h],xmm7 movaps xmmword ptr [rsp+40h],xmm6 Would this have something to do with the stack alignment? I am wondering if anybody else has had any success using that patch to get Windows x64 JIT to work correctly. Or if my problem may be un...
2020 Aug 30
5
BUG: complete misunterstanding of the MS-ABI
...f __uint128_t __udivmodti4(__uint128_t dividend, __uint128_t divisor, __uint128_t *remainder) { if (remainder != 0) *remainder = divisor; return dividend; } --- EOF --- clang -c -O1 generates the following INCOMPATIBLE and WRONG code: __udivmodti4 proc public movaps xmm0, xmmword ptr [rcx] test r8, r8 jz 0f movaps xmm1, xmmword ptr [rdx] movaps xmmword ptr [r8], xmm1 0: ret __udivmodti4 endp clang's misunderstanding of the MS-ABI can be clearly seen here: - RCX holds the address of the return value, NOT the address of...
2008 May 22
0
[LLVMdev] SSE intrinsic alignment bug?
Small typo, for the correct assembly code I meant: mov eax,dword ptr [esp+8] movups xmm0,xmmword ptr [eax] rcpps xmm1,xmm0 mov eax,dword ptr [esp+4] movups xmmword ptr [eax],xmm1 ret -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080523/c171ce0c/attachment.html>
2008 May 22
2
[LLVMdev] SSE intrinsic alignment bug?
...ter is always 16-byte aligned; on other targets there should be code in the function prologue to force it to be aligned. On May 22, 2008, at 4:36 PM, Nicolas Capens wrote: > Small typo, for the correct assembly code I meant: > > mov eax,dword ptr [esp+8] > movups xmm0,xmmword ptr [eax] > rcpps xmm1,xmm0 > mov eax,dword ptr [esp+4] > movups xmmword ptr [eax],xmm1 > ret > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.e...
2012 Mar 31
1
[LLVMdev] llvm.exp.f32 didn't work
...but sqrt works well. I implemented a function like define inlinehint float "my_exp"(float %.value) { .body: %0 = call float @llvm.exp.f32(float %.value) ret float %0 } declare float @llvm.exp.f32(float) nounwind readonly But it generates following ASM: 00280072 movups xmm0,xmmword ptr [esp+8] 00280077 movss dword ptr [esp],xmm0 0028007C call 00000000 00280081 pop eax As you seen, line 0028007C will call CRT exp I think, but it calls NULL pointer. But sqrt is right. 005000D1 movss xmm0,dword ptr [esp+0Ch] 005000D7 movss dword ptr [esp]...
2015 Mar 09
2
crash on lpc_restore_signal_16_intrin_sse2
On 9.3.2015 20:43, lvqcl wrote: > Janne Hyv?rinen wrote: > >> VLC 2.2.0 crashed with exception 0xc0000005 on the first file I tried. >> But libflac itself does not, for example flac.exe and foobar2000 have no >> issues. > *Very* interesting. > > I suspect that flac.exe and foobar2000 don't use > FLAC__lpc_restore_signal_16_intrin_sse2() function at all. This