thr3ads.net - search: "movup"

2008 Jul 12

2

[LLVMdev] Shuffle regression

...t of fibonacci.cpp to reproduce the issue. It runs fine on release 2.3 but revision 52648 fails, and I suspect that the issue is still present. 2.3 generates the following x86 code: 03A10010 push ebp 03A10011 mov ebp,esp 03A10013 and esp,0FFFFFFF0h 03A10019 movups xmm0,xmmword ptr ds:[141D280h] 03A10020 xorps xmm1,xmm1 03A10023 movaps xmm2,xmm0 03A10026 shufps xmm2,xmm1,32h 03A1002A movaps xmm1,xmm0 03A1002D shufps xmm1,xmm2,84h 03A10031 shufps xmm0,xmm1,23h 03A10035 shufps xmm1,xmm1,40h 03A100...

[LLVMdev] Spilled variables using unaligned moves

2008 Jul 14

5

[LLVMdev] Spilled variables using unaligned moves

...hough the stack is aligned. This seems like an optimization opportunity. The attached replacement of fibonacci.cpp generates x86 code like this: 03A70010 push ebp 03A70011 mov ebp,esp 03A70013 and esp,0FFFFFFF0h 03A70019 sub esp,1A0h ... 03A7006C movups xmmword ptr [esp+180h],xmm7 ... 03A70229 mulps xmm1,xmmword ptr [esp+180h] ... 03A70682 movups xmm0,xmmword ptr [esp+180h] Note how stores and loads use unaligned moves while it could use aligned moves. It's also interesting that the multiply does correctly assume the...

[LLVMdev] movaps being generated despite alignment 1 being specified

2007 Oct 18

3

[LLVMdev] movaps being generated despite alignment 1 being specified

...t; %vectorToDemote, <4 x float>* %externalVectorPtrCast2, align 1 ret void } Produces these instructions which obeys all the align 1 directives on the LoadInsts and StoreInsts.. ... 15D10010 sub esp,2Ch 15D10013 mov eax,dword ptr [esp+34h] 15D10017 movups xmm0,xmmword ptr [eax] 15D1001A movups xmmword ptr [esp],xmm0 15D1001E mov eax,dword ptr [esp+30h] 15D10022 movups xmmword ptr [esp+10h],xmm0 15D10027 movups xmm0,xmmword ptr [esp+10h] 15D1002C movups xmmword ptr [eax],xmm0 15D1002F add es...

[LLVMdev] Shuffle regression

2008 Jul 12

0

[LLVMdev] Shuffle regression

...e issue. It > runs fine on release 2.3 but revision 52648 fails, and I suspect > that the issue is still present. > > 2.3 generates the following x86 code: > > 03A10010 push ebp > 03A10011 mov ebp,esp > 03A10013 and esp,0FFFFFFF0h > 03A10019 movups xmm0,xmmword ptr ds:[141D280h] > 03A10020 xorps xmm1,xmm1 > 03A10023 movaps xmm2,xmm0 > 03A10026 shufps xmm2,xmm1,32h > 03A1002A movaps xmm1,xmm0 > 03A1002D shufps xmm1,xmm2,84h > 03A10031 shufps xmm0,xmm1,23h > 03A10035 shufps x...

[LLVMdev] Spilled variables using unaligned moves

2008 Jul 14

0

[LLVMdev] Spilled variables using unaligned moves

This is on Windows / Cygwin? I think the dynamic stack pointer re- alignment doesn't happen until post- register allocation. Assuming there aren't other instructions between the prologue and the first movups that mess up esp (there shouldn't), this is indeed a bug. Please file a bug and attach a bc file. Thanks. Evan On Jul 14, 2008, at 7:43 AM, Nicolas Capens wrote: > Hi all, > > It looks like vector spills don’t use aligned moves even though the > stack is aligned. This seems...

[LLVMdev] movaps being generated despite alignment 1 being specified

2007 Oct 19

0

[LLVMdev] movaps being generated despite alignment 1 being specified

On Oct 18, 2007, at 1:52 PM, Chuck Rose III wrote: > > Here are the instructions for evaluateDependents. The JITter > hasn’t compiled foo yet. What’s confusing to me is why did my > movups suddenly become a movaps? All the stores and loads have > align 1 on them. Hi Chuck, I believe this is a bug but am unable to reproduce it with the test case you've provided. I should be able to see the same problem using llc since the code generator is going through all the same p...

[LLVMdev] SSE intrinsic alignment bug?

2008 May 22

4

[LLVMdev] SSE intrinsic alignment bug?

...loat*))executionEngine->getPointerToFunction(function); func(out, in); delete executionEngine; return 0; } It generates the following assembly code: mov eax,dword ptr [esp+8] rcpps xmm0,xmmword ptr [eax] mov eax,dword ptr [esp+4] movups xmmword ptr [eax],xmm0 ret Note that even though the LoadInst is specified to have an alignment of 1 (in fact no alignment), the rcpps tries to reference the memory directly, but it expects aligned memory. If "in" happens to not be 16-byte aligned, an exception will be thrown....

[LLVMdev] InstructionCombining forgets alignment of globals

2008 Jul 10

3

[LLVMdev] InstructionCombining forgets alignment of globals

...ebp 03C20031 ret All three SSE instructions will generate a fault for accessing unaligned memory. Disabling InstructionCombining gives me the following correct code: 03B10010 push ebp 03B10011 mov ebp,esp 03B10013 and esp,0FFFFFFF0h 03B10019 movups xmm0,xmmword ptr ds:[164E79Ah] 03B10020 movups xmm1,xmmword ptr ds:[164E799h] 03B10027 mulps xmm1,xmm0 03B1002A movups xmmword ptr ds:[164E799h],xmm1 03B10031 mov esp,ebp 03B10033 pop ebp 03B10034 ret Unless I'm missing something...

[LLVMdev] SSE intrinsic alignment bug?

2008 May 22

2

[LLVMdev] SSE intrinsic alignment bug?

...n) the stack pointer is always 16-byte aligned; on other targets there should be code in the function prologue to force it to be aligned. On May 22, 2008, at 4:36 PM, Nicolas Capens wrote: > Small typo, for the correct assembly code I meant: > > mov eax,dword ptr [esp+8] > movups xmm0,xmmword ptr [eax] > rcpps xmm1,xmm0 > mov eax,dword ptr [esp+4] > movups xmmword ptr [eax],xmm1 > ret > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > htt...

[LLVMdev] Spilled variables using unaligned moves

2008 Jul 15

1

[LLVMdev] Spilled variables using unaligned moves

...at korobeynikov.info Subject: Re: [LLVMdev] Spilled variables using unaligned moves This is on Windows / Cygwin? I think the dynamic stack pointer re-alignment doesn't happen until post- register allocation. Assuming there aren't other instructions between the prologue and the first movups that mess up esp (there shouldn't), this is indeed a bug. Please file a bug and attach a bc file. Thanks. Evan On Jul 14, 2008, at 7:43 AM, Nicolas Capens wrote: Hi all, It looks like vector spills don't use aligned moves even though the stack is aligned. This seems like a...

[PATCH] Make SSE Run Time option. Add Win32 SSE code

2004 Aug 06

2

[PATCH] Make SSE Run Time option. Add Win32 SSE code

...m4, [eax+20] + mulps xmm4, xmm0 + addps xmm2, [ecx+4] + movaps xmm5, [ebx+20] + mulps xmm5, xmm1 + addps xmm4, [ecx+20] + subps xmm2, xmm3 + movups [ecx], xmm2 + subps xmm4, xmm5 + movups [ecx+16], xmm4 + + movss xmm2, [eax+36] + mulss xmm2, xmm0 + movss xmm3, [ebx+36] + mulss xmm3, xmm1 +...

[LLVMdev] SSE intrinsic alignment bug?

2008 May 22

0

[LLVMdev] SSE intrinsic alignment bug?

Small typo, for the correct assembly code I meant: mov eax,dword ptr [esp+8] movups xmm0,xmmword ptr [eax] rcpps xmm1,xmm0 mov eax,dword ptr [esp+4] movups xmmword ptr [eax],xmm1 ret -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080523/c171ce0c/attachment....

[LLVMdev] Spilled variables using unaligned moves

2008 Jul 14

0

[LLVMdev] Spilled variables using unaligned moves

...ave a 16-byte aligned stack. -Chris > > The attached replacement of fibonacci.cpp generates x86 code like > this: > > 03A70010 push ebp > 03A70011 mov ebp,esp > 03A70013 and esp,0FFFFFFF0h > 03A70019 sub esp,1A0h > ... > 03A7006C movups xmmword ptr [esp+180h],xmm7 > ... > 03A70229 mulps xmm1,xmmword ptr [esp+180h] > ... > 03A70682 movups xmm0,xmmword ptr [esp+180h] > > Note how stores and loads use unaligned moves while it could use > aligned moves. It’s also interesting that the multiply...

[LLVMdev] InstructionCombining forgets alignment of globals

2008 Jul 10

0

[LLVMdev] InstructionCombining forgets alignment of globals

...ebp 03C20031 ret All three SSE instructions will generate a fault for accessing unaligned memory. Disabling InstructionCombining gives me the following correct code: 03B10010 push ebp 03B10011 mov ebp,esp 03B10013 and esp,0FFFFFFF0h 03B10019 movups xmm0,xmmword ptr ds:[164E79Ah] 03B10020 movups xmm1,xmmword ptr ds:[164E799h] 03B10027 mulps xmm1,xmm0 03B1002A movups xmmword ptr ds:[164E799h],xmm1 03B10031 mov esp,ebp 03B10033 pop ebp 03B10034 ret Unless I'm missing something...

New routine: FLAC__lpc_compute_autocorrelation_asm_ia32_sse_lag_16

2013 Aug 22

2

New routine: FLAC__lpc_compute_autocorrelation_asm_ia32_sse_lag_16

...xmm7, xmm0 + mulps xmm7, xmm2 + addps xmm6, xmm7 + movaps xmm7, xmm0 + mulps xmm7, xmm3 + mulps xmm0, xmm4 + addps xmm7, [esp] + addps xmm0, [esp + 16] + movaps [esp], xmm7 + movaps [esp + 16], xmm0 + + dec edx + jnz .loop_start +.loop_end: + ; store autoc + mov edx, [ebp + 20] ; edx == autoc + movups [edx], xmm5 + movups [edx + 16], xmm6 + movaps xmm5, [esp] + movaps xmm6, [esp + 16] + movups [edx + 32], xmm5 + movups [edx + 48], xmm6 +.end: + mov esp, ebp + pop ebp + ret + + ALIGN 16 cident FLAC__lpc_compute_autocorrelation_asm_ia32_3dnow ;[ebp + 32] autoc ;[ebp + 28] lag

error: couldn't allocate input reg for constraint '{xmm0}'

2018 Nov 17

2

error: couldn't allocate input reg for constraint '{xmm0}'

Here is some zig code: pub fn setXmm0(comptime T: type, value: T) void { comptime assert(builtin.arch == builtin.Arch.x86_64); const aligned_value: T align(16) = value; asm volatile ( \\movaps (%[ptr]), %%xmm0 : : [ptr] "r" (&aligned_value) : "xmm0" ); } I want to improve this and integrate more tightly with LLVM IR,

[LLVMdev] llvm.exp.f32 didn't work

2012 Mar 31

1

[LLVMdev] llvm.exp.f32 didn't work

...2 didn't work but sqrt works well. I implemented a function like define inlinehint float "my_exp"(float %.value) { .body: %0 = call float @llvm.exp.f32(float %.value) ret float %0 } declare float @llvm.exp.f32(float) nounwind readonly But it generates following ASM: 00280072 movups xmm0,xmmword ptr [esp+8] 00280077 movss dword ptr [esp],xmm0 0028007C call 00000000 00280081 pop eax As you seen, line 0028007C will call CRT exp I think, but it calls NULL pointer. But sqrt is right. 005000D1 movss xmm0,dword ptr [esp+0Ch] 005000D7 movss...

[LLVMdev] buildbot with -vectorize

2012 Jun 28

0

[LLVMdev] buildbot with -vectorize

...+0200 > Tobias Grosser<tobias at grosser.es> wrote: > [..] > Also, since you're running these on an x86_64 machine, and I think they > don't have unaligned vector load/stores, you should probably add -mllvm > -bb-vectorize-aligned-only to the target flags. What about MOVUPS and MOVUPD? Tobi

[LLVMdev] buildbot with -vectorize

2012 Jun 28

1

[LLVMdev] buildbot with -vectorize

...rosser.es> wrote: > > > [..] > > > Also, since you're running these on an x86_64 machine, and I think > > they don't have unaligned vector load/stores, you should probably > > add -mllvm -bb-vectorize-aligned-only to the target flags. > > What about MOVUPS and MOVUPD? Good point. Never mind. I suppose those can be used for the integer vectors too. Thanks again, Hal > > Tobi -- Hal Finkel Postdoctoral Appointee Leadership Computing Facility Argonne National Laboratory

[LLVMdev] Is it a bug or am I missing something ?

2013 Feb 19

2

[LLVMdev] Is it a bug or am I missing something ?

....globl sample_test .align 16, 0x90 .type sample_test, at function sample_test: # @sample_test # BB#0: # %L.entry movl 4(%esp), %eax movss 304(%eax), %xmm0 xorps %xmm1, %xmm1 movl 8(%esp), %eax movups %xmm1, 624(%eax) pshufd $65, %xmm0, %xmm0 # xmm0 = xmm0[1,0,0,1] movdqu %xmm0, 608(%eax) ret .Ltmp0: .size sample_test, .Ltmp0-sample_test .section ".note.GNU-stack","", at progbits It seems to me that this sequence of instruction i...

search for: movup