Displaying 20 results from an estimated 54 matches for "movup".
Did you mean:
movups
2008 Jul 12
2
[LLVMdev] Shuffle regression
...t of fibonacci.cpp to reproduce the issue. It runs fine on release
2.3 but revision 52648 fails, and I suspect that the issue is still present.
2.3 generates the following x86 code:
03A10010 push ebp
03A10011 mov ebp,esp
03A10013 and esp,0FFFFFFF0h
03A10019 movups xmm0,xmmword ptr ds:[141D280h]
03A10020 xorps xmm1,xmm1
03A10023 movaps xmm2,xmm0
03A10026 shufps xmm2,xmm1,32h
03A1002A movaps xmm1,xmm0
03A1002D shufps xmm1,xmm2,84h
03A10031 shufps xmm0,xmm1,23h
03A10035 shufps xmm1,xmm1,40h
03A100...
2008 Jul 14
5
[LLVMdev] Spilled variables using unaligned moves
...hough the stack is
aligned. This seems like an optimization opportunity.
The attached replacement of fibonacci.cpp generates x86 code like this:
03A70010 push ebp
03A70011 mov ebp,esp
03A70013 and esp,0FFFFFFF0h
03A70019 sub esp,1A0h
...
03A7006C movups xmmword ptr [esp+180h],xmm7
...
03A70229 mulps xmm1,xmmword ptr [esp+180h]
...
03A70682 movups xmm0,xmmword ptr [esp+180h]
Note how stores and loads use unaligned moves while it could use aligned
moves. It's also interesting that the multiply does correctly assume the...
2007 Oct 18
3
[LLVMdev] movaps being generated despite alignment 1 being specified
...t; %vectorToDemote, <4 x float>*
%externalVectorPtrCast2, align 1
ret void
}
Produces these instructions which obeys all the align 1 directives on
the LoadInsts and StoreInsts..
...
15D10010 sub esp,2Ch
15D10013 mov eax,dword ptr [esp+34h]
15D10017 movups xmm0,xmmword ptr [eax]
15D1001A movups xmmword ptr [esp],xmm0
15D1001E mov eax,dword ptr [esp+30h]
15D10022 movups xmmword ptr [esp+10h],xmm0
15D10027 movups xmm0,xmmword ptr [esp+10h]
15D1002C movups xmmword ptr [eax],xmm0
15D1002F add es...
2008 Jul 12
0
[LLVMdev] Shuffle regression
...e issue. It
> runs fine on release 2.3 but revision 52648 fails, and I suspect
> that the issue is still present.
>
> 2.3 generates the following x86 code:
>
> 03A10010 push ebp
> 03A10011 mov ebp,esp
> 03A10013 and esp,0FFFFFFF0h
> 03A10019 movups xmm0,xmmword ptr ds:[141D280h]
> 03A10020 xorps xmm1,xmm1
> 03A10023 movaps xmm2,xmm0
> 03A10026 shufps xmm2,xmm1,32h
> 03A1002A movaps xmm1,xmm0
> 03A1002D shufps xmm1,xmm2,84h
> 03A10031 shufps xmm0,xmm1,23h
> 03A10035 shufps x...
2008 Jul 14
0
[LLVMdev] Spilled variables using unaligned moves
This is on Windows / Cygwin? I think the dynamic stack pointer re-
alignment doesn't happen until post- register allocation.
Assuming there aren't other instructions between the prologue and the
first movups that mess up esp (there shouldn't), this is indeed a bug.
Please file a bug and attach a bc file. Thanks.
Evan
On Jul 14, 2008, at 7:43 AM, Nicolas Capens wrote:
> Hi all,
>
> It looks like vector spills don’t use aligned moves even though the
> stack is aligned. This seems...
2007 Oct 19
0
[LLVMdev] movaps being generated despite alignment 1 being specified
On Oct 18, 2007, at 1:52 PM, Chuck Rose III wrote:
>
> Here are the instructions for evaluateDependents. The JITter
> hasn’t compiled foo yet. What’s confusing to me is why did my
> movups suddenly become a movaps? All the stores and loads have
> align 1 on them.
Hi Chuck,
I believe this is a bug but am unable to reproduce it with the test
case you've provided. I should be able to see the same problem using
llc since the code generator is going through all the same p...
2008 May 22
4
[LLVMdev] SSE intrinsic alignment bug?
...loat*))executionEngine->getPointerToFunction(function);
func(out, in);
delete executionEngine;
return 0;
}
It generates the following assembly code:
mov eax,dword ptr [esp+8]
rcpps xmm0,xmmword ptr [eax]
mov eax,dword ptr [esp+4]
movups xmmword ptr [eax],xmm0
ret
Note that even though the LoadInst is specified to have an alignment of 1
(in fact no alignment), the rcpps tries to reference the memory directly,
but it expects aligned memory. If "in" happens to not be 16-byte aligned, an
exception will be thrown....
2008 Jul 10
3
[LLVMdev] InstructionCombining forgets alignment of globals
...ebp
03C20031 ret
All three SSE instructions will generate a fault for accessing unaligned
memory. Disabling InstructionCombining gives me the following correct code:
03B10010 push ebp
03B10011 mov ebp,esp
03B10013 and esp,0FFFFFFF0h
03B10019 movups xmm0,xmmword ptr ds:[164E79Ah]
03B10020 movups xmm1,xmmword ptr ds:[164E799h]
03B10027 mulps xmm1,xmm0
03B1002A movups xmmword ptr ds:[164E799h],xmm1
03B10031 mov esp,ebp
03B10033 pop ebp
03B10034 ret
Unless I'm missing something...
2008 May 22
2
[LLVMdev] SSE intrinsic alignment bug?
...n) the stack pointer is always 16-byte aligned; on
other targets there should be code in the function prologue to force
it to be aligned.
On May 22, 2008, at 4:36 PM, Nicolas Capens wrote:
> Small typo, for the correct assembly code I meant:
>
> mov eax,dword ptr [esp+8]
> movups xmm0,xmmword ptr [eax]
> rcpps xmm1,xmm0
> mov eax,dword ptr [esp+4]
> movups xmmword ptr [eax],xmm1
> ret
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> htt...
2008 Jul 15
1
[LLVMdev] Spilled variables using unaligned moves
...at korobeynikov.info
Subject: Re: [LLVMdev] Spilled variables using unaligned moves
This is on Windows / Cygwin? I think the dynamic stack pointer re-alignment
doesn't happen until post- register allocation.
Assuming there aren't other instructions between the prologue and the first
movups that mess up esp (there shouldn't), this is indeed a bug. Please file
a bug and attach a bc file. Thanks.
Evan
On Jul 14, 2008, at 7:43 AM, Nicolas Capens wrote:
Hi all,
It looks like vector spills don't use aligned moves even though the stack is
aligned. This seems like a...
2004 Aug 06
2
[PATCH] Make SSE Run Time option. Add Win32 SSE code
...m4, [eax+20]
+ mulps xmm4, xmm0
+ addps xmm2, [ecx+4]
+ movaps xmm5, [ebx+20]
+ mulps xmm5, xmm1
+ addps xmm4, [ecx+20]
+ subps xmm2, xmm3
+ movups [ecx], xmm2
+ subps xmm4, xmm5
+ movups [ecx+16], xmm4
+
+ movss xmm2, [eax+36]
+ mulss xmm2, xmm0
+ movss xmm3, [ebx+36]
+ mulss xmm3, xmm1
+...
2008 May 22
0
[LLVMdev] SSE intrinsic alignment bug?
Small typo, for the correct assembly code I meant:
mov eax,dword ptr [esp+8]
movups xmm0,xmmword ptr [eax]
rcpps xmm1,xmm0
mov eax,dword ptr [esp+4]
movups xmmword ptr [eax],xmm1
ret
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080523/c171ce0c/attachment....
2008 Jul 14
0
[LLVMdev] Spilled variables using unaligned moves
...ave a 16-byte aligned stack.
-Chris
>
> The attached replacement of fibonacci.cpp generates x86 code like
> this:
>
> 03A70010 push ebp
> 03A70011 mov ebp,esp
> 03A70013 and esp,0FFFFFFF0h
> 03A70019 sub esp,1A0h
> ...
> 03A7006C movups xmmword ptr [esp+180h],xmm7
> ...
> 03A70229 mulps xmm1,xmmword ptr [esp+180h]
> ...
> 03A70682 movups xmm0,xmmword ptr [esp+180h]
>
> Note how stores and loads use unaligned moves while it could use
> aligned moves. It’s also interesting that the multiply...
2008 Jul 10
0
[LLVMdev] InstructionCombining forgets alignment of globals
...ebp
03C20031 ret
All three SSE instructions will generate a fault for accessing unaligned
memory. Disabling InstructionCombining gives me the following correct code:
03B10010 push ebp
03B10011 mov ebp,esp
03B10013 and esp,0FFFFFFF0h
03B10019 movups xmm0,xmmword ptr ds:[164E79Ah]
03B10020 movups xmm1,xmmword ptr ds:[164E799h]
03B10027 mulps xmm1,xmm0
03B1002A movups xmmword ptr ds:[164E799h],xmm1
03B10031 mov esp,ebp
03B10033 pop ebp
03B10034 ret
Unless I'm missing something...
2013 Aug 22
2
New routine: FLAC__lpc_compute_autocorrelation_asm_ia32_sse_lag_16
...xmm7, xmm0
+ mulps xmm7, xmm2
+ addps xmm6, xmm7
+ movaps xmm7, xmm0
+ mulps xmm7, xmm3
+ mulps xmm0, xmm4
+ addps xmm7, [esp]
+ addps xmm0, [esp + 16]
+ movaps [esp], xmm7
+ movaps [esp + 16], xmm0
+
+ dec edx
+ jnz .loop_start
+.loop_end:
+ ; store autoc
+ mov edx, [ebp + 20] ; edx == autoc
+ movups [edx], xmm5
+ movups [edx + 16], xmm6
+ movaps xmm5, [esp]
+ movaps xmm6, [esp + 16]
+ movups [edx + 32], xmm5
+ movups [edx + 48], xmm6
+.end:
+ mov esp, ebp
+ pop ebp
+ ret
+
+ ALIGN 16
cident FLAC__lpc_compute_autocorrelation_asm_ia32_3dnow
;[ebp + 32] autoc
;[ebp + 28] lag
2018 Nov 17
2
error: couldn't allocate input reg for constraint '{xmm0}'
Here is some zig code:
pub fn setXmm0(comptime T: type, value: T) void {
comptime assert(builtin.arch == builtin.Arch.x86_64);
const aligned_value: T align(16) = value;
asm volatile (
\\movaps (%[ptr]), %%xmm0
:
: [ptr] "r" (&aligned_value)
: "xmm0"
);
}
I want to improve this and integrate more tightly with LLVM IR,
2012 Mar 31
1
[LLVMdev] llvm.exp.f32 didn't work
...2 didn't work but sqrt works well.
I implemented a function like
define inlinehint float "my_exp"(float %.value) {
.body:
%0 = call float @llvm.exp.f32(float %.value)
ret float %0
}
declare float @llvm.exp.f32(float) nounwind readonly
But it generates following ASM:
00280072 movups xmm0,xmmword ptr [esp+8]
00280077 movss dword ptr [esp],xmm0
0028007C call 00000000
00280081 pop eax
As you seen, line 0028007C will call CRT exp I think, but it calls NULL
pointer.
But sqrt is right.
005000D1 movss xmm0,dword ptr [esp+0Ch]
005000D7 movss...
2012 Jun 28
0
[LLVMdev] buildbot with -vectorize
...+0200
> Tobias Grosser<tobias at grosser.es> wrote:
>
[..]
> Also, since you're running these on an x86_64 machine, and I think they
> don't have unaligned vector load/stores, you should probably add -mllvm
> -bb-vectorize-aligned-only to the target flags.
What about MOVUPS and MOVUPD?
Tobi
2012 Jun 28
1
[LLVMdev] buildbot with -vectorize
...rosser.es> wrote:
> >
> [..]
>
> > Also, since you're running these on an x86_64 machine, and I think
> > they don't have unaligned vector load/stores, you should probably
> > add -mllvm -bb-vectorize-aligned-only to the target flags.
>
> What about MOVUPS and MOVUPD?
Good point. Never mind. I suppose those can be used for the integer
vectors too.
Thanks again,
Hal
>
> Tobi
--
Hal Finkel
Postdoctoral Appointee
Leadership Computing Facility
Argonne National Laboratory
2013 Feb 19
2
[LLVMdev] Is it a bug or am I missing something ?
....globl sample_test
.align 16, 0x90
.type sample_test, at function
sample_test: # @sample_test
# BB#0: # %L.entry
movl 4(%esp), %eax
movss 304(%eax), %xmm0
xorps %xmm1, %xmm1
movl 8(%esp), %eax
movups %xmm1, 624(%eax)
pshufd $65, %xmm0, %xmm0 # xmm0 = xmm0[1,0,0,1]
movdqu %xmm0, 608(%eax)
ret
.Ltmp0:
.size sample_test, .Ltmp0-sample_test
.section ".note.GNU-stack","", at progbits
It seems to me that this sequence of instruction i...