Displaying 20 results from an estimated 35 matches for "xmm14".
Did you mean:
xmm1
2014 Sep 05
3
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
...A sample here:
>
> vinsertps $256, %xmm0, %xmm13, %xmm4 # xmm4 = xmm0[0],xmm13[1,2,3]
> vinsertps $256, %xmm1, %xmm0, %xmm6 # xmm6 = xmm1[0],xmm0[1,2,3]
> vinsertps $256, %xmm13, %xmm1, %xmm7 # xmm7 = xmm13[0],xmm1[1,2,3]
> vinsertps $416, %xmm1, %xmm4, %xmm14 # xmm14 =
> xmm4[0,1],xmm1[2],xmm4[3]
> vinsertps $416, %xmm13, %xmm6, %xmm13 # xmm13 =
> xmm6[0,1],xmm13[2],xmm6[3]
> vinsertps $416, %xmm0, %xmm7, %xmm0 # xmm0 =
> xmm7[0,1],xmm0[2],xmm7[3]
>
> We'll continue to look into this and do additional testing.
>...
2012 Jul 06
0
[LLVMdev] Excessive register spilling in large automatically generated functions, such as is found in FFTW
...n't
> immediately obvious, trunk generates more instructions and spills more
> registers compared to 3.1.
>
Actually, here is an occurrence of that behavior when compiling the
code with trunk:
[...]
movaps %xmm1, %xmm0 ### xmm1 mov'ed to xmm0
movaps %xmm1, %xmm14 ### xmm1 mov'ed to xmm14
addps %xmm7, %xmm0
movaps %xmm7, %xmm13
movaps %xmm0, %xmm1 ### and now other data is mov'ed into xmm1,
making one of the above movaps superfluous
[...]
amb
2014 Sep 04
2
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
Greetings all,
As you may have noticed, there is a new vector shuffle lowering path in the
X86 backend. You can try it out with the
'-x86-experimental-vector-shuffle-lowering' flag to llc, or '-mllvm
-x86-experimental-vector-shuffle-lowering' to clang. Please test it out!
There may be some correctness bugs, I'm still fuzz testing it to shake them
out. But I expect fairly few
2012 Jul 06
2
[LLVMdev] Excessive register spilling in large automatically generated functions, such as is found in FFTW
On Fri, Jul 6, 2012 at 6:39 PM, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote:
>
> On Jul 5, 2012, at 9:06 PM, Anthony Blake <amb33 at cs.waikato.ac.nz> wrote:
>
>> I've noticed that LLVM tends to generate suboptimal code and spill an
>> excessive amount of registers in large functions, such as in those
>> that are automatically generated by FFTW.
>
2014 Sep 05
2
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
...t;> vinsertps $256, %xmm0, %xmm13, %xmm4 # xmm4 = xmm0[0],xmm13[1,2,3]
>>> vinsertps $256, %xmm1, %xmm0, %xmm6 # xmm6 = xmm1[0],xmm0[1,2,3]
>>> vinsertps $256, %xmm13, %xmm1, %xmm7 # xmm7 = xmm13[0],xmm1[1,2,3]
>>> vinsertps $416, %xmm1, %xmm4, %xmm14 # xmm14 =
>>> xmm4[0,1],xmm1[2],xmm4[3]
>>> vinsertps $416, %xmm13, %xmm6, %xmm13 # xmm13 =
>>> xmm6[0,1],xmm13[2],xmm6[3]
>>> vinsertps $416, %xmm0, %xmm7, %xmm0 # xmm0 =
>>> xmm7[0,1],xmm0[2],xmm7[3]
>>>
>>> We'll con...
2010 Oct 20
2
[LLVMdev] llvm register reload/spilling around calls
...call instructions are all prefixed
> by:
>
> let Defs = [RAX, RCX, RDX, RSI, RDI, R8, R9, R10, R11, FP0, FP1, FP2,
> FP3, FP4, FP5, FP6, ST0, ST1, MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7,
> XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, XMM8, XMM9, XMM10,
> XMM11, XMM12, XMM13, XMM14, XMM15, EFLAGS],
>
> This is the fixed list of call-clobbered registers. It should really
> be controlled by the calling convention of the called function
> instead.
>
> The WINCALL* instructions only exist because of this.
Ahh I see now. I hacked this up and indeed the code loo...
2014 Sep 06
2
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
...>>
>> vinsertps $256, %xmm0, %xmm13, %xmm4 # xmm4 = xmm0[0],xmm13[1,2,3]
>> vinsertps $256, %xmm1, %xmm0, %xmm6 # xmm6 = xmm1[0],xmm0[1,2,3]
>> vinsertps $256, %xmm13, %xmm1, %xmm7 # xmm7 = xmm13[0],xmm1[1,2,3]
>> vinsertps $416, %xmm1, %xmm4, %xmm14 # xmm14 =
>> xmm4[0,1],xmm1[2],xmm4[3]
>> vinsertps $416, %xmm13, %xmm6, %xmm13 # xmm13 =
>> xmm6[0,1],xmm13[2],xmm6[3]
>> vinsertps $416, %xmm0, %xmm7, %xmm0 # xmm0 =
>> xmm7[0,1],xmm0[2],xmm7[3]
>>
>> We'll continue to look into this and d...
2010 Oct 20
0
[LLVMdev] llvm register reload/spilling around calls
...all prefixed
>> by:
>>
>> let Defs = [RAX, RCX, RDX, RSI, RDI, R8, R9, R10, R11, FP0, FP1, FP2,
>> FP3, FP4, FP5, FP6, ST0, ST1, MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7,
>> XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, XMM8, XMM9, XMM10,
>> XMM11, XMM12, XMM13, XMM14, XMM15, EFLAGS],
>>
>> This is the fixed list of call-clobbered registers. It should really
>> be controlled by the calling convention of the called function
>> instead.
>>
>> The WINCALL* instructions only exist because of this.
> Ahh I see now. I hacked th...
2010 Oct 20
1
[LLVMdev] llvm register reload/spilling around calls
...t; by:
>>>
>>> let Defs = [RAX, RCX, RDX, RSI, RDI, R8, R9, R10, R11, FP0, FP1, FP2,
>>> FP3, FP4, FP5, FP6, ST0, ST1, MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7,
>>> XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, XMM8, XMM9, XMM10,
>>> XMM11, XMM12, XMM13, XMM14, XMM15, EFLAGS],
>>>
>>> This is the fixed list of call-clobbered registers. It should really
>>> be controlled by the calling convention of the called function
>>> instead.
>>>
>>> The WINCALL* instructions only exist because of this.
>>...
2014 Sep 08
2
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
...256, %xmm0, %xmm13, %xmm4 # xmm4 = xmm0[0],xmm13[1,2,3]
>>>>> vinsertps $256, %xmm1, %xmm0, %xmm6 # xmm6 = xmm1[0],xmm0[1,2,3]
>>>>> vinsertps $256, %xmm13, %xmm1, %xmm7 # xmm7 = xmm13[0],xmm1[1,2,3]
>>>>> vinsertps $416, %xmm1, %xmm4, %xmm14 # xmm14 =
>>>>> xmm4[0,1],xmm1[2],xmm4[3]
>>>>> vinsertps $416, %xmm13, %xmm6, %xmm13 # xmm13 =
>>>>> xmm6[0,1],xmm13[2],xmm6[3]
>>>>> vinsertps $416, %xmm0, %xmm7, %xmm0 # xmm0 =
>>>>> xmm7[0,1],xmm0[2],xmm7[3]...
2017 Oct 11
1
[PATCH v1 01/27] x86/crypto: Adapt assembly for PIE support
...ocks\operation:
vpshufb and an array of shuffle masks */
movq %r12, %r11
salq $4, %r11
- movdqu aad_shift_arr(%r11), \TMP1
+ leaq aad_shift_arr(%rip), %rax
+ movdqu (%rax,%r11,), \TMP1
PSHUFB_XMM \TMP1, %xmm\i
_get_AAD_rest_final\num_initial_blocks\operation:
PSHUFB_XMM %xmm14, %xmm\i # byte-reflect the AAD data
@@ -584,7 +585,8 @@ _get_AAD_rest0\num_initial_blocks\operation:
vpshufb and an array of shuffle masks */
movq %r12, %r11
salq $4, %r11
- movdqu aad_shift_arr(%r11), \TMP1
+ leaq aad_shift_arr(%rip), %rax
+ movdqu (%rax,%r11,), \TMP1
PSHUFB...
2010 Oct 20
0
[LLVMdev] llvm register reload/spilling around calls
...refixed by:
let Defs = [RAX, RCX, RDX, RSI, RDI, R8, R9, R10, R11,
FP0, FP1, FP2, FP3, FP4, FP5, FP6, ST0, ST1,
MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7,
XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7,
XMM8, XMM9, XMM10, XMM11, XMM12, XMM13, XMM14, XMM15, EFLAGS],
This is the fixed list of call-clobbered registers. It should really be controlled by the calling convention of the called function instead.
The WINCALL* instructions only exist because of this.
One problem is that calling conventions are handled while building the selection DAG...
2010 Oct 20
3
[LLVMdev] llvm register reload/spilling around calls
Thanks for giving it a look!
On 19.10.2010 23:21, Jakob Stoklund Olesen wrote:
> On Oct 19, 2010, at 11:40 AM, Roland Scheidegger wrote:
>
>> So I saw that the code is doing lots of register
>> spilling/reloading. Now I understand that due to calling
>> conventions, there's not really a way to avoid this - I tried using
>> coldcc but apparently the backend
2008 Sep 03
2
[LLVMdev] Codegen/Register allocation question.
...MM3<imp-def,dead>, %XMM4<imp-def,dead>,
%XMM5<imp-def,dead>, %XMM6<imp-def,dead>, %XMM7<imp-def,dead>,
%XMM8<imp-def,dead>, %XMM9<imp-def,dead>, %XMM10<imp-def,dead>,
%XMM11<imp-def,dead>, %XMM12<imp-def,dead>, %XMM13<imp-def,dead>,
%XMM14<imp-def,dead>, %XMM15<imp-def,dead>, %EFLAGS<imp-def,dead>,
%EAX<imp-def>, %ECX<imp-def,dead>, %EDI<imp-def,dead>,
%EDX<imp-def,dead>, %ESI<imp-def,dead>
108 ADJCALLSTACKUP 0, 0, %ESP<imp-def>, %EFLAGS<imp-def,dead>, %ESP<imp-use>...
2008 Sep 04
0
[LLVMdev] Codegen/Register allocation question.
...gt;, %XMM4<imp-def,dead>,
> %XMM5<imp-def,dead>, %XMM6<imp-def,dead>, %XMM7<imp-def,dead>,
> %XMM8<imp-def,dead>, %XMM9<imp-def,dead>, %XMM10<imp-def,dead>,
> %XMM11<imp-def,dead>, %XMM12<imp-def,dead>, %XMM13<imp-def,dead>,
> %XMM14<imp-def,dead>, %XMM15<imp-def,dead>, %EFLAGS<imp-def,dead>,
> %EAX<imp-def>, %ECX<imp-def,dead>, %EDI<imp-def,dead>,
> %EDX<imp-def,dead>, %ESI<imp-def,dead>
> 108 ADJCALLSTACKUP 0, 0, %ESP<imp-def>, %EFLAGS<imp-def,dead>,
&...
2014 Sep 09
5
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
...t;> vinsertps $256, %xmm0, %xmm13, %xmm4 # xmm4 = xmm0[0],xmm13[1,2,3]
>>> vinsertps $256, %xmm1, %xmm0, %xmm6 # xmm6 = xmm1[0],xmm0[1,2,3]
>>> vinsertps $256, %xmm13, %xmm1, %xmm7 # xmm7 = xmm13[0],xmm1[1,2,3]
>>> vinsertps $416, %xmm1, %xmm4, %xmm14 # xmm14 =
>>> xmm4[0,1],xmm1[2],xmm4[3]
>>> vinsertps $416, %xmm13, %xmm6, %xmm13 # xmm13 =
>>> xmm6[0,1],xmm13[2],xmm6[3]
>>> vinsertps $416, %xmm0, %xmm7, %xmm0 # xmm0 =
>>> xmm7[0,1],xmm0[2],xmm7[3]
>>>
>>> We'll cont...
2012 Jan 09
3
[LLVMdev] Calling conventions for YMM registers on AVX
On Jan 9, 2012, at 10:00 AM, Jakob Stoklund Olesen wrote:
>
> On Jan 8, 2012, at 11:18 PM, Demikhovsky, Elena wrote:
>
>> I'll explain what we see in the code.
>> 1. The caller saves XMM registers across the call if needed (according to DEFS definition).
>> YMMs are not in the set, so caller does not take care.
>
> This is not how the register allocator
2007 Jun 26
3
[LLVMdev] Live Intervals Question
...lt;imp-def,dead>, %XMM4<imp-def,dead>,
%XMM5<imp-def,dead>, %XMM6<imp-def,dead>, %XMM7<imp-def,dead>,
%XMM8<imp-def,dead>, %XMM9<imp-def,dead>, %XMM10<imp-def,dead>,
%XMM11<imp-def,dead>, %XMM12<imp-def,dead>, %XMM13<imp-def,dead>,
%XMM14<imp-def,dead>, %XMM15<imp-def,dead>, %EAX<imp-def>
CALL64pcrel32 <ga:printf> %mreg(78) %mreg(74)<d> %mreg(77)<d> %mreg(79)<d>
%mreg(81)<d> %mreg(78)<d> %mreg(66)<d> %mreg(70)<d> %mreg(42)<d> %mreg(46)<d>
%mreg(26)<d&...
2007 Jun 26
0
[LLVMdev] Live Intervals Question
...gt;, %XMM4<imp-def,dead>,
> %XMM5<imp-def,dead>, %XMM6<imp-def,dead>, %XMM7<imp-def,dead>,
> %XMM8<imp-def,dead>, %XMM9<imp-def,dead>, %XMM10<imp-def,dead>,
> %XMM11<imp-def,dead>, %XMM12<imp-def,dead>, %XMM13<imp-def,dead>,
> %XMM14<imp-def,dead>, %XMM15<imp-def,dead>, %EAX<imp-def>
> CALL64pcrel32 <ga:printf> %mreg(78) %mreg(74)<d> %mreg(77)<d> %mreg
> (79)<d>
> %mreg(81)<d> %mreg(78)<d> %mreg(66)<d> %mreg(70)<d> %mreg(42)<d> %
> mreg(46)&l...
2014 Sep 09
1
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
...$256, %xmm0, %xmm13, %xmm4 # xmm4 = xmm0[0],xmm13[1,2,3]
>>>>> vinsertps $256, %xmm1, %xmm0, %xmm6 # xmm6 = xmm1[0],xmm0[1,2,3]
>>>>> vinsertps $256, %xmm13, %xmm1, %xmm7 # xmm7 = xmm13[0],xmm1[1,2,3]
>>>>> vinsertps $416, %xmm1, %xmm4, %xmm14 # xmm14 =
>>>>> xmm4[0,1],xmm1[2],xmm4[3]
>>>>> vinsertps $416, %xmm13, %xmm6, %xmm13 # xmm13 =
>>>>> xmm6[0,1],xmm13[2],xmm6[3]
>>>>> vinsertps $416, %xmm0, %xmm7, %xmm0 # xmm0 =
>>>>> xmm7[0,1],xmm0[2],xmm7[3]
&g...