search for: xmm14

Displaying 20 results from an estimated 35 matches for "xmm14".

Did you mean: xmm1
2014 Sep 05
3
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
...A sample here: > > vinsertps $256, %xmm0, %xmm13, %xmm4 # xmm4 = xmm0[0],xmm13[1,2,3] > vinsertps $256, %xmm1, %xmm0, %xmm6 # xmm6 = xmm1[0],xmm0[1,2,3] > vinsertps $256, %xmm13, %xmm1, %xmm7 # xmm7 = xmm13[0],xmm1[1,2,3] > vinsertps $416, %xmm1, %xmm4, %xmm14 # xmm14 = > xmm4[0,1],xmm1[2],xmm4[3] > vinsertps $416, %xmm13, %xmm6, %xmm13 # xmm13 = > xmm6[0,1],xmm13[2],xmm6[3] > vinsertps $416, %xmm0, %xmm7, %xmm0 # xmm0 = > xmm7[0,1],xmm0[2],xmm7[3] > > We'll continue to look into this and do additional testing. >...
2012 Jul 06
0
[LLVMdev] Excessive register spilling in large automatically generated functions, such as is found in FFTW
...n't > immediately obvious, trunk generates more instructions and spills more > registers compared to 3.1. > Actually, here is an occurrence of that behavior when compiling the code with trunk: [...] movaps %xmm1, %xmm0 ### xmm1 mov'ed to xmm0 movaps %xmm1, %xmm14 ### xmm1 mov'ed to xmm14 addps %xmm7, %xmm0 movaps %xmm7, %xmm13 movaps %xmm0, %xmm1 ### and now other data is mov'ed into xmm1, making one of the above movaps superfluous [...] amb
2014 Sep 04
2
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
Greetings all, As you may have noticed, there is a new vector shuffle lowering path in the X86 backend. You can try it out with the '-x86-experimental-vector-shuffle-lowering' flag to llc, or '-mllvm -x86-experimental-vector-shuffle-lowering' to clang. Please test it out! There may be some correctness bugs, I'm still fuzz testing it to shake them out. But I expect fairly few
2012 Jul 06
2
[LLVMdev] Excessive register spilling in large automatically generated functions, such as is found in FFTW
On Fri, Jul 6, 2012 at 6:39 PM, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote: > > On Jul 5, 2012, at 9:06 PM, Anthony Blake <amb33 at cs.waikato.ac.nz> wrote: > >> I've noticed that LLVM tends to generate suboptimal code and spill an >> excessive amount of registers in large functions, such as in those >> that are automatically generated by FFTW. >
2014 Sep 05
2
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
...t;> vinsertps $256, %xmm0, %xmm13, %xmm4 # xmm4 = xmm0[0],xmm13[1,2,3] >>> vinsertps $256, %xmm1, %xmm0, %xmm6 # xmm6 = xmm1[0],xmm0[1,2,3] >>> vinsertps $256, %xmm13, %xmm1, %xmm7 # xmm7 = xmm13[0],xmm1[1,2,3] >>> vinsertps $416, %xmm1, %xmm4, %xmm14 # xmm14 = >>> xmm4[0,1],xmm1[2],xmm4[3] >>> vinsertps $416, %xmm13, %xmm6, %xmm13 # xmm13 = >>> xmm6[0,1],xmm13[2],xmm6[3] >>> vinsertps $416, %xmm0, %xmm7, %xmm0 # xmm0 = >>> xmm7[0,1],xmm0[2],xmm7[3] >>> >>> We'll con...
2010 Oct 20
2
[LLVMdev] llvm register reload/spilling around calls
...call instructions are all prefixed > by: > > let Defs = [RAX, RCX, RDX, RSI, RDI, R8, R9, R10, R11, FP0, FP1, FP2, > FP3, FP4, FP5, FP6, ST0, ST1, MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, > XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, XMM8, XMM9, XMM10, > XMM11, XMM12, XMM13, XMM14, XMM15, EFLAGS], > > This is the fixed list of call-clobbered registers. It should really > be controlled by the calling convention of the called function > instead. > > The WINCALL* instructions only exist because of this. Ahh I see now. I hacked this up and indeed the code loo...
2014 Sep 06
2
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
...>> >> vinsertps $256, %xmm0, %xmm13, %xmm4 # xmm4 = xmm0[0],xmm13[1,2,3] >> vinsertps $256, %xmm1, %xmm0, %xmm6 # xmm6 = xmm1[0],xmm0[1,2,3] >> vinsertps $256, %xmm13, %xmm1, %xmm7 # xmm7 = xmm13[0],xmm1[1,2,3] >> vinsertps $416, %xmm1, %xmm4, %xmm14 # xmm14 = >> xmm4[0,1],xmm1[2],xmm4[3] >> vinsertps $416, %xmm13, %xmm6, %xmm13 # xmm13 = >> xmm6[0,1],xmm13[2],xmm6[3] >> vinsertps $416, %xmm0, %xmm7, %xmm0 # xmm0 = >> xmm7[0,1],xmm0[2],xmm7[3] >> >> We'll continue to look into this and d...
2010 Oct 20
0
[LLVMdev] llvm register reload/spilling around calls
...all prefixed >> by: >> >> let Defs = [RAX, RCX, RDX, RSI, RDI, R8, R9, R10, R11, FP0, FP1, FP2, >> FP3, FP4, FP5, FP6, ST0, ST1, MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, >> XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, XMM8, XMM9, XMM10, >> XMM11, XMM12, XMM13, XMM14, XMM15, EFLAGS], >> >> This is the fixed list of call-clobbered registers. It should really >> be controlled by the calling convention of the called function >> instead. >> >> The WINCALL* instructions only exist because of this. > Ahh I see now. I hacked th...
2010 Oct 20
1
[LLVMdev] llvm register reload/spilling around calls
...t; by: >>> >>> let Defs = [RAX, RCX, RDX, RSI, RDI, R8, R9, R10, R11, FP0, FP1, FP2, >>> FP3, FP4, FP5, FP6, ST0, ST1, MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, >>> XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, XMM8, XMM9, XMM10, >>> XMM11, XMM12, XMM13, XMM14, XMM15, EFLAGS], >>> >>> This is the fixed list of call-clobbered registers. It should really >>> be controlled by the calling convention of the called function >>> instead. >>> >>> The WINCALL* instructions only exist because of this. >>...
2014 Sep 08
2
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
...256, %xmm0, %xmm13, %xmm4 # xmm4 = xmm0[0],xmm13[1,2,3] >>>>> vinsertps $256, %xmm1, %xmm0, %xmm6 # xmm6 = xmm1[0],xmm0[1,2,3] >>>>> vinsertps $256, %xmm13, %xmm1, %xmm7 # xmm7 = xmm13[0],xmm1[1,2,3] >>>>> vinsertps $416, %xmm1, %xmm4, %xmm14 # xmm14 = >>>>> xmm4[0,1],xmm1[2],xmm4[3] >>>>> vinsertps $416, %xmm13, %xmm6, %xmm13 # xmm13 = >>>>> xmm6[0,1],xmm13[2],xmm6[3] >>>>> vinsertps $416, %xmm0, %xmm7, %xmm0 # xmm0 = >>>>> xmm7[0,1],xmm0[2],xmm7[3]...
2017 Oct 11
1
[PATCH v1 01/27] x86/crypto: Adapt assembly for PIE support
...ocks\operation: vpshufb and an array of shuffle masks */ movq %r12, %r11 salq $4, %r11 - movdqu aad_shift_arr(%r11), \TMP1 + leaq aad_shift_arr(%rip), %rax + movdqu (%rax,%r11,), \TMP1 PSHUFB_XMM \TMP1, %xmm\i _get_AAD_rest_final\num_initial_blocks\operation: PSHUFB_XMM %xmm14, %xmm\i # byte-reflect the AAD data @@ -584,7 +585,8 @@ _get_AAD_rest0\num_initial_blocks\operation: vpshufb and an array of shuffle masks */ movq %r12, %r11 salq $4, %r11 - movdqu aad_shift_arr(%r11), \TMP1 + leaq aad_shift_arr(%rip), %rax + movdqu (%rax,%r11,), \TMP1 PSHUFB...
2010 Oct 20
0
[LLVMdev] llvm register reload/spilling around calls
...refixed by: let Defs = [RAX, RCX, RDX, RSI, RDI, R8, R9, R10, R11, FP0, FP1, FP2, FP3, FP4, FP5, FP6, ST0, ST1, MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, XMM8, XMM9, XMM10, XMM11, XMM12, XMM13, XMM14, XMM15, EFLAGS], This is the fixed list of call-clobbered registers. It should really be controlled by the calling convention of the called function instead. The WINCALL* instructions only exist because of this. One problem is that calling conventions are handled while building the selection DAG...
2010 Oct 20
3
[LLVMdev] llvm register reload/spilling around calls
Thanks for giving it a look! On 19.10.2010 23:21, Jakob Stoklund Olesen wrote: > On Oct 19, 2010, at 11:40 AM, Roland Scheidegger wrote: > >> So I saw that the code is doing lots of register >> spilling/reloading. Now I understand that due to calling >> conventions, there's not really a way to avoid this - I tried using >> coldcc but apparently the backend
2008 Sep 03
2
[LLVMdev] Codegen/Register allocation question.
...MM3<imp-def,dead>, %XMM4<imp-def,dead>, %XMM5<imp-def,dead>, %XMM6<imp-def,dead>, %XMM7<imp-def,dead>, %XMM8<imp-def,dead>, %XMM9<imp-def,dead>, %XMM10<imp-def,dead>, %XMM11<imp-def,dead>, %XMM12<imp-def,dead>, %XMM13<imp-def,dead>, %XMM14<imp-def,dead>, %XMM15<imp-def,dead>, %EFLAGS<imp-def,dead>, %EAX<imp-def>, %ECX<imp-def,dead>, %EDI<imp-def,dead>, %EDX<imp-def,dead>, %ESI<imp-def,dead> 108 ADJCALLSTACKUP 0, 0, %ESP<imp-def>, %EFLAGS<imp-def,dead>, %ESP<imp-use&gt...
2008 Sep 04
0
[LLVMdev] Codegen/Register allocation question.
...gt;, %XMM4<imp-def,dead>, > %XMM5<imp-def,dead>, %XMM6<imp-def,dead>, %XMM7<imp-def,dead>, > %XMM8<imp-def,dead>, %XMM9<imp-def,dead>, %XMM10<imp-def,dead>, > %XMM11<imp-def,dead>, %XMM12<imp-def,dead>, %XMM13<imp-def,dead>, > %XMM14<imp-def,dead>, %XMM15<imp-def,dead>, %EFLAGS<imp-def,dead>, > %EAX<imp-def>, %ECX<imp-def,dead>, %EDI<imp-def,dead>, > %EDX<imp-def,dead>, %ESI<imp-def,dead> > 108 ADJCALLSTACKUP 0, 0, %ESP<imp-def>, %EFLAGS<imp-def,dead>, &...
2014 Sep 09
5
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
...t;> vinsertps $256, %xmm0, %xmm13, %xmm4 # xmm4 = xmm0[0],xmm13[1,2,3] >>> vinsertps $256, %xmm1, %xmm0, %xmm6 # xmm6 = xmm1[0],xmm0[1,2,3] >>> vinsertps $256, %xmm13, %xmm1, %xmm7 # xmm7 = xmm13[0],xmm1[1,2,3] >>> vinsertps $416, %xmm1, %xmm4, %xmm14 # xmm14 = >>> xmm4[0,1],xmm1[2],xmm4[3] >>> vinsertps $416, %xmm13, %xmm6, %xmm13 # xmm13 = >>> xmm6[0,1],xmm13[2],xmm6[3] >>> vinsertps $416, %xmm0, %xmm7, %xmm0 # xmm0 = >>> xmm7[0,1],xmm0[2],xmm7[3] >>> >>> We'll cont...
2012 Jan 09
3
[LLVMdev] Calling conventions for YMM registers on AVX
On Jan 9, 2012, at 10:00 AM, Jakob Stoklund Olesen wrote: > > On Jan 8, 2012, at 11:18 PM, Demikhovsky, Elena wrote: > >> I'll explain what we see in the code. >> 1. The caller saves XMM registers across the call if needed (according to DEFS definition). >> YMMs are not in the set, so caller does not take care. > > This is not how the register allocator
2007 Jun 26
3
[LLVMdev] Live Intervals Question
...lt;imp-def,dead>, %XMM4<imp-def,dead>, %XMM5<imp-def,dead>, %XMM6<imp-def,dead>, %XMM7<imp-def,dead>, %XMM8<imp-def,dead>, %XMM9<imp-def,dead>, %XMM10<imp-def,dead>, %XMM11<imp-def,dead>, %XMM12<imp-def,dead>, %XMM13<imp-def,dead>, %XMM14<imp-def,dead>, %XMM15<imp-def,dead>, %EAX<imp-def> CALL64pcrel32 <ga:printf> %mreg(78) %mreg(74)<d> %mreg(77)<d> %mreg(79)<d> %mreg(81)<d> %mreg(78)<d> %mreg(66)<d> %mreg(70)<d> %mreg(42)<d> %mreg(46)<d> %mreg(26)<d&...
2007 Jun 26
0
[LLVMdev] Live Intervals Question
...gt;, %XMM4<imp-def,dead>, > %XMM5<imp-def,dead>, %XMM6<imp-def,dead>, %XMM7<imp-def,dead>, > %XMM8<imp-def,dead>, %XMM9<imp-def,dead>, %XMM10<imp-def,dead>, > %XMM11<imp-def,dead>, %XMM12<imp-def,dead>, %XMM13<imp-def,dead>, > %XMM14<imp-def,dead>, %XMM15<imp-def,dead>, %EAX<imp-def> > CALL64pcrel32 <ga:printf> %mreg(78) %mreg(74)<d> %mreg(77)<d> %mreg > (79)<d> > %mreg(81)<d> %mreg(78)<d> %mreg(66)<d> %mreg(70)<d> %mreg(42)<d> % > mreg(46)&l...
2014 Sep 09
1
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
...$256, %xmm0, %xmm13, %xmm4 # xmm4 = xmm0[0],xmm13[1,2,3] >>>>> vinsertps $256, %xmm1, %xmm0, %xmm6 # xmm6 = xmm1[0],xmm0[1,2,3] >>>>> vinsertps $256, %xmm13, %xmm1, %xmm7 # xmm7 = xmm13[0],xmm1[1,2,3] >>>>> vinsertps $416, %xmm1, %xmm4, %xmm14 # xmm14 = >>>>> xmm4[0,1],xmm1[2],xmm4[3] >>>>> vinsertps $416, %xmm13, %xmm6, %xmm13 # xmm13 = >>>>> xmm6[0,1],xmm13[2],xmm6[3] >>>>> vinsertps $416, %xmm0, %xmm7, %xmm0 # xmm0 = >>>>> xmm7[0,1],xmm0[2],xmm7[3] &g...