thr3ads.net - search: "xmm14"

Displaying 20 results from an estimated 35 matches for "xmm14".

Did you mean: xmm1

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

2014 Sep 05

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

...A sample here: > > vinsertps $256, %xmm0, %xmm13, %xmm4 # xmm4 = xmm0[0],xmm13[1,2,3] > vinsertps $256, %xmm1, %xmm0, %xmm6 # xmm6 = xmm1[0],xmm0[1,2,3] > vinsertps $256, %xmm13, %xmm1, %xmm7 # xmm7 = xmm13[0],xmm1[1,2,3] > vinsertps $416, %xmm1, %xmm4, %xmm14 # xmm14 = > xmm4[0,1],xmm1[2],xmm4[3] > vinsertps $416, %xmm13, %xmm6, %xmm13 # xmm13 = > xmm6[0,1],xmm13[2],xmm6[3] > vinsertps $416, %xmm0, %xmm7, %xmm0 # xmm0 = > xmm7[0,1],xmm0[2],xmm7[3] > > We'll continue to look into this and do additional testing. >...

[LLVMdev] Excessive register spilling in large automatically generated functions, such as is found in FFTW

2012 Jul 06

[LLVMdev] Excessive register spilling in large automatically generated functions, such as is found in FFTW

...n't > immediately obvious, trunk generates more instructions and spills more > registers compared to 3.1. > Actually, here is an occurrence of that behavior when compiling the code with trunk: [...] movaps %xmm1, %xmm0 ### xmm1 mov'ed to xmm0 movaps %xmm1, %xmm14 ### xmm1 mov'ed to xmm14 addps %xmm7, %xmm0 movaps %xmm7, %xmm13 movaps %xmm0, %xmm1 ### and now other data is mov'ed into xmm1, making one of the above movaps superfluous [...] amb

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

2014 Sep 04

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

Greetings all, As you may have noticed, there is a new vector shuffle lowering path in the X86 backend. You can try it out with the '-x86-experimental-vector-shuffle-lowering' flag to llc, or '-mllvm -x86-experimental-vector-shuffle-lowering' to clang. Please test it out! There may be some correctness bugs, I'm still fuzz testing it to shake them out. But I expect fairly few

[LLVMdev] Excessive register spilling in large automatically generated functions, such as is found in FFTW

2012 Jul 06

[LLVMdev] Excessive register spilling in large automatically generated functions, such as is found in FFTW

On Fri, Jul 6, 2012 at 6:39 PM, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote: > > On Jul 5, 2012, at 9:06 PM, Anthony Blake <amb33 at cs.waikato.ac.nz> wrote: > >> I've noticed that LLVM tends to generate suboptimal code and spill an >> excessive amount of registers in large functions, such as in those >> that are automatically generated by FFTW. >

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

2014 Sep 05

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

...t;> vinsertps $256, %xmm0, %xmm13, %xmm4 # xmm4 = xmm0[0],xmm13[1,2,3] >>> vinsertps $256, %xmm1, %xmm0, %xmm6 # xmm6 = xmm1[0],xmm0[1,2,3] >>> vinsertps $256, %xmm13, %xmm1, %xmm7 # xmm7 = xmm13[0],xmm1[1,2,3] >>> vinsertps $416, %xmm1, %xmm4, %xmm14 # xmm14 = >>> xmm4[0,1],xmm1[2],xmm4[3] >>> vinsertps $416, %xmm13, %xmm6, %xmm13 # xmm13 = >>> xmm6[0,1],xmm13[2],xmm6[3] >>> vinsertps $416, %xmm0, %xmm7, %xmm0 # xmm0 = >>> xmm7[0,1],xmm0[2],xmm7[3] >>> >>> We'll con...

[LLVMdev] llvm register reload/spilling around calls

2010 Oct 20

[LLVMdev] llvm register reload/spilling around calls

...call instructions are all prefixed > by: > > let Defs = [RAX, RCX, RDX, RSI, RDI, R8, R9, R10, R11, FP0, FP1, FP2, > FP3, FP4, FP5, FP6, ST0, ST1, MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, > XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, XMM8, XMM9, XMM10, > XMM11, XMM12, XMM13, XMM14, XMM15, EFLAGS], > > This is the fixed list of call-clobbered registers. It should really > be controlled by the calling convention of the called function > instead. > > The WINCALL* instructions only exist because of this. Ahh I see now. I hacked this up and indeed the code loo...

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

2014 Sep 06

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

...>> >> vinsertps $256, %xmm0, %xmm13, %xmm4 # xmm4 = xmm0[0],xmm13[1,2,3] >> vinsertps $256, %xmm1, %xmm0, %xmm6 # xmm6 = xmm1[0],xmm0[1,2,3] >> vinsertps $256, %xmm13, %xmm1, %xmm7 # xmm7 = xmm13[0],xmm1[1,2,3] >> vinsertps $416, %xmm1, %xmm4, %xmm14 # xmm14 = >> xmm4[0,1],xmm1[2],xmm4[3] >> vinsertps $416, %xmm13, %xmm6, %xmm13 # xmm13 = >> xmm6[0,1],xmm13[2],xmm6[3] >> vinsertps $416, %xmm0, %xmm7, %xmm0 # xmm0 = >> xmm7[0,1],xmm0[2],xmm7[3] >> >> We'll continue to look into this and d...

[LLVMdev] llvm register reload/spilling around calls

2010 Oct 20

[LLVMdev] llvm register reload/spilling around calls

...all prefixed >> by: >> >> let Defs = [RAX, RCX, RDX, RSI, RDI, R8, R9, R10, R11, FP0, FP1, FP2, >> FP3, FP4, FP5, FP6, ST0, ST1, MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, >> XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, XMM8, XMM9, XMM10, >> XMM11, XMM12, XMM13, XMM14, XMM15, EFLAGS], >> >> This is the fixed list of call-clobbered registers. It should really >> be controlled by the calling convention of the called function >> instead. >> >> The WINCALL* instructions only exist because of this. > Ahh I see now. I hacked th...

[LLVMdev] llvm register reload/spilling around calls

2010 Oct 20

[LLVMdev] llvm register reload/spilling around calls

...t; by: >>> >>> let Defs = [RAX, RCX, RDX, RSI, RDI, R8, R9, R10, R11, FP0, FP1, FP2, >>> FP3, FP4, FP5, FP6, ST0, ST1, MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, >>> XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, XMM8, XMM9, XMM10, >>> XMM11, XMM12, XMM13, XMM14, XMM15, EFLAGS], >>> >>> This is the fixed list of call-clobbered registers. It should really >>> be controlled by the calling convention of the called function >>> instead. >>> >>> The WINCALL* instructions only exist because of this. >>...

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

2014 Sep 08

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

...256, %xmm0, %xmm13, %xmm4 # xmm4 = xmm0[0],xmm13[1,2,3] >>>>> vinsertps $256, %xmm1, %xmm0, %xmm6 # xmm6 = xmm1[0],xmm0[1,2,3] >>>>> vinsertps $256, %xmm13, %xmm1, %xmm7 # xmm7 = xmm13[0],xmm1[1,2,3] >>>>> vinsertps $416, %xmm1, %xmm4, %xmm14 # xmm14 = >>>>> xmm4[0,1],xmm1[2],xmm4[3] >>>>> vinsertps $416, %xmm13, %xmm6, %xmm13 # xmm13 = >>>>> xmm6[0,1],xmm13[2],xmm6[3] >>>>> vinsertps $416, %xmm0, %xmm7, %xmm0 # xmm0 = >>>>> xmm7[0,1],xmm0[2],xmm7[3]...

[PATCH v1 01/27] x86/crypto: Adapt assembly for PIE support

2017 Oct 11

[PATCH v1 01/27] x86/crypto: Adapt assembly for PIE support

...ocks\operation: vpshufb and an array of shuffle masks */ movq %r12, %r11 salq $4, %r11 - movdqu aad_shift_arr(%r11), \TMP1 + leaq aad_shift_arr(%rip), %rax + movdqu (%rax,%r11,), \TMP1 PSHUFB_XMM \TMP1, %xmm\i _get_AAD_rest_final\num_initial_blocks\operation: PSHUFB_XMM %xmm14, %xmm\i # byte-reflect the AAD data @@ -584,7 +585,8 @@ _get_AAD_rest0\num_initial_blocks\operation: vpshufb and an array of shuffle masks */ movq %r12, %r11 salq $4, %r11 - movdqu aad_shift_arr(%r11), \TMP1 + leaq aad_shift_arr(%rip), %rax + movdqu (%rax,%r11,), \TMP1 PSHUFB...

[LLVMdev] llvm register reload/spilling around calls

2010 Oct 20

[LLVMdev] llvm register reload/spilling around calls

...refixed by: let Defs = [RAX, RCX, RDX, RSI, RDI, R8, R9, R10, R11, FP0, FP1, FP2, FP3, FP4, FP5, FP6, ST0, ST1, MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, XMM8, XMM9, XMM10, XMM11, XMM12, XMM13, XMM14, XMM15, EFLAGS], This is the fixed list of call-clobbered registers. It should really be controlled by the calling convention of the called function instead. The WINCALL* instructions only exist because of this. One problem is that calling conventions are handled while building the selection DAG...

[LLVMdev] llvm register reload/spilling around calls

2010 Oct 20

[LLVMdev] llvm register reload/spilling around calls

Thanks for giving it a look! On 19.10.2010 23:21, Jakob Stoklund Olesen wrote: > On Oct 19, 2010, at 11:40 AM, Roland Scheidegger wrote: > >> So I saw that the code is doing lots of register >> spilling/reloading. Now I understand that due to calling >> conventions, there's not really a way to avoid this - I tried using >> coldcc but apparently the backend

[LLVMdev] Codegen/Register allocation question.

2008 Sep 03

[LLVMdev] Codegen/Register allocation question.

...MM3<imp-def,dead>, %XMM4<imp-def,dead>, %XMM5<imp-def,dead>, %XMM6<imp-def,dead>, %XMM7<imp-def,dead>, %XMM8<imp-def,dead>, %XMM9<imp-def,dead>, %XMM10<imp-def,dead>, %XMM11<imp-def,dead>, %XMM12<imp-def,dead>, %XMM13<imp-def,dead>, %XMM14<imp-def,dead>, %XMM15<imp-def,dead>, %EFLAGS<imp-def,dead>, %EAX<imp-def>, %ECX<imp-def,dead>, %EDI<imp-def,dead>, %EDX<imp-def,dead>, %ESI<imp-def,dead> 108 ADJCALLSTACKUP 0, 0, %ESP<imp-def>, %EFLAGS<imp-def,dead>, %ESP<imp-use&gt...

[LLVMdev] Codegen/Register allocation question.

2008 Sep 04

[LLVMdev] Codegen/Register allocation question.

...gt;, %XMM4<imp-def,dead>, > %XMM5<imp-def,dead>, %XMM6<imp-def,dead>, %XMM7<imp-def,dead>, > %XMM8<imp-def,dead>, %XMM9<imp-def,dead>, %XMM10<imp-def,dead>, > %XMM11<imp-def,dead>, %XMM12<imp-def,dead>, %XMM13<imp-def,dead>, > %XMM14<imp-def,dead>, %XMM15<imp-def,dead>, %EFLAGS<imp-def,dead>, > %EAX<imp-def>, %ECX<imp-def,dead>, %EDI<imp-def,dead>, > %EDX<imp-def,dead>, %ESI<imp-def,dead> > 108 ADJCALLSTACKUP 0, 0, %ESP<imp-def>, %EFLAGS<imp-def,dead>, &...

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

2014 Sep 09

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

[LLVMdev] Calling conventions for YMM registers on AVX

2012 Jan 09

[LLVMdev] Calling conventions for YMM registers on AVX

On Jan 9, 2012, at 10:00 AM, Jakob Stoklund Olesen wrote: > > On Jan 8, 2012, at 11:18 PM, Demikhovsky, Elena wrote: > >> I'll explain what we see in the code. >> 1. The caller saves XMM registers across the call if needed (according to DEFS definition). >> YMMs are not in the set, so caller does not take care. > > This is not how the register allocator

[LLVMdev] Live Intervals Question

2007 Jun 26

[LLVMdev] Live Intervals Question

...lt;imp-def,dead>, %XMM4<imp-def,dead>, %XMM5<imp-def,dead>, %XMM6<imp-def,dead>, %XMM7<imp-def,dead>, %XMM8<imp-def,dead>, %XMM9<imp-def,dead>, %XMM10<imp-def,dead>, %XMM11<imp-def,dead>, %XMM12<imp-def,dead>, %XMM13<imp-def,dead>, %XMM14<imp-def,dead>, %XMM15<imp-def,dead>, %EAX<imp-def> CALL64pcrel32 <ga:printf> %mreg(78) %mreg(74)<d> %mreg(77)<d> %mreg(79)<d> %mreg(81)<d> %mreg(78)<d> %mreg(66)<d> %mreg(70)<d> %mreg(42)<d> %mreg(46)<d> %mreg(26)<d&...

[LLVMdev] Live Intervals Question

2007 Jun 26

[LLVMdev] Live Intervals Question

...gt;, %XMM4<imp-def,dead>, > %XMM5<imp-def,dead>, %XMM6<imp-def,dead>, %XMM7<imp-def,dead>, > %XMM8<imp-def,dead>, %XMM9<imp-def,dead>, %XMM10<imp-def,dead>, > %XMM11<imp-def,dead>, %XMM12<imp-def,dead>, %XMM13<imp-def,dead>, > %XMM14<imp-def,dead>, %XMM15<imp-def,dead>, %EAX<imp-def> > CALL64pcrel32 <ga:printf> %mreg(78) %mreg(74)<d> %mreg(77)<d> %mreg > (79)<d> > %mreg(81)<d> %mreg(78)<d> %mreg(66)<d> %mreg(70)<d> %mreg(42)<d> % > mreg(46)&l...

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

2014 Sep 09

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

...$256, %xmm0, %xmm13, %xmm4 # xmm4 = xmm0[0],xmm13[1,2,3] >>>>> vinsertps $256, %xmm1, %xmm0, %xmm6 # xmm6 = xmm1[0],xmm0[1,2,3] >>>>> vinsertps $256, %xmm13, %xmm1, %xmm7 # xmm7 = xmm13[0],xmm1[1,2,3] >>>>> vinsertps $416, %xmm1, %xmm4, %xmm14 # xmm14 = >>>>> xmm4[0,1],xmm1[2],xmm4[3] >>>>> vinsertps $416, %xmm13, %xmm6, %xmm13 # xmm13 = >>>>> xmm6[0,1],xmm13[2],xmm6[3] >>>>> vinsertps $416, %xmm0, %xmm7, %xmm0 # xmm0 = >>>>> xmm7[0,1],xmm0[2],xmm7[3] &g...

search for: xmm14