search for: vshufp

Displaying 12 results from an estimated 12 matches for "vshufp".

Did you mean: vshufps
2015 Jan 29
2
[LLVMdev] RFB: Would like to flip the vector shuffle legality flag
...is! > > > Another problem I'm seeing is that in some cases we can't fold memory > anymore: > vpermilps $-0x6d, -0xXX(%rdx), %xmm2 ## xmm2 = mem[3,0,1,2] > vblendps $0x1, %xmm2, %xmm0, %xmm0 > becomes: > vmovaps -0xXX(%rdx), %xmm2 > vshufps $0x3, %xmm0, %xmm2, %xmm3 ## xmm3 = xmm2[3,0],xmm0[0,0] > vshufps $-0x68, %xmm0, %xmm3, %xmm0 ## xmm0 = xmm3[0,2],xmm0[1,2] > > > Also, I see differences when some loads are shuffled, that I'm a bit > conflicted about: > vmovaps -0xXX(%rbp), %xmm3 &g...
2015 Jan 30
4
[LLVMdev] RFB: Would like to flip the vector shuffle legality flag
...ing is that in some cases we can't fold memory >>> anymore: >>> vpermilps $-0x6d, -0xXX(%rdx), %xmm2 ## xmm2 = mem[3,0,1,2] >>> vblendps $0x1, %xmm2, %xmm0, %xmm0 >>> becomes: >>> vmovaps -0xXX(%rdx), %xmm2 >>> vshufps $0x3, %xmm0, %xmm2, %xmm3 ## xmm3 = xmm2[3,0],xmm0[0,0] >>> vshufps $-0x68, %xmm0, %xmm3, %xmm0 ## xmm0 = >>> xmm3[0,2],xmm0[1,2] >>> >>> >>> Also, I see differences when some loads are shuffled, that I'm a bit >>> conflicte...
2015 Jan 29
0
[LLVMdev] RFB: Would like to flip the vector shuffle legality flag
...ther problem I'm seeing is that in some cases we can't fold memory >> anymore: >> vpermilps $-0x6d, -0xXX(%rdx), %xmm2 ## xmm2 = mem[3,0,1,2] >> vblendps $0x1, %xmm2, %xmm0, %xmm0 >> becomes: >> vmovaps -0xXX(%rdx), %xmm2 >> vshufps $0x3, %xmm0, %xmm2, %xmm3 ## xmm3 = xmm2[3,0],xmm0[0,0] >> vshufps $-0x68, %xmm0, %xmm3, %xmm0 ## xmm0 = >> xmm3[0,2],xmm0[1,2] >> >> >> Also, I see differences when some loads are shuffled, that I'm a bit >> conflicted about: >> vm...
2015 Jan 30
0
[LLVMdev] RFB: Would like to flip the vector shuffle legality flag
...s we can't fold memory >>>> anymore: >>>> vpermilps $-0x6d, -0xXX(%rdx), %xmm2 ## xmm2 = mem[3,0,1,2] >>>> vblendps $0x1, %xmm2, %xmm0, %xmm0 >>>> becomes: >>>> vmovaps -0xXX(%rdx), %xmm2 >>>> vshufps $0x3, %xmm0, %xmm2, %xmm3 ## xmm3 = >>>> xmm2[3,0],xmm0[0,0] >>>> vshufps $-0x68, %xmm0, %xmm3, %xmm0 ## xmm0 = >>>> xmm3[0,2],xmm0[1,2] >>>> >>>> >>>> Also, I see differences when some loads are shuffled, that...
2015 Jan 23
5
[LLVMdev] RFB: Would like to flip the vector shuffle legality flag
Greetings LLVM hackers and x86 vector shufflers! I would like to flip on another chunk of the new vector shuffling, specifically the logic to mark ~all shuffles as "legal". This can be tested today with the flag "-x86-experimental-vector-shuffle-legality". I would essentially like to make this the default (by removing the "false" path). Doing this will allow me to
2015 Jan 25
4
[LLVMdev] RFB: Would like to flip the vector shuffle legality flag
I ran the benchmarking subset of test-suite on a btver2 machine and optimizing for btver2 (so enabling AVX codegen). I don't see anything outside of the noise with x86-experimental-vector-shuffle-legality=1. On Fri, Jan 23, 2015 at 5:19 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com > wrote: > Hi Chandler, > > On Fri, Jan 23, 2015 at 8:15 AM, Chandler Carruth
2014 Sep 09
5
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
...t;4 x float> %A, <4 x float> %B, <4 x i32> <i32 0, i32 5, i32 2, i32 7> ret <4 x float> %1 } ;;; llc (-mcpu=corei7-avx): vblendps $10, %xmm1, %xmm0, %xmm0 # xmm0 = xmm0[0],xmm1[5],xmm0[2],xmm1[7] llc -x86-experimental-vector-shuffle-lowering (-mcpu=corei7-avx): vshufps $-40, %xmm0, %xmm1, %xmm0 # xmm0 = xmm1[0,2],xmm0[1,3] vshufps $-40, %xmm0, %xmm0, %xmm0 # xmm0[0,2,1,3] 2) On SSE4.1, we should try not to emit an insertps if the shuffle mask identifies a blend. At the moment the new lowering logic is very aggressively emitting insertps instead of cheaper bl...
2014 Sep 10
2
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
...i32> <i32 0, > i32 5, i32 2, i32 7> > ret <4 x float> %1 > } > ;;; > > llc (-mcpu=corei7-avx): > vblendps $10, %xmm1, %xmm0, %xmm0 # xmm0 = xmm0[0],xmm1[5],xmm0[2],xmm1[7] > > llc -x86-experimental-vector-shuffle-lowering (-mcpu=corei7-avx): > vshufps $-40, %xmm0, %xmm1, %xmm0 # xmm0 = xmm1[0,2],xmm0[1,3] > vshufps $-40, %xmm0, %xmm0, %xmm0 # xmm0[0,2,1,3] > > > 2) On SSE4.1, we should try not to emit an insertps if the shuffle > mask identifies a blend. At the moment the new lowering logic is very > aggressively emitting...
2014 Sep 19
4
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
...stigate more on this bug (maybe we no longer trigger some ISel patterns?) and I will try to give you a small reproducible for this paticular case. 2. There are cases where we no longer fold a vector load in one of the operands of a shuffle. This is an example: vmovaps 320(%rsp), %xmm0 vshufps $-27, %xmm0, %xmm0, %xmm0 # %xmm0 = %xmm0[1,1,2,3] Before, we used to emit the following sequence: # 16-byte Folded reload. vpshufd $1, 320(%rsp), %xmm0 # %xmm0 = mem[1,0,0,0] Note: the reason why the shuffle masks are different but still valid is because the upper bits in %xmm...
2014 Sep 09
1
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
...2 2, i32 7> >> ret <4 x float> %1 >> } >> ;;; >> >> llc (-mcpu=corei7-avx): >> vblendps $10, %xmm1, %xmm0, %xmm0 # xmm0 = xmm0[0],xmm1[5],xmm0[2],xmm1[7] >> >> llc -x86-experimental-vector-shuffle-lowering (-mcpu=corei7-avx): >> vshufps $-40, %xmm0, %xmm1, %xmm0 # xmm0 = xmm1[0,2],xmm0[1,3] >> vshufps $-40, %xmm0, %xmm0, %xmm0 # xmm0[0,2,1,3] >> >> >> 2) On SSE4.1, we should try not to emit an insertps if the shuffle >> mask identifies a blend. At the moment the new lowering logic is very >>...
2014 Sep 08
2
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
> On Sep 7, 2014, at 8:49 PM, Quentin Colombet <qcolombet at apple.com> wrote: > > Sure, > > Here is the command line: > clang -cc1 -triple x86_64-apple-macosx -S -disable-free -disable-llvm-verifier -main-file-name tmp.i -mrelocation-model pic -pic-level 2 -mdisable-fp-elim -masm-verbose -munwind-tables -target-cpu core-avx-i -O3 -ferror-limit 19 -fmessage-length 114
2014 Sep 10
13
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
On Tue, Sep 9, 2014 at 11:39 PM, Chandler Carruth <chandlerc at google.com> wrote: > Awesome, thanks for all the information! > > See below: > > On Tue, Sep 9, 2014 at 6:13 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com> > wrote: >> >> You have already mentioned how the new shuffle lowering is missing >> some features; for example, you explicitly