Displaying 12 results from an estimated 12 matches for "vshufp".
Did you mean:
vshufps
2015 Jan 29
2
[LLVMdev] RFB: Would like to flip the vector shuffle legality flag
...is!
>
>
> Another problem I'm seeing is that in some cases we can't fold memory
> anymore:
> vpermilps $-0x6d, -0xXX(%rdx), %xmm2 ## xmm2 = mem[3,0,1,2]
> vblendps $0x1, %xmm2, %xmm0, %xmm0
> becomes:
> vmovaps -0xXX(%rdx), %xmm2
> vshufps $0x3, %xmm0, %xmm2, %xmm3 ## xmm3 = xmm2[3,0],xmm0[0,0]
> vshufps $-0x68, %xmm0, %xmm3, %xmm0 ## xmm0 = xmm3[0,2],xmm0[1,2]
>
>
> Also, I see differences when some loads are shuffled, that I'm a bit
> conflicted about:
> vmovaps -0xXX(%rbp), %xmm3
&g...
2015 Jan 30
4
[LLVMdev] RFB: Would like to flip the vector shuffle legality flag
...ing is that in some cases we can't fold memory
>>> anymore:
>>> vpermilps $-0x6d, -0xXX(%rdx), %xmm2 ## xmm2 = mem[3,0,1,2]
>>> vblendps $0x1, %xmm2, %xmm0, %xmm0
>>> becomes:
>>> vmovaps -0xXX(%rdx), %xmm2
>>> vshufps $0x3, %xmm0, %xmm2, %xmm3 ## xmm3 = xmm2[3,0],xmm0[0,0]
>>> vshufps $-0x68, %xmm0, %xmm3, %xmm0 ## xmm0 =
>>> xmm3[0,2],xmm0[1,2]
>>>
>>>
>>> Also, I see differences when some loads are shuffled, that I'm a bit
>>> conflicte...
2015 Jan 29
0
[LLVMdev] RFB: Would like to flip the vector shuffle legality flag
...ther problem I'm seeing is that in some cases we can't fold memory
>> anymore:
>> vpermilps $-0x6d, -0xXX(%rdx), %xmm2 ## xmm2 = mem[3,0,1,2]
>> vblendps $0x1, %xmm2, %xmm0, %xmm0
>> becomes:
>> vmovaps -0xXX(%rdx), %xmm2
>> vshufps $0x3, %xmm0, %xmm2, %xmm3 ## xmm3 = xmm2[3,0],xmm0[0,0]
>> vshufps $-0x68, %xmm0, %xmm3, %xmm0 ## xmm0 =
>> xmm3[0,2],xmm0[1,2]
>>
>>
>> Also, I see differences when some loads are shuffled, that I'm a bit
>> conflicted about:
>> vm...
2015 Jan 30
0
[LLVMdev] RFB: Would like to flip the vector shuffle legality flag
...s we can't fold memory
>>>> anymore:
>>>> vpermilps $-0x6d, -0xXX(%rdx), %xmm2 ## xmm2 = mem[3,0,1,2]
>>>> vblendps $0x1, %xmm2, %xmm0, %xmm0
>>>> becomes:
>>>> vmovaps -0xXX(%rdx), %xmm2
>>>> vshufps $0x3, %xmm0, %xmm2, %xmm3 ## xmm3 =
>>>> xmm2[3,0],xmm0[0,0]
>>>> vshufps $-0x68, %xmm0, %xmm3, %xmm0 ## xmm0 =
>>>> xmm3[0,2],xmm0[1,2]
>>>>
>>>>
>>>> Also, I see differences when some loads are shuffled, that...
2015 Jan 23
5
[LLVMdev] RFB: Would like to flip the vector shuffle legality flag
Greetings LLVM hackers and x86 vector shufflers!
I would like to flip on another chunk of the new vector shuffling,
specifically the logic to mark ~all shuffles as "legal".
This can be tested today with the flag
"-x86-experimental-vector-shuffle-legality". I would essentially like to
make this the default (by removing the "false" path). Doing this will allow
me to
2015 Jan 25
4
[LLVMdev] RFB: Would like to flip the vector shuffle legality flag
I ran the benchmarking subset of test-suite on a btver2 machine and
optimizing for btver2 (so enabling AVX codegen).
I don't see anything outside of the noise with
x86-experimental-vector-shuffle-legality=1.
On Fri, Jan 23, 2015 at 5:19 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com
> wrote:
> Hi Chandler,
>
> On Fri, Jan 23, 2015 at 8:15 AM, Chandler Carruth
2014 Sep 09
5
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
...t;4 x float> %A, <4 x float> %B, <4 x i32> <i32 0,
i32 5, i32 2, i32 7>
ret <4 x float> %1
}
;;;
llc (-mcpu=corei7-avx):
vblendps $10, %xmm1, %xmm0, %xmm0 # xmm0 = xmm0[0],xmm1[5],xmm0[2],xmm1[7]
llc -x86-experimental-vector-shuffle-lowering (-mcpu=corei7-avx):
vshufps $-40, %xmm0, %xmm1, %xmm0 # xmm0 = xmm1[0,2],xmm0[1,3]
vshufps $-40, %xmm0, %xmm0, %xmm0 # xmm0[0,2,1,3]
2) On SSE4.1, we should try not to emit an insertps if the shuffle
mask identifies a blend. At the moment the new lowering logic is very
aggressively emitting insertps instead of cheaper bl...
2014 Sep 10
2
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
...i32> <i32 0,
> i32 5, i32 2, i32 7>
> ret <4 x float> %1
> }
> ;;;
>
> llc (-mcpu=corei7-avx):
> vblendps $10, %xmm1, %xmm0, %xmm0 # xmm0 = xmm0[0],xmm1[5],xmm0[2],xmm1[7]
>
> llc -x86-experimental-vector-shuffle-lowering (-mcpu=corei7-avx):
> vshufps $-40, %xmm0, %xmm1, %xmm0 # xmm0 = xmm1[0,2],xmm0[1,3]
> vshufps $-40, %xmm0, %xmm0, %xmm0 # xmm0[0,2,1,3]
>
>
> 2) On SSE4.1, we should try not to emit an insertps if the shuffle
> mask identifies a blend. At the moment the new lowering logic is very
> aggressively emitting...
2014 Sep 19
4
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
...stigate more
on this bug (maybe we no longer trigger some ISel patterns?) and I
will try to give you a small reproducible for this paticular case.
2. There are cases where we no longer fold a vector load in one of
the operands of a shuffle.
This is an example:
vmovaps 320(%rsp), %xmm0
vshufps $-27, %xmm0, %xmm0, %xmm0 # %xmm0 = %xmm0[1,1,2,3]
Before, we used to emit the following sequence:
# 16-byte Folded reload.
vpshufd $1, 320(%rsp), %xmm0 # %xmm0 = mem[1,0,0,0]
Note: the reason why the shuffle masks are different but still valid
is because the upper bits in %xmm...
2014 Sep 09
1
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
...2 2, i32 7>
>> ret <4 x float> %1
>> }
>> ;;;
>>
>> llc (-mcpu=corei7-avx):
>> vblendps $10, %xmm1, %xmm0, %xmm0 # xmm0 = xmm0[0],xmm1[5],xmm0[2],xmm1[7]
>>
>> llc -x86-experimental-vector-shuffle-lowering (-mcpu=corei7-avx):
>> vshufps $-40, %xmm0, %xmm1, %xmm0 # xmm0 = xmm1[0,2],xmm0[1,3]
>> vshufps $-40, %xmm0, %xmm0, %xmm0 # xmm0[0,2,1,3]
>>
>>
>> 2) On SSE4.1, we should try not to emit an insertps if the shuffle
>> mask identifies a blend. At the moment the new lowering logic is very
>>...
2014 Sep 08
2
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
> On Sep 7, 2014, at 8:49 PM, Quentin Colombet <qcolombet at apple.com> wrote:
>
> Sure,
>
> Here is the command line:
> clang -cc1 -triple x86_64-apple-macosx -S -disable-free -disable-llvm-verifier -main-file-name tmp.i -mrelocation-model pic -pic-level 2 -mdisable-fp-elim -masm-verbose -munwind-tables -target-cpu core-avx-i -O3 -ferror-limit 19 -fmessage-length 114
2014 Sep 10
13
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
On Tue, Sep 9, 2014 at 11:39 PM, Chandler Carruth <chandlerc at google.com> wrote:
> Awesome, thanks for all the information!
>
> See below:
>
> On Tue, Sep 9, 2014 at 6:13 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com>
> wrote:
>>
>> You have already mentioned how the new shuffle lowering is missing
>> some features; for example, you explicitly