thr3ads.net - llvm dev - [LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon! [Sep 2014]

If this information is useful, please help other people find it:
Share via:

Chandler Carruth

2014-Sep-23 11:28 UTC

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

On Sun, Sep 21, 2014 at 1:15 PM, Simon Pilgrim <llvm-dev at redking.me.uk>
wrote:
> On 20 Sep 2014, at 19:44, Chandler Carruth <chandlerc at google.com>
wrote:
>
> > If AVX is available I would expect the vpermilps/vpermilpd instruction
> to be used for all float/double single vector shuffles, especially as it
> can deal with the folded load case as well - this would avoid the
> integer/float execution domain transfer issue with using vpshufd.
> >
> > Yes, this is the obvious solution to folding memory loads. It just
isn't
> implemented yet.
> >
> > Well, actually it is, but I haven't finished writing tests for it.
=]
>
> Thanks Chandler - vpermilps/vpermilpd generation looks great now.
>
> I've found another regression - byte shifts on pre-ssse3 targets are
> failing to make use of the vpslldq/vpsrldq instructions - I've attached
> some basic test cases.
>
> Could vpslldq/vpsrldq be used on ssse3+ targets for the cases where zeros
> are being shifted in? It avoids the need for a zero register (although they
> aren't as good for memory folding).

I'm curious, how important is this? This lowering has always seemed deeply
magical and unlikely to be necessary in practice. palignr at least allows
blending.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140923/a5156662/attachment.html>

Simon Pilgrim

2014-Sep-23 21:35 UTC

head link

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

On 23 Sep 2014, at 12:28, Chandler Carruth <chandlerc at google.com>
wrote:
> I've found another regression - byte shifts on pre-ssse3 targets are
failing to make use of the vpslldq/vpsrldq instructions - I've attached some
basic test cases.
> 
> Could vpslldq/vpsrldq be used on ssse3+ targets for the cases where zeros
are being shifted in? It avoids the need for a zero register (although they
aren't as good for memory folding).
> 
> I'm curious, how important is this? This lowering has always seemed
deeply magical and unlikely to be necessary in practice. palignr at least allows
blending.
Hi,

The general pre-ssse3 byte shift is very useful for the cases where we can’t
guarantee anything other than the cpu being x86_64 (or SSE2 only) - without
palignr the code can be pretty nasty if the shift is not a multiple of 4.

I agree that the implicit byte shift with zero vector is much more of an edge
case - we do have situations where its been useful to help avoid the need for a
zero register but that is it. At the moment we’ve gone back to using the
_mm_slli_si128 intrinsic which still wraps to _builtin_ia32_pslldqi128 - but
have had some success in replacing this with a __builtin_shufflevector call
prior to testing with the new lowering.

If you don’t want to spend time on this, I’d be happy to create a candidate
patch for review? I’ve been unclear if you were taking patches for your shuffle
work prior to it becoming the default.

Cheers, SimonP.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140923/0760f089/attachment.html>

Chandler Carruth

2014-Sep-23 21:53 UTC

head link

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

On Tue, Sep 23, 2014 at 2:35 PM, Simon Pilgrim <llvm-dev at redking.me.uk>
wrote:
> If you don’t want to spend time on this, I’d be happy to create a
> candidate patch for review? I’ve been unclear if you were taking patches
> for your shuffle work prior to it becoming the default.

While I'm happy to work on it, I'm even more happy to have patches. =D
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140923/53d37d9a/attachment.html>

llvm dev - Sep 2014 - [LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!