thr3ads.net - llvm dev - [LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon! [Sep 2014]

If this information is useful, please help other people find it:
Share via:

Chandler Carruth

2014-Sep-04 09:45 UTC

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

Greetings all,

As you may have noticed, there is a new vector shuffle lowering path in the
X86 backend. You can try it out with the
'-x86-experimental-vector-shuffle-lowering' flag to llc, or '-mllvm
-x86-experimental-vector-shuffle-lowering' to clang. Please test it out!

There may be some correctness bugs, I'm still fuzz testing it to shake them
out. But I expect fairly few of those.

I don't have any test cases which regress in performance with the new
shuffle lowering. I have several which improve by 1-3%, and a couple which
improve by 5-10%. YMMV.

There are still some missing features: AVX2 shuffles, SSE4.1 blends,
handling all possible uses of the "mov*" style shuffles. However, as
indicated, I don't have any test cases on any micro architectures that are
really showing regressions here. It's entirely possible I just don't
have
access to them, so please help me benchmark!

Provided there aren't really terrible regressions in performance, I'd
like
to switch the default in a couple of days and start getting bug reports
about what doesn't work yet. I've already talked to a couple of the
regular
contributors to the x86 backend and they seem pretty happy, so I just
wanted to send a wider reaching email in case some folks had a chance to
benchmark more.

Inevitably, there will be some regressions, but they can be handled and
fixed like anything else provided they don't cause lots of trouble for
folks.

Thanks,
-Chandler
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140904/447e6a6e/attachment.html>

Robert Lougher

2014-Sep-05 16:32 UTC

head link

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

Hi Chandler,

I've done some informal benchmarking on an AMD Jaguar core (amd16h)
with and without the experimental flag.  The tests were a mixture of
FP and Integer tests.  I didn't see any significant performance
regression, with most of the differances being in the noise (less than
1%).  One test, however, did show a performance improvement of ~4%.

Unfortunately, another team, while doing internal testing has seen the
new path generating illegal insertps masks.  A sample here:

    vinsertps    $256, %xmm0, %xmm13, %xmm4 # xmm4 = xmm0[0],xmm13[1,2,3]
    vinsertps    $256, %xmm1, %xmm0, %xmm6 # xmm6 = xmm1[0],xmm0[1,2,3]
    vinsertps    $256, %xmm13, %xmm1, %xmm7 # xmm7 = xmm13[0],xmm1[1,2,3]
    vinsertps    $416, %xmm1, %xmm4, %xmm14 # xmm14 = xmm4[0,1],xmm1[2],xmm4[3]
    vinsertps    $416, %xmm13, %xmm6, %xmm13 # xmm13 xmm6[0,1],xmm13[2],xmm6[3]
    vinsertps    $416, %xmm0, %xmm7, %xmm0 # xmm0 = xmm7[0,1],xmm0[2],xmm7[3]

We'll continue to look into this and do additional testing.

Thanks,
Rob.

--

Robert Lougher
SN Systems - Sony Computer Entertainment Group

Chandler Carruth

2014-Sep-05 16:38 UTC

head link

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

On Fri, Sep 5, 2014 at 9:32 AM, Robert Lougher <rob.lougher at gmail.com>
wrote:
> Unfortunately, another team, while doing internal testing has seen the
> new path generating illegal insertps masks.  A sample here:
>
>     vinsertps    $256, %xmm0, %xmm13, %xmm4 # xmm4 = xmm0[0],xmm13[1,2,3]
>     vinsertps    $256, %xmm1, %xmm0, %xmm6 # xmm6 = xmm1[0],xmm0[1,2,3]
>     vinsertps    $256, %xmm13, %xmm1, %xmm7 # xmm7 = xmm13[0],xmm1[1,2,3]
>     vinsertps    $416, %xmm1, %xmm4, %xmm14 # xmm14 >
xmm4[0,1],xmm1[2],xmm4[3]
>     vinsertps    $416, %xmm13, %xmm6, %xmm13 # xmm13 >
xmm6[0,1],xmm13[2],xmm6[3]
>     vinsertps    $416, %xmm0, %xmm7, %xmm0 # xmm0 >
xmm7[0,1],xmm0[2],xmm7[3]
>
> We'll continue to look into this and do additional testing.
>
Interesting. Let me know if you get a test case. The insertps code path was
added recently though and has been much less well tested. I'll start fuzz
testing it and should hopefully uncover the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140905/ad76da83/attachment.html>

Reasonably Related Threads

Search for more reasonably related threads

llvm dev - Sep 2014 - [LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

Reasonably Related Threads