Chandler Carruth
2014-Sep-30 05:48 UTC
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
Wow. Somehow, I forgot about vbroadcast and vpbroadcast. =[ Sorry about that. I'll fix those. On Fri, Sep 26, 2014 at 3:39 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com> wrote:> Hi Chandler, > > Here is another test. > > When looking at the AVX codegen, I noticed that, when using the new > shuffle lowering, we no longer emit a single vbroadcastss in the case > where the shuffle performs a splat of a scalar float loaded from > memory. > > For example: > (with -mcpu=corei7-avx -x86-experimental-vector-shuffle-lowering) > vmovss (%rdi), %xmm0 > vpermilps $0, %xmm0, %xmm0 # xmm0 = xmm0[0,0,0,0] > > Instead of: > (with -mcpu=corei7-avx) > vbroadcastss (%rdi), %xmm0 > > I have attached a small reproducible for it. > > Basically, the old shuffle lowering logic calls function > 'NormalizeVectorShuffle' to handle shuffles that perform a splat > operation. > On AVX, function 'NormalizeVectorShuffle' tries to lower a splat where > the splat value comes from a load into a X86ISD::VBROADCAST dag node. > Later on, during instruction selection, we emit a single avx_broadcast > for the load+splat sequence (basically, we end up folding the load in > the operand of the vbroadcastss). > > What happens is that the new shuffle lowering doesn't emit a > vbroadcast node in this case and eventually we end up selecting the > sequence of vmovss+vpermilps. > > I hope this helps. > Andrea > > On Tue, Sep 23, 2014 at 10:53 PM, Chandler Carruth <chandlerc at google.com> > wrote: > > > > On Tue, Sep 23, 2014 at 2:35 PM, Simon Pilgrim <llvm-dev at redking.me.uk> > > wrote: > >> > >> If you don’t want to spend time on this, I’d be happy to create a > >> candidate patch for review? I’ve been unclear if you were taking > patches for > >> your shuffle work prior to it becoming the default. > > > > > > While I'm happy to work on it, I'm even more happy to have patches. =D > > > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140929/cec7af6d/attachment.html>
Chandler Carruth
2014-Oct-01 00:52 UTC
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
This has been added in r218724. Based on the feedback here and from Quentin, I'm going to email the list shortly with a heads-up, and then flip the default over to the new shuffle lowering. On Mon, Sep 29, 2014 at 10:48 PM, Chandler Carruth <chandlerc at google.com> wrote:> Wow. Somehow, I forgot about vbroadcast and vpbroadcast. =[ Sorry about > that. I'll fix those. > > On Fri, Sep 26, 2014 at 3:39 AM, Andrea Di Biagio < > andrea.dibiagio at gmail.com> wrote: > >> Hi Chandler, >> >> Here is another test. >> >> When looking at the AVX codegen, I noticed that, when using the new >> shuffle lowering, we no longer emit a single vbroadcastss in the case >> where the shuffle performs a splat of a scalar float loaded from >> memory. >> >> For example: >> (with -mcpu=corei7-avx -x86-experimental-vector-shuffle-lowering) >> vmovss (%rdi), %xmm0 >> vpermilps $0, %xmm0, %xmm0 # xmm0 = xmm0[0,0,0,0] >> >> Instead of: >> (with -mcpu=corei7-avx) >> vbroadcastss (%rdi), %xmm0 >> >> I have attached a small reproducible for it. >> >> Basically, the old shuffle lowering logic calls function >> 'NormalizeVectorShuffle' to handle shuffles that perform a splat >> operation. >> On AVX, function 'NormalizeVectorShuffle' tries to lower a splat where >> the splat value comes from a load into a X86ISD::VBROADCAST dag node. >> Later on, during instruction selection, we emit a single avx_broadcast >> for the load+splat sequence (basically, we end up folding the load in >> the operand of the vbroadcastss). >> >> What happens is that the new shuffle lowering doesn't emit a >> vbroadcast node in this case and eventually we end up selecting the >> sequence of vmovss+vpermilps. >> >> I hope this helps. >> Andrea >> >> On Tue, Sep 23, 2014 at 10:53 PM, Chandler Carruth <chandlerc at google.com> >> wrote: >> > >> > On Tue, Sep 23, 2014 at 2:35 PM, Simon Pilgrim <llvm-dev at redking.me.uk> >> > wrote: >> >> >> >> If you don’t want to spend time on this, I’d be happy to create a >> >> candidate patch for review? I’ve been unclear if you were taking >> patches for >> >> your shuffle work prior to it becoming the default. >> > >> > >> > While I'm happy to work on it, I'm even more happy to have patches. =D >> > >> > _______________________________________________ >> > LLVM Developers mailing list >> > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> > >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140930/5757d528/attachment.html>
Andrea Di Biagio
2014-Oct-01 08:23 UTC
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
On Wed, Oct 1, 2014 at 1:52 AM, Chandler Carruth <chandlerc at google.com> wrote:> This has been added in r218724.Thanks Chandler!> Based on the feedback here and from Quentin, I'm going to email the list > shortly with a heads-up, and then flip the default over to the new shuffle > lowering.Nice. Again, thanks for working on this! -Andrea> > On Mon, Sep 29, 2014 at 10:48 PM, Chandler Carruth <chandlerc at google.com> > wrote: >> >> Wow. Somehow, I forgot about vbroadcast and vpbroadcast. =[ Sorry about >> that. I'll fix those. >> >> On Fri, Sep 26, 2014 at 3:39 AM, Andrea Di Biagio >> <andrea.dibiagio at gmail.com> wrote: >>> >>> Hi Chandler, >>> >>> Here is another test. >>> >>> When looking at the AVX codegen, I noticed that, when using the new >>> shuffle lowering, we no longer emit a single vbroadcastss in the case >>> where the shuffle performs a splat of a scalar float loaded from >>> memory. >>> >>> For example: >>> (with -mcpu=corei7-avx -x86-experimental-vector-shuffle-lowering) >>> vmovss (%rdi), %xmm0 >>> vpermilps $0, %xmm0, %xmm0 # xmm0 = xmm0[0,0,0,0] >>> >>> Instead of: >>> (with -mcpu=corei7-avx) >>> vbroadcastss (%rdi), %xmm0 >>> >>> I have attached a small reproducible for it. >>> >>> Basically, the old shuffle lowering logic calls function >>> 'NormalizeVectorShuffle' to handle shuffles that perform a splat >>> operation. >>> On AVX, function 'NormalizeVectorShuffle' tries to lower a splat where >>> the splat value comes from a load into a X86ISD::VBROADCAST dag node. >>> Later on, during instruction selection, we emit a single avx_broadcast >>> for the load+splat sequence (basically, we end up folding the load in >>> the operand of the vbroadcastss). >>> >>> What happens is that the new shuffle lowering doesn't emit a >>> vbroadcast node in this case and eventually we end up selecting the >>> sequence of vmovss+vpermilps. >>> >>> I hope this helps. >>> Andrea >>> >>> On Tue, Sep 23, 2014 at 10:53 PM, Chandler Carruth <chandlerc at google.com> >>> wrote: >>> > >>> > On Tue, Sep 23, 2014 at 2:35 PM, Simon Pilgrim <llvm-dev at redking.me.uk> >>> > wrote: >>> >> >>> >> If you don’t want to spend time on this, I’d be happy to create a >>> >> candidate patch for review? I’ve been unclear if you were taking >>> >> patches for >>> >> your shuffle work prior to it becoming the default. >>> > >>> > >>> > While I'm happy to work on it, I'm even more happy to have patches. =D >>> > >>> > _______________________________________________ >>> > LLVM Developers mailing list >>> > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>> > >> >> >