thr3ads.net - llvm dev - [LLVMdev] X86 LowerVECTOR

If this information is useful, please help other people find it:
Share via:

David Greene

2011-Feb-25 21:19 UTC

[LLVMdev] X86 LowerVECTOR_SHUFFLE Question

In ToT, LowerVECTOR_SHUFFLE for x86 has this code:

  if (X86::isUNPCKLMask(SVOp))
    getTargetShuffleNode(getUNPCKLOpcode(VT) dl, VT, V1, V2, DAG);

why would this not be:

  if (X86::isUNPCKLMask(SVOp))
    return SVOp;

I'm trying to add support for VUNPCKL and am getting into trouble
because the existing code ends up creating:

VUNPCKLPS
  load
  load

which is badness come selection time.  Legalize doesn't get a chance to
look below the target shuffle node to see that there are two memory
operands.

Back in the 2.7 days, we used to just return the shuffle as is if it was
already legal.  Why the change to create a target node?  Will there be
problems if I change the code to return the shuffle as-is?

Thanks!

                       -Dave

David A. Greene

2011-Feb-26 02:01 UTC

head link

[LLVMdev] X86 LowerVECTOR_SHUFFLE Question

David Greene <dag at cray.com> writes:
> In ToT, LowerVECTOR_SHUFFLE for x86 has this code:
>
>   if (X86::isUNPCKLMask(SVOp))
>     getTargetShuffleNode(getUNPCKLOpcode(VT) dl, VT, V1, V2, DAG);
>
> why would this not be:
>
>   if (X86::isUNPCKLMask(SVOp))
>     return SVOp;
Ok, I discovered that Bruno did this in revisions 112934, 112942 and
113020 but the logs don't really make clear why.  I did figure out that
I needed new SDNode defs for VUNPCKLPSY and VUNPCKLPDY and corresponding
patterns.  Once I added them everything started working.

I found this all very confusing because it appears there are now two
ways to match certain shuffle instructions in .td files: one through the
traditional shuffle operators like unpckl and shufp and another through
these special X86* operators.

This is reflected in X86InstrSSE.td:

"Traditional":

    defm VUNPCKLPS: sse12_unpack_interleave<0x14, unpckl, v4f32, memopv4f32,
          VR128, f128mem, "unpcklps\t{$src2, $src1, $dst|$dst, $src1,
$src2}",
                         SSEPackedSingle>, VEX_4V;
"New-style":

    def : Pat<(v4f32 (X86Unpcklps VR128:$src1, (memopv4f32 addr:$src2))),
              (VUNPCKLPSrm VR128:$src1, addr:$src2)>,
Requires<[HasAVX]>;

I think these are basically the same pattern.

What's the purpose of these special operators and patterns?

                              -Dave

Bob Wilson

2011-Feb-26 05:33 UTC

head link

[LLVMdev] X86 LowerVECTOR_SHUFFLE Question

It is inefficient and error-prone to recognize legal shuffles and then have isel
repeat the process.  For example, if the DAG combiner changes a shuffle in
between legalization and isel, it may stop being legal and break isel.  By
legalizing to target-specific DAG nodes, we avoid that possibility and also make
it much easier to match the shuffles during isel.

On Feb 25, 2011, at 6:01 PM, David A. Greene wrote:
> David Greene <dag at cray.com> writes:
> 
>> In ToT, LowerVECTOR_SHUFFLE for x86 has this code:
>> 
>>  if (X86::isUNPCKLMask(SVOp))
>>    getTargetShuffleNode(getUNPCKLOpcode(VT) dl, VT, V1, V2, DAG);
>> 
>> why would this not be:
>> 
>>  if (X86::isUNPCKLMask(SVOp))
>>    return SVOp;
> 
> Ok, I discovered that Bruno did this in revisions 112934, 112942 and
> 113020 but the logs don't really make clear why.  I did figure out that
> I needed new SDNode defs for VUNPCKLPSY and VUNPCKLPDY and corresponding
> patterns.  Once I added them everything started working.
> 
> I found this all very confusing because it appears there are now two
> ways to match certain shuffle instructions in .td files: one through the
> traditional shuffle operators like unpckl and shufp and another through
> these special X86* operators.
> 
> This is reflected in X86InstrSSE.td:
> 
> "Traditional":
> 
>    defm VUNPCKLPS: sse12_unpack_interleave<0x14, unpckl, v4f32,
memopv4f32,
>          VR128, f128mem, "unpcklps\t{$src2, $src1, $dst|$dst, $src1,
$src2}",
>                         SSEPackedSingle>, VEX_4V;
> "New-style":
> 
>    def : Pat<(v4f32 (X86Unpcklps VR128:$src1, (memopv4f32 addr:$src2))),
>              (VUNPCKLPSrm VR128:$src1, addr:$src2)>,
Requires<[HasAVX]>;
> 
> I think these are basically the same pattern.
> 
> What's the purpose of these special operators and patterns?
> 
>                              -Dave
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Possibly Parallel Threads

Search for more apparently analagous threads

llvm dev - Feb 2011 - [LLVMdev] X86 LowerVECTOR_SHUFFLE Question

[LLVMdev] X86 LowerVECTOR_SHUFFLE Question

[LLVMdev] X86 LowerVECTOR_SHUFFLE Question

[LLVMdev] X86 LowerVECTOR_SHUFFLE Question

Possibly Parallel Threads