thr3ads.net - llvm dev - [llvm-dev] RegBankSelect complex value mappings [Dec 2018]

If this information is useful, please help other people find it:
Share via:

Matt Arsenault via llvm-dev

2018-Dec-20 05:25 UTC

[llvm-dev] RegBankSelect complex value mappings

Hi,

I’m looking at RegBankSelect’s partially implemented support for deciding to
split a value between multiple registers and I’m wondering if it’s actually
intended to solve the problem I’m trying to use it for. RegisterBankInfo.h has
this example mapping table:
  /// E.g.,
  /// Let say we have a 32-bit add and a <2 x 32-bit> vadd. We
  /// can expand the
  /// <2 x 32-bit> add into 2 x 32-bit add.
  ///
  /// Currently the TableGen-like file would look like:
  /// \code
  /// PartialMapping[] = {
  /// /*32-bit add*/ {0, 32, GPR},
  /// /*2x32-bit add*/ {0, 32, GPR}, {0, 32, GPR}, // <-- Same entry 3x
  /// /*<2x32-bit> vadd {0, 64, VPR}
  /// }; // PartialMapping duplicated.
  ///
  /// ValueMapping[] {
  ///   /*plain 32-bit add*/ {&PartialMapping[0], 1},
  ///   /*expanded vadd on 2xadd*/ {&PartialMapping[1], 2},
  ///   /*plain <2x32-bit> vadd*/ {&PartialMapping[3], 1}
  /// };

This looks almost like the problem I want to solve for AMDGPU. There are 2 main
register banks. On the SALU, some 64-bit operation are available which can only
be 32-bit on the VALU. For example, if all of the input operands aren’t in the
scalar bank, a 64-bit and needs to be split into 2 32-bit ands. It’s illegal to
copy from the vector to the scalar bank, since these don’t mean what vector and
scalar mean on other targets.

The current code seems very operand centric and computes costs only based on
copies. Decomposing the operation into 2 pieces requires rewriting the entire
instruction, not just copying from one offending operand. Is this intended to
handle this kind of case, or do I need to introduce a separate register bank
aware legalizer pass?

-Matt

Quentin Colombet via llvm-dev

2018-Dec-21 00:15 UTC

head link

[llvm-dev] RegBankSelect complex value mappings

Hi Matt,

Your use case falls definitely in what RegBankSelect meant to solve.
That said, the support you need is not implemented because we didn't
have use cases to test the code against.

Regarding the cost, if the mapping produces more than 1 partial value,
right now RegBankSelect::getRepairCost will say this is too expensive
and this is actually where you need to patch the pass to add a target
hook to compute something that would use instruction to decompose the
value.

Le mer. 19 déc. 2018 à 21:25, Matt Arsenault <arsenm2 at gmail.com> a
écrit :>
> Hi,
>
> I’m looking at RegBankSelect’s partially implemented support for deciding
to split a value between multiple registers and I’m wondering if it’s actually
intended to solve the problem I’m trying to use it for. RegisterBankInfo.h has
this example mapping table:
>   /// E.g.,
>   /// Let say we have a 32-bit add and a <2 x 32-bit> vadd. We
>   /// can expand the
>   /// <2 x 32-bit> add into 2 x 32-bit add.
>   ///
>   /// Currently the TableGen-like file would look like:
>   /// \code
>   /// PartialMapping[] = {
>   /// /*32-bit add*/ {0, 32, GPR},
>   /// /*2x32-bit add*/ {0, 32, GPR}, {0, 32, GPR}, // <-- Same entry 3x
>   /// /*<2x32-bit> vadd {0, 64, VPR}
>   /// }; // PartialMapping duplicated.
>   ///
>   /// ValueMapping[] {
>   ///   /*plain 32-bit add*/ {&PartialMapping[0], 1},
>   ///   /*expanded vadd on 2xadd*/ {&PartialMapping[1], 2},
>   ///   /*plain <2x32-bit> vadd*/ {&PartialMapping[3], 1}
>   /// };
>
> This looks almost like the problem I want to solve for AMDGPU. There are 2
main register banks. On the SALU, some 64-bit operation are available which can
only be 32-bit on the VALU. For example, if all of the input operands aren’t in
the scalar bank, a 64-bit and needs to be split into 2 32-bit ands. It’s illegal
to copy from the vector to the scalar bank, since these don’t mean what vector
and scalar mean on other targets.
>
> The current code seems very operand centric and computes costs only based
on copies. Decomposing the operation into 2 pieces requires rewriting the entire
instruction,
So the copy part cost I covered it. For the cost of rewriting the
instruction completely, this is captured by
InstructionMapping::getCost.
The idea of InstructionMapping::getCost is to reflect the cost for
transforming the current instruction into the instruction after we
apply this mapping. Then the RepairCost is here to account for the
cost of "bringing" every operand to the right place for this mapping
using copy or some target specific sequence.
Like the cost computation, the target specific sequences are not
implemented, but should happen in RegBankSelect::repairReg. Right now,
this will assert that the number of break downs should be == 1 but the
code to decompose the operand should happen there.
Finally, the rewriting of the current instruction is supposed to
happen in RegisterBankInfo::applyMapping.

If you have an example (.mir) that you can share, we can work together
to make this happen.

Cheers,
-Quentin
> not just copying from one offending operand. Is this intended to handle
this kind of case, or do I need to introduce a separate register bank aware
legalizer pass?
>
> -Matt
>

Matt Arsenault via llvm-dev

2018-Dec-21 07:51 UTC

head link

[llvm-dev] RegBankSelect complex value mappings

> On Dec 21, 2018, at 11:15 AM, Quentin Colombet <quentin.colombet at
gmail.com> wrote:
> 
> Hi Matt,
> 
> Your use case falls definitely in what RegBankSelect meant to solve.
> That said, the support you need is not implemented because we didn't
> have use cases to test the code against.
> 
> Regarding the cost, if the mapping produces more than 1 partial value,
> right now RegBankSelect::getRepairCost will say this is too expensive
> and this is actually where you need to patch the pass to add a target
> hook to compute something that would use instruction to decompose the
> value.
Yes, this is what happens with greedy. With fast I get a little further.

> 
> So the copy part cost I covered it. For the cost of rewriting the
> instruction completely, this is captured by
> InstructionMapping::getCost.
> The idea of InstructionMapping::getCost is to reflect the cost for
> transforming the current instruction into the instruction after we
> apply this mapping. Then the RepairCost is here to account for the
> cost of "bringing" every operand to the right place for this
mapping
> using copy or some target specific sequence.
> Like the cost computation, the target specific sequences are not
> implemented, but should happen in RegBankSelect::repairReg.
This seems to contradict the comment on repairReg?
  /// \note The caller is supposed to do the rewriting of op if need be.
  /// I.e., Reg = op ... => <NewRegs> = NewOp …


> Right now,
> this will assert that the number of break downs should be == 1 but the
> code to decompose the operand should happen there.
> Finally, the rewriting of the current instruction is supposed to
> happen in RegisterBankInfo::applyMapping.
> 
> If you have an example (.mir) that you can share, we can work together
> to make this happen.
> 
> Cheers,
> -Quentin

The simplest case is this, where there’s only one register bank involved. The
cost of the unmerge and merge should be 0, there’s only a real cost from the
fact that it is now 2 operations.

---
name: and_i64_vv
legalized: true

body: |
  bb.0:
    ; Should turn into something like this, although the merge_values and
unmerge_values can be optimized out
    ; %0:vgpr(s64) = COPY $vgpr0_vgpr1
    ; %1:vgpr(s64) = COPY $vgpr2_vgpr3
    ; %2:vgpr(s32), %3:vgpr(s32) = G_UNMERGE_VALUES %0
    ; %4:vgpr(s32), %5:vgpr(s32) = G_UNMERGE_VALUES %1
    ; %6:vgpr(s32) = G_AND %2, %3
    ; %7:vgpr(s32) = G_AND %4, %5
    ; %8:vgpr(s64) = G_MERGE_VALUES %6, %7

    liveins: $vgpr0_vgpr1, $vgpr2_vgpr3
    %0:_(s64) = COPY $vgpr0_vgpr1
    %1:_(s64) = COPY $vgpr2_vgpr3
    %2:_(s64) = G_AND %0, %1
…

Part of my confusion about the operand focus is the use of RepairPts. In this
case the inputs %0 and %1 have been trivially assigned already, but I kind of
expected those to be present as something to handle here if that makes sense.

-Matt

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20181221/8de23801/attachment-0001.html>

llvm dev - Dec 2018 - RegBankSelect complex value mappings

[llvm-dev] RegBankSelect complex value mappings

[llvm-dev] RegBankSelect complex value mappings

[llvm-dev] RegBankSelect complex value mappings