thr3ads.net - llvm dev - [llvm-dev] RFC: Implement variable-sized register classes [Oct 2016]

If this information is useful, please help other people find it:
Share via:

Krzysztof Parzyszek via llvm-dev

2016-Sep-20 17:32 UTC

[llvm-dev] RFC: Implement variable-sized register classes

I have posted a patch that switches the API to one that supports this 
(yet non-existent functionality) earlier:
https://reviews.llvm.org/D24631

The comments from that were incorporated into the following RFC.


Motivation:

Certain targets feature "variable-sized" registers, i.e. a situation 
where the register size can be configured by a hardware switch.  A 
common instruction set would then operate on these registers regardless 
of what size they have been configured to have.  A specific example of 
that is the HVX coprocessor on Hexagon. HVX provides a set of vector 
registers, and can be configured in one of two modes: one in which 
vectors are 512 bits long, and one in vectors are 1024 bits in length. 
The size only determines the number of elements in the vector register, 
and so the semantics of each HVX instruction does not change: it 
performs a given operation on all vector elements. The encoding of the 
instruction does not change between modes, in fact, it is possible to 
have a binary that runs in both modes.

Currently the register size (strictly speaking, "spill slot size") and
related properties are fixed and immutable in a RegisterClass. In order 
to allow multiple possible register sizes, several RegisterClass objects 
may need to be defined, which then will require each instruction to be 
defined twice. This is what the HVX code does.  Another approach may be 
to define several sets of physical registers corresponding to different 
sizes, and have a large RegisterClass which would be the union of all of 
them. This could avoid having to duplicate the instructions, but would 
lead to problems with getting the actual spill slot size or alignment.

Since the number of targets allowing this kind of variability is growing 
(besides Hexagon, there is RISC-V, MIPS, and out of tree targets, such 
as CHERI), LLVM should allow convenient handling of this type of a 
situation. See comments in https://reviews.llvm.org/D23561 for more details.


General approach:

1. Introduce a concept of a hardware "mode". This "mode"
should be
immutable, that is, it should be treated as a fixed property of the 
hardware throughout the execution of the program being compiled. This is 
different from, for example, floating point rounding mode, which can be 
changed at run-time.  In LLVM, the mode would be determined by subtarget 
features (reflected in TargetSubtargetInfo).

2. Move the register/spill size and alignment information from 
MCRegisterClass, and into TargetRegisterInfo. This means that this data 
will no longer be available to the MC layer. Note that the 
size/alignment information will be provided by the TargetRegisterInfo 
object, and not by each individual TargetRegisterClass. A 
TargetRegisterInfo object would be created for a specific hardware mode, 
so that it would be able to provide the necessary information without 
having to consult TargetSubtargetInfo.

3. Introduce TableGen support for specifying instruction selection 
patterns involving data types depending on the hardware mode.

4. Require that the sub-/super-class relationships between register 
classes are the same across all hardware modes.


The largest impact of this change would be on TableGen, since it needs 
to be aware of the fact that value types under consideration would 
depend on a hardware mode. For example, when having an add-registers 
instruction defined to work on 64-bit registers, providing an additional 
selection pattern for 128-bit registers would present difficulties:

   def AddReg : Instruction {
     let OutOperandList = (outs GPR64:$Rd);
     let InOperandList = (ins GPR64:$Rs, GPR64:$Rt);
     let Pattern = [(set GPR64:$Rd, (add GPR64:$Rs, GPR64:$Rt))]>;
   }

the pattern

   def: Pat<(add GPR128:$Rs, GPR128:$Rt), (AddReg $Rs, $Rt)>;

would result in a type interference error from TableGen. If the class 
GPR64 was amended to also allow the value type i128, TableGen would no 
longer complain, but may generate invalid instruction selection code.

To solve this, TableGen would need to be aware of the association 
between value types and hardware modes. The rest of this proposal 
describes the programming interface to provide necessary information to 
TableGen.

1. Define a mode class. It will be recognized by TableGen as having a 
special meaning.

   class HwMode<list<Predicate> Ps> {
     // List of Predicate objects that determine whether this mode
     // applies. This is used for situation where the code generated by
     // TableGen needs to determine this, as opposed to TableGen itself,
     // for example in the isel pattern-matching code.
     list<Predicate> ModeDef = Ps;
   }

 From the point of view of the code generated by TableGen, HwMode is 
equivalent to a list of Predicate objects. The difference is in how 
TableGen itself treats it: TableGen will distinguish two objects of 
class HwMode if they have different names, regardless of what sets of 
predicates they contain. One way to think of it is that the name of the 
object would serve as a tag denoting the hardware mode.

In the example of the AddReg instruction, we could define two modes:

   def Mode64: Mode<[...]>;
   def Mode128: Mode<[...]>;

but so far there would not be much more that we could do.

2. To make a use of the mode information, provide a class to associate a 
HwMode object with a particular value. This will be done by having two 
lists: one with HwMode objects and another with the corresponding 
values.  Since TableGen does not provide a way to define class templates 
(in the same sense as C++ does), the actual interface will be split in 
two parts.  First is the "mode selection" base class:

   class HwModeSelect<list<HwMode> Ms> {
     list<HwMode> Modes;  // List of unique hw modes.
   }

This will be a "built-in" class for TableGen. It will be a base class,
and treated as "abstract" since it only contains half of the 
information.  Each derived class would then need to define a member 
"Values", which is a list of corresponding values, of the same length
as
the list of modes.  The following definitions will be useful for 
defining register classes and selection patterns:

   class IntSelect<list<Mode> Ms, list<int> Is>
       : HwModeSelect<Ms> {
     // Select an integer literal.
     list<int> Values = Is;
   }

   class ValueTypeSelect<list<Mode> Ms, list<ValueType> Ts>
       : HwModeSelect<Ms> {
     // Select a value type.
     list<ValueType> Values = Ts;
   }

   class ValueTypeListSelect<list<Mode> Ms,
list<list<ValueType>> Ls>
       : HwModeSelect<Ms> {
     // Select a list of value types.
     list<list<ValueType>> Values = Ls;
   }

3. The class RegisterClass would get new members to hold the 
configurable size/alignment information. If defined, they would take 
precedence over the existing members RegTypes/Size/Alignment.

   class RegisterClass {
     ...
     ValueTypeListSelect VarRegTypes;  // The names of these members
     IntSelect VarRegSize;             // could likely be improved...
     IntSelect VarSpillSize;           //
     IntSelect VarSpillAlignment       //
   }


To fully implement the AddReg instruction, the target would then define 
the register class:

   class MyRegisterClass : RegisterClass<...> {
     let VarRegTypes = ValueTypeListSelect<[Mode64, Mode128],
             [[i64, v2i32, v4i16, v8i8],             // Mode64
              [i128, v2i64, v4i32, v8i16, v16i8]]>;  // Mode128
     let VarRegSize = IntSelect<[Mode64, Mode128], [64, 128]>;
     let VarSpillSize = IntSelect<[Mode64, Mode128], [64, 128]>;
     let VarSpillAlignment = IntSelect<[Mode64, Mode128], [64, 128]>;
   }

   def MyIntReg: MyRegisterClass { ... };

And following that, the instruction:

   def AddReg: Instruction {
     let OutOperandList = (outs MyIntReg:$Rd);
     let InOperandList = (ins MyIntReg:$Rs, MyIntReg:$Rt);
     let AsmString = "add $Rd, $Rs, $Rt";
     let Pattern = [(set MyIntReg:$Rd, (add MyIntReg:$Rs,
                                            MyIntReg:$Rt))]>;
   }




-Krzysztof


-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, 
hosted by The Linux Foundation

Sean Silva via llvm-dev

2016-Sep-23 20:01 UTC

head link

[llvm-dev] RFC: Implement variable-sized register classes

On Tue, Sep 20, 2016 at 10:32 AM, Krzysztof Parzyszek via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> I have posted a patch that switches the API to one that supports this (yet
> non-existent functionality) earlier:
> https://reviews.llvm.org/D24631
>
> The comments from that were incorporated into the following RFC.
>
>
> Motivation:
>
> Certain targets feature "variable-sized" registers, i.e. a
situation where
> the register size can be configured by a hardware switch.  A common
> instruction set would then operate on these registers regardless of what
> size they have been configured to have.  A specific example of that is the
> HVX coprocessor on Hexagon. HVX provides a set of vector registers, and can
> be configured in one of two modes: one in which vectors are 512 bits long,
> and one in vectors are 1024 bits in length. The size only determines the
> number of elements in the vector register, and so the semantics of each HVX
> instruction does not change: it performs a given operation on all vector
> elements. The encoding of the instruction does not change between modes, in
> fact, it is possible to have a binary that runs in both modes.
>
> Currently the register size (strictly speaking, "spill slot
size") and
> related properties are fixed and immutable in a RegisterClass. In order to
> allow multiple possible register sizes, several RegisterClass objects may
> need to be defined, which then will require each instruction to be defined
> twice. This is what the HVX code does.  Another approach may be to define
> several sets of physical registers corresponding to different sizes, and
> have a large RegisterClass which would be the union of all of them. This
> could avoid having to duplicate the instructions, but would lead to
> problems with getting the actual spill slot size or alignment.
>
> Since the number of targets allowing this kind of variability is growing
> (besides Hexagon, there is RISC-V, MIPS, and out of tree targets, such as
> CHERI), LLVM should allow convenient handling of this type of a situation.
> See comments in https://reviews.llvm.org/D23561 for more details.
>
ARM SVE sounds like it will have similar issues:
https://community.arm.com/groups/processors/blog/2016/08/22/technology-update-the-scalable-vector-extension-sve-for-the-armv8-a-architecture

-- Sean Silva

>
>
> General approach:
>
> 1. Introduce a concept of a hardware "mode". This
"mode" should be
> immutable, that is, it should be treated as a fixed property of the
> hardware throughout the execution of the program being compiled. This is
> different from, for example, floating point rounding mode, which can be
> changed at run-time.  In LLVM, the mode would be determined by subtarget
> features (reflected in TargetSubtargetInfo).
>
> 2. Move the register/spill size and alignment information from
> MCRegisterClass, and into TargetRegisterInfo. This means that this data
> will no longer be available to the MC layer. Note that the size/alignment
> information will be provided by the TargetRegisterInfo object, and not by
> each individual TargetRegisterClass. A TargetRegisterInfo object would be
> created for a specific hardware mode, so that it would be able to provide
> the necessary information without having to consult TargetSubtargetInfo.
>
> 3. Introduce TableGen support for specifying instruction selection
> patterns involving data types depending on the hardware mode.
>
> 4. Require that the sub-/super-class relationships between register
> classes are the same across all hardware modes.
>
>
> The largest impact of this change would be on TableGen, since it needs to
> be aware of the fact that value types under consideration would depend on a
> hardware mode. For example, when having an add-registers instruction
> defined to work on 64-bit registers, providing an additional selection
> pattern for 128-bit registers would present difficulties:
>
>   def AddReg : Instruction {
>     let OutOperandList = (outs GPR64:$Rd);
>     let InOperandList = (ins GPR64:$Rs, GPR64:$Rt);
>     let Pattern = [(set GPR64:$Rd, (add GPR64:$Rs, GPR64:$Rt))]>;
>   }
>
> the pattern
>
>   def: Pat<(add GPR128:$Rs, GPR128:$Rt), (AddReg $Rs, $Rt)>;
>
> would result in a type interference error from TableGen. If the class
> GPR64 was amended to also allow the value type i128, TableGen would no
> longer complain, but may generate invalid instruction selection code.
>
> To solve this, TableGen would need to be aware of the association between
> value types and hardware modes. The rest of this proposal describes the
> programming interface to provide necessary information to TableGen.
>
> 1. Define a mode class. It will be recognized by TableGen as having a
> special meaning.
>
>   class HwMode<list<Predicate> Ps> {
>     // List of Predicate objects that determine whether this mode
>     // applies. This is used for situation where the code generated by
>     // TableGen needs to determine this, as opposed to TableGen itself,
>     // for example in the isel pattern-matching code.
>     list<Predicate> ModeDef = Ps;
>   }
>
> From the point of view of the code generated by TableGen, HwMode is
> equivalent to a list of Predicate objects. The difference is in how
> TableGen itself treats it: TableGen will distinguish two objects of class
> HwMode if they have different names, regardless of what sets of predicates
> they contain. One way to think of it is that the name of the object would
> serve as a tag denoting the hardware mode.
>
> In the example of the AddReg instruction, we could define two modes:
>
>   def Mode64: Mode<[...]>;
>   def Mode128: Mode<[...]>;
>
> but so far there would not be much more that we could do.
>
> 2. To make a use of the mode information, provide a class to associate a
> HwMode object with a particular value. This will be done by having two
> lists: one with HwMode objects and another with the corresponding values.
> Since TableGen does not provide a way to define class templates (in the
> same sense as C++ does), the actual interface will be split in two parts.
> First is the "mode selection" base class:
>
>   class HwModeSelect<list<HwMode> Ms> {
>     list<HwMode> Modes;  // List of unique hw modes.
>   }
>
> This will be a "built-in" class for TableGen. It will be a base
class, and
> treated as "abstract" since it only contains half of the
information.  Each
> derived class would then need to define a member "Values", which
is a list
> of corresponding values, of the same length as the list of modes.  The
> following definitions will be useful for defining register classes and
> selection patterns:
>
>   class IntSelect<list<Mode> Ms, list<int> Is>
>       : HwModeSelect<Ms> {
>     // Select an integer literal.
>     list<int> Values = Is;
>   }
>
>   class ValueTypeSelect<list<Mode> Ms, list<ValueType>
Ts>
>       : HwModeSelect<Ms> {
>     // Select a value type.
>     list<ValueType> Values = Ts;
>   }
>
>   class ValueTypeListSelect<list<Mode> Ms,
list<list<ValueType>> Ls>
>       : HwModeSelect<Ms> {
>     // Select a list of value types.
>     list<list<ValueType>> Values = Ls;
>   }
>
> 3. The class RegisterClass would get new members to hold the configurable
> size/alignment information. If defined, they would take precedence over the
> existing members RegTypes/Size/Alignment.
>
>   class RegisterClass {
>     ...
>     ValueTypeListSelect VarRegTypes;  // The names of these members
>     IntSelect VarRegSize;             // could likely be improved...
>     IntSelect VarSpillSize;           //
>     IntSelect VarSpillAlignment       //
>   }
>
>
> To fully implement the AddReg instruction, the target would then define
> the register class:
>
>   class MyRegisterClass : RegisterClass<...> {
>     let VarRegTypes = ValueTypeListSelect<[Mode64, Mode128],
>             [[i64, v2i32, v4i16, v8i8],             // Mode64
>              [i128, v2i64, v4i32, v8i16, v16i8]]>;  // Mode128
>     let VarRegSize = IntSelect<[Mode64, Mode128], [64, 128]>;
>     let VarSpillSize = IntSelect<[Mode64, Mode128], [64, 128]>;
>     let VarSpillAlignment = IntSelect<[Mode64, Mode128], [64, 128]>;
>   }
>
>   def MyIntReg: MyRegisterClass { ... };
>
> And following that, the instruction:
>
>   def AddReg: Instruction {
>     let OutOperandList = (outs MyIntReg:$Rd);
>     let InOperandList = (ins MyIntReg:$Rs, MyIntReg:$Rt);
>     let AsmString = "add $Rd, $Rs, $Rt";
>     let Pattern = [(set MyIntReg:$Rd, (add MyIntReg:$Rs,
>                                            MyIntReg:$Rt))]>;
>   }
>
>
>
>
> -Krzysztof
>
>
> --
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
> by The Linux Foundation
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160923/b02e29b3/attachment.html>

Matthias Braun via llvm-dev

2016-Sep-23 20:08 UTC

head link

[llvm-dev] RFC: Implement variable-sized register classes

> On Sep 23, 2016, at 1:01 PM, Sean Silva via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> 
> 
> On Tue, Sep 20, 2016 at 10:32 AM, Krzysztof Parzyszek via llvm-dev
<llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>
wrote:
> I have posted a patch that switches the API to one that supports this (yet
non-existent functionality) earlier:
> https://reviews.llvm.org/D24631 <https://reviews.llvm.org/D24631>
> 
> The comments from that were incorporated into the following RFC.
> 
> 
> Motivation:
> 
> Certain targets feature "variable-sized" registers, i.e. a
situation where the register size can be configured by a hardware switch.  A
common instruction set would then operate on these registers regardless of what
size they have been configured to have.  A specific example of that is the HVX
coprocessor on Hexagon. HVX provides a set of vector registers, and can be
configured in one of two modes: one in which vectors are 512 bits long, and one
in vectors are 1024 bits in length. The size only determines the number of
elements in the vector register, and so the semantics of each HVX instruction
does not change: it performs a given operation on all vector elements. The
encoding of the instruction does not change between modes, in fact, it is
possible to have a binary that runs in both modes.
> 
> Currently the register size (strictly speaking, "spill slot
size") and related properties are fixed and immutable in a RegisterClass.
In order to allow multiple possible register sizes, several RegisterClass
objects may need to be defined, which then will require each instruction to be
defined twice. This is what the HVX code does.  Another approach may be to
define several sets of physical registers corresponding to different sizes, and
have a large RegisterClass which would be the union of all of them. This could
avoid having to duplicate the instructions, but would lead to problems with
getting the actual spill slot size or alignment.
> 
> Since the number of targets allowing this kind of variability is growing
(besides Hexagon, there is RISC-V, MIPS, and out of tree targets, such as
CHERI), LLVM should allow convenient handling of this type of a situation. See
comments in https://reviews.llvm.org/D23561
<https://reviews.llvm.org/D23561>for more details.
> 
> ARM SVE sounds like it will have similar issues:
https://community.arm.com/groups/processors/blog/2016/08/22/technology-update-the-scalable-vector-extension-sve-for-the-armv8-a-architecture
<https://community.arm.com/groups/processors/blog/2016/08/22/technology-update-the-scalable-vector-extension-sve-for-the-armv8-a-architecture>From glancing over the slides, it seems like SVE has dynamically sized (i.e. you
don't know yet at compile time) registers which would be a step further than
this. Of course the stuff in here wouldn't hurt for that as it  pushes the
code into a direction to rely less on well-known/fixed register sizes.

- Matthias

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160923/4c861b3b/attachment.html>

Alex Bradbury via llvm-dev

2016-Sep-24 12:20 UTC

head link

[llvm-dev] RFC: Implement variable-sized register classes

On 20 September 2016 at 18:32, Krzysztof Parzyszek via llvm-dev
<llvm-dev at lists.llvm.org> wrote:> I have posted a patch that switches the API to one that supports this (yet
> non-existent functionality) earlier:
> https://reviews.llvm.org/D24631
>
> The comments from that were incorporated into the following RFC.
Thank you for writing this up. Your proposal is now much clearer to me.
> 1. Introduce a concept of a hardware "mode". This
"mode" should be
> immutable, that is, it should be treated as a fixed property of the
hardware
> throughout the execution of the program being compiled. This is different
> from, for example, floating point rounding mode, which can be changed at
> run-time.  In LLVM, the mode would be determined by subtarget features
> (reflected in TargetSubtargetInfo).
>
> 2. Move the register/spill size and alignment information from
> MCRegisterClass, and into TargetRegisterInfo. This means that this data
will
> no longer be available to the MC layer. Note that the size/alignment
> information will be provided by the TargetRegisterInfo object, and not by
> each individual TargetRegisterClass. A TargetRegisterInfo object would be
> created for a specific hardware mode, so that it would be able to provide
> the necessary information without having to consult TargetSubtargetInfo.
Having thought about it somewhat, the ways that come to my mind of
approaching this problem are:

* Put up with the code duplication and duplicate everything for
different register classes (current approach taken by in-tree
backends)
* Make use of a multiclass to define multiple instructions with
minimal duplication. I trialled this, but only on a RISC-V
InstrInfo.td that doesn't yet support codegen
https://reviews.llvm.org/P7637
* Use a for loop in tablegen and some !cast<> magic to do something
with a similar effect to the multiclass approach
* Extend TableGen with some sort of AST macro support that would
again, allow you to generate a second (and third..) version of each
instruction with with a different RegisterClass substituted
* Add support for implicit parameterisation. e.g. allowing def MyRC :
Predicated<Is32Bit, GPR32, GPR64>. Invasive and complex, but still an
option.
* Adding support for variable-sized register classes, as you've done
here. This definitely feels like the least invasive and is potentially
less fiddly than using multiclasses.

> 1. Define a mode class. It will be recognized by TableGen as having a
> special meaning.
>
>   class HwMode<list<Predicate> Ps> {
>     // List of Predicate objects that determine whether this mode
>     // applies. This is used for situation where the code generated by
>     // TableGen needs to determine this, as opposed to TableGen itself,
>     // for example in the isel pattern-matching code.
>     list<Predicate> ModeDef = Ps;
>   }
<snip>>
> 2. To make a use of the mode information, provide a class to associate a
> HwMode object with a particular value. This will be done by having two
> lists: one with HwMode objects and another with the corresponding values.
> Since TableGen does not provide a way to define class templates (in the
same
> sense as C++ does), the actual interface will be split in two parts.  First
> is the "mode selection" base class:
>
>   class HwModeSelect<list<HwMode> Ms> {
>     list<HwMode> Modes;  // List of unique hw modes.
>   }
>
> This will be a "built-in" class for TableGen. It will be a base
class, and
> treated as "abstract" since it only contains half of the
information.
<snip>> 3. The class RegisterClass would get new members to hold the configurable
> size/alignment information. If defined, they would take precedence over the
> existing members RegTypes/Size/Alignment.
>
>   class RegisterClass {
>     ...
>     ValueTypeListSelect VarRegTypes;  // The names of these members
>     IntSelect VarRegSize;             // could likely be improved...
>     IntSelect VarSpillSize;           //
>     IntSelect VarSpillAlignment       //
>   }
>
>
> To fully implement the AddReg instruction, the target would then define the
> register class:
>
>   class MyRegisterClass : RegisterClass<...> {
>     let VarRegTypes = ValueTypeListSelect<[Mode64, Mode128],
>             [[i64, v2i32, v4i16, v8i8],             // Mode64
>              [i128, v2i64, v4i32, v8i16, v16i8]]>;  // Mode128
>     let VarRegSize = IntSelect<[Mode64, Mode128], [64, 128]>;
>     let VarSpillSize = IntSelect<[Mode64, Mode128], [64, 128]>;
>     let VarSpillAlignment = IntSelect<[Mode64, Mode128], [64, 128]>;
>   }
>
>   def MyIntReg: MyRegisterClass { ... };
My concern is that all of the above adds yet more complexity to what
is already (in my view) a fairly difficult part of LLVM to understand.
The definition of MyRegisterClass is not so bad though, and perhaps it
doesn't matter how it works under the hood to the average backend
writer.

What if RegisterClass contained a `list<RCInfo>`. Each RCInfo contains
RegTypes, RegSize, SpillSize, and SpillAlignment as well as a
Predicate the determines whether this individual RCInfo is the one
that should apply. To my taste this seems easier to understand than
the {Int,ValueType,ValueTypeList}Select mechanism.

def Is64Bit : Predicate<"Subtarget->is64Bit()">;
def RCInfo64 : RCInfo<Is64Bit> {
  let RegTypes = [i64, v2i32, v4i16, v8i8];
  .....
}

class MyRegisterClass : RegisterClass<...> {
  let RCInfos = [RCInfo32, RCInfo64]
}

Then for e.g. RISC-V I might end up with one GPR RegisterClass that
contains RCInfo for 32-bit and 64-bit which is used in the definition
of all instruction. I might also want to define an explicit GPR32
RegisterClass for use with instructions like ADDW where the two input
operands will always come from the 32-bit subregisters.

Alex

Krzysztof Parzyszek via llvm-dev

2016-Sep-24 13:12 UTC

head link

[llvm-dev] RFC: Implement variable-sized register classes

On 9/24/2016 7:20 AM, Alex Bradbury wrote:> My concern is that all of the above adds yet more complexity to what
> is already (in my view) a fairly difficult part of LLVM to understand.
> The definition of MyRegisterClass is not so bad though, and perhaps it
> doesn't matter how it works under the hood to the average backend
> writer.
I agree with the complexity, but I would hope that more documentation, 
examples and explanations would clarify it.

> What if RegisterClass contained a `list<RCInfo>`. Each RCInfo
contains
> RegTypes, RegSize, SpillSize, and SpillAlignment as well as a
> Predicate the determines whether this individual RCInfo is the one
> that should apply. To my taste this seems easier to understand than
> the {Int,ValueType,ValueTypeList}Select mechanism.
The "select" mechanism was intended to be extendable to be able to 
select any object of any type based on the predefined mode. It is 
entirely possible to use it in a similar way to what you describe below.
> def Is64Bit : Predicate<"Subtarget->is64Bit()">;
> def RCInfo64 : RCInfo<Is64Bit> {
>   let RegTypes = [i64, v2i32, v4i16, v8i8];
>   .....
> }
>
> class MyRegisterClass : RegisterClass<...> {
>   let RCInfos = [RCInfo32, RCInfo64]
> }
With the RCInfo data, the new register class definition would be 
something like

class MyRegisterClass : RegisterClass<...> {
   let RCInfos = HwModeSelect<[Is32Bit,  Is64Bit,  Is128Bit],
                              [RCInfo32, RCInfo64, RCInfo128]>;
}

In either case, aggregating the info in a RCInfo class would require 
additional changes in TableGen so that it picks up the 
size/alignment/type data from the RCInfos list, instead of from 
individual members. This is doable and there are no technical barriers 
to do it. It may actually be a good idea, since it would isolate the 
part of the register class definition into a single object.

On a side note---there is a distinction between "mode" and
"predicate":
modes are distinguished by name, which is necessary because they need to 
be distinguishable during the run-time of TableGen. Predicates are 
evaluated after TableGen is done, during the run-time of the code 
generated by it. I didn't want to differentiate predicates based on 
their names, since that would go against expectations of how predicates 
have behaved so far.

-Krzysztof

Alex Elsayed via llvm-dev

2016-Sep-27 00:03 UTC

head link

[llvm-dev] RFC: Implement variable-sized register classes

On Fri, 23 Sep 2016 13:01:47 -0700, Sean Silva via llvm-dev wrote:
> On Tue, Sep 20, 2016 at 10:32 AM, Krzysztof Parzyszek via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> 
>> I have posted a patch that switches the API to one that supports this
>> (yet non-existent functionality) earlier:
>> https://reviews.llvm.org/D24631
>>
>> The comments from that were incorporated into the following RFC.
>>
>>
>> Motivation:
>>
>> Certain targets feature "variable-sized" registers, i.e. a
situation
>> where the register size can be configured by a hardware switch.  A
>> common instruction set would then operate on these registers regardless
>> of what size they have been configured to have.
<snip>


One thing I'll note is that the RISC-V "V" (Vector) extension is
likely
to work this feature very hard indeed - see the following papers/slides/
talks:

"A Case for MVPs: Mixed-Precision Vector Processors"
http://hwacha.org/papers/hwacha-mvp-prism2014.pdf

"2nd RISC-V Workshop: Vector Extension Proposal"
http://riscv.wpengine.com/wp-content/uploads/2015/06/riscv-vector-
workshop-june2015.pdf
https://youtu.be/NlZr19lFxRg

In such a design, it's very likely that the width of the registers in the 
vector processor may change between individual stripmine loops - that is, 
in fact, rather the point.

Krzysztof Parzyszek via llvm-dev

2016-Oct-04 18:50 UTC

head link

[llvm-dev] RFC: Implement variable-sized register classes

If there are no objections, I'd like to start working on this soon...

For the AMDGPU target this implies that RC->getSize will no longer be 
available in the MC layer.

-Krzysztof


On 9/20/2016 12:32 PM, Krzysztof Parzyszek wrote:> I have posted a patch that switches the API to one that supports this
> (yet non-existent functionality) earlier:
> https://reviews.llvm.org/D24631
>
> The comments from that were incorporated into the following RFC.
>
>
> Motivation:
>
> Certain targets feature "variable-sized" registers, i.e. a
situation
> where the register size can be configured by a hardware switch.  A
> common instruction set would then operate on these registers regardless
> of what size they have been configured to have.  A specific example of
> that is the HVX coprocessor on Hexagon. HVX provides a set of vector
> registers, and can be configured in one of two modes: one in which
> vectors are 512 bits long, and one in vectors are 1024 bits in length.
> The size only determines the number of elements in the vector register,
> and so the semantics of each HVX instruction does not change: it
> performs a given operation on all vector elements. The encoding of the
> instruction does not change between modes, in fact, it is possible to
> have a binary that runs in both modes.
>
> Currently the register size (strictly speaking, "spill slot
size") and
> related properties are fixed and immutable in a RegisterClass. In order
> to allow multiple possible register sizes, several RegisterClass objects
> may need to be defined, which then will require each instruction to be
> defined twice. This is what the HVX code does.  Another approach may be
> to define several sets of physical registers corresponding to different
> sizes, and have a large RegisterClass which would be the union of all of
> them. This could avoid having to duplicate the instructions, but would
> lead to problems with getting the actual spill slot size or alignment.
>
> Since the number of targets allowing this kind of variability is growing
> (besides Hexagon, there is RISC-V, MIPS, and out of tree targets, such
> as CHERI), LLVM should allow convenient handling of this type of a
> situation. See comments in https://reviews.llvm.org/D23561 for more
> details.
>
>
> General approach:
>
> 1. Introduce a concept of a hardware "mode". This
"mode" should be
> immutable, that is, it should be treated as a fixed property of the
> hardware throughout the execution of the program being compiled. This is
> different from, for example, floating point rounding mode, which can be
> changed at run-time.  In LLVM, the mode would be determined by subtarget
> features (reflected in TargetSubtargetInfo).
>
> 2. Move the register/spill size and alignment information from
> MCRegisterClass, and into TargetRegisterInfo. This means that this data
> will no longer be available to the MC layer. Note that the
> size/alignment information will be provided by the TargetRegisterInfo
> object, and not by each individual TargetRegisterClass. A
> TargetRegisterInfo object would be created for a specific hardware mode,
> so that it would be able to provide the necessary information without
> having to consult TargetSubtargetInfo.
>
> 3. Introduce TableGen support for specifying instruction selection
> patterns involving data types depending on the hardware mode.
>
> 4. Require that the sub-/super-class relationships between register
> classes are the same across all hardware modes.
>
>
> The largest impact of this change would be on TableGen, since it needs
> to be aware of the fact that value types under consideration would
> depend on a hardware mode. For example, when having an add-registers
> instruction defined to work on 64-bit registers, providing an additional
> selection pattern for 128-bit registers would present difficulties:
>
>   def AddReg : Instruction {
>     let OutOperandList = (outs GPR64:$Rd);
>     let InOperandList = (ins GPR64:$Rs, GPR64:$Rt);
>     let Pattern = [(set GPR64:$Rd, (add GPR64:$Rs, GPR64:$Rt))]>;
>   }
>
> the pattern
>
>   def: Pat<(add GPR128:$Rs, GPR128:$Rt), (AddReg $Rs, $Rt)>;
>
> would result in a type interference error from TableGen. If the class
> GPR64 was amended to also allow the value type i128, TableGen would no
> longer complain, but may generate invalid instruction selection code.
>
> To solve this, TableGen would need to be aware of the association
> between value types and hardware modes. The rest of this proposal
> describes the programming interface to provide necessary information to
> TableGen.
>
> 1. Define a mode class. It will be recognized by TableGen as having a
> special meaning.
>
>   class HwMode<list<Predicate> Ps> {
>     // List of Predicate objects that determine whether this mode
>     // applies. This is used for situation where the code generated by
>     // TableGen needs to determine this, as opposed to TableGen itself,
>     // for example in the isel pattern-matching code.
>     list<Predicate> ModeDef = Ps;
>   }
>
> From the point of view of the code generated by TableGen, HwMode is
> equivalent to a list of Predicate objects. The difference is in how
> TableGen itself treats it: TableGen will distinguish two objects of
> class HwMode if they have different names, regardless of what sets of
> predicates they contain. One way to think of it is that the name of the
> object would serve as a tag denoting the hardware mode.
>
> In the example of the AddReg instruction, we could define two modes:
>
>   def Mode64: Mode<[...]>;
>   def Mode128: Mode<[...]>;
>
> but so far there would not be much more that we could do.
>
> 2. To make a use of the mode information, provide a class to associate a
> HwMode object with a particular value. This will be done by having two
> lists: one with HwMode objects and another with the corresponding
> values.  Since TableGen does not provide a way to define class templates
> (in the same sense as C++ does), the actual interface will be split in
> two parts.  First is the "mode selection" base class:
>
>   class HwModeSelect<list<HwMode> Ms> {
>     list<HwMode> Modes;  // List of unique hw modes.
>   }
>
> This will be a "built-in" class for TableGen. It will be a base
class,
> and treated as "abstract" since it only contains half of the
> information.  Each derived class would then need to define a member
> "Values", which is a list of corresponding values, of the same
length as
> the list of modes.  The following definitions will be useful for
> defining register classes and selection patterns:
>
>   class IntSelect<list<Mode> Ms, list<int> Is>
>       : HwModeSelect<Ms> {
>     // Select an integer literal.
>     list<int> Values = Is;
>   }
>
>   class ValueTypeSelect<list<Mode> Ms, list<ValueType>
Ts>
>       : HwModeSelect<Ms> {
>     // Select a value type.
>     list<ValueType> Values = Ts;
>   }
>
>   class ValueTypeListSelect<list<Mode> Ms,
list<list<ValueType>> Ls>
>       : HwModeSelect<Ms> {
>     // Select a list of value types.
>     list<list<ValueType>> Values = Ls;
>   }
>
> 3. The class RegisterClass would get new members to hold the
> configurable size/alignment information. If defined, they would take
> precedence over the existing members RegTypes/Size/Alignment.
>
>   class RegisterClass {
>     ...
>     ValueTypeListSelect VarRegTypes;  // The names of these members
>     IntSelect VarRegSize;             // could likely be improved...
>     IntSelect VarSpillSize;           //
>     IntSelect VarSpillAlignment       //
>   }
>
>
> To fully implement the AddReg instruction, the target would then define
> the register class:
>
>   class MyRegisterClass : RegisterClass<...> {
>     let VarRegTypes = ValueTypeListSelect<[Mode64, Mode128],
>             [[i64, v2i32, v4i16, v8i8],             // Mode64
>              [i128, v2i64, v4i32, v8i16, v16i8]]>;  // Mode128
>     let VarRegSize = IntSelect<[Mode64, Mode128], [64, 128]>;
>     let VarSpillSize = IntSelect<[Mode64, Mode128], [64, 128]>;
>     let VarSpillAlignment = IntSelect<[Mode64, Mode128], [64, 128]>;
>   }
>
>   def MyIntReg: MyRegisterClass { ... };
>
> And following that, the instruction:
>
>   def AddReg: Instruction {
>     let OutOperandList = (outs MyIntReg:$Rd);
>     let InOperandList = (ins MyIntReg:$Rs, MyIntReg:$Rt);
>     let AsmString = "add $Rd, $Rs, $Rt";
>     let Pattern = [(set MyIntReg:$Rd, (add MyIntReg:$Rs,
>                                            MyIntReg:$Rt))]>;
>   }
>
>
>
>
> -Krzysztof
>
>
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, 
hosted by The Linux Foundation

Alex Bradbury via llvm-dev

2016-Oct-08 19:52 UTC

head link

[llvm-dev] RFC: Implement variable-sized register classes

On 4 October 2016 at 19:50, Krzysztof Parzyszek via llvm-dev
<llvm-dev at lists.llvm.org> wrote:> If there are no objections, I'd like to start working on this soon...
>
> For the AMDGPU target this implies that RC->getSize will no longer be
> available in the MC layer.
Another advantage of this work that hasn't been mentioned yet is it
will reduce the number of uses of isCodeGenOnly. The comment in
Target.td indicates the long-term plan is to remove the distinction
between isPseudo and isCodeGenOnly.

A closely related to variable-sized register classes is the case where
you have multiple registers with the same AsmName. This crops up in
the same kind of cases where you have multiple instructions with the
same encoding. Without a workaround, an assert is tripped in
llvm-tblgen when trying to produce a StringSwitch for
MatchRegisterName. The solution in Mips, PPC and others seems to be
involve the generation of MatchRegisterName. What has been discussed
so far with regards to HwMode and variable-size register classes
points to a solution, but I don't think it's quite enough. Options
include:

1. Only have one set of register definitions, and have the variable
sized register class determine the bit width. The problem is there are
often some instructions where I think you need to have registers
modelled as subregisters. e.g. SLLW, ADDW etc in 64-bit RISC-V. These
operate on 32-bit values and write the results sign-extended to the
target 64-bit register.

2. Define both the 64-bit registers and the 32-bit subregisters, but
make MatchRegisterName's behaviour change based on the HwMode. This
works around the fact there are multiple registers with the same
AsmName. Although I doubt this would actually cause problems, this
still isn't quite right. For an `SLLIW x1, x2, 5` I think the correct
interpretation would have x1 as a 64-bit target register and x2 as the
32-bit subregister that happens to have the same AsmName as the 64-bit
x2 register.

Have you thought about how the HwMode/variable-sized register class
proposal might interact with register AsmNames at all?

This old patch that never landed
<http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20141201/246835.html>
is also I think related. Backends like Mips and PPC end up defining
RegisterOperand with a ParserMatchClass (in the Mips case, this
specified the 'parseAnyRegister' ParserMethod. Adding a
ParserMatchClass field to RegisterClass would be a minor
simplification.

Best,

Alex

Reasonably Related Threads

Search for more seemingly similar threads

llvm dev - Oct 2016 - RFC: Implement variable-sized register classes

[llvm-dev] RFC: Implement variable-sized register classes

[llvm-dev] RFC: Implement variable-sized register classes

[llvm-dev] RFC: Implement variable-sized register classes

[llvm-dev] RFC: Implement variable-sized register classes

[llvm-dev] RFC: Implement variable-sized register classes

[llvm-dev] RFC: Implement variable-sized register classes

[llvm-dev] RFC: Implement variable-sized register classes

[llvm-dev] RFC: Implement variable-sized register classes

Reasonably Related Threads