Krzysztof Parzyszek via llvm-dev
2016-Sep-20 17:32 UTC
[llvm-dev] RFC: Implement variable-sized register classes
I have posted a patch that switches the API to one that supports this (yet non-existent functionality) earlier: https://reviews.llvm.org/D24631 The comments from that were incorporated into the following RFC. Motivation: Certain targets feature "variable-sized" registers, i.e. a situation where the register size can be configured by a hardware switch. A common instruction set would then operate on these registers regardless of what size they have been configured to have. A specific example of that is the HVX coprocessor on Hexagon. HVX provides a set of vector registers, and can be configured in one of two modes: one in which vectors are 512 bits long, and one in vectors are 1024 bits in length. The size only determines the number of elements in the vector register, and so the semantics of each HVX instruction does not change: it performs a given operation on all vector elements. The encoding of the instruction does not change between modes, in fact, it is possible to have a binary that runs in both modes. Currently the register size (strictly speaking, "spill slot size") and related properties are fixed and immutable in a RegisterClass. In order to allow multiple possible register sizes, several RegisterClass objects may need to be defined, which then will require each instruction to be defined twice. This is what the HVX code does. Another approach may be to define several sets of physical registers corresponding to different sizes, and have a large RegisterClass which would be the union of all of them. This could avoid having to duplicate the instructions, but would lead to problems with getting the actual spill slot size or alignment. Since the number of targets allowing this kind of variability is growing (besides Hexagon, there is RISC-V, MIPS, and out of tree targets, such as CHERI), LLVM should allow convenient handling of this type of a situation. See comments in https://reviews.llvm.org/D23561 for more details. General approach: 1. Introduce a concept of a hardware "mode". This "mode" should be immutable, that is, it should be treated as a fixed property of the hardware throughout the execution of the program being compiled. This is different from, for example, floating point rounding mode, which can be changed at run-time. In LLVM, the mode would be determined by subtarget features (reflected in TargetSubtargetInfo). 2. Move the register/spill size and alignment information from MCRegisterClass, and into TargetRegisterInfo. This means that this data will no longer be available to the MC layer. Note that the size/alignment information will be provided by the TargetRegisterInfo object, and not by each individual TargetRegisterClass. A TargetRegisterInfo object would be created for a specific hardware mode, so that it would be able to provide the necessary information without having to consult TargetSubtargetInfo. 3. Introduce TableGen support for specifying instruction selection patterns involving data types depending on the hardware mode. 4. Require that the sub-/super-class relationships between register classes are the same across all hardware modes. The largest impact of this change would be on TableGen, since it needs to be aware of the fact that value types under consideration would depend on a hardware mode. For example, when having an add-registers instruction defined to work on 64-bit registers, providing an additional selection pattern for 128-bit registers would present difficulties: def AddReg : Instruction { let OutOperandList = (outs GPR64:$Rd); let InOperandList = (ins GPR64:$Rs, GPR64:$Rt); let Pattern = [(set GPR64:$Rd, (add GPR64:$Rs, GPR64:$Rt))]>; } the pattern def: Pat<(add GPR128:$Rs, GPR128:$Rt), (AddReg $Rs, $Rt)>; would result in a type interference error from TableGen. If the class GPR64 was amended to also allow the value type i128, TableGen would no longer complain, but may generate invalid instruction selection code. To solve this, TableGen would need to be aware of the association between value types and hardware modes. The rest of this proposal describes the programming interface to provide necessary information to TableGen. 1. Define a mode class. It will be recognized by TableGen as having a special meaning. class HwMode<list<Predicate> Ps> { // List of Predicate objects that determine whether this mode // applies. This is used for situation where the code generated by // TableGen needs to determine this, as opposed to TableGen itself, // for example in the isel pattern-matching code. list<Predicate> ModeDef = Ps; } From the point of view of the code generated by TableGen, HwMode is equivalent to a list of Predicate objects. The difference is in how TableGen itself treats it: TableGen will distinguish two objects of class HwMode if they have different names, regardless of what sets of predicates they contain. One way to think of it is that the name of the object would serve as a tag denoting the hardware mode. In the example of the AddReg instruction, we could define two modes: def Mode64: Mode<[...]>; def Mode128: Mode<[...]>; but so far there would not be much more that we could do. 2. To make a use of the mode information, provide a class to associate a HwMode object with a particular value. This will be done by having two lists: one with HwMode objects and another with the corresponding values. Since TableGen does not provide a way to define class templates (in the same sense as C++ does), the actual interface will be split in two parts. First is the "mode selection" base class: class HwModeSelect<list<HwMode> Ms> { list<HwMode> Modes; // List of unique hw modes. } This will be a "built-in" class for TableGen. It will be a base class, and treated as "abstract" since it only contains half of the information. Each derived class would then need to define a member "Values", which is a list of corresponding values, of the same length as the list of modes. The following definitions will be useful for defining register classes and selection patterns: class IntSelect<list<Mode> Ms, list<int> Is> : HwModeSelect<Ms> { // Select an integer literal. list<int> Values = Is; } class ValueTypeSelect<list<Mode> Ms, list<ValueType> Ts> : HwModeSelect<Ms> { // Select a value type. list<ValueType> Values = Ts; } class ValueTypeListSelect<list<Mode> Ms, list<list<ValueType>> Ls> : HwModeSelect<Ms> { // Select a list of value types. list<list<ValueType>> Values = Ls; } 3. The class RegisterClass would get new members to hold the configurable size/alignment information. If defined, they would take precedence over the existing members RegTypes/Size/Alignment. class RegisterClass { ... ValueTypeListSelect VarRegTypes; // The names of these members IntSelect VarRegSize; // could likely be improved... IntSelect VarSpillSize; // IntSelect VarSpillAlignment // } To fully implement the AddReg instruction, the target would then define the register class: class MyRegisterClass : RegisterClass<...> { let VarRegTypes = ValueTypeListSelect<[Mode64, Mode128], [[i64, v2i32, v4i16, v8i8], // Mode64 [i128, v2i64, v4i32, v8i16, v16i8]]>; // Mode128 let VarRegSize = IntSelect<[Mode64, Mode128], [64, 128]>; let VarSpillSize = IntSelect<[Mode64, Mode128], [64, 128]>; let VarSpillAlignment = IntSelect<[Mode64, Mode128], [64, 128]>; } def MyIntReg: MyRegisterClass { ... }; And following that, the instruction: def AddReg: Instruction { let OutOperandList = (outs MyIntReg:$Rd); let InOperandList = (ins MyIntReg:$Rs, MyIntReg:$Rt); let AsmString = "add $Rd, $Rs, $Rt"; let Pattern = [(set MyIntReg:$Rd, (add MyIntReg:$Rs, MyIntReg:$Rt))]>; } -Krzysztof -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Sean Silva via llvm-dev
2016-Sep-23 20:01 UTC
[llvm-dev] RFC: Implement variable-sized register classes
On Tue, Sep 20, 2016 at 10:32 AM, Krzysztof Parzyszek via llvm-dev < llvm-dev at lists.llvm.org> wrote:> I have posted a patch that switches the API to one that supports this (yet > non-existent functionality) earlier: > https://reviews.llvm.org/D24631 > > The comments from that were incorporated into the following RFC. > > > Motivation: > > Certain targets feature "variable-sized" registers, i.e. a situation where > the register size can be configured by a hardware switch. A common > instruction set would then operate on these registers regardless of what > size they have been configured to have. A specific example of that is the > HVX coprocessor on Hexagon. HVX provides a set of vector registers, and can > be configured in one of two modes: one in which vectors are 512 bits long, > and one in vectors are 1024 bits in length. The size only determines the > number of elements in the vector register, and so the semantics of each HVX > instruction does not change: it performs a given operation on all vector > elements. The encoding of the instruction does not change between modes, in > fact, it is possible to have a binary that runs in both modes. > > Currently the register size (strictly speaking, "spill slot size") and > related properties are fixed and immutable in a RegisterClass. In order to > allow multiple possible register sizes, several RegisterClass objects may > need to be defined, which then will require each instruction to be defined > twice. This is what the HVX code does. Another approach may be to define > several sets of physical registers corresponding to different sizes, and > have a large RegisterClass which would be the union of all of them. This > could avoid having to duplicate the instructions, but would lead to > problems with getting the actual spill slot size or alignment. > > Since the number of targets allowing this kind of variability is growing > (besides Hexagon, there is RISC-V, MIPS, and out of tree targets, such as > CHERI), LLVM should allow convenient handling of this type of a situation. > See comments in https://reviews.llvm.org/D23561 for more details. >ARM SVE sounds like it will have similar issues: https://community.arm.com/groups/processors/blog/2016/08/22/technology-update-the-scalable-vector-extension-sve-for-the-armv8-a-architecture -- Sean Silva> > > General approach: > > 1. Introduce a concept of a hardware "mode". This "mode" should be > immutable, that is, it should be treated as a fixed property of the > hardware throughout the execution of the program being compiled. This is > different from, for example, floating point rounding mode, which can be > changed at run-time. In LLVM, the mode would be determined by subtarget > features (reflected in TargetSubtargetInfo). > > 2. Move the register/spill size and alignment information from > MCRegisterClass, and into TargetRegisterInfo. This means that this data > will no longer be available to the MC layer. Note that the size/alignment > information will be provided by the TargetRegisterInfo object, and not by > each individual TargetRegisterClass. A TargetRegisterInfo object would be > created for a specific hardware mode, so that it would be able to provide > the necessary information without having to consult TargetSubtargetInfo. > > 3. Introduce TableGen support for specifying instruction selection > patterns involving data types depending on the hardware mode. > > 4. Require that the sub-/super-class relationships between register > classes are the same across all hardware modes. > > > The largest impact of this change would be on TableGen, since it needs to > be aware of the fact that value types under consideration would depend on a > hardware mode. For example, when having an add-registers instruction > defined to work on 64-bit registers, providing an additional selection > pattern for 128-bit registers would present difficulties: > > def AddReg : Instruction { > let OutOperandList = (outs GPR64:$Rd); > let InOperandList = (ins GPR64:$Rs, GPR64:$Rt); > let Pattern = [(set GPR64:$Rd, (add GPR64:$Rs, GPR64:$Rt))]>; > } > > the pattern > > def: Pat<(add GPR128:$Rs, GPR128:$Rt), (AddReg $Rs, $Rt)>; > > would result in a type interference error from TableGen. If the class > GPR64 was amended to also allow the value type i128, TableGen would no > longer complain, but may generate invalid instruction selection code. > > To solve this, TableGen would need to be aware of the association between > value types and hardware modes. The rest of this proposal describes the > programming interface to provide necessary information to TableGen. > > 1. Define a mode class. It will be recognized by TableGen as having a > special meaning. > > class HwMode<list<Predicate> Ps> { > // List of Predicate objects that determine whether this mode > // applies. This is used for situation where the code generated by > // TableGen needs to determine this, as opposed to TableGen itself, > // for example in the isel pattern-matching code. > list<Predicate> ModeDef = Ps; > } > > From the point of view of the code generated by TableGen, HwMode is > equivalent to a list of Predicate objects. The difference is in how > TableGen itself treats it: TableGen will distinguish two objects of class > HwMode if they have different names, regardless of what sets of predicates > they contain. One way to think of it is that the name of the object would > serve as a tag denoting the hardware mode. > > In the example of the AddReg instruction, we could define two modes: > > def Mode64: Mode<[...]>; > def Mode128: Mode<[...]>; > > but so far there would not be much more that we could do. > > 2. To make a use of the mode information, provide a class to associate a > HwMode object with a particular value. This will be done by having two > lists: one with HwMode objects and another with the corresponding values. > Since TableGen does not provide a way to define class templates (in the > same sense as C++ does), the actual interface will be split in two parts. > First is the "mode selection" base class: > > class HwModeSelect<list<HwMode> Ms> { > list<HwMode> Modes; // List of unique hw modes. > } > > This will be a "built-in" class for TableGen. It will be a base class, and > treated as "abstract" since it only contains half of the information. Each > derived class would then need to define a member "Values", which is a list > of corresponding values, of the same length as the list of modes. The > following definitions will be useful for defining register classes and > selection patterns: > > class IntSelect<list<Mode> Ms, list<int> Is> > : HwModeSelect<Ms> { > // Select an integer literal. > list<int> Values = Is; > } > > class ValueTypeSelect<list<Mode> Ms, list<ValueType> Ts> > : HwModeSelect<Ms> { > // Select a value type. > list<ValueType> Values = Ts; > } > > class ValueTypeListSelect<list<Mode> Ms, list<list<ValueType>> Ls> > : HwModeSelect<Ms> { > // Select a list of value types. > list<list<ValueType>> Values = Ls; > } > > 3. The class RegisterClass would get new members to hold the configurable > size/alignment information. If defined, they would take precedence over the > existing members RegTypes/Size/Alignment. > > class RegisterClass { > ... > ValueTypeListSelect VarRegTypes; // The names of these members > IntSelect VarRegSize; // could likely be improved... > IntSelect VarSpillSize; // > IntSelect VarSpillAlignment // > } > > > To fully implement the AddReg instruction, the target would then define > the register class: > > class MyRegisterClass : RegisterClass<...> { > let VarRegTypes = ValueTypeListSelect<[Mode64, Mode128], > [[i64, v2i32, v4i16, v8i8], // Mode64 > [i128, v2i64, v4i32, v8i16, v16i8]]>; // Mode128 > let VarRegSize = IntSelect<[Mode64, Mode128], [64, 128]>; > let VarSpillSize = IntSelect<[Mode64, Mode128], [64, 128]>; > let VarSpillAlignment = IntSelect<[Mode64, Mode128], [64, 128]>; > } > > def MyIntReg: MyRegisterClass { ... }; > > And following that, the instruction: > > def AddReg: Instruction { > let OutOperandList = (outs MyIntReg:$Rd); > let InOperandList = (ins MyIntReg:$Rs, MyIntReg:$Rt); > let AsmString = "add $Rd, $Rs, $Rt"; > let Pattern = [(set MyIntReg:$Rd, (add MyIntReg:$Rs, > MyIntReg:$Rt))]>; > } > > > > > -Krzysztof > > > -- > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted > by The Linux Foundation > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160923/b02e29b3/attachment.html>
Matthias Braun via llvm-dev
2016-Sep-23 20:08 UTC
[llvm-dev] RFC: Implement variable-sized register classes
> On Sep 23, 2016, at 1:01 PM, Sean Silva via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > > On Tue, Sep 20, 2016 at 10:32 AM, Krzysztof Parzyszek via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > I have posted a patch that switches the API to one that supports this (yet non-existent functionality) earlier: > https://reviews.llvm.org/D24631 <https://reviews.llvm.org/D24631> > > The comments from that were incorporated into the following RFC. > > > Motivation: > > Certain targets feature "variable-sized" registers, i.e. a situation where the register size can be configured by a hardware switch. A common instruction set would then operate on these registers regardless of what size they have been configured to have. A specific example of that is the HVX coprocessor on Hexagon. HVX provides a set of vector registers, and can be configured in one of two modes: one in which vectors are 512 bits long, and one in vectors are 1024 bits in length. The size only determines the number of elements in the vector register, and so the semantics of each HVX instruction does not change: it performs a given operation on all vector elements. The encoding of the instruction does not change between modes, in fact, it is possible to have a binary that runs in both modes. > > Currently the register size (strictly speaking, "spill slot size") and related properties are fixed and immutable in a RegisterClass. In order to allow multiple possible register sizes, several RegisterClass objects may need to be defined, which then will require each instruction to be defined twice. This is what the HVX code does. Another approach may be to define several sets of physical registers corresponding to different sizes, and have a large RegisterClass which would be the union of all of them. This could avoid having to duplicate the instructions, but would lead to problems with getting the actual spill slot size or alignment. > > Since the number of targets allowing this kind of variability is growing (besides Hexagon, there is RISC-V, MIPS, and out of tree targets, such as CHERI), LLVM should allow convenient handling of this type of a situation. See comments in https://reviews.llvm.org/D23561 <https://reviews.llvm.org/D23561>for more details. > > ARM SVE sounds like it will have similar issues: https://community.arm.com/groups/processors/blog/2016/08/22/technology-update-the-scalable-vector-extension-sve-for-the-armv8-a-architecture <https://community.arm.com/groups/processors/blog/2016/08/22/technology-update-the-scalable-vector-extension-sve-for-the-armv8-a-architecture>From glancing over the slides, it seems like SVE has dynamically sized (i.e. you don't know yet at compile time) registers which would be a step further than this. Of course the stuff in here wouldn't hurt for that as it pushes the code into a direction to rely less on well-known/fixed register sizes. - Matthias -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160923/4c861b3b/attachment.html>
Alex Bradbury via llvm-dev
2016-Sep-24 12:20 UTC
[llvm-dev] RFC: Implement variable-sized register classes
On 20 September 2016 at 18:32, Krzysztof Parzyszek via llvm-dev <llvm-dev at lists.llvm.org> wrote:> I have posted a patch that switches the API to one that supports this (yet > non-existent functionality) earlier: > https://reviews.llvm.org/D24631 > > The comments from that were incorporated into the following RFC.Thank you for writing this up. Your proposal is now much clearer to me.> 1. Introduce a concept of a hardware "mode". This "mode" should be > immutable, that is, it should be treated as a fixed property of the hardware > throughout the execution of the program being compiled. This is different > from, for example, floating point rounding mode, which can be changed at > run-time. In LLVM, the mode would be determined by subtarget features > (reflected in TargetSubtargetInfo). > > 2. Move the register/spill size and alignment information from > MCRegisterClass, and into TargetRegisterInfo. This means that this data will > no longer be available to the MC layer. Note that the size/alignment > information will be provided by the TargetRegisterInfo object, and not by > each individual TargetRegisterClass. A TargetRegisterInfo object would be > created for a specific hardware mode, so that it would be able to provide > the necessary information without having to consult TargetSubtargetInfo.Having thought about it somewhat, the ways that come to my mind of approaching this problem are: * Put up with the code duplication and duplicate everything for different register classes (current approach taken by in-tree backends) * Make use of a multiclass to define multiple instructions with minimal duplication. I trialled this, but only on a RISC-V InstrInfo.td that doesn't yet support codegen https://reviews.llvm.org/P7637 * Use a for loop in tablegen and some !cast<> magic to do something with a similar effect to the multiclass approach * Extend TableGen with some sort of AST macro support that would again, allow you to generate a second (and third..) version of each instruction with with a different RegisterClass substituted * Add support for implicit parameterisation. e.g. allowing def MyRC : Predicated<Is32Bit, GPR32, GPR64>. Invasive and complex, but still an option. * Adding support for variable-sized register classes, as you've done here. This definitely feels like the least invasive and is potentially less fiddly than using multiclasses.> 1. Define a mode class. It will be recognized by TableGen as having a > special meaning. > > class HwMode<list<Predicate> Ps> { > // List of Predicate objects that determine whether this mode > // applies. This is used for situation where the code generated by > // TableGen needs to determine this, as opposed to TableGen itself, > // for example in the isel pattern-matching code. > list<Predicate> ModeDef = Ps; > }<snip>> > 2. To make a use of the mode information, provide a class to associate a > HwMode object with a particular value. This will be done by having two > lists: one with HwMode objects and another with the corresponding values. > Since TableGen does not provide a way to define class templates (in the same > sense as C++ does), the actual interface will be split in two parts. First > is the "mode selection" base class: > > class HwModeSelect<list<HwMode> Ms> { > list<HwMode> Modes; // List of unique hw modes. > } > > This will be a "built-in" class for TableGen. It will be a base class, and > treated as "abstract" since it only contains half of the information.<snip>> 3. The class RegisterClass would get new members to hold the configurable > size/alignment information. If defined, they would take precedence over the > existing members RegTypes/Size/Alignment. > > class RegisterClass { > ... > ValueTypeListSelect VarRegTypes; // The names of these members > IntSelect VarRegSize; // could likely be improved... > IntSelect VarSpillSize; // > IntSelect VarSpillAlignment // > } > > > To fully implement the AddReg instruction, the target would then define the > register class: > > class MyRegisterClass : RegisterClass<...> { > let VarRegTypes = ValueTypeListSelect<[Mode64, Mode128], > [[i64, v2i32, v4i16, v8i8], // Mode64 > [i128, v2i64, v4i32, v8i16, v16i8]]>; // Mode128 > let VarRegSize = IntSelect<[Mode64, Mode128], [64, 128]>; > let VarSpillSize = IntSelect<[Mode64, Mode128], [64, 128]>; > let VarSpillAlignment = IntSelect<[Mode64, Mode128], [64, 128]>; > } > > def MyIntReg: MyRegisterClass { ... };My concern is that all of the above adds yet more complexity to what is already (in my view) a fairly difficult part of LLVM to understand. The definition of MyRegisterClass is not so bad though, and perhaps it doesn't matter how it works under the hood to the average backend writer. What if RegisterClass contained a `list<RCInfo>`. Each RCInfo contains RegTypes, RegSize, SpillSize, and SpillAlignment as well as a Predicate the determines whether this individual RCInfo is the one that should apply. To my taste this seems easier to understand than the {Int,ValueType,ValueTypeList}Select mechanism. def Is64Bit : Predicate<"Subtarget->is64Bit()">; def RCInfo64 : RCInfo<Is64Bit> { let RegTypes = [i64, v2i32, v4i16, v8i8]; ..... } class MyRegisterClass : RegisterClass<...> { let RCInfos = [RCInfo32, RCInfo64] } Then for e.g. RISC-V I might end up with one GPR RegisterClass that contains RCInfo for 32-bit and 64-bit which is used in the definition of all instruction. I might also want to define an explicit GPR32 RegisterClass for use with instructions like ADDW where the two input operands will always come from the 32-bit subregisters. Alex
Krzysztof Parzyszek via llvm-dev
2016-Sep-24 13:12 UTC
[llvm-dev] RFC: Implement variable-sized register classes
On 9/24/2016 7:20 AM, Alex Bradbury wrote:> My concern is that all of the above adds yet more complexity to what > is already (in my view) a fairly difficult part of LLVM to understand. > The definition of MyRegisterClass is not so bad though, and perhaps it > doesn't matter how it works under the hood to the average backend > writer.I agree with the complexity, but I would hope that more documentation, examples and explanations would clarify it.> What if RegisterClass contained a `list<RCInfo>`. Each RCInfo contains > RegTypes, RegSize, SpillSize, and SpillAlignment as well as a > Predicate the determines whether this individual RCInfo is the one > that should apply. To my taste this seems easier to understand than > the {Int,ValueType,ValueTypeList}Select mechanism.The "select" mechanism was intended to be extendable to be able to select any object of any type based on the predefined mode. It is entirely possible to use it in a similar way to what you describe below.> def Is64Bit : Predicate<"Subtarget->is64Bit()">; > def RCInfo64 : RCInfo<Is64Bit> { > let RegTypes = [i64, v2i32, v4i16, v8i8]; > ..... > } > > class MyRegisterClass : RegisterClass<...> { > let RCInfos = [RCInfo32, RCInfo64] > }With the RCInfo data, the new register class definition would be something like class MyRegisterClass : RegisterClass<...> { let RCInfos = HwModeSelect<[Is32Bit, Is64Bit, Is128Bit], [RCInfo32, RCInfo64, RCInfo128]>; } In either case, aggregating the info in a RCInfo class would require additional changes in TableGen so that it picks up the size/alignment/type data from the RCInfos list, instead of from individual members. This is doable and there are no technical barriers to do it. It may actually be a good idea, since it would isolate the part of the register class definition into a single object. On a side note---there is a distinction between "mode" and "predicate": modes are distinguished by name, which is necessary because they need to be distinguishable during the run-time of TableGen. Predicates are evaluated after TableGen is done, during the run-time of the code generated by it. I didn't want to differentiate predicates based on their names, since that would go against expectations of how predicates have behaved so far. -Krzysztof
Alex Elsayed via llvm-dev
2016-Sep-27 00:03 UTC
[llvm-dev] RFC: Implement variable-sized register classes
On Fri, 23 Sep 2016 13:01:47 -0700, Sean Silva via llvm-dev wrote:> On Tue, Sep 20, 2016 at 10:32 AM, Krzysztof Parzyszek via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> I have posted a patch that switches the API to one that supports this >> (yet non-existent functionality) earlier: >> https://reviews.llvm.org/D24631 >> >> The comments from that were incorporated into the following RFC. >> >> >> Motivation: >> >> Certain targets feature "variable-sized" registers, i.e. a situation >> where the register size can be configured by a hardware switch. A >> common instruction set would then operate on these registers regardless >> of what size they have been configured to have.<snip> One thing I'll note is that the RISC-V "V" (Vector) extension is likely to work this feature very hard indeed - see the following papers/slides/ talks: "A Case for MVPs: Mixed-Precision Vector Processors" http://hwacha.org/papers/hwacha-mvp-prism2014.pdf "2nd RISC-V Workshop: Vector Extension Proposal" http://riscv.wpengine.com/wp-content/uploads/2015/06/riscv-vector- workshop-june2015.pdf https://youtu.be/NlZr19lFxRg In such a design, it's very likely that the width of the registers in the vector processor may change between individual stripmine loops - that is, in fact, rather the point.
Krzysztof Parzyszek via llvm-dev
2016-Oct-04 18:50 UTC
[llvm-dev] RFC: Implement variable-sized register classes
If there are no objections, I'd like to start working on this soon... For the AMDGPU target this implies that RC->getSize will no longer be available in the MC layer. -Krzysztof On 9/20/2016 12:32 PM, Krzysztof Parzyszek wrote:> I have posted a patch that switches the API to one that supports this > (yet non-existent functionality) earlier: > https://reviews.llvm.org/D24631 > > The comments from that were incorporated into the following RFC. > > > Motivation: > > Certain targets feature "variable-sized" registers, i.e. a situation > where the register size can be configured by a hardware switch. A > common instruction set would then operate on these registers regardless > of what size they have been configured to have. A specific example of > that is the HVX coprocessor on Hexagon. HVX provides a set of vector > registers, and can be configured in one of two modes: one in which > vectors are 512 bits long, and one in vectors are 1024 bits in length. > The size only determines the number of elements in the vector register, > and so the semantics of each HVX instruction does not change: it > performs a given operation on all vector elements. The encoding of the > instruction does not change between modes, in fact, it is possible to > have a binary that runs in both modes. > > Currently the register size (strictly speaking, "spill slot size") and > related properties are fixed and immutable in a RegisterClass. In order > to allow multiple possible register sizes, several RegisterClass objects > may need to be defined, which then will require each instruction to be > defined twice. This is what the HVX code does. Another approach may be > to define several sets of physical registers corresponding to different > sizes, and have a large RegisterClass which would be the union of all of > them. This could avoid having to duplicate the instructions, but would > lead to problems with getting the actual spill slot size or alignment. > > Since the number of targets allowing this kind of variability is growing > (besides Hexagon, there is RISC-V, MIPS, and out of tree targets, such > as CHERI), LLVM should allow convenient handling of this type of a > situation. See comments in https://reviews.llvm.org/D23561 for more > details. > > > General approach: > > 1. Introduce a concept of a hardware "mode". This "mode" should be > immutable, that is, it should be treated as a fixed property of the > hardware throughout the execution of the program being compiled. This is > different from, for example, floating point rounding mode, which can be > changed at run-time. In LLVM, the mode would be determined by subtarget > features (reflected in TargetSubtargetInfo). > > 2. Move the register/spill size and alignment information from > MCRegisterClass, and into TargetRegisterInfo. This means that this data > will no longer be available to the MC layer. Note that the > size/alignment information will be provided by the TargetRegisterInfo > object, and not by each individual TargetRegisterClass. A > TargetRegisterInfo object would be created for a specific hardware mode, > so that it would be able to provide the necessary information without > having to consult TargetSubtargetInfo. > > 3. Introduce TableGen support for specifying instruction selection > patterns involving data types depending on the hardware mode. > > 4. Require that the sub-/super-class relationships between register > classes are the same across all hardware modes. > > > The largest impact of this change would be on TableGen, since it needs > to be aware of the fact that value types under consideration would > depend on a hardware mode. For example, when having an add-registers > instruction defined to work on 64-bit registers, providing an additional > selection pattern for 128-bit registers would present difficulties: > > def AddReg : Instruction { > let OutOperandList = (outs GPR64:$Rd); > let InOperandList = (ins GPR64:$Rs, GPR64:$Rt); > let Pattern = [(set GPR64:$Rd, (add GPR64:$Rs, GPR64:$Rt))]>; > } > > the pattern > > def: Pat<(add GPR128:$Rs, GPR128:$Rt), (AddReg $Rs, $Rt)>; > > would result in a type interference error from TableGen. If the class > GPR64 was amended to also allow the value type i128, TableGen would no > longer complain, but may generate invalid instruction selection code. > > To solve this, TableGen would need to be aware of the association > between value types and hardware modes. The rest of this proposal > describes the programming interface to provide necessary information to > TableGen. > > 1. Define a mode class. It will be recognized by TableGen as having a > special meaning. > > class HwMode<list<Predicate> Ps> { > // List of Predicate objects that determine whether this mode > // applies. This is used for situation where the code generated by > // TableGen needs to determine this, as opposed to TableGen itself, > // for example in the isel pattern-matching code. > list<Predicate> ModeDef = Ps; > } > > From the point of view of the code generated by TableGen, HwMode is > equivalent to a list of Predicate objects. The difference is in how > TableGen itself treats it: TableGen will distinguish two objects of > class HwMode if they have different names, regardless of what sets of > predicates they contain. One way to think of it is that the name of the > object would serve as a tag denoting the hardware mode. > > In the example of the AddReg instruction, we could define two modes: > > def Mode64: Mode<[...]>; > def Mode128: Mode<[...]>; > > but so far there would not be much more that we could do. > > 2. To make a use of the mode information, provide a class to associate a > HwMode object with a particular value. This will be done by having two > lists: one with HwMode objects and another with the corresponding > values. Since TableGen does not provide a way to define class templates > (in the same sense as C++ does), the actual interface will be split in > two parts. First is the "mode selection" base class: > > class HwModeSelect<list<HwMode> Ms> { > list<HwMode> Modes; // List of unique hw modes. > } > > This will be a "built-in" class for TableGen. It will be a base class, > and treated as "abstract" since it only contains half of the > information. Each derived class would then need to define a member > "Values", which is a list of corresponding values, of the same length as > the list of modes. The following definitions will be useful for > defining register classes and selection patterns: > > class IntSelect<list<Mode> Ms, list<int> Is> > : HwModeSelect<Ms> { > // Select an integer literal. > list<int> Values = Is; > } > > class ValueTypeSelect<list<Mode> Ms, list<ValueType> Ts> > : HwModeSelect<Ms> { > // Select a value type. > list<ValueType> Values = Ts; > } > > class ValueTypeListSelect<list<Mode> Ms, list<list<ValueType>> Ls> > : HwModeSelect<Ms> { > // Select a list of value types. > list<list<ValueType>> Values = Ls; > } > > 3. The class RegisterClass would get new members to hold the > configurable size/alignment information. If defined, they would take > precedence over the existing members RegTypes/Size/Alignment. > > class RegisterClass { > ... > ValueTypeListSelect VarRegTypes; // The names of these members > IntSelect VarRegSize; // could likely be improved... > IntSelect VarSpillSize; // > IntSelect VarSpillAlignment // > } > > > To fully implement the AddReg instruction, the target would then define > the register class: > > class MyRegisterClass : RegisterClass<...> { > let VarRegTypes = ValueTypeListSelect<[Mode64, Mode128], > [[i64, v2i32, v4i16, v8i8], // Mode64 > [i128, v2i64, v4i32, v8i16, v16i8]]>; // Mode128 > let VarRegSize = IntSelect<[Mode64, Mode128], [64, 128]>; > let VarSpillSize = IntSelect<[Mode64, Mode128], [64, 128]>; > let VarSpillAlignment = IntSelect<[Mode64, Mode128], [64, 128]>; > } > > def MyIntReg: MyRegisterClass { ... }; > > And following that, the instruction: > > def AddReg: Instruction { > let OutOperandList = (outs MyIntReg:$Rd); > let InOperandList = (ins MyIntReg:$Rs, MyIntReg:$Rt); > let AsmString = "add $Rd, $Rs, $Rt"; > let Pattern = [(set MyIntReg:$Rd, (add MyIntReg:$Rs, > MyIntReg:$Rt))]>; > } > > > > > -Krzysztof > >-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Alex Bradbury via llvm-dev
2016-Oct-08 19:52 UTC
[llvm-dev] RFC: Implement variable-sized register classes
On 4 October 2016 at 19:50, Krzysztof Parzyszek via llvm-dev <llvm-dev at lists.llvm.org> wrote:> If there are no objections, I'd like to start working on this soon... > > For the AMDGPU target this implies that RC->getSize will no longer be > available in the MC layer.Another advantage of this work that hasn't been mentioned yet is it will reduce the number of uses of isCodeGenOnly. The comment in Target.td indicates the long-term plan is to remove the distinction between isPseudo and isCodeGenOnly. A closely related to variable-sized register classes is the case where you have multiple registers with the same AsmName. This crops up in the same kind of cases where you have multiple instructions with the same encoding. Without a workaround, an assert is tripped in llvm-tblgen when trying to produce a StringSwitch for MatchRegisterName. The solution in Mips, PPC and others seems to be involve the generation of MatchRegisterName. What has been discussed so far with regards to HwMode and variable-size register classes points to a solution, but I don't think it's quite enough. Options include: 1. Only have one set of register definitions, and have the variable sized register class determine the bit width. The problem is there are often some instructions where I think you need to have registers modelled as subregisters. e.g. SLLW, ADDW etc in 64-bit RISC-V. These operate on 32-bit values and write the results sign-extended to the target 64-bit register. 2. Define both the 64-bit registers and the 32-bit subregisters, but make MatchRegisterName's behaviour change based on the HwMode. This works around the fact there are multiple registers with the same AsmName. Although I doubt this would actually cause problems, this still isn't quite right. For an `SLLIW x1, x2, 5` I think the correct interpretation would have x1 as a 64-bit target register and x2 as the 32-bit subregister that happens to have the same AsmName as the 64-bit x2 register. Have you thought about how the HwMode/variable-sized register class proposal might interact with register AsmNames at all? This old patch that never landed <http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20141201/246835.html> is also I think related. Backends like Mips and PPC end up defining RegisterOperand with a ParserMatchClass (in the Mips case, this specified the 'parseAnyRegister' ParserMethod. Adding a ParserMatchClass field to RegisterClass would be a minor simplification. Best, Alex