Matthias Braun via llvm-dev
2016-Feb-26 21:54 UTC
[llvm-dev] Reserved/Unallocatable Registers
Let's try this again after some longer offline discussions: = Reserved Registers The primary use of reserved registers is to hold values required by runtime conventions. Typical examples are the stack pointer, frame pointer maybe TLS base address, GOT address ... Zero registers and program counters are an odd special case for which we may be able to provide looser rules. == Rules =1) Reserved registers are always life: They are live-in and live-out of functions. There are no dead-defs, they are still alive after being clobbered by a regmask. 2) It is not legal to add or remove definitions of a reserved register; This implies that we cannot replace it with a different register or temporarily spill/reload it. 3) Calls are considered uses of reserved registers. That means you cannot reorder a write to a reserved register over a call, even if there is no explicit use operand on the call 4) The value of the reserved register can only change for instructions with a Def operand or regmask clobbering the register. This rule is just for clarification, all registers behave like this. See [1] for a note on program counter/time stamp registers. == Implications =- We skip Liveness analysis because we know a reserved register is live anyway. - Register allocators cannot use a reserved registers: It is never free and therefore considered unallocatable. - Scheduling has to consider the implicit use on calls - No special considerations necessary for copy propagation - Writes to a reserved register are not dead code, because the value is always live out! == Examples =Assume r0 is a normal register r1 is a reserved register: - We can remove the 2nd COPY here: r0 = COPY r1 ... use v0 r0 = COPY r1 ... use v0 - We can remove this COPY because r0 is unused: r0 = COPY r1 return - We cannot remove this COPY because r1 is live-out: r1 = COPY r0 return - We cannot reorder the add before the call. The call reads r1 so it has an anti dependency on the add. call foobar r1 = add r1, 10 == [1] Constant Registers =The rules above are designed for the case of normal registers which are reserved for runtime conventions. We also have the case of zero register. We have the concept of a constant register for them which allows us to ignore any reordering constraints and assume all uses read the same value. We should even be able to fit the program counter into the class of constant registers: The only practical use of reading the program counter is to find relative positions in position independent code (PIC), it is always used in combination with a relocation, which is adjusted to the actual position of the instruction. The value after adding this relocation is constant in the function! = Unallocatable Registers A reserved register is not allocatable, however there are also registers which are unallocatable just because they are explicitely excluded from allocation orders, they are not reserved registers. This can be confusing so I added this section talking about those! == Rules =They behave like normal registers, the only difference is that: 1) The register allocator will never assign an unallocatable register to a virtual register. == Motivation =Typical examples of unallocatable but no reserved registers are: - A CPUs flag register: The scheduler has to respect the order! We are interested in liveness, but we do not necessarily want to spill/reload them or perform register allocation on a single register. - X87 floating point stack registers need special handling on top of the generic register allocation == Impliciations =Except for the register allocator not using them they behave like normal registers: - We track the liveness of unallocatable registers - The scheduler respects data/output dependencies for unallocatable registers == Examples =Assume r0 is a normal register, r1 is an unallocatable register (but not a reserved one): - We can remove the 2nd COPY here: r0 = COPY r1 ... use v0 r0 = COPY r1 ... use v0 - We can remove remove the following two COPYs because the r0/r1 are not used afterwards: r0 = COPY r1 r1 = COPY r0 return - We can replace r1 with a different (normal register) here (provided we replace all following uses) r1 = ... // ... = use r1> On Feb 26, 2016, at 11:41 AM, Matt Arsenault via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > On 02/25/2016 06:14 PM, Matthias Braun via llvm-dev wrote: >> 1) The value read from a reserved register cannot be predicted. Reading a reserved register twice may each time produce a different result. > This seems broken to me that treating another copy should be assumed to produce a different result. This seems like it should be optimized, and have a special volatile_copy instruction for the special cases where the reserved register may randomly change. > > -Matt > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Quentin Colombet via llvm-dev
2016-Feb-27 01:09 UTC
[llvm-dev] Reserved/Unallocatable Registers
> On Feb 26, 2016, at 1:54 PM, Matthias Braun <mbraun at apple.com> wrote: > > Let's try this again after some longer offline discussions: > > = Reserved Registers > The primary use of reserved registers is to hold values required by runtime conventions. Typical examples are the stack pointer, frame pointer maybe TLS base address, GOT address ... > Zero registers and program counters are an odd special case for which we may be able to provide looser rules. > > == Rules => 1) Reserved registers are always life: They are live-in and live-out of functions. There are no dead-defs, they are still alive after being clobbered by a regmask. > 2) It is not legal to add or remove definitions of a reserved register; This implies that we cannot replace it with a different register or temporarily spill/reload it.Zero registers don’t follow that rule. AArch64 has a pass that set XZR for unused results and this is fine. I.e., we need to add a note like you did for #4.> 3) Calls are considered uses of reserved registers. That means you cannot reorder a write to a reserved register over a call, even if there is no explicit use operand on the call > 4) The value of the reserved register can only change for instructions with a Def operand or regmask clobbering the register. This rule is just for clarification, all registers behave like this. See [1] for a note on program counter/time stamp registers.Hmm, I don’t see how pc can fit this rule.> > == Implications => - We skip Liveness analysis because we know a reserved register is live anyway. > - Register allocators cannot use a reserved registers: It is never free and therefore considered unallocatable. > - Scheduling has to consider the implicit use on calls > - No special considerations necessary for copy propagationDitto for pc.> - Writes to a reserved register are not dead code, because the value is always live out! > > == Examples => Assume r0 is a normal register r1 is a reserved register: > > - We can remove the 2nd COPY here: > r0 = COPY r1 > ... use v0 > r0 = COPY r1 > ... use v0If r1 is pc, r0 is a different value now.> - We can remove this COPY because r0 is unused: > r0 = COPY r1 > return > - We cannot remove this COPY because r1 is live-out: > r1 = COPY r0 > return > - We cannot reorder the add before the call. The call reads r1 so it has an anti dependency on the add. > call foobar > r1 = add r1, 10 > > == [1] Constant Registers => The rules above are designed for the case of normal registers which are reserved for runtime conventions. We also have the case of zero register. We have the concept of a constant register for them which allows us to ignore any reordering constraints and assume all uses read the same value. > > We should even be able to fit the program counter into the class of constant registers: The only practical use of reading the program counter is to find relative positions in position independent code (PIC), it is always used in combination with a relocation, which is adjusted to the actual position of the instruction. The value after adding this relocation is constant in the function!That sounds like a far stretch to me. How pc can be considered constant? To summarize my thoughts, I believe reserved registers were introduced to fill the gap of want we don’t model. E.g., for pc for instance, each instruction should implicitly define it, then the actual use are predictable. Since we don’t do that, we need to conservatively assume that the value of a reserved register is unknown and that rule #4 is not true. Cheers, -Quentin> > = Unallocatable Registers > > A reserved register is not allocatable, however there are also registers which are unallocatable just because they are explicitely excluded from allocation orders, they are not reserved registers. This can be confusing so I added this section talking about those! > > == Rules => They behave like normal registers, the only difference is that: > 1) The register allocator will never assign an unallocatable register to a virtual register. > > == Motivation => Typical examples of unallocatable but no reserved registers are: > - A CPUs flag register: The scheduler has to respect the order! We are interested in liveness, but we do not necessarily want to spill/reload them or perform register allocation on a single register. > - X87 floating point stack registers need special handling on top of the generic register allocation > > == Impliciations => Except for the register allocator not using them they behave like normal registers: > - We track the liveness of unallocatable registers > - The scheduler respects data/output dependencies for unallocatable registers > > == Examples => Assume r0 is a normal register, r1 is an unallocatable register (but not a reserved one): > > - We can remove the 2nd COPY here: > r0 = COPY r1 > ... use v0 > r0 = COPY r1 > ... use v0 > > - We can remove remove the following two COPYs because the r0/r1 are not used afterwards: > r0 = COPY r1 > r1 = COPY r0 > return > > - We can replace r1 with a different (normal register) here (provided we replace all following uses) > r1 = ... > // ... > = use r1 > >> On Feb 26, 2016, at 11:41 AM, Matt Arsenault via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> >> On 02/25/2016 06:14 PM, Matthias Braun via llvm-dev wrote: >>> 1) The value read from a reserved register cannot be predicted. Reading a reserved register twice may each time produce a different result. >> This seems broken to me that treating another copy should be assumed to produce a different result. This seems like it should be optimized, and have a special volatile_copy instruction for the special cases where the reserved register may randomly change. >> >> -Matt >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160226/358af371/attachment-0001.html>
Krzysztof Parzyszek via llvm-dev
2016-Feb-27 15:46 UTC
[llvm-dev] Reserved/Unallocatable Registers
On 2/26/2016 7:09 PM, Quentin Colombet via llvm-dev wrote:> To summarize my thoughts, I believe reserved registers were introduced > to fill the gap of want we don’t model. E.g., for pc for instance, each > instruction should implicitly define it, then the actual use are > predictable. Since we don’t do that, we need to conservatively assume > that the value of a reserved register is unknown and that rule #4 is not > true.Maybe we need to extend the model then? From the point of view of register allocation, none of these registers are allocatable, so nothing would change, but from the point of view of scheduling, for example, certain moves involving reserved registers are legal, while others are not. For example: 1. Status (flags) register, such as the one in x86: it's a special register, but it cannot be modified except by an instruction that is known to alter it. (I know there was no direct copy from flags to a register on x86, but I can't think of a better example.) Dependencies on this register need to be respected. a. Special cases: Hexagon has USR (user status register), where one of the bits indicates "overflow". This bit is sticky and can only go from 0 to 1, except where it's cleared explicitly. Stores to this bit of the register can be reordered (except the "clear bit" case).[1] 2. Timer/cycle count registers, PC, etc.: these registers are "volatile" in the sense that they are modified in a way that does not depend on the semantics of the executed code in a predictable or controllable manner. Dependencies on such registers can be ignored, and two subsequent reads of their value are not guaranteed to be equal. a. Special case: one could imagine a special register, the reading of which in itself can have side-effects. In such case the reads should not be elinminated or duplicated. [1] In our local repository we have a hook in the subtarget info, which is called with the DAG as its argument after its construction, and where we remove the write-write dependencies on the overflow bit (which we model as a subregister of USR). This is really important for us for performance reasons, and it would be great to have an "official" way of handling such cases. -Krzysztof -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation