thr3ads.net - llvm dev - [llvm-dev] Reserved/Unallocatable Registers [Feb 2016]

If this information is useful, please help other people find it:
Share via:

Matthias Braun via llvm-dev

2016-Feb-26 21:54 UTC

[llvm-dev] Reserved/Unallocatable Registers

Let's try this again after some longer offline discussions:

= Reserved Registers The primary use of reserved registers is to hold values
required by runtime conventions. Typical examples are the stack pointer, frame
pointer maybe TLS base address, GOT address ...
Zero registers and program counters are an odd special case for which we may be
able to provide looser rules.

== Rules =1) Reserved registers are always life: They are live-in and live-out
of functions. There are no dead-defs, they are still alive after being clobbered
by a regmask.
2) It is not legal to add or remove definitions of a reserved register; This
implies that we cannot replace it with a different register or temporarily
spill/reload it.
3) Calls are considered uses of reserved registers. That means you cannot
reorder a write to a reserved register over a call, even if there is no explicit
use operand on the call
4) The value of the reserved register can only change for instructions with a
Def operand or regmask clobbering the register. This rule is just for
clarification, all registers behave like this. See [1] for a note on program
counter/time stamp registers.

== Implications =- We skip Liveness analysis because we know a reserved register
is live anyway.
- Register allocators cannot use a reserved registers: It is never free and
therefore considered unallocatable.
- Scheduling has to consider the implicit use on calls
- No special considerations necessary for copy propagation
- Writes to a reserved register are not dead code, because the value is always
live out!

== Examples =Assume r0 is a normal register r1 is a reserved register:

- We can remove the 2nd COPY here:
  r0 = COPY r1
      ... use v0
  r0 = COPY r1
      ... use v0
- We can remove this COPY because r0 is unused:
    r0 = COPY r1
    return
- We cannot remove this COPY because r1 is live-out:
    r1 = COPY r0
    return
- We cannot reorder the add before the call. The call reads r1 so it has an anti
dependency on the add.
    call foobar
    r1 = add r1, 10

== [1] Constant Registers =The rules above are designed for the case of normal
registers which are reserved for runtime conventions. We also have the case of
zero register. We have the concept of a constant register for them which allows
us to ignore any reordering constraints and assume all uses read the same value.

We should even be able to fit the program counter into the class of constant
registers: The only practical use of reading the program counter is to find
relative positions in position independent code (PIC), it is always used in
combination with a relocation, which is adjusted to the actual position of the
instruction. The value after adding this relocation is constant in the function!

= Unallocatable Registers 
A reserved register is not allocatable, however there are also registers which
are unallocatable just because they are explicitely excluded from allocation
orders, they are not reserved registers. This can be confusing so I added this
section talking about those!

== Rules =They behave like normal registers, the only difference is that:
1) The register allocator will never assign an unallocatable register to a
virtual register.

== Motivation =Typical examples of unallocatable but no reserved registers are:
- A CPUs flag register: The scheduler has to respect the order! We are
interested in liveness, but we do not necessarily want to spill/reload them or
perform register allocation on a single register.
- X87 floating point stack registers need special handling on top of the generic
register allocation

== Impliciations =Except for the register allocator not using them they behave
like normal registers:
- We track the liveness of unallocatable registers
- The scheduler respects data/output dependencies for unallocatable registers

== Examples =Assume r0 is a normal register, r1 is an unallocatable register
(but not a reserved one):

- We can remove the 2nd COPY here:
  r0 = COPY r1
      ... use v0
  r0 = COPY r1
      ... use v0

- We can remove remove the following two COPYs because the r0/r1 are not used
afterwards:
  r0 = COPY r1
  r1 = COPY r0
  return 

- We can replace r1 with a different (normal register) here (provided we replace
all following uses)
 r1 = ...
     // ...
     = use r1
> On Feb 26, 2016, at 11:41 AM, Matt Arsenault via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> On 02/25/2016 06:14 PM, Matthias Braun via llvm-dev wrote:
>> 1) The value read from a reserved register cannot be predicted. Reading
a reserved register twice may each time produce a different result.
> This seems broken to me that treating another copy should be assumed to
produce a different result. This seems like it should be optimized, and have a
special volatile_copy instruction for the special cases where the reserved
register may randomly change.
> 
> -Matt
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Quentin Colombet via llvm-dev

2016-Feb-27 01:09 UTC

head link

[llvm-dev] Reserved/Unallocatable Registers

> On Feb 26, 2016, at 1:54 PM, Matthias Braun <mbraun at apple.com>
wrote:
> 
> Let's try this again after some longer offline discussions:
> 
> = Reserved Registers > The primary use of reserved registers is to hold
values required by runtime conventions. Typical examples are the stack pointer,
frame pointer maybe TLS base address, GOT address ...
> Zero registers and program counters are an odd special case for which we
may be able to provide looser rules.
> 
> == Rules => 1) Reserved registers are always life: They are live-in and
live-out of functions. There are no dead-defs, they are still alive after being
clobbered by a regmask.
> 2) It is not legal to add or remove definitions of a reserved register;
This implies that we cannot replace it with a different register or temporarily
spill/reload it.
Zero registers don’t follow that rule. AArch64 has a pass that set XZR for
unused results and this is fine.
I.e., we need to add a note like you did for #4.
> 3) Calls are considered uses of reserved registers. That means you cannot
reorder a write to a reserved register over a call, even if there is no explicit
use operand on the call
> 4) The value of the reserved register can only change for instructions with
a Def operand or regmask clobbering the register. This rule is just for
clarification, all registers behave like this. See [1] for a note on program
counter/time stamp registers.
Hmm, I don’t see how pc can fit this rule.
> 
> == Implications => - We skip Liveness analysis because we know a
reserved register is live anyway.
> - Register allocators cannot use a reserved registers: It is never free and
therefore considered unallocatable.
> - Scheduling has to consider the implicit use on calls
> - No special considerations necessary for copy propagation
Ditto for pc.
> - Writes to a reserved register are not dead code, because the value is
always live out!
> 
> == Examples => Assume r0 is a normal register r1 is a reserved register:
> 
> - We can remove the 2nd COPY here:
>  r0 = COPY r1
>      ... use v0
>  r0 = COPY r1
>      ... use v0
If r1 is pc, r0 is a different value now.
> - We can remove this COPY because r0 is unused:
>    r0 = COPY r1
>    return
> - We cannot remove this COPY because r1 is live-out:
>    r1 = COPY r0
>    return
> - We cannot reorder the add before the call. The call reads r1 so it has an
anti dependency on the add.
>    call foobar
>    r1 = add r1, 10
> 
> == [1] Constant Registers => The rules above are designed for the case
of normal registers which are reserved for runtime conventions. We also have the
case of zero register. We have the concept of a constant register for them which
allows us to ignore any reordering constraints and assume all uses read the same
value.
> 
> We should even be able to fit the program counter into the class of
constant registers: The only practical use of reading the program counter is to
find relative positions in position independent code (PIC), it is always used in
combination with a relocation, which is adjusted to the actual position of the
instruction. The value after adding this relocation is constant in the function!
That sounds like a far stretch to me. How pc can be considered constant?
To summarize my thoughts, I believe reserved registers were introduced to fill
the gap of want we don’t model. E.g., for pc for instance, each instruction
should implicitly define it, then the actual use are predictable. Since we don’t
do that, we need to conservatively assume that the value of a reserved register
is unknown and that rule #4 is not true.

Cheers,
-Quentin
> 
> = Unallocatable Registers > 
> A reserved register is not allocatable, however there are also registers
which are unallocatable just because they are explicitely excluded from
allocation orders, they are not reserved registers. This can be confusing so I
added this section talking about those!
> 
> == Rules => They behave like normal registers, the only difference is
that:
> 1) The register allocator will never assign an unallocatable register to a
virtual register.
> 
> == Motivation => Typical examples of unallocatable but no reserved
registers are:
> - A CPUs flag register: The scheduler has to respect the order! We are
interested in liveness, but we do not necessarily want to spill/reload them or
perform register allocation on a single register.
> - X87 floating point stack registers need special handling on top of the
generic register allocation
> 
> == Impliciations => Except for the register allocator not using them
they behave like normal registers:
> - We track the liveness of unallocatable registers
> - The scheduler respects data/output dependencies for unallocatable
registers
> 
> == Examples => Assume r0 is a normal register, r1 is an unallocatable
register (but not a reserved one):
> 
> - We can remove the 2nd COPY here:
>  r0 = COPY r1
>      ... use v0
>  r0 = COPY r1
>      ... use v0
> 
> - We can remove remove the following two COPYs because the r0/r1 are not
used afterwards:
>  r0 = COPY r1
>  r1 = COPY r0
>  return 
> 
> - We can replace r1 with a different (normal register) here (provided we
replace all following uses)
> r1 = ...
>     // ...
>     = use r1
> 
>> On Feb 26, 2016, at 11:41 AM, Matt Arsenault via llvm-dev <llvm-dev
at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>> 
>> On 02/25/2016 06:14 PM, Matthias Braun via llvm-dev wrote:
>>> 1) The value read from a reserved register cannot be predicted.
Reading a reserved register twice may each time produce a different result.
>> This seems broken to me that treating another copy should be assumed to
produce a different result. This seems like it should be optimized, and have a
special volatile_copy instruction for the special cases where the reserved
register may randomly change.
>> 
>> -Matt
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160226/358af371/attachment-0001.html>

Krzysztof Parzyszek via llvm-dev

2016-Feb-27 15:46 UTC

head link

[llvm-dev] Reserved/Unallocatable Registers

On 2/26/2016 7:09 PM, Quentin Colombet via llvm-dev
wrote:> To summarize my thoughts, I believe reserved registers were introduced
> to fill the gap of want we don’t model. E.g., for pc for instance, each
> instruction should implicitly define it, then the actual use are
> predictable. Since we don’t do that, we need to conservatively assume
> that the value of a reserved register is unknown and that rule #4 is not
> true.
Maybe we need to extend the model then?  From the point of view of 
register allocation, none of these registers are allocatable, so nothing 
would change, but from the point of view of scheduling, for example, 
certain moves involving reserved registers are legal, while others are not.

For example:
1. Status (flags) register, such as the one in x86: it's a special 
register, but it cannot be modified except by an instruction that is 
known to alter it. (I know there was no direct copy from flags to a 
register on x86, but I can't think of a better example.)  Dependencies 
on this register need to be respected.
   a. Special cases: Hexagon has USR (user status register), where one 
of the bits indicates "overflow". This bit is sticky and can only go 
from 0 to 1, except where it's cleared explicitly. Stores to this bit of 
the register can be reordered (except the "clear bit" case).[1]

2. Timer/cycle count registers, PC, etc.: these registers are
"volatile"
in the sense that they are modified in a way that does not depend on the 
semantics of the executed code in a predictable or controllable manner. 
Dependencies on such registers can be ignored, and two subsequent reads 
of their value are not guaranteed to be equal.
   a. Special case: one could imagine a special register, the reading of 
which in itself can have side-effects. In such case the reads should not 
be elinminated or duplicated.

[1] In our local repository we have a hook in the subtarget info, which 
is called with the DAG as its argument after its construction, and where 
we remove the write-write dependencies on the overflow bit (which we 
model as a subregister of USR). This is really important for us for 
performance reasons, and it would be great to have an "official" way
of
handling such cases.

-Krzysztof

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, 
hosted by The Linux Foundation

Reasonably Related Threads

Search for more apparently analagous threads

llvm dev - Feb 2016 - Reserved/Unallocatable Registers

[llvm-dev] Reserved/Unallocatable Registers

[llvm-dev] Reserved/Unallocatable Registers

[llvm-dev] Reserved/Unallocatable Registers

Reasonably Related Threads