Quentin Colombet via llvm-dev
2016-Feb-26 18:56 UTC
[llvm-dev] Reserved/Unallocatable Registers
Hi Matthias, Thanks for doing this. Each time we talk about it, it takes us 10 min to rebuild those rules form our recollection, so definitely useful to write them down. I am in agreement with what you wrote down. I just think we need additional rules for the constant registers like Jakob mentioned: - Their value is constant (i.e., copy propagation is fine, unlike regular reserved registers). - In particular, writing to them does not change their value. Cheers, -Quentin> On Feb 26, 2016, at 8:59 AM, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote: > > Hi Matthias, > > This pretty much matches my memory. I think that the rules are a bit ad hoc and not followed to the letter everywhere. It would be good to codify something concrete. > > I thought that I added some way of distinguishing between constant registers and other reserved registers but I can’t find it now. We do some register coalescing that is not consistent with your rules here: If a virtual register is defined as a copy of a constant register, we will replace the virtual register with the constant register. See RegisterCoalescer::canJoinPhys(). This can mean the the register is read multiple times. This optimization was added for the ARM64 zero register. > > Thanks, > /jakob > > >> On Feb 25, 2016, at 18:14, Matthias Braun <mbraun at apple.com> wrote: >> >> Lately I have had a few discussions of what it means for a register to be unallocatable or reserved. As this comes up every now and again and I often struggled answering such questions I decided to write down some definite rules and codify the current usage and assumptions. I plan to put the rules below into the doxygen comments of MachineRegisterInfo etc. And I also hope that people will correct me if I am wrong or miss something here! >> >> = Reserved Registers >> >> == Rules =>> 1) The value read from a reserved register cannot be predicted. Reading a reserved register twice may each time produce a different result. >> 2) Writing to a reserved register may affect more than just the register. >> 3) Nonetheless reading/writing reserved registers imposes no constraints on reordering instructions. >> >> == Motivation =>> Generic backend code, especially register allocators make assumptions about how registers behave. These include things like the value in a register only changes in instruction with a def or regmask/clobber operand for that register or writing to the register changes its value but has no further effects. There are often cases where we need exceptions to these rules, typical examples of this are: >> - zero registers (e.g. SPARC g0): They always stay zero even if we write other values. >> - program counters (e.g. ARM PC): Their value changes with every instruction and writing into them cause control flow to change. >> - Stack pointer, Frame pointer: They mostly behave like normal registers but we do not want to impose scheduling constraints; Even if the stack pointers value changes because we reordered an instruction we can usually fix this up by adjusting offsets in load/store operations. >> >> So obviously we exclude these registers from register allocation and cannot make too many assumptions about them. However regardless of the alien semantic we still want to model them as registers because that is how they are modeled in most instruction encodings. >> >> == Implications =>> - Register allocators will never assign reserved registers to a virtual register. A reserved register is always unallocatable, but an unallocatable register is not necessary a reserved one! >> - Liveness analysis makes no sense for reserved registers. >> - The rules above do not free us from modeling the instruction effects properly! Instructions writing to PC must be marked as terminators, we need to add barrier flags if we want to restrict the reordering of time stamp counters, ... >> >> == Examples =>> Assume r0 is a normal register r1 is a reserved register: >> >> - We cannot remove the 2nd COPY here because we may read a different value from r1: >> r0 = COPY r1 >> ... use v0 >> r0 = COPY r1 >> ... use v0 >> >> - We can remove this COPY because r0 is unused: >> r0 = COPY r1 >> return >> >> - We cannot remove this COPY even if r1 appears unused afterwards. We also cannot replace r1 with a different register. >> r1 = COPY r0 >> >> - We can reorder these instructions in any way: >> >> STORE r0 to r1+12 >> STORE r0 to r1+8 >> ... = LOAD from r1 + 20 >> >> >> >> = Unallocatable Registers >> >> A reserved register is not allocatable, however there are also registers which are unallocatable just because they are explicitely excluded from allocation orders, they are not reserved registers. This can be confusing so I added this section talking about those! >> >> == Rules =>> They behave like normal registers, the only difference is that: >> 1) The register allocator will never assign an unallocatable register to a virtual register. >> >> == Motivation =>> Typical examples of unallocatable but no reserved registers are: >> - A CPUs flag register: The scheduler has to respect the order! We are interested in liveness, but we do not necessarily want to spill/reload them or perform register allocation on a single register. >> - X87 floating point stack registers need special handling on top of the generic register allocation >> >> == Impliciations =>> Except for the register allocator not using them they behave like normal registers: >> - We track the liveness of unallocatable registers >> - The scheduler respects data/output dependencies for unallocatable registers >> >> == Examples =>> Assume r0 is a normal register, r1 is an unallocatable register (but not a reserved one): >> >> - We can remove the 2nd COPY here: >> r0 = COPY r1 >> ... use v0 >> r0 = COPY r1 >> ... use v0 >> >> - We can remove remove the following two COPYs because the r0/r1 are not used afterwards: >> r0 = COPY r1 >> r1 = COPY r0 >> return >> >> - We can replace r1 with a different (normal register) here (provided we replace all following uses) >> r1 = ... >> // ... >> = use r1 >
Matthias Braun via llvm-dev
2016-Feb-26 19:07 UTC
[llvm-dev] Reserved/Unallocatable Registers
There is MachineRegisterInfo::isConstantPhysReg(), in the current implementation this just returns true if it cannot find any def operand for the register (or on of its aliases). I think we also write to zero registers at times and then this function would return false... For this to work reliably targets would need to provide the constant information explicitely. For the "writing to them does not change their value": As long as we do not make any assumptions about the values in the register anyway (rule 1 below) knowing this fact doesn't help... Though knowing that we have a zero register would indeed allow us to do some copy propagation, coalescing and removing unnecessary data dependencies. - Matthias> On Feb 26, 2016, at 10:56 AM, Quentin Colombet <qcolombet at apple.com> wrote: > > Hi Matthias, > > Thanks for doing this. Each time we talk about it, it takes us 10 min to rebuild those rules form our recollection, so definitely useful to write them down. > > I am in agreement with what you wrote down. > > I just think we need additional rules for the constant registers like Jakob mentioned: > - Their value is constant (i.e., copy propagation is fine, unlike regular reserved registers). > - In particular, writing to them does not change their value. > > Cheers, > -Quentin >> On Feb 26, 2016, at 8:59 AM, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote: >> >> Hi Matthias, >> >> This pretty much matches my memory. I think that the rules are a bit ad hoc and not followed to the letter everywhere. It would be good to codify something concrete. >> >> I thought that I added some way of distinguishing between constant registers and other reserved registers but I can’t find it now. We do some register coalescing that is not consistent with your rules here: If a virtual register is defined as a copy of a constant register, we will replace the virtual register with the constant register. See RegisterCoalescer::canJoinPhys(). This can mean the the register is read multiple times. This optimization was added for the ARM64 zero register. >> >> Thanks, >> /jakob >> >> >>> On Feb 25, 2016, at 18:14, Matthias Braun <mbraun at apple.com> wrote: >>> >>> Lately I have had a few discussions of what it means for a register to be unallocatable or reserved. As this comes up every now and again and I often struggled answering such questions I decided to write down some definite rules and codify the current usage and assumptions. I plan to put the rules below into the doxygen comments of MachineRegisterInfo etc. And I also hope that people will correct me if I am wrong or miss something here! >>> >>> = Reserved Registers >>> >>> == Rules =>>> 1) The value read from a reserved register cannot be predicted. Reading a reserved register twice may each time produce a different result. >>> 2) Writing to a reserved register may affect more than just the register. >>> 3) Nonetheless reading/writing reserved registers imposes no constraints on reordering instructions. >>> >>> == Motivation =>>> Generic backend code, especially register allocators make assumptions about how registers behave. These include things like the value in a register only changes in instruction with a def or regmask/clobber operand for that register or writing to the register changes its value but has no further effects. There are often cases where we need exceptions to these rules, typical examples of this are: >>> - zero registers (e.g. SPARC g0): They always stay zero even if we write other values. >>> - program counters (e.g. ARM PC): Their value changes with every instruction and writing into them cause control flow to change. >>> - Stack pointer, Frame pointer: They mostly behave like normal registers but we do not want to impose scheduling constraints; Even if the stack pointers value changes because we reordered an instruction we can usually fix this up by adjusting offsets in load/store operations. >>> >>> So obviously we exclude these registers from register allocation and cannot make too many assumptions about them. However regardless of the alien semantic we still want to model them as registers because that is how they are modeled in most instruction encodings. >>> >>> == Implications =>>> - Register allocators will never assign reserved registers to a virtual register. A reserved register is always unallocatable, but an unallocatable register is not necessary a reserved one! >>> - Liveness analysis makes no sense for reserved registers. >>> - The rules above do not free us from modeling the instruction effects properly! Instructions writing to PC must be marked as terminators, we need to add barrier flags if we want to restrict the reordering of time stamp counters, ... >>> >>> == Examples =>>> Assume r0 is a normal register r1 is a reserved register: >>> >>> - We cannot remove the 2nd COPY here because we may read a different value from r1: >>> r0 = COPY r1 >>> ... use v0 >>> r0 = COPY r1 >>> ... use v0 >>> >>> - We can remove this COPY because r0 is unused: >>> r0 = COPY r1 >>> return >>> >>> - We cannot remove this COPY even if r1 appears unused afterwards. We also cannot replace r1 with a different register. >>> r1 = COPY r0 >>> >>> - We can reorder these instructions in any way: >>> >>> STORE r0 to r1+12 >>> STORE r0 to r1+8 >>> ... = LOAD from r1 + 20 >>> >>> >>> >>> = Unallocatable Registers >>> >>> A reserved register is not allocatable, however there are also registers which are unallocatable just because they are explicitely excluded from allocation orders, they are not reserved registers. This can be confusing so I added this section talking about those! >>> >>> == Rules =>>> They behave like normal registers, the only difference is that: >>> 1) The register allocator will never assign an unallocatable register to a virtual register. >>> >>> == Motivation =>>> Typical examples of unallocatable but no reserved registers are: >>> - A CPUs flag register: The scheduler has to respect the order! We are interested in liveness, but we do not necessarily want to spill/reload them or perform register allocation on a single register. >>> - X87 floating point stack registers need special handling on top of the generic register allocation >>> >>> == Impliciations =>>> Except for the register allocator not using them they behave like normal registers: >>> - We track the liveness of unallocatable registers >>> - The scheduler respects data/output dependencies for unallocatable registers >>> >>> == Examples =>>> Assume r0 is a normal register, r1 is an unallocatable register (but not a reserved one): >>> >>> - We can remove the 2nd COPY here: >>> r0 = COPY r1 >>> ... use v0 >>> r0 = COPY r1 >>> ... use v0 >>> >>> - We can remove remove the following two COPYs because the r0/r1 are not used afterwards: >>> r0 = COPY r1 >>> r1 = COPY r0 >>> return >>> >>> - We can replace r1 with a different (normal register) here (provided we replace all following uses) >>> r1 = ... >>> // ... >>> = use r1 >> >
Quentin Colombet via llvm-dev
2016-Feb-26 19:13 UTC
[llvm-dev] Reserved/Unallocatable Registers
> On Feb 26, 2016, at 11:07 AM, Matthias Braun <mbraun at apple.com> wrote: > > There is MachineRegisterInfo::isConstantPhysReg(), in the current implementation this just returns true if it cannot find any def operand for the register (or on of its aliases). I think we also write to zero registers at times and then this function would return false... For this to work reliably targets would need to provide the constant information explicitely. > > For the "writing to them does not change their value": As long as we do not make any assumptions about the values in the register anyway (rule 1 below) knowing this fact doesn't help…That’s the thing, with really constant register, rule one does not apply, right? I.e., we could do dead code and such. That’s funny, I thought like Jakob we add something different than the isConstantPhysReg thing, hmm…> Though knowing that we have a zero register would indeed allow us to do some copy propagation, coalescing and removing unnecessary data dependencies. > > - Matthias > >> On Feb 26, 2016, at 10:56 AM, Quentin Colombet <qcolombet at apple.com> wrote: >> >> Hi Matthias, >> >> Thanks for doing this. Each time we talk about it, it takes us 10 min to rebuild those rules form our recollection, so definitely useful to write them down. >> >> I am in agreement with what you wrote down. >> >> I just think we need additional rules for the constant registers like Jakob mentioned: >> - Their value is constant (i.e., copy propagation is fine, unlike regular reserved registers). >> - In particular, writing to them does not change their value. >> >> Cheers, >> -Quentin >>> On Feb 26, 2016, at 8:59 AM, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote: >>> >>> Hi Matthias, >>> >>> This pretty much matches my memory. I think that the rules are a bit ad hoc and not followed to the letter everywhere. It would be good to codify something concrete. >>> >>> I thought that I added some way of distinguishing between constant registers and other reserved registers but I can’t find it now. We do some register coalescing that is not consistent with your rules here: If a virtual register is defined as a copy of a constant register, we will replace the virtual register with the constant register. See RegisterCoalescer::canJoinPhys(). This can mean the the register is read multiple times. This optimization was added for the ARM64 zero register. >>> >>> Thanks, >>> /jakob >>> >>> >>>> On Feb 25, 2016, at 18:14, Matthias Braun <mbraun at apple.com> wrote: >>>> >>>> Lately I have had a few discussions of what it means for a register to be unallocatable or reserved. As this comes up every now and again and I often struggled answering such questions I decided to write down some definite rules and codify the current usage and assumptions. I plan to put the rules below into the doxygen comments of MachineRegisterInfo etc. And I also hope that people will correct me if I am wrong or miss something here! >>>> >>>> = Reserved Registers >>>> >>>> == Rules =>>>> 1) The value read from a reserved register cannot be predicted. Reading a reserved register twice may each time produce a different result. >>>> 2) Writing to a reserved register may affect more than just the register. >>>> 3) Nonetheless reading/writing reserved registers imposes no constraints on reordering instructions. >>>> >>>> == Motivation =>>>> Generic backend code, especially register allocators make assumptions about how registers behave. These include things like the value in a register only changes in instruction with a def or regmask/clobber operand for that register or writing to the register changes its value but has no further effects. There are often cases where we need exceptions to these rules, typical examples of this are: >>>> - zero registers (e.g. SPARC g0): They always stay zero even if we write other values. >>>> - program counters (e.g. ARM PC): Their value changes with every instruction and writing into them cause control flow to change. >>>> - Stack pointer, Frame pointer: They mostly behave like normal registers but we do not want to impose scheduling constraints; Even if the stack pointers value changes because we reordered an instruction we can usually fix this up by adjusting offsets in load/store operations. >>>> >>>> So obviously we exclude these registers from register allocation and cannot make too many assumptions about them. However regardless of the alien semantic we still want to model them as registers because that is how they are modeled in most instruction encodings. >>>> >>>> == Implications =>>>> - Register allocators will never assign reserved registers to a virtual register. A reserved register is always unallocatable, but an unallocatable register is not necessary a reserved one! >>>> - Liveness analysis makes no sense for reserved registers. >>>> - The rules above do not free us from modeling the instruction effects properly! Instructions writing to PC must be marked as terminators, we need to add barrier flags if we want to restrict the reordering of time stamp counters, ... >>>> >>>> == Examples =>>>> Assume r0 is a normal register r1 is a reserved register: >>>> >>>> - We cannot remove the 2nd COPY here because we may read a different value from r1: >>>> r0 = COPY r1 >>>> ... use v0 >>>> r0 = COPY r1 >>>> ... use v0 >>>> >>>> - We can remove this COPY because r0 is unused: >>>> r0 = COPY r1 >>>> return >>>> >>>> - We cannot remove this COPY even if r1 appears unused afterwards. We also cannot replace r1 with a different register. >>>> r1 = COPY r0 >>>> >>>> - We can reorder these instructions in any way: >>>> >>>> STORE r0 to r1+12 >>>> STORE r0 to r1+8 >>>> ... = LOAD from r1 + 20 >>>> >>>> >>>> >>>> = Unallocatable Registers >>>> >>>> A reserved register is not allocatable, however there are also registers which are unallocatable just because they are explicitely excluded from allocation orders, they are not reserved registers. This can be confusing so I added this section talking about those! >>>> >>>> == Rules =>>>> They behave like normal registers, the only difference is that: >>>> 1) The register allocator will never assign an unallocatable register to a virtual register. >>>> >>>> == Motivation =>>>> Typical examples of unallocatable but no reserved registers are: >>>> - A CPUs flag register: The scheduler has to respect the order! We are interested in liveness, but we do not necessarily want to spill/reload them or perform register allocation on a single register. >>>> - X87 floating point stack registers need special handling on top of the generic register allocation >>>> >>>> == Impliciations =>>>> Except for the register allocator not using them they behave like normal registers: >>>> - We track the liveness of unallocatable registers >>>> - The scheduler respects data/output dependencies for unallocatable registers >>>> >>>> == Examples =>>>> Assume r0 is a normal register, r1 is an unallocatable register (but not a reserved one): >>>> >>>> - We can remove the 2nd COPY here: >>>> r0 = COPY r1 >>>> ... use v0 >>>> r0 = COPY r1 >>>> ... use v0 >>>> >>>> - We can remove remove the following two COPYs because the r0/r1 are not used afterwards: >>>> r0 = COPY r1 >>>> r1 = COPY r0 >>>> return >>>> >>>> - We can replace r1 with a different (normal register) here (provided we replace all following uses) >>>> r1 = ... >>>> // ... >>>> = use r1 >>> >> >