Mikael Holmén
2014-Aug-15 10:42 UTC
[LLVMdev] Help with definition of subregisters; spill, rematerialization and implicit uses
Hi, I have a problem regarding sub-register definitions and LiveIntervals on our target. When a subregister is defined, other parts of the register are always left untouched - they are neither read or def:ed. It however seems that Codegen treats subregister definitions as somehow clobbering the whole register. The SSA-code looks like this after isel: (Reg0 and Reg1 are 16bit registers. Reg2, Reg3 and Reg4 are 32 bit registers with 16bit subregs, hi16 and lo16.) Reg0 = #imm0 Reg1 = #imm1 Reg2 = IMPLICIT_DEF Reg3 = INSERT_SUBREG Reg2, Reg0, hi16 Reg4 = INSERT_SUBREG Reg3, Reg1, lo16 After TwoAddressInstructionPass it becomes: Reg5:hi16<def,read-undef> = Reg0 Reg5:lo16<def> = Reg1 So, in my world this means a setting of the high 16 bits in Reg5 (not affecting the low part) followed by a setting of the low 16 bits (not affecting the high part). Is this how LLVM looks at it too? (What does the "read-undef" part really mean, since in my world, the setting of lo16 or hi16 does not in any way affect or access the other part of the register.) The question is: How should true subregister definitions be expressed so that they do not interfere with each other? See the detailed problem description below. --- During RA it's decided Reg5 should be spilled and it's also decided Reg5 can be rematerialized: "Value Reg5:0 at 5000r may remat from Reg5:hi16<def,read-undef> = mv_any16 32766" So it says Reg5 can be rematerialized by setting it's high part... We also get: reload: 5052r Reg5<def> = Load40FI <fi#2> rewrite: 5056r Reg5:lo16<def> = mv_nimm6_ar16 0 So it inserts a reload of the full Reg5 prior to the setting of Reg5:lo16, because it thinks there is an implicit use of Reg5 when writing the low part??? This seems very weird to me. The decision is based on the fact that MachineOperand::readsReg() returns true: /// readsReg - Returns true if this operand reads the previous value of its /// register. A use operand with the <undef> flag set doesn't read its /// register. A sub-register def implicitly reads the other parts of the /// register being redefined unless the <undef> flag is set. /// /// This refers to reading the register value from before the current /// instruction or bundle. Internal bundle reads are not included. bool readsReg() const { assert(isReg() && "Wrong MachineOperand accessor"); return !isUndef() && !isInternalRead() && (isUse() || getSubReg()); } I don't get why we automatically should get an implicit use just because we are writing a subreg. Since Reg5:lo16 is defined with Reg5:lo16<def> = Reg1 isUndef() will return false and getSubReg() true, and thus readsReg() true and the reload is inserted. Then we get *** Bad machine code: Instruction loads from dead spill slot *** because the spill slot has not been written. Most grateful for any help, Mikael Holmén
Quentin Colombet
2014-Aug-15 17:01 UTC
[LLVMdev] Help with definition of subregisters; spill, rematerialization and implicit uses
Hi Mikael, On Aug 15, 2014, at 3:42 AM, Mikael Holmén <mikael.holmen at ericsson.com> wrote:> Hi, > > I have a problem regarding sub-register definitions and LiveIntervals on > our target. When a subregister is defined, other parts of the register > are always left untouched - they are neither read or def:ed. > > It however seems that Codegen treats subregister definitions as somehow > clobbering the whole register. > > The SSA-code looks like this after isel: > > (Reg0 and Reg1 are 16bit registers. Reg2, Reg3 and Reg4 are 32 bit > registers with 16bit subregs, hi16 and lo16.) > > Reg0 = #imm0 > Reg1 = #imm1 > > Reg2 = IMPLICIT_DEF > Reg3 = INSERT_SUBREG Reg2, Reg0, hi16 > Reg4 = INSERT_SUBREG Reg3, Reg1, lo16 > > After TwoAddressInstructionPass it becomes: > > Reg5:hi16<def,read-undef> = Reg0 > Reg5:lo16<def> = Reg1 > > So, in my world this means a setting of the high 16 bits in Reg5 (not > affecting the low part) followed by a setting of the low 16 bits (not > affecting the high part). Is this how LLVM looks at it too?Yes, it is.> > (What does the "read-undef" part really mean, since in my > world, the setting of lo16 or hi16 does not in any way affect or > access the other part of the register.)read-undef means that we care only about the part we are defining. Reg5:hi16<def,read-undef> means that whatever is set in Reg5, but hi16, we do not care. Reg5:lo16<def> means that we do care about the value of Reg5:hi16. Here is the exact definition: /// IsUndef - True if this register operand reads an "undef" value, i.e. the /// read value doesn't matter. This flag can be set on both use and def /// operands. On a sub-register def operand, it refers to the part of the /// register that isn't written. On a full-register def operand, it is a /// noop. See readsReg().> > The question is: How should true subregister definitions be > expressed so that they do not interfere with each other? See the > detailed problem description below.We do have a limitation in our current liveness tracking for sub-register. Therefore, I am not sure that is possible. Conceptually, what you want is: Reg5:hi16<def,read-undef> Reg5:lo16<def,read-undef> However, I guess this wouldn’t be supported, because it would mean that we do not care about the value of hi16 at the definition of Reg5:lo16. This is true, but after this definition we do care about hi16 and I am afraid read-undef does not convey the right information for the subsequent uses of Reg5. You can give it a try and see how it goes.> > --- > > During RA it's decided Reg5 should be spilled and it's also decided Reg5 > can be rematerialized: > > "Value Reg5:0 at 5000r may remat from Reg5:hi16<def,read-undef> = mv_any16 > 32766" > > So it says Reg5 can be rematerialized by setting it's high part... > > We also get: > > reload: 5052r Reg5<def> = Load40FI <fi#2> > rewrite: 5056r Reg5:lo16<def> = mv_nimm6_ar16 0 > > So it inserts a reload of the full Reg5 prior to the setting of > Reg5:lo16, because it thinks there is an implicit use of Reg5 when > writing the low part??? This seems very weird to me. > > The decision is based on the fact that MachineOperand::readsReg() > returns true: > > /// readsReg - Returns true if this operand reads the previous value > of its > /// register. A use operand with the <undef> flag set doesn't read its > /// register. A sub-register def implicitly reads the other parts of the > /// register being redefined unless the <undef> flag is set. > /// > /// This refers to reading the register value from before the current > /// instruction or bundle. Internal bundle reads are not included. > bool readsReg() const { > assert(isReg() && "Wrong MachineOperand accessor"); > return !isUndef() && !isInternalRead() && (isUse() || getSubReg()); > } > > I don't get why we automatically should get an implicit use just > because we are writing a subreg. > > Since Reg5:lo16 is defined with > > Reg5:lo16<def> = Reg1 > > isUndef() will return false and getSubReg() true, and thus readsReg() > true and the reload is inserted. > > Then we get > > *** Bad machine code: Instruction loads from dead spill slot *** > > because the spill slot has not been written.This is weird. Any chance you could share a test case? Thanks, -Quentin> > Most grateful for any help, > Mikael Holmén > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140815/069a1fbb/attachment.html>
Mikael Holmén
2014-Aug-19 11:09 UTC
[LLVMdev] Help with definition of subregisters; spill, rematerialization and implicit uses
Hi Quentin, On 08/15/14 19:01, Quentin Colombet wrote: [...]>> The question is: How should true subregister definitions be >> expressed so that they do not interfere with each other? See the >> detailed problem description below. > > We do have a limitation in our current liveness tracking for > sub-register. Therefore, I am not sure that is possible. > > Conceptually, what you want is: > Reg5:hi16<def,read-undef> > Reg5:lo16<def,read-undef> > > However, I guess this wouldn’t be supported, because it would mean that > we do not care about the value of hi16 at the definition of Reg5:lo16. > This is true, but after this definition we do care about hi16 and I am > afraid read-undef does not convey the right information for the > subsequent uses of Reg5. > > You can give it a try and see how it goes.I tried setting isUndef to trie when handling INSERT_SUBREG in TwoAddressInstructioPass.cpp, but then I run into stuff like this instead: 832B %vreg50:hi16<def,read-undef> = COPY %vreg0 848B ... 864B %vreg19<def,dead> = COPY %vreg50 880B %vreg19:lo16<def,read-undef> = COPY %vreg73 896B ... 912B mv_a32_r16_rmod1 %vreg19, %vreg20 ... *** Bad machine code: Multiple connected components in live interval *** - function: fixedconv - interval: %vreg19 [864r,864d:0)[880r,1024r:1) 0 at 864r 1 at 880r 0: valnos 0 1: valnos 1 So here, both the setting of the hi16 and lo16 parts are marked with read-undef, as wanted. However 864B %vreg19<def,dead> = COPY %vreg50 looks suspicious to me. It's like it thinks that 880B %vreg19:lo16<def,read-undef> = COPY %vreg73 is redefining the whole vreg19, not only the lo16 part, and since this instruction has read-undef, it thinks no part of vreg19, not even hi16 is live over instruction 880.>> >> isUndef() will return false and getSubReg() true, and thus readsReg() >> true and the reload is inserted. >> >> Then we get >> >> *** Bad machine code: Instruction loads from dead spill slot *** >> >> because the spill slot has not been written. > > This is weird. Any chance you could share a test case?Unfortunately not. I'm running our out-of-tree backend and I've no idea if anything like this ever happens in other backends :( Thanks! /Mikael
Maybe Matching Threads
- [LLVMdev] Help with definition of subregisters; spill, rematerialization and implicit uses
- [LLVMdev] Help with definition of subregisters; spill, rematerialization and implicit uses
- Spill hoisting on RAL: looking for some debugging ideas
- [lld] R_MIPS_HI16 / R_MIPS_LO16 calculation
- [LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.