thr3ads.net - llvm dev - [llvm-dev] Extending Register Rematerialization [Dec 2016]

If this information is useful, please help other people find it:
Share via:

Nirav Rana via llvm-dev

2016-Nov-27 20:37 UTC

[llvm-dev] Extending Register Rematerialization

Hello LLVM Developers,

We are working on extending currently available register rematerialization
to include cases where sequence of multiple instructions is required to
rematerialize a value.

We had a discussion on this in community mailing list and link is here:
http://lists.llvm.org/pipermail/llvm-dev/2016-September/subject.html#104777
>From the above discussion and studying the code we believe that extensioncan be implemented in same flow as current remat is implemented. What we
unterstood is RegAlloc<>.cpp will try to allocate register to live-range,
and if not possible, will call InlineSpiller.cpp to spill the live range.
InlineSpiller.cpp will try to first rematerialize the register value if
possible with help of LiveRangeEdit.cpp which provides various methods for
checking if value is rematable or not.

So we have added a new function in LiveRangeEdit that traverses sequence of
instruction in use-def chain recursively (instead of only current
instruction in consideration) upto depth 6 (arbitrarily taken for
testing) to check if value can be rematerialized with the sequence of
instruction or not.

Here is the code:
//New function added for checking complex multi-instruction-sequence
rematerializable
bool LiveRangeEdit::checkComplexRematerializable(VNInfo *VNI,
                                          const MachineInstr *DefMI,
                                          unsigned int depth,
                                          AliasAnalysis *aa) {
  if(TII.isReMaterializablePossible(*DefMI, aa))
    return false;
  DEBUG(dbgs() << " ComplexRemat MI: " << *DefMI);
  for (unsigned i = 0, e = DefMI->getNumOperands(); i != e; ++i) {
    const MachineOperand &MO = DefMI->getOperand(i);

    if (!MO.isReg() || !MO.getReg() || !MO.readsReg())
      continue;
    if (TargetRegisterInfo::isPhysicalRegister(MO.getReg())) {
      if (MRI.isConstantPhysReg(MO.getReg(),
*DefMI->getParent()->getParent()))
        continue;
      //If not constant then check its def
      if(depth > 6)
        return false;

      LiveInterval &li = LIS.getInterval(MO.getReg());
      SlotIndex UseIdx = LIS.getInstructionIndex(*DefMI);
      VNInfo *UseVNInfo = li.getVNInfoAt(UseIdx);

      MachineInstr *NewDefMI = LIS.getInstructionFromIndex(UseVNInfo->def);
      if(!checkComplexRematerializable(UseVNInfo, NewDefMI, depth+1, aa))
        return false;
    }
  }
  Remattable.insert(VNI);     //May have to add new data structure
  return true;
}

In above function we are calling a new function
TII.isReMaterializablePossible(*DefMI, aa) which will act as early
heuristic and return false by checking if instruction is definitely not
rematerialize. We have found some cases
from TargetInstrInfo::isReallyTriviallyReMaterializableGeneric and code for
same is here:

bool TargetInstrInfo::isReMaterializablePossible(
    const MachineInstr &MI, AliasAnalysis *AA) const {
  const MachineFunction &MF = *MI.getParent()->getParent();
  const MachineRegisterInfo &MRI = MF.getRegInfo();

  // Remat clients assume operand 0 is the defined register.
  if (!MI.getNumOperands() || !MI.getOperand(0).isReg())
    return false;
  unsigned DefReg = MI.getOperand(0).getReg();

  // A sub-register definition can only be rematerialized if the instruction
  // doesn't read the other parts of the register.  Otherwise it is really a
  // read-modify-write operation on the full virtual register which cannot
be
  // moved safely.
  if (TargetRegisterInfo::isVirtualRegister(DefReg) &&
      MI.getOperand(0).getSubReg() && MI.readsVirtualRegister(DefReg))
    return false;

  // Avoid instructions obviously unsafe for remat.
  if (MI.isNotDuplicable() || MI.mayStore() || MI.hasUnmodeledSideEffects())
    return false;

  // Don't remat inline asm. We have no idea how expensive it is
  // even if it's side effect free.
  if (MI.isInlineAsm())
    return false;
}

We have following doubts and require guidance and suggestion to move ahead:
1. Is the approach we are following feasible?
2. What will be the suitable method to store the sequence of instruction
for recomputing value which will be used during transformation.
3. Suggestion for deciding termination condition for checking use-def chain
as it should be terminated when remat will be costly that spill.
4. What other cases or instruction could be included in
isReMaterializablePossible() function. Some suggestions for direction to
look in.

Any other suggestions will also be helpful for us to move in right
direction.

- Nirav
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161128/e94038db/attachment.html>

Hal Finkel via llvm-dev

2016-Dec-01 21:36 UTC

head link

[llvm-dev] Extending Register Rematerialization

----- Original Message -----
> From: "Nirav Rana" <nirav076 at gmail.com>
> To: hfinkel at anl.gov, llvm-dev at lists.llvm.org
> Cc: "Pandya Vivek" <h2015078 at pilani.bits-pilani.ac.in>,
> h2015089 at pilani.bits-pilani.ac.in, h2015172 at pilani.bits-pilani.ac.in
> Sent: Sunday, November 27, 2016 2:37:14 PM
> Subject: Extending Register Rematerialization
> Hello LLVM Developers,
> We are working on extending currently available register
> rematerialization to include cases where sequence of multiple
> instructions is required to rematerialize a value.
> We had a discussion on this in community mailing list and link is
> here:
> http://lists.llvm.org/pipermail/llvm-dev/2016-September/subject.html#104777
> From the above discussion and studying the code we believe that
> extension can be implemented in same flow as current remat is
> implemented. What we unterstood is RegAlloc<>.cpp will try to
> allocate register to live-range, and if not possible, will call
> InlineSpiller.cpp to spill the live range. InlineSpiller.cpp will
> try to first rematerialize the register value if possible with help
> of LiveRangeEdit.cpp which provides various methods for checking if
> value is rematable or not.
> So we have added a new function in LiveRangeEdit that traverses
> sequence of instruction in use-def chain recursively (instead of
> only current instruction in consideration) upto depth 6 (arbitrarily
> taken for testing) to check if value can be rematerialized with the
> sequence of instruction or not.
> Here is the code:
> //New function added for checking complex multi-instruction-sequence
> rematerializable
> bool LiveRangeEdit::checkComplexRematerializable(VNInfo *VNI,
> const MachineInstr *DefMI,
> unsigned int depth,
> AliasAnalysis *aa) {
> if(TII.isReMaterializablePossible(*DefMI, aa))
> return false;
> DEBUG(dbgs() << " ComplexRemat MI: " << *DefMI);
> for (unsigned i = 0, e = DefMI->getNumOperands(); i != e; ++i) {
> const MachineOperand &MO = DefMI->getOperand(i);
> if (!MO.isReg() || !MO.getReg() || !MO.readsReg())
> continue;
> if (TargetRegisterInfo::isPhysicalRegister(MO.getReg())) {
> if (MRI.isConstantPhysReg(MO.getReg(),
> *DefMI->getParent()->getParent()))
> continue;
> //If not constant then check its def
> if(depth > 6)
> return false;
> LiveInterval &li = LIS.getInterval(MO.getReg());
> SlotIndex UseIdx = LIS.getInstructionIndex(*DefMI);
> VNInfo *UseVNInfo = li.getVNInfoAt(UseIdx);
> MachineInstr *NewDefMI = LIS.getInstructionFromIndex(UseVNInfo->def);
> if(!checkComplexRematerializable(UseVNInfo, NewDefMI, depth+1, aa))
> return false;
> }
> }
> Remattable.insert(VNI); //May have to add new data structure
> return true;
> }
> In above function we are calling a new function
> TII.isReMaterializablePossible(*DefMI, aa) which will act as early
> heuristic and return false by checking if instruction is definitely
> not rematerialize. We have found some cases from
> TargetInstrInfo::isReallyTriviallyReMaterializableGeneric and code
> for same is here:
> bool TargetInstrInfo::isReMaterializablePossible(
> const MachineInstr &MI, AliasAnalysis *AA) const {
> const MachineFunction &MF = *MI.getParent()->getParent();
> const MachineRegisterInfo &MRI = MF.getRegInfo();
> // Remat clients assume operand 0 is the defined register.
> if (!MI.getNumOperands() || !MI.getOperand(0).isReg())
> return false;
> unsigned DefReg = MI.getOperand(0).getReg();
> // A sub-register definition can only be rematerialized if the
> instruction
> // doesn't read the other parts of the register. Otherwise it is
> really a
> // read-modify-write operation on the full virtual register which
> cannot be
> // moved safely.
> if (TargetRegisterInfo::isVirtualRegister(DefReg) &&
> MI.getOperand(0).getSubReg() && MI.readsVirtualRegister(DefReg))
> return false;
> // Avoid instructions obviously unsafe for remat.
> if (MI.isNotDuplicable() || MI.mayStore() ||
> MI.hasUnmodeledSideEffects())
> return false;
> // Don't remat inline asm. We have no idea how expensive it is
> // even if it's side effect free.
> if (MI.isInlineAsm())
> return false;
> }
> We have following doubts and require guidance and suggestion to move
> ahead:
> 1. Is the approach we are following feasible?
> 2. What will be the suitable method to store the sequence of
> instruction for recomputing value which will be used during
> transformation.
> 3. Suggestion for deciding termination condition for checking use-def
> chain as it should be terminated when remat will be costly that
> spill.
> 4. What other cases or instruction could be included in
> isReMaterializablePossible() function. Some suggestions for
> direction to look in.
> Any other suggestions will also be helpful for us to move in right
> direction.I think sounds feasible. Regarding the second question, I'm not sure what
you're asking.

Regarding isReMaterializablePossible(), if you're allowing instructions that
read from memory, and many of the use cases fall into that category, you'll
need to consider whether you'd be moving any potential load past some
aliasing store. We have code that does these kinds of checks in the instruction
scheduler (in ScheduleDAGInstrs). The instruction scheduler builds a graph
structure detailing these dependencies; maybe we should just keep it around for
RA?

One of the issues we obviously need to deal with is the relative cost of
rematerialization vs. the spill/restore. We have code that does this kind of
comparison today in the MachineCombiner, and I think that the same kind of
comparison is needed here.

-Hal 
> - Nirav-- 

Hal Finkel 
Lead, Compiler Technology and Programming Languages 
Leadership Computing Facility 
Argonne National Laboratory 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161201/e11e0777/attachment.html>

Gerolf Hoflehner via llvm-dev

2016-Dec-02 00:14 UTC

head link

[llvm-dev] Extending Register Rematerialization

On which targets & apps/benchmarks do you expect a speed-up? In practice I
expect spills/fills to be hard to beat by longer remat sequences.

Thanks
Gerolf
> On Nov 27, 2016, at 12:37 PM, Nirav Rana via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> Hello LLVM Developers,
> 
> We are working on extending currently available register rematerialization
to include cases where sequence of multiple instructions is required to
rematerialize a value.
> 
> We had a discussion on this in community mailing list and link is here:  
> http://lists.llvm.org/pipermail/llvm-dev/2016-September/subject.html#104777
<http://lists.llvm.org/pipermail/llvm-dev/2016-September/subject.html#104777>
> 
> From the above discussion and studying the code we believe that extension
can be implemented in same flow as current remat is implemented. What we
unterstood is RegAlloc<>.cpp will try to allocate register to live-range,
and if not possible, will call InlineSpiller.cpp to spill the live range.
InlineSpiller.cpp will try to first rematerialize the register value if possible
with help of LiveRangeEdit.cpp which provides various methods for checking if
value is rematable or not.
> 
> So we have added a new function in LiveRangeEdit that traverses sequence of
instruction in use-def chain recursively (instead of only current instruction in
consideration) upto depth 6 (arbitrarily taken for testing) to check if value
can be rematerialized with the sequence of instruction or not.
> 
> Here is the code:
> //New function added for checking complex multi-instruction-sequence
rematerializable
> bool LiveRangeEdit::checkComplexRematerializable(VNInfo *VNI,
>                                           const MachineInstr *DefMI,
>                                           unsigned int depth,
>                                           AliasAnalysis *aa) {
>   if(TII.isReMaterializablePossible(*DefMI, aa))
>     return false;
>   DEBUG(dbgs() << " ComplexRemat MI: " << *DefMI);
>   for (unsigned i = 0, e = DefMI->getNumOperands(); i != e; ++i) {
>     const MachineOperand &MO = DefMI->getOperand(i);
> 
>     if (!MO.isReg() || !MO.getReg() || !MO.readsReg())
>       continue;
>     if (TargetRegisterInfo::isPhysicalRegister(MO.getReg())) {
>       if (MRI.isConstantPhysReg(MO.getReg(),
*DefMI->getParent()->getParent()))
>         continue;
>       //If not constant then check its def
>       if(depth > 6)
>         return false;
> 
>       LiveInterval &li = LIS.getInterval(MO.getReg());
>       SlotIndex UseIdx = LIS.getInstructionIndex(*DefMI);
>       VNInfo *UseVNInfo = li.getVNInfoAt(UseIdx);
> 
>       MachineInstr *NewDefMI =
LIS.getInstructionFromIndex(UseVNInfo->def);
>       if(!checkComplexRematerializable(UseVNInfo, NewDefMI, depth+1, aa))
>         return false;
>     }
>   }
>   Remattable.insert(VNI);     //May have to add new data structure
>   return true;
> } 
> 
> In above function we are calling a new function
TII.isReMaterializablePossible(*DefMI, aa) which will act as early heuristic and
return false by checking if instruction is definitely not rematerialize. We have
found some cases from TargetInstrInfo::isReallyTriviallyReMaterializableGeneric
and code for same is here:
> 
> bool TargetInstrInfo::isReMaterializablePossible(
>     const MachineInstr &MI, AliasAnalysis *AA) const {
>   const MachineFunction &MF = *MI.getParent()->getParent();
>   const MachineRegisterInfo &MRI = MF.getRegInfo();
> 
>   // Remat clients assume operand 0 is the defined register.
>   if (!MI.getNumOperands() || !MI.getOperand(0).isReg())
>     return false;
>   unsigned DefReg = MI.getOperand(0).getReg();
> 
>   // A sub-register definition can only be rematerialized if the
instruction
>   // doesn't read the other parts of the register.  Otherwise it is
really a
>   // read-modify-write operation on the full virtual register which cannot
be
>   // moved safely.
>   if (TargetRegisterInfo::isVirtualRegister(DefReg) &&
>       MI.getOperand(0).getSubReg() &&
MI.readsVirtualRegister(DefReg))
>     return false;
> 
>   // Avoid instructions obviously unsafe for remat.
>   if (MI.isNotDuplicable() || MI.mayStore() ||
MI.hasUnmodeledSideEffects())
>     return false;
> 
>   // Don't remat inline asm. We have no idea how expensive it is
>   // even if it's side effect free.
>   if (MI.isInlineAsm())
>     return false;
> }
> 
> We have following doubts and require guidance and suggestion to move ahead:
> 1. Is the approach we are following feasible?
> 2. What will be the suitable method to store the sequence of instruction
for recomputing value which will be used during transformation.
> 3. Suggestion for deciding termination condition for checking use-def chain
as it should be terminated when remat will be costly that spill.
> 4. What other cases or instruction could be included in
isReMaterializablePossible() function. Some suggestions for direction to look
in.
> 
> Any other suggestions will also be helpful for us to move in right
direction.
> 
> - Nirav
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161201/5ac8d084/attachment.html>

Nirav Rana via llvm-dev

2016-Dec-02 17:13 UTC

head link

[llvm-dev] Extending Register Rematerialization

On Fri, Dec 2, 2016 at 3:06 AM, Hal Finkel <hfinkel at anl.gov> wrote:
>
> ------------------------------
>
> *From: *"Nirav Rana" <nirav076 at gmail.com>
> *To: *hfinkel at anl.gov, llvm-dev at lists.llvm.org
> *Cc: *"Pandya Vivek" <h2015078 at
pilani.bits-pilani.ac.in>,
> h2015089 at pilani.bits-pilani.ac.in, h2015172 at pilani.bits-pilani.ac.in
> *Sent: *Sunday, November 27, 2016 2:37:14 PM
> *Subject: *Extending Register Rematerialization
>
>
> Hello LLVM Developers,
>
> We are working on extending currently available register rematerialization
> to include cases where sequence of multiple instructions is required to
> rematerialize a value.
>
> We had a discussion on this in community mailing list and link is here:
> http://lists.llvm.org/pipermail/llvm-dev/2016-September/
> subject.html#104777
>
> From the above discussion and studying the code we believe that extension
> can be implemented in same flow as current remat is implemented. What we
> unterstood is RegAlloc<>.cpp will try to allocate register to
live-range,
> and if not possible, will call InlineSpiller.cpp to spill the live range.
> InlineSpiller.cpp will try to first rematerialize the register value if
> possible with help of LiveRangeEdit.cpp which provides various methods for
> checking if value is rematable or not.
>
> So we have added a new function in LiveRangeEdit that traverses sequence
> of instruction in use-def chain recursively (instead of only current
> instruction in consideration) upto depth 6 (arbitrarily taken for
> testing) to check if value can be rematerialized with the sequence of
> instruction or not.
>
> Here is the code:
> //New function added for checking complex multi-instruction-sequence
> rematerializable
> bool LiveRangeEdit::checkComplexRematerializable(VNInfo *VNI,
>                                           const MachineInstr *DefMI,
>                                           unsigned int depth,
>                                           AliasAnalysis *aa) {
>   if(TII.isReMaterializablePossible(*DefMI, aa))
>     return false;
>   DEBUG(dbgs() << " ComplexRemat MI: " << *DefMI);
>   for (unsigned i = 0, e = DefMI->getNumOperands(); i != e; ++i) {
>     const MachineOperand &MO = DefMI->getOperand(i);
>
>     if (!MO.isReg() || !MO.getReg() || !MO.readsReg())
>       continue;
>     if (TargetRegisterInfo::isPhysicalRegister(MO.getReg())) {
>       if (MRI.isConstantPhysReg(MO.getReg(),
> *DefMI->getParent()->getParent()))
>         continue;
>       //If not constant then check its def
>       if(depth > 6)
>         return false;
>
>       LiveInterval &li = LIS.getInterval(MO.getReg());
>       SlotIndex UseIdx = LIS.getInstructionIndex(*DefMI);
>       VNInfo *UseVNInfo = li.getVNInfoAt(UseIdx);
>
>       MachineInstr *NewDefMI = LIS.getInstructionFromIndex(Us
> eVNInfo->def);
>       if(!checkComplexRematerializable(UseVNInfo, NewDefMI, depth+1, aa))
>         return false;
>     }
>   }
>   Remattable.insert(VNI);     //May have to add new data structure
>   return true;
> }
>
> In above function we are calling a new function
> TII.isReMaterializablePossible(*DefMI, aa) which will act as early
> heuristic and return false by checking if instruction is definitely not
> rematerialize. We have found some cases from TargetInstrInfo::isReally
> TriviallyReMaterializableGeneric and code for same is here:
>
> bool TargetInstrInfo::isReMaterializablePossible(
>     const MachineInstr &MI, AliasAnalysis *AA) const {
>   const MachineFunction &MF = *MI.getParent()->getParent();
>   const MachineRegisterInfo &MRI = MF.getRegInfo();
>
>   // Remat clients assume operand 0 is the defined register.
>   if (!MI.getNumOperands() || !MI.getOperand(0).isReg())
>     return false;
>   unsigned DefReg = MI.getOperand(0).getReg();
>
>   // A sub-register definition can only be rematerialized if the
> instruction
>   // doesn't read the other parts of the register.  Otherwise it is
really
> a
>   // read-modify-write operation on the full virtual register which cannot
> be
>   // moved safely.
>   if (TargetRegisterInfo::isVirtualRegister(DefReg) &&
>       MI.getOperand(0).getSubReg() &&
MI.readsVirtualRegister(DefReg))
>     return false;
>
>   // Avoid instructions obviously unsafe for remat.
>   if (MI.isNotDuplicable() || MI.mayStore() ||
> MI.hasUnmodeledSideEffects())
>     return false;
>
>   // Don't remat inline asm. We have no idea how expensive it is
>   // even if it's side effect free.
>   if (MI.isInlineAsm())
>     return false;
> }
>
> We have following doubts and require guidance and suggestion to move ahead:
> 1. Is the approach we are following feasible?
> 2. What will be the suitable method to store the sequence of instruction
> for recomputing value which will be used during transformation.
> 3. Suggestion for deciding termination condition for checking use-def
> chain as it should be terminated when remat will be costly that spill.
> 4. What other cases or instruction could be included in
> isReMaterializablePossible() function. Some suggestions for direction to
> look in.
>
> Any other suggestions will also be helpful for us to move in right
> direction.
>
> I think sounds feasible. Regarding the second question, I'm not sure
what
> you're asking.
>
Current rematerialization first checks if there is any remat possible or
not and stores this information in Remattable variable. Actual remat is
done by again getting the def, which is fine because we only need to get
def (only one step above). But because in this case we are traversing
use-def chain for multiple instructions, if we do not store this
information of sequence of instructions required to recompute value, doing
it again for actual remat would be redundant work. So my question is what
structure could be used for storing this information, that can be used to
clone and add those instruction for recomputing the value. Will a
SmallVector< TinyPtrVector<const MachineInstr *, 6>> would be write
choice?
where TinyPtrVector with store sequence of instructions for remat.

>
> Regarding isReMaterializablePossible(), if you're allowing instructions
> that read from memory, and many of the use cases fall into that category,
> you'll need to consider whether you'd be moving any potential load
past
> some aliasing store. We have code that does these kinds of checks in the
> instruction scheduler (in ScheduleDAGInstrs). The instruction scheduler
> builds a graph structure detailing these dependencies; maybe we should just
> keep it around for RA?
>
>One of the issues we obviously need to deal with is the relative cost
of> rematerialization vs. the spill/restore. We have code that does this kind
> of comparison today in the MachineCombiner, and I think that the same kind
> of comparison is needed here.
>
Ok we will study both these modules (ScheduleDAGInstrs and MachineCombiner).

>
>  -Hal
>
>
> - Nirav
>
>
>
>
> --
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161202/87cc4552/attachment.html>

Nirav Rana via llvm-dev

2016-Dec-02 17:13 UTC

head link

[llvm-dev] Extending Register Rematerialization

On Fri, Dec 2, 2016 at 5:44 AM, Gerolf Hoflehner <ghoflehner at apple.com>
wrote:
> On which targets & apps/benchmarks do you expect a speed-up? In
practice I
> expect spills/fills to be hard to beat by longer remat sequences.
>
We think that this will be helpful in RISC architectures such as MIPS, ARM
and PowerPC. An example case is mentioned in first mail in mail-chain I
gave link of. But as far as benchmarks is concerned, we are still looking
for examples in testsuit.

>
> Thanks
> Gerolf
>
> On Nov 27, 2016, at 12:37 PM, Nirav Rana via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> Hello LLVM Developers,
>
> We are working on extending currently available register rematerialization
> to include cases where sequence of multiple instructions is required to
> rematerialize a value.
>
> We had a discussion on this in community mailing list and link is here:
> http://lists.llvm.org/pipermail/llvm-dev/2016-
> September/subject.html#104777
>
> From the above discussion and studying the code we believe that extension
> can be implemented in same flow as current remat is implemented. What we
> unterstood is RegAlloc<>.cpp will try to allocate register to
live-range,
> and if not possible, will call InlineSpiller.cpp to spill the live range.
> InlineSpiller.cpp will try to first rematerialize the register value if
> possible with help of LiveRangeEdit.cpp which provides various methods for
> checking if value is rematable or not.
>
> So we have added a new function in LiveRangeEdit that traverses sequence
> of instruction in use-def chain recursively (instead of only current
> instruction in consideration) upto depth 6 (arbitrarily taken for
> testing) to check if value can be rematerialized with the sequence of
> instruction or not.
>
> Here is the code:
> //New function added for checking complex multi-instruction-sequence
> rematerializable
> bool LiveRangeEdit::checkComplexRematerializable(VNInfo *VNI,
>                                           const MachineInstr *DefMI,
>                                           unsigned int depth,
>                                           AliasAnalysis *aa) {
>   if(TII.isReMaterializablePossible(*DefMI, aa))
>     return false;
>   DEBUG(dbgs() << " ComplexRemat MI: " << *DefMI);
>   for (unsigned i = 0, e = DefMI->getNumOperands(); i != e; ++i) {
>     const MachineOperand &MO = DefMI->getOperand(i);
>
>     if (!MO.isReg() || !MO.getReg() || !MO.readsReg())
>       continue;
>     if (TargetRegisterInfo::isPhysicalRegister(MO.getReg())) {
>       if (MRI.isConstantPhysReg(MO.getReg(), *DefMI->getParent()->
> getParent()))
>         continue;
>       //If not constant then check its def
>       if(depth > 6)
>         return false;
>
>       LiveInterval &li = LIS.getInterval(MO.getReg());
>       SlotIndex UseIdx = LIS.getInstructionIndex(*DefMI);
>       VNInfo *UseVNInfo = li.getVNInfoAt(UseIdx);
>
>       MachineInstr *NewDefMI = LIS.getInstructionFromIndex(
> UseVNInfo->def);
>       if(!checkComplexRematerializable(UseVNInfo, NewDefMI, depth+1, aa))
>         return false;
>     }
>   }
>   Remattable.insert(VNI);     //May have to add new data structure
>   return true;
> }
>
> In above function we are calling a new function TII.
> isReMaterializablePossible(*DefMI, aa) which will act as early heuristic
> and return false by checking if instruction is definitely not
> rematerialize. We have found some cases from TargetInstrInfo::
> isReallyTriviallyReMaterializableGeneric and code for same is here:
>
> bool TargetInstrInfo::isReMaterializablePossible(
>     const MachineInstr &MI, AliasAnalysis *AA) const {
>   const MachineFunction &MF = *MI.getParent()->getParent();
>   const MachineRegisterInfo &MRI = MF.getRegInfo();
>
>   // Remat clients assume operand 0 is the defined register.
>   if (!MI.getNumOperands() || !MI.getOperand(0).isReg())
>     return false;
>   unsigned DefReg = MI.getOperand(0).getReg();
>
>   // A sub-register definition can only be rematerialized if the
> instruction
>   // doesn't read the other parts of the register.  Otherwise it is
really
> a
>   // read-modify-write operation on the full virtual register which cannot
> be
>   // moved safely.
>   if (TargetRegisterInfo::isVirtualRegister(DefReg) &&
>       MI.getOperand(0).getSubReg() &&
MI.readsVirtualRegister(DefReg))
>     return false;
>
>   // Avoid instructions obviously unsafe for remat.
>   if (MI.isNotDuplicable() || MI.mayStore() ||
> MI.hasUnmodeledSideEffects())
>     return false;
>
>   // Don't remat inline asm. We have no idea how expensive it is
>   // even if it's side effect free.
>   if (MI.isInlineAsm())
>     return false;
> }
>
> We have following doubts and require guidance and suggestion to move ahead:
> 1. Is the approach we are following feasible?
> 2. What will be the suitable method to store the sequence of instruction
> for recomputing value which will be used during transformation.
> 3. Suggestion for deciding termination condition for checking use-def
> chain as it should be terminated when remat will be costly that spill.
> 4. What other cases or instruction could be included in
> isReMaterializablePossible() function. Some suggestions for direction to
> look in.
>
> Any other suggestions will also be helpful for us to move in right
> direction.
>
> - Nirav
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161202/2c6ee4cb/attachment.html>

Hal Finkel via llvm-dev

2016-Dec-02 20:36 UTC

head link

[llvm-dev] Extending Register Rematerialization

----- Original Message -----
> From: "Gerolf Hoflehner" <ghoflehner at apple.com>
> To: "Nirav Rana" <nirav076 at gmail.com>
> Cc: hfinkel at anl.gov, llvm-dev at lists.llvm.org, "Pandya
Vivek"
> <h2015078 at pilani.bits-pilani.ac.in>,
> h2015089 at pilani.bits-pilani.ac.in, h2015172 at pilani.bits-pilani.ac.in
> Sent: Thursday, December 1, 2016 6:14:06 PM
> Subject: Re: [llvm-dev] Extending Register Rematerialization
> On which targets & apps/benchmarks do you expect a speed-up? In
> practice I expect spills/fills to be hard to beat by longer remat
> sequences.Why? 

Perhaps it depends on how you define "longer." A larger OOO core with
multiple pipelines can often execute a materialization sequence consisting of
several instructions faster than it can get data from the L1 cache. If the code
is already putting pressure on the load/store units then the extra spill/restore
code can be noticeably worse.

Note the following from AArch64InstrInfo.td: 

let isReMaterializable = 1, isCodeGenOnly = 1, isMoveImm = 1, 
isAsCheapAsAMove = 1 in { 
// FIXME: The following pseudo instructions are only needed because remat 
// cannot handle multiple instructions. When that changes, we can select 
// directly to the real instructions and get rid of these pseudos. 

def MOVi32imm 
: Pseudo<(outs GPR32:$dst), (ins i32imm:$src), 
[(set GPR32:$dst, imm:$src)]>, 
Sched<[WriteImm]>; 
def MOVi64imm 
: Pseudo<(outs GPR64:$dst), (ins i64imm:$src), 
[(set GPR64:$dst, imm:$src)]>, 
Sched<[WriteImm]>; 
} // isReMaterializable, isCodeGenOnly 

Also, I think that Ivan Baev's talk at the developers' meeting a couple
of years ago also provides some good hints of places to look for where this
might matter: http://llvm.org/devmtg/2014-10/#talk20

Thanks again, 
Hal 
> Thanks
> Gerolf
> > On Nov 27, 2016, at 12:37 PM, Nirav Rana via llvm-dev <
> > llvm-dev at lists.llvm.org > wrote:
> 
> > Hello LLVM Developers,
> 
> > We are working on extending currently available register
> > rematerialization to include cases where sequence of multiple
> > instructions is required to rematerialize a value.
> 
> > We had a discussion on this in community mailing list and link is
> > here:
> 
> >
http://lists.llvm.org/pipermail/llvm-dev/2016-September/subject.html#104777
> 
> > From the above discussion and studying the code we believe that
> > extension can be implemented in same flow as current remat is
> > implemented. What we unterstood is RegAlloc<>.cpp will try to
> > allocate register to live-range, and if not possible, will call
> > InlineSpiller.cpp to spill the live range. InlineSpiller.cpp will
> > try to first rematerialize the register value if possible with help
> > of LiveRangeEdit.cpp which provides various methods for checking if
> > value is rematable or not.
> 
> > So we have added a new function in LiveRangeEdit that traverses
> > sequence of instruction in use-def chain recursively (instead of
> > only current instruction in consideration) upto depth 6
> > (arbitrarily
> > taken for testing) to check if value can be rematerialized with the
> > sequence of instruction or not.
> 
> > Here is the code:
> 
> > //New function added for checking complex
> > multi-instruction-sequence
> > rematerializable
> 
> > bool LiveRangeEdit::checkComplexRematerializable(VNInfo *VNI,
> 
> > const MachineInstr *DefMI,
> 
> > unsigned int depth,
> 
> > AliasAnalysis *aa) {
> 
> > if(TII.isReMaterializablePossible(*DefMI, aa))
> 
> > return false;
> 
> > DEBUG(dbgs() << " ComplexRemat MI: " << *DefMI);
> 
> > for (unsigned i = 0, e = DefMI->getNumOperands(); i != e; ++i) {
> 
> > const MachineOperand &MO = DefMI->getOperand(i);
> 
> > if (!MO.isReg() || !MO.getReg() || !MO.readsReg())
> 
> > continue;
> 
> > if (TargetRegisterInfo::isPhysicalRegister(MO.getReg())) {
> 
> > if (MRI.isConstantPhysReg(MO.getReg(),
> > *DefMI->getParent()->getParent()))
> 
> > continue;
> 
> > //If not constant then check its def
> 
> > if(depth > 6)
> 
> > return false;
> 
> > LiveInterval &li = LIS.getInterval(MO.getReg());
> 
> > SlotIndex UseIdx = LIS.getInstructionIndex(*DefMI);
> 
> > VNInfo *UseVNInfo = li.getVNInfoAt(UseIdx);
> 
> > MachineInstr *NewDefMI > >
LIS.getInstructionFromIndex(UseVNInfo->def);
> 
> > if(!checkComplexRematerializable(UseVNInfo, NewDefMI, depth+1, aa))
> 
> > return false;
> 
> > }
> 
> > }
> 
> > Remattable.insert(VNI); //May have to add new data structure
> 
> > return true;
> 
> > }
> 
> > In above function we are calling a new function
> > TII.isReMaterializablePossible(*DefMI, aa) which will act as early
> > heuristic and return false by checking if instruction is definitely
> > not rematerialize. We have found some cases from
> > TargetInstrInfo::isReallyTriviallyReMaterializableGeneric and code
> > for same is here:
> 
> > bool TargetInstrInfo::isReMaterializablePossible(
> 
> > const MachineInstr &MI, AliasAnalysis *AA) const {
> 
> > const MachineFunction &MF = *MI.getParent()->getParent();
> 
> > const MachineRegisterInfo &MRI = MF.getRegInfo();
> 
> > // Remat clients assume operand 0 is the defined register.
> 
> > if (!MI.getNumOperands() || !MI.getOperand(0).isReg())
> 
> > return false;
> 
> > unsigned DefReg = MI.getOperand(0).getReg();
> 
> > // A sub-register definition can only be rematerialized if the
> > instruction
> 
> > // doesn't read the other parts of the register. Otherwise it is
> > really a
> 
> > // read-modify-write operation on the full virtual register which
> > cannot be
> 
> > // moved safely.
> 
> > if (TargetRegisterInfo::isVirtualRegister(DefReg) &&
> 
> > MI.getOperand(0).getSubReg() &&
MI.readsVirtualRegister(DefReg))
> 
> > return false;
> 
> > // Avoid instructions obviously unsafe for remat.
> 
> > if (MI.isNotDuplicable() || MI.mayStore() ||
> > MI.hasUnmodeledSideEffects())
> 
> > return false;
> 
> > // Don't remat inline asm. We have no idea how expensive it is
> 
> > // even if it's side effect free.
> 
> > if (MI.isInlineAsm())
> 
> > return false;
> 
> > }
> 
> > We have following doubts and require guidance and suggestion to
> > move
> > ahead:
> 
> > 1. Is the approach we are following feasible?
> 
> > 2. What will be the suitable method to store the sequence of
> > instruction for recomputing value which will be used during
> > transformation.
> 
> > 3. Suggestion for deciding termination condition for checking
> > use-def
> > chain as it should be terminated when remat will be costly that
> > spill.
> 
> > 4. What other cases or instruction could be included in
> > isReMaterializablePossible() function. Some suggestions for
> > direction to look in.
> 
> > Any other suggestions will also be helpful for us to move in right
> > direction.
> 
> > - Nirav _______________________________________________
> 
> > LLVM Developers mailing list
> 
> > llvm-dev at lists.llvm.org
> 
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
-- 

Hal Finkel 
Lead, Compiler Technology and Programming Languages 
Leadership Computing Facility 
Argonne National Laboratory 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161202/5ace6ba7/attachment.html>

llvm dev - Dec 2016 - Extending Register Rematerialization

[llvm-dev] Extending Register Rematerialization

[llvm-dev] Extending Register Rematerialization

[llvm-dev] Extending Register Rematerialization

[llvm-dev] Extending Register Rematerialization

[llvm-dev] Extending Register Rematerialization

[llvm-dev] Extending Register Rematerialization