thr3ads.net - llvm dev - [LLVMdev] Machine LICM and cheap instructions? [Jan 2015]

If this information is useful, please help other people find it:
Share via:

Hal Finkel

2015-Jan-08 21:45 UTC

[LLVMdev] Machine LICM and cheap instructions?

Hi everyone,

The MachineLICM pass has a heuristic such that, even in low-register-pressure
situations, it will refuse to hoist "cheap" instructions out of loops.
By default, when an itinerary is available, this means that all of the defined
operands are available in at most 1 cycle. ARM overrides this, and provides this
more-customized definition:

bool ARMBaseInstrInfo::
hasLowDefLatency(const InstrItineraryData *ItinData,
                 const MachineInstr *DefMI, unsigned DefIdx) const {
  if (!ItinData || ItinData->isEmpty())
    return false;

  unsigned DDomain = DefMI->getDesc().TSFlags & ARMII::DomainMask;
  if (DDomain == ARMII::DomainGeneral) {
    unsigned DefClass = DefMI->getDesc().getSchedClass();
    int DefCycle = ItinData->getOperandCycle(DefClass, DefIdx);
    return (DefCycle != -1 && DefCycle <= 2);
  }
  return false;
}

So it won't hoist instructions that have defined operands ready in at most
two cycles for general-domain instructions. Regardless, I don't understand
the logic behind this heuristic. High-register-pressure situations are one
thing, but why is it ever profitable in low-register-pressure situations not to
hoist even "cheap" instructions out of loop bodies?

Thanks again,
Hal

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Quentin Colombet

2015-Jan-08 23:06 UTC

head link

[LLVMdev] Machine LICM and cheap instructions?

Hi Hal,

I haven’t looked at the implementation but could it be that we leave room for
expensive instructions to be hoisted?
The rational being that hoisting a cheap instruction will increase the register
pressure and then we may block expensive instruction (assuming we use some kind
of iterative algorithm).

Other than that, I would say that seem rather silly.

Cheers,
Q.
On Jan 8, 2015, at 1:45 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> Hi everyone,
> 
> The MachineLICM pass has a heuristic such that, even in
low-register-pressure situations, it will refuse to hoist "cheap"
instructions out of loops. By default, when an itinerary is available, this
means that all of the defined operands are available in at most 1 cycle. ARM
overrides this, and provides this more-customized definition:
> 
> bool ARMBaseInstrInfo::
> hasLowDefLatency(const InstrItineraryData *ItinData,
>                 const MachineInstr *DefMI, unsigned DefIdx) const {
>  if (!ItinData || ItinData->isEmpty())
>    return false;
> 
>  unsigned DDomain = DefMI->getDesc().TSFlags & ARMII::DomainMask;
>  if (DDomain == ARMII::DomainGeneral) {
>    unsigned DefClass = DefMI->getDesc().getSchedClass();
>    int DefCycle = ItinData->getOperandCycle(DefClass, DefIdx);
>    return (DefCycle != -1 && DefCycle <= 2);
>  }
>  return false;
> }
> 
> So it won't hoist instructions that have defined operands ready in at
most two cycles for general-domain instructions. Regardless, I don't
understand the logic behind this heuristic. High-register-pressure situations
are one thing, but why is it ever profitable in low-register-pressure situations
not to hoist even "cheap" instructions out of loop bodies?
> 
> Thanks again,
> Hal
> 
> -- 
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Hal Finkel

2015-Jan-08 23:12 UTC

head link

[LLVMdev] Machine LICM and cheap instructions?

----- Original Message -----> From: "Quentin Colombet" <qcolombet at apple.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "Dev" <llvmdev at cs.uiuc.edu>
> Sent: Thursday, January 8, 2015 5:06:24 PM
> Subject: Re: [LLVMdev] Machine LICM and cheap instructions?
> 
> Hi Hal,
> 
> I haven’t looked at the implementation but could it be that we leave
> room for expensive instructions to be hoisted?
> The rational being that hoisting a cheap instruction will increase
> the register pressure and then we may block expensive instruction
> (assuming we use some kind of iterative algorithm).
Interesting point. That cuts both ways, however, because hoisting a cheap
instruction can then make an expensive instruction loop invariant. Perhaps we
need two passes over the instructions to make a good decision here.

Thanks,
Hal
> 
> Other than that, I would say that seem rather silly.
> 
> Cheers,
> Q.
> On Jan 8, 2015, at 1:45 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> 
> > Hi everyone,
> > 
> > The MachineLICM pass has a heuristic such that, even in
> > low-register-pressure situations, it will refuse to hoist
"cheap"
> > instructions out of loops. By default, when an itinerary is
> > available, this means that all of the defined operands are
> > available in at most 1 cycle. ARM overrides this, and provides
> > this more-customized definition:
> > 
> > bool ARMBaseInstrInfo::
> > hasLowDefLatency(const InstrItineraryData *ItinData,
> >                 const MachineInstr *DefMI, unsigned DefIdx) const {
> >  if (!ItinData || ItinData->isEmpty())
> >    return false;
> > 
> >  unsigned DDomain = DefMI->getDesc().TSFlags &
ARMII::DomainMask;
> >  if (DDomain == ARMII::DomainGeneral) {
> >    unsigned DefClass = DefMI->getDesc().getSchedClass();
> >    int DefCycle = ItinData->getOperandCycle(DefClass, DefIdx);
> >    return (DefCycle != -1 && DefCycle <= 2);
> >  }
> >  return false;
> > }
> > 
> > So it won't hoist instructions that have defined operands ready in
> > at most two cycles for general-domain instructions. Regardless, I
> > don't understand the logic behind this heuristic.
> > High-register-pressure situations are one thing, but why is it
> > ever profitable in low-register-pressure situations not to hoist
> > even "cheap" instructions out of loop bodies?
> > 
> > Thanks again,
> > Hal
> > 
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Evan Cheng

2015-Jan-09 19:40 UTC

head link

[LLVMdev] Machine LICM and cheap instructions?

Some cheap and trivially remateriablizable instructions are free. It makes
little sense to hoist them and increase register pressure. In some cases, the
register allocator will end up rematerialize them at use sites again.

The heuristics is obviously conservative. Due to phase ordering reasons,
register pressure estimates cannot be 100% accurate. It’s here to prevent harm.
During my testings from way back then, hoisting these cheap instructions can
indeed harm performance. It probably can help in some cases as well. But I’d
wager it’s pretty much a wash at best.

Evan
> On Jan 8, 2015, at 1:45 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> 
> Hi everyone,
> 
> The MachineLICM pass has a heuristic such that, even in
low-register-pressure situations, it will refuse to hoist "cheap"
instructions out of loops. By default, when an itinerary is available, this
means that all of the defined operands are available in at most 1 cycle. ARM
overrides this, and provides this more-customized definition:
> 
> bool ARMBaseInstrInfo::
> hasLowDefLatency(const InstrItineraryData *ItinData,
>                 const MachineInstr *DefMI, unsigned DefIdx) const {
>  if (!ItinData || ItinData->isEmpty())
>    return false;
> 
>  unsigned DDomain = DefMI->getDesc().TSFlags & ARMII::DomainMask;
>  if (DDomain == ARMII::DomainGeneral) {
>    unsigned DefClass = DefMI->getDesc().getSchedClass();
>    int DefCycle = ItinData->getOperandCycle(DefClass, DefIdx);
>    return (DefCycle != -1 && DefCycle <= 2);
>  }
>  return false;
> }
> 
> So it won't hoist instructions that have defined operands ready in at
most two cycles for general-domain instructions. Regardless, I don't
understand the logic behind this heuristic. High-register-pressure situations
are one thing, but why is it ever profitable in low-register-pressure situations
not to hoist even "cheap" instructions out of loop bodies?
> 
> Thanks again,
> Hal
> 
> -- 
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory

Krzysztof Parzyszek

2015-Jan-12 14:25 UTC

head link

[LLVMdev] Machine LICM and cheap instructions?

On Hexagon it's generally better to hoist everything out of the loop and 
then rematerialize cheap instructions if necessary.  Hexagon executes 
instruction in packets, and each extra instruction in the loop adds the 
likelihood of having more packets than necessary.  In case of small 
loops this can be particularly expensive and predicting packetization is 
extremely difficult without actually performing it.
The "hasLowDefLatency" is not really a good tool to model this, since 
cheap instructions are still cheap, and yet they are better hoisted out.

-Krzysztof

On 1/8/2015 3:45 PM, Hal Finkel wrote:> Hi everyone,
>
> The MachineLICM pass has a heuristic such that, even in
low-register-pressure situations, it will refuse to hoist "cheap"
instructions out of loops. By default, when an itinerary is available, this
means that all of the defined operands are available in at most 1 cycle. ARM
overrides this, and provides this more-customized definition:
>
> bool ARMBaseInstrInfo::
> hasLowDefLatency(const InstrItineraryData *ItinData,
>                   const MachineInstr *DefMI, unsigned DefIdx) const {
>    if (!ItinData || ItinData->isEmpty())
>      return false;
>
>    unsigned DDomain = DefMI->getDesc().TSFlags & ARMII::DomainMask;
>    if (DDomain == ARMII::DomainGeneral) {
>      unsigned DefClass = DefMI->getDesc().getSchedClass();
>      int DefCycle = ItinData->getOperandCycle(DefClass, DefIdx);
>      return (DefCycle != -1 && DefCycle <= 2);
>    }
>    return false;
> }
>
> So it won't hoist instructions that have defined operands ready in at
most two cycles for general-domain instructions. Regardless, I don't
understand the logic behind this heuristic. High-register-pressure situations
are one thing, but why is it ever profitable in low-register-pressure situations
not to hoist even "cheap" instructions out of loop bodies?
>
> Thanks again,
> Hal
>

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, 
hosted by The Linux Foundation

llvm dev - Jan 2015 - [LLVMdev] Machine LICM and cheap instructions?

[LLVMdev] Machine LICM and cheap instructions?

[LLVMdev] Machine LICM and cheap instructions?

[LLVMdev] Machine LICM and cheap instructions?

[LLVMdev] Machine LICM and cheap instructions?

[LLVMdev] Machine LICM and cheap instructions?