Yin Ma
2013-Mar-13  23:37 UTC
[LLVMdev] Suggestion About Adding Target Dependent Decision in LSR Please
Hi All,
 
In the target I am working, we comes cross a situation that the loop
strength reduction
could deliver a better result but currently not, because 
1.       the algorithm narrows search space by winner registers without
considering 
the target preferred format. (NarrowSearchSpaceByPickingWinnerRegs)
2.       Cost comparison solely favors the number register without
considering other
Impacts. 
 
For the case one,
NarrowSearchSpaceByPickingWinnerRegs filters by most occurred registers.
ld(basereg, immediate) is a target preferred addressing mode. However, it
may
be deleted because basereg is very likely not to be the most occurred
register 
because the less opportunity in a combination. 
 
For the case two, by observing the cost comparison equation
bool Cost::operator<(const Cost &Other) const {
  if (NumRegs != Other.NumRegs)                            return NumRegs <
Other.NumRegs;
  if (AddRecCost != Other.AddRecCost)                  return AddRecCost <
Other.AddRecCost;
  if (NumIVMuls != Other.NumIVMuls)                   return NumIVMuls <
Other.NumIVMuls;
  if (NumBaseAdds != Other.NumBaseAdds)       return NumBaseAdds <
Other.NumBaseAdds;
  if (ImmCost != Other.ImmCost)                               return ImmCost
< Other.ImmCost;
  if (SetupCost != Other.SetupCost)                         return SetupCost
< Other.SetupCost;
  return false;
}
 
If we have a case to compare
Cost at 5 regs, with addrec cost 1, plus 15 base adds, plus 1 imm cost, plus
4 setup cost.
Cost at 4 regs, with addrec cost 1, plus 28 base adds, plus 1 imm cost, plus
2 setup cost.
The current mode will select 4 regs case even there are 14 more base adds.
And base
Adds matters in our targets.
 
So I think the current LSR should be pushing more decision into target
dependent backend.
Like calling new functions in TargetTransformInfo. At least, in narrow
search space and cost 
comparison phase, or more in cost rating phase. LSR can be tightly cooped
with the target
attributes in order to get the most beneficial result.
 
How do you guys think?
 
Thanks,
 
Yin 
 
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130313/72e8fb24/attachment.html>
Andrew Trick
2013-Mar-14  16:41 UTC
[LLVMdev] Suggestion About Adding Target Dependent Decision in LSR Please
On Mar 13, 2013, at 4:37 PM, Yin Ma <yinma at codeaurora.org> wrote:> Hi All, > > In the target I am working, we comes cross a situation that the loop strength reduction > could deliver a better result but currently not, because > 1. the algorithm narrows search space by winner registers without considering > the target preferred format. (NarrowSearchSpaceByPickingWinnerRegs) > 2. Cost comparison solely favors the number register without considering other > Impacts. > > For the case one, > NarrowSearchSpaceByPickingWinnerRegs filters by most occurred registers. > ld(basereg, immediate) is a target preferred addressing mode. However, it may > be deleted because basereg is very likely not to be the most occurred register > because the less opportunity in a combination. > > For the case two, by observing the cost comparison equation > bool Cost::operator<(const Cost &Other) const { > if (NumRegs != Other.NumRegs) return NumRegs < Other.NumRegs; > if (AddRecCost != Other.AddRecCost) return AddRecCost < Other.AddRecCost; > if (NumIVMuls != Other.NumIVMuls) return NumIVMuls < Other.NumIVMuls; > if (NumBaseAdds != Other.NumBaseAdds) return NumBaseAdds < Other.NumBaseAdds; > if (ImmCost != Other.ImmCost) return ImmCost < Other.ImmCost; > if (SetupCost != Other.SetupCost) return SetupCost < Other.SetupCost; > return false; > } > > If we have a case to compare > Cost at 5 regs, with addrec cost 1, plus 15 base adds, plus 1 imm cost, plus 4 setup cost. > Cost at 4 regs, with addrec cost 1, plus 28 base adds, plus 1 imm cost, plus 2 setup cost. > The current mode will select 4 regs case even there are 14 more base adds. And base > Adds matters in our targets. > > So I think the current LSR should be pushing more decision into target dependent backend. > Like calling new functions in TargetTransformInfo. At least, in narrow search space and cost > comparison phase, or more in cost rating phase. LSR can be tightly cooped with the target > attributes in order to get the most beneficial result.Yes. LSR decisions are tightly coupled with the target architecture and potentially the subtarget microarcthitecture. As you figured out, the way to handle it is to communicate more information to LSR through TTI. It's easy to do this to solve individual benchmarks on your target. It's hard to know if you have a general solution that works across targets. But if you can add hooks in a way that preserves existing behavior on other targets it shouldn't be a problem. We want to design general hooks, but leave it up to someone doing the benchmarking to tune them for a particular target. -Andy -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130314/d9770336/attachment.html>
Yin Ma
2013-Mar-14  21:21 UTC
[LLVMdev] Suggestion About Adding Target Dependent Decision in LSR Please
Hi Andy,
 
Actually, if we just add hooks that preserves the existing behavior, 
It is not difficult. For example, 
 
For case one, we can define one function like
  virtual const SCEV* getTargetPreferredWinnerReg(const SCEV*& ScaledReg,
           SmallVector<const SCEV *, 4>& BaseRegs, GlobalValue*&
BaseGV)
const;
 
In NarrowSearchSpaceByPickingWinnerRegs, we can preserves the winner
reg from target and winner reg from the original algorithm if this function 
returns NULL, it is just like before 
 
For case two, we can define a general cost from TTI function, like
  virtual int getLSRFormulaCost(const unsigned NumRegs,
                            const unsigned AddRecCost, const unsigned
NumIVMuls,
                            const unsigned NumBaseAdds, const unsigned
ImmCost,
                            const unsigned SetupCost) const;
Then we do something like 
  int thisCost = TTI->getLSRFormulaCost(NumRegs, AddRecCost, NumIVMuls,
                                           NumBaseAdds, ImmCost, SetupCost);
  if (thisCost >= 0) {
    int otherCost = TTI->getLSRFormulaCost(Other.NumRegs, Other.AddRecCost,
                                            Other.NumIVMuls,
Other.NumBaseAdds,
                                            Other.ImmCost, Other.SetupCost);
    if (otherCost >= 0)
      return thisCost < otherCost;
  }
In bool Cost::operator<(const Cost &Other) const 
 
We could have more decision from target backend.
 
However, from the problem I am dealing with, which has a lot of branches in
multiple level
Loop nests. LSR is still not able to perform the best because 
1.       LSR is not control flow sensitive. It treats all USE equally, which
doesn't care which 
USE is on critical path and which USE is on a branch. Without these kind of
information,
We cannot predict AddRec precisely because we only can assume all USEs can
be post
Increment or all not.
2.       The most occurred winner regs pruning may not be the best approach.
Because target
may prefer certain regs than others, even some registers do occur more.
Specially, 
register with small computation is more likely to occur in formulas.
However, register
with small computation may not always be a best choice if the content in
register are
loop invariant.
 
Therefore,  We may need a systemic agreement or plan to address the existing
LSR problems. I
would like to ask if any party has any improvement plan about LSR? So we can
come together
to have an unified solution to handle all known problem in one round?
 
Thanks,
 
Yin 
 
 
From: Andrew Trick [mailto:atrick at apple.com] 
Sent: Thursday, March 14, 2013 9:42 AM
To: Yin Ma
Cc: llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] Suggestion About Adding Target Dependent Decision in
LSR Please
 
 
On Mar 13, 2013, at 4:37 PM, Yin Ma <yinma at codeaurora.org> wrote:
Hi All,
 
In the target I am working, we comes cross a situation that the loop
strength reduction
could deliver a better result but currently not, because
1.       the algorithm narrows search space by winner registers without
considering
the target preferred format. (NarrowSearchSpaceByPickingWinnerRegs)
2.       Cost comparison solely favors the number register without
considering other
Impacts.
 
For the case one,
NarrowSearchSpaceByPickingWinnerRegs filters by most occurred registers.
ld(basereg, immediate) is a target preferred addressing mode. However, it
may
be deleted because basereg is very likely not to be the most occurred
register
because the less opportunity in a combination.
 
For the case two, by observing the cost comparison equation
bool Cost::operator<(const Cost &Other) const {
  if (NumRegs != Other.NumRegs)                            return NumRegs <
Other.NumRegs;
  if (AddRecCost != Other.AddRecCost)                  return AddRecCost <
Other.AddRecCost;
  if (NumIVMuls != Other.NumIVMuls)                   return NumIVMuls <
Other.NumIVMuls;
  if (NumBaseAdds != Other.NumBaseAdds)       return NumBaseAdds <
Other.NumBaseAdds;
  if (ImmCost != Other.ImmCost)                               return ImmCost
< Other.ImmCost;
  if (SetupCost != Other.SetupCost)                         return SetupCost
< Other.SetupCost;
  return false;
}
 
If we have a case to compare
Cost at 5 regs, with addrec cost 1, plus 15 base adds, plus 1 imm cost, plus
4 setup cost.
Cost at 4 regs, with addrec cost 1, plus 28 base adds, plus 1 imm cost, plus
2 setup cost.
The current mode will select 4 regs case even there are 14 more base adds.
And base
Adds matters in our targets.
 
So I think the current LSR should be pushing more decision into target
dependent backend.
Like calling new functions in TargetTransformInfo. At least, in narrow
search space and cost
comparison phase, or more in cost rating phase. LSR can be tightly cooped
with the target
attributes in order to get the most beneficial result.
 
Yes. LSR decisions are tightly coupled with the target architecture and
potentially the subtarget microarcthitecture. As you figured out, the way to
handle it is to communicate more information to LSR through TTI. It's easy
to do this to solve individual benchmarks on your target. It's hard to know
if you have a general solution that works across targets. But if you can add
hooks in a way that preserves existing behavior on other targets it
shouldn't be a problem. We want to design general hooks, but leave it up to
someone doing the benchmarking to tune them for a particular target.
 
-Andy
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130314/bce313bb/attachment.html>
Maybe Matching Threads
- [LLVMdev] Suggestion About Adding Target Dependent Decision in LSR Please
- [LLVMdev] Suggestion About Adding Target Dependent Decision in LSR Please
- [LLVMdev] Suggestion About Adding Target Dependent Decision in LSR Please
- [LLVMdev] Problems about developing LLVM pass on windows visual studio
- Current status on _outgoing_ Swedish/Dutch DTMF CLIP for TDM400 FXS interfaces?