thr3ads.net - llvm dev - [LLVMdev] [llvm-commits] Bottom-Up Scheduling? [Nov 2011]

If this information is useful, please help other people find it:
Share via:

Andrew Trick

2011-Nov-29 21:16 UTC

[LLVMdev] [llvm-commits] Bottom-Up Scheduling?

Sergei,

I would say that each target has its own scheduling strategy that has changed
considerably over time. We try to maximize code reuse across targets, but
it's not easy and done ad hoc. The result is confusing code that makes it
difficult to understand the strategy for any particular target.

The right thing to do is:
1) Make it as easy as possible to understand how scheduling works for each of
the primary targets (x86 and ARM) independent of each other.
2) Make it easy for very similar targets to piggyback on one of those
implementations, without having to worry about other targets
3) Allow dissimilar targets (e.g. VLIW) to completely bypass the scheduler used
by other targets and reuse only nicely self-contained parts of the framework,
such as the DAG builder and individual machine description features.

We've recently moved further from this ideal scenario in that we're now
forcing targets to implement the bottom-up selection dag scheduler. This is not
really so bad, because you can revert to "source order" scheduling,
-pre-RA-sched=source, and you don't need to implement many target hooks. It
burns compile time for no good reason, but you can probably live with it. Then
you're free to implement your own MI-level scheduler.

The next step in making it easier to maintain an llvm scheduler for
"interesting" targets is to build an MI-level scheduling framework and
move at least one of the primary targets to this framework so it's well
supported. This would separate the nasty issues of serializing the selection DAG
from the challenge of microarchitecture-level scheduling, and provide a suitable
place to inject your own scheduling algorithm. It's easier to implement a
scheduler when starting from a valid instruction sequence  where all
dependencies are resolved and no register interferences exit.

To answer your question, there's no clear way to describe the current
overall scheduling strategy. For now, you'll need to ask porting questions
on llvm-dev. Maybe someone who's faced a similar problem will have a good
suggestion. We do want to improve that situation and we intend to do that by
first providing a new scheduler framework. When we get to that point, I'll
be sure that the new direction can work for you and is easy to understand. All I
can say now is that the new design will allow a target to compose a preRA
scheduler from an MI-level framework combined with target-specific logic for
selecting the optimal instruction order. I don't see any point in imposing a
generic scheduling algorithm across all targets.

-Andy

On Nov 29, 2011, at 11:20 AM, Sergei Larin wrote:
> 
> Andy, 
> 
>  Is there any good info/docs on scheduling strategy in LLVM? As I was
> complaining to you at the LLVM meeting, I end up reverse engineering/double
> guessing more than I would like to... This thread shows that I am not
> exactly alone in this... Thanks. 
> 
> Sergei Larin
> 
> --
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum.
> 
> -----Original Message-----
> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at
cs.uiuc.edu] On
> Behalf Of Andrew Trick
> Sent: Tuesday, November 29, 2011 11:48 AM
> To: Hal Finkel
> Cc: llvmdev at cs.uiuc.edu
> Subject: Re: [LLVMdev] [llvm-commits] Bottom-Up Scheduling?
> 
> ARM can reuse all the default scoreboard hazard recognizer logic such as
> recede cycle (naturally since its the primary client). If you can do the
> same with PPC that's great. 
> 
> Andy
> 
> On Nov 29, 2011, at 8:51 AM, Hal Finkel <hfinkel at anl.gov> wrote:
> 
>>> Thanks! Since I have to change PPCHazardRecognizer for bottom-up
support
>>> anyway, is there any reason not to have it derive from
>>> ScoreboardHazardRecognizer at this point? It looks like the custom
>>> bundling logic could be implemented on top of the scoreboard
recognizer
>>> (that seems similar to what ARM's recognizer is doing).
>> 
>> Also, how does the ARM hazard recognizer get away with not implementing
>> RecedeCycle?
>> 
>> Thanks again,
>> Hal
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

Sergei Larin

2011-Nov-30 17:11 UTC

head link

[LLVMdev] [llvm-commits] Bottom-Up Scheduling?

Andy, 

   Thank you for the extended and prompt answer. Let me try to summaries my
current position so you (and everyone interested) would have a better view
of the world through my eyes ;)

  1) LLVM first robust VLIW target is currently in review. It needs for
scheduling strategy/quality are rather different than what current
scheduler(schedulers) can provide.
  2) My first attempt in porting (while I was on 2.9) resulted in a new
top-down Pre-RA VLIW enabled scheduler that I was hoping to upstream as soon
as our back end is accepted. I guess I have missed the window since our
commit took a bit longer than planned. Now Evan told me (and you have
confirmed) that it would need to change to bottom-up version for 3.0.
Moreover, current "level" (exact placement in DAG->DAG pass) of
Pre-RA
scheduling is less than optimal (and I agree to that since I have to bend
backwards to extract info readily available in MIs).
  3) Your group is working on a "new" scheduler, and the best I
understand
it would be same general algorithm moved "closer" to RA. I also
understand
that at first it would not have added support for
"packets"/bundles/multiops
in VLIW definition (or will it?). If they will be presented, interesting
discussion on how subsequent passes will be modified to recognize them would
follow... but we had another thread on this topic not that long ago.


  So, IMHO the following would make sense:
 
  1) It would be very nice if we can have some sort of write-up detailing
proposed changes, and maybe defining overall strategy for instruction
scheduling in LLVM __before__ major decisions are made. It should later be
converted in to "how to" or simple doc chapter on porting scheduler(s)
to
new targets. Public discussion should follow, and we need to try to
accommodate all needs (as much as possible).
  2) Any attempts on my part to further VLIW scheduler design for my target
would be unwise until such discussion would take place. I also do not
separate this process from bundle/packet representation. If you perceive an
overhead associated with this activity, I could volunteer to help. 

  Also, please see my comments embedded below. 

Thanks. 

Sergei Larin

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum.

> -----Original Message-----
> From: Andrew Trick [mailto:atrick at apple.com]
> Sent: Tuesday, November 29, 2011 3:16 PM
> To: Sergei Larin
> Cc: 'Hal Finkel'; llvmdev at cs.uiuc.edu
> Subject: Re: [LLVMdev] [llvm-commits] Bottom-Up Scheduling?
> 
> Sergei,
> 
> I would say that each target has its own scheduling strategy that has
> changed considerably over time. We try to maximize code reuse across
> targets, but it's not easy and done ad hoc. The result is confusing
> code that makes it difficult to understand the strategy for any
> particular target.
> 
> The right thing to do is:
> 1) Make it as easy as possible to understand how scheduling works for
> each of the primary targets (x86 and ARM) independent of each other.
[Larin, Sergei] 
  Sure, that could be achieved with the design document/documentation set I
am talking about.
> 2) Make it easy for very similar targets to piggyback on one of those
> implementations, without having to worry about other targets

[Larin, Sergei] 
  Yes, and having a robust VLIW scheduler would greatly help here. It would
also IMHO set LLVM apart from GCC, and become an additional selling point
for us.
> 3) Allow dissimilar targets (e.g. VLIW) to completely bypass the
> scheduler used by other targets and reuse only nicely self-contained
> parts of the framework, such as the DAG builder and individual machine
> description features.
[Larin, Sergei] 
  I think this is rather implementation dependent, and we can finesse this
once we have framework better defined. > 
> We've recently moved further from this ideal scenario in that we're
now
> forcing targets to implement the bottom-up selection dag scheduler.
[Larin, Sergei] 
 I really dislike this, especially due to the reason that lead to this
decision. I think the general "flexibility"/functionality was
sacrificed for
tactical reason.
> This is not really so bad, because you can revert to "source
order"
> scheduling, -pre-RA-sched=source, and you don't need to implement many
> target hooks. It burns compile time for no good reason, but you can
> probably live with it. Then you're free to implement your own MI-level
> scheduler.[Larin, Sergei] 
  I am not 100% sure about this statement, but as I get closer to
re-implementing my scheduler I might grasp a better
picture.> 
> The next step in making it easier to maintain an llvm scheduler for
> "interesting" targets is to build an MI-level scheduling
framework and
> move at least one of the primary targets to this framework so it's well
> supported. This would separate the nasty issues of serializing the
> selection DAG from the challenge of microarchitecture-level scheduling,
> and provide a suitable place to inject your own scheduling algorithm.
> It's easier to implement a scheduler when starting from a valid
> instruction sequence  where all dependencies are resolved and no
> register interferences exit.

[Larin, Sergei] 
  Agree, and my whole point is that it needs to be done with preceding
public discussion, and not de-facto with code drops.> 
> To answer your question, there's no clear way to describe the current
> overall scheduling strategy. For now, you'll need to ask porting
> questions on llvm-dev. Maybe someone who's faced a similar problem will
> have a good suggestion. We do want to improve that situation and we
> intend to do that by first providing a new scheduler framework. When we
> get to that point, I'll be sure that the new direction can work for you
[Larin, Sergei] 
Any clue on time frame?
> and is easy to understand. All I can say now is that the new design
> will allow a target to compose a preRA scheduler from an MI-level
> framework combined with target-specific logic for selecting the optimal
> instruction order. I don't see any point in imposing a generic
> scheduling algorithm across all targets.
> 
> -Andy
[Larin, Sergei] 
Thank you again for the explanation. I am really looking forward to digging
into it.
> 
> On Nov 29, 2011, at 11:20 AM, Sergei Larin wrote:
> 
> >
> > Andy,
> >
> >  Is there any good info/docs on scheduling strategy in LLVM? As I was
> > complaining to you at the LLVM meeting, I end up reverse
> engineering/double
> > guessing more than I would like to... This thread shows that I am not
> > exactly alone in this... Thanks.
> >
> > Sergei Larin
> >
> > --
> > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum.
> >
> > -----Original Message-----
> > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-
> bounces at cs.uiuc.edu] On
> > Behalf Of Andrew Trick
> > Sent: Tuesday, November 29, 2011 11:48 AM
> > To: Hal Finkel
> > Cc: llvmdev at cs.uiuc.edu
> > Subject: Re: [LLVMdev] [llvm-commits] Bottom-Up Scheduling?
> >
> > ARM can reuse all the default scoreboard hazard recognizer logic such
> as
> > recede cycle (naturally since its the primary client). If you can do
> the
> > same with PPC that's great.
> >
> > Andy
> >
> > On Nov 29, 2011, at 8:51 AM, Hal Finkel <hfinkel at anl.gov>
wrote:
> >
> >>> Thanks! Since I have to change PPCHazardRecognizer for
bottom-up
> support
> >>> anyway, is there any reason not to have it derive from
> >>> ScoreboardHazardRecognizer at this point? It looks like the
custom
> >>> bundling logic could be implemented on top of the scoreboard
> recognizer
> >>> (that seems similar to what ARM's recognizer is doing).
> >>
> >> Also, how does the ARM hazard recognizer get away with not
> implementing
> >> RecedeCycle?
> >>
> >> Thanks again,
> >> Hal
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >

Andrew Trick

2011-Nov-30 18:41 UTC

head link

[LLVMdev] [llvm-commits] Bottom-Up Scheduling?

On Nov 30, 2011, at 9:11 AM, Sergei Larin wrote:
>> This is not really so bad, because you can revert to "source
order"
>> scheduling, -pre-RA-sched=source, and you don't need to implement
many
>> target hooks. It burns compile time for no good reason, but you can
>> probably live with it. Then you're free to implement your own
MI-level
>> scheduler.
> [Larin, Sergei] 
>  I am not 100% sure about this statement, but as I get closer to
> re-implementing my scheduler I might grasp a better picture.
One thing that would be nice to have ASAP is a SelectionDAG serialization pass
that satisfies dependencies and physical register interferences, while
preserving IR instruction whenever possible. This should be totally separate
from from the SelectionDAG scheduler. It should not work on SUnits.

I realize this is quite disjoint from the work needed to port a new target.
I'm just pointing out that it would be a welcome feature.

If we had that pass, I could tell you that it would be fairly straightforward to
reenable the top-down SD scheduler. At this point, since you'd rather
scheduler MI's anyway, you may choose to focus on that strategy instead.
>> The next step in making it easier to maintain an llvm scheduler for
>> "interesting" targets is to build an MI-level scheduling
framework and
>> move at least one of the primary targets to this framework so it's
well
>> supported. This would separate the nasty issues of serializing the
>> selection DAG from the challenge of microarchitecture-level scheduling,
>> and provide a suitable place to inject your own scheduling algorithm.
>> It's easier to implement a scheduler when starting from a valid
>> instruction sequence  where all dependencies are resolved and no
>> register interferences exit.
> 
> 
> [Larin, Sergei] 
>  Agree, and my whole point is that it needs to be done with preceding
> public discussion, and not de-facto with code drops.
It will be an incremental process. I'm not going to design a complete
scheduling framework for all microarchitectures "on paper" before
making any changes. Design decisions will be deferred as late as they can be
without holding up progress. You'll know when they're being made and
have the opportunity to influence them. In fact, any new design will be strongly
influenced by the scheduler work that you and others have done recently.

I think you're reacting to the recent dropping of preRA top-down scheduling
without public discussion. As you know it was not part of a planned strategy,
and not a desirable outcome for anyone. The fact is that we couldn't wait to
fix an existing design flaw in DAG serialization. The bottom-up scheduler has
the ability to overcome this problem, but implementing a fix that doesn't
require running the bottom-up scheduler requires significant work. The right
thing to do is to implement SD serialization pass I mentioned above. That
solution would be preferable to everyone, but someone needs make the investment.

Of course, anyone is welcome to fix the existing top-down scheduler as well. It
requires implementing the inverse of the bottom-up scheduler's physical
register tracking, see LiveRegDefs, plus some really hairy logic for resolving
interferences that the SelectionDAG builder has created.

FWIW, we're not going to run into this issue with the MI scheduling
framework that I'm referring to because no part of it will be imposed on any
targets.
>> To answer your question, there's no clear way to describe the
current
>> overall scheduling strategy. For now, you'll need to ask porting
>> questions on llvm-dev. Maybe someone who's faced a similar problem
will
>> have a good suggestion. We do want to improve that situation and we
>> intend to do that by first providing a new scheduler framework. When we
>> get to that point, I'll be sure that the new direction can work for
you
> 
> [Larin, Sergei] 
> Any clue on time frame?
2012 :)

-Andy
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20111130/10e6f6ff/attachment.html>

Possibly Parallel Threads

Search for more apparently analagous threads

llvm dev - Nov 2011 - [LLVMdev] [llvm-commits] Bottom-Up Scheduling?

[LLVMdev] [llvm-commits] Bottom-Up Scheduling?

[LLVMdev] [llvm-commits] Bottom-Up Scheduling?

[LLVMdev] [llvm-commits] Bottom-Up Scheduling?

Possibly Parallel Threads