thr3ads.net - llvm dev - [llvm-dev] [GlobalISel] A Proposal for global instruction selection [Nov 2015]

If this information is useful, please help other people find it:
Share via:

Quentin Colombet via llvm-dev

2015-Nov-19 22:26 UTC

[llvm-dev] [GlobalISel] A Proposal for global instruction selection

Hi Eric,

> On Nov 19, 2015, at 12:46 PM, Eric Christopher <echristo at
gmail.com> wrote:
> 
> Hi Quentin,
> 
> 
> *** Goals ***
> 
> The high level goals of the new instruction selector are:
> - Global instruction selector.
> - Fast instruction selector.
> 
> Are these separate or the same? It reads like two instruction selectors at
the moment.
They are the same, sorry for the confusion. This reads, we want a global and
fast instruction selector where producing the code fast and producing good code
quality exercise the same basic path in the framework. I.e., producing code fast
is a trimmed down version of producing good code. E.g., for fast, analysis are
less precise, fewer passes are run, etc.
>  
> - Shared code path for fast and good instruction selection.
> 
> But then I'm not sure starting here.
>  
> - IR that represents ISA concepts better.
> - More flexible instruction selector.
> 
> Some definitions here would be good.
For IR that represents ISA concepts better, this is in opposition to SDISel or
LLVM IR. In other words, the target should be able to insert target specific
code (e.g., instruction, physical register) at anytime without needing some
extra crust to express that (e.g., intrinsic or custom SDNode).

By more flexible we mean that targets should be able to inject target specific
passes between the generic passes or replace those passes by their own.
>  
> - Easier to maintain/understand framework, in particular legalization.
> - Self contained machine representation, no back links to LLVM IR.
> - No change to LLVM IR.
> 
> 
> These sound great. Would be good to get the assumptions of the legalization
pass written down more explicitly as you go through this.
Agree.
For now, the assumptions are there are no illegal types, just illegal pair of
operation and type. But yeah, we may need to refine when we get to the
legalization.
>  
> *** Proposed Approach ***
> 
> In this section, I describe the approach I plan to pursue in the prototype
and the roadmap to get there. The final design will flow out of it.
> 
> For this prototype, we purposely exclude any work to improve or use
TableGen or
> 
> I'm getting the idea that you really don't want to work on
TableGen? ;)
Heh, that’s more a pragmatic approach. I don’t want we spend months improving
TableGen before we start working on GlobalISel.
That being said, I think we should push as much thing as possible in tablegen
when we are done with prototyping.
>  
> 
> ** Implications **
> 
> As part of the bring-up of the prototype, we need to extend some of the
core MachineInstr-level APIs:
>   - Need to remember FastMath flags for each MachineInstr.
> 
> Not orthogonal to this proposal? I don't mind lumping it in as being
able to do this is probably a good goal for the prototype at least, but it seems
like being able to do this is something that could be done incrementally as a
separate project?
That’s a good point and yes, it could be done as a separate project. The reason
why this is here is because if we want to experiment with combine and such in
the prototype, this is the kind of information we would need.
>  
> At the end of M1, the prototype will not be able to produce code, since we
would only have the beginning of the Global ISel pipeline. Instead, we will test
the IRTranslator on the generic output that is produced from the tested IR.
> 
> 
> So this would be targeting Generic MachineInstr?
Yes.
> (Better name perhaps?).
Suggestion welcome :).
> Which means that it should be serializable and testable in isolation yes?
Partly. The lowering of the body of the function will be generic, but the ABI
lowering will be target specific and unless we create some kind of fake target,
the tests need to be bound to one target.
>  
> * Design Decisions *
> 
> - The IRTranslator is a final class. Its purpose is to move away from LLVM
IR to MachineInstr world [final].
> - Lower the ABI as part of the translation process [final].
> 
> * Design Questions the Prototype Addresses at the End of M1 *
> 
> - Handling of aggregate types during the translation.
> - Lowering of switches.
> - What about Module pass for Machine pass?
> 
> Could you elaborate a bit more here?
I have quickly mentioned in my reply to Marcello why this may be interesting.
Let me rephrase my answer here.
Basically, we would like to have the MachineInstr to be self contained, i.e.,
get rid of those back links to LLVM IR. This implies that we would need to lower
globals (maybe directly to MC) as part of the translation process. Globals are
not attached to function but module, therefore it seems to make sense to
introduce a concept of MachineModulePass.
>  
> - Introduce new APIs to have a clearer separation between:
>   - Legalization (setOperationAction, etc.)
>   - Cost/Combine related (isXXXFree, etc.)
>   - Lowering related (LowerFormal, etc.)
> - What is the contract with the backends? Is it still “should be able to
select any valid LLVM IR”?
> 
> Probably :) 
> 
> As far as the prototype I think you also need to address a few additional
things:
> 
> a) Calls
>  Calls are probably the most important part of any new instruction selector
and lowering machinery and I think that the design of the call lowering
infrastructure is going to be a critical part of evaluating the prototype. You
might have meant this earlier when you said Lowering related, but I wanted to
make sure to call it out explicitly.
Yes, lowering of calls is definitely going to be evaluated in the prototype for
this first milestone and the "lowering related” stuff was about that :).
(You’re good at deciphering messages ;)).
> 
> b) Testing
>  It's been covered a bit before, but being able to serialize and use
for testing the various IR constructs is important. In particular, I worry about
the existing MIR code as I and a few others have tried to use it for testcases
and failed. I'm very interested in whatever ideas you have here, all of mine
are much more invasive than I think we'd like.
Honestly I haven’t used the MIR testing infrastructure yet, but yes my
impression was it is not really… mature. I would love to have some serialization
mechanism for the MI that really work so that we can write those testcases more
easily.
As for now, I haven’t looked into it, so I cannot share any ideas. I’ve
discussed a bit with Matthias and he thinks that we might not be that far away
from having MIR testing useable modulo bug fixes.

It would be helpful if you could file PR on the cases where MIR was not working
for you so that we can look into it at some point.

My hope is that someone could look into it before we actually need a proper MI
testing in place.

(Hidden message: If you are willing to work on the MIR testing or any other
mechanism that would allow us to do MI serialization deserialization, please
come forward, we need you!! :D)

Indeed, for the translation part the MIR testing is not critical since we do
have the LLVM IR around.
Then, if we get rid of the LLVM IR back links, serialization should become
easier and maybe MIR testing could be leverage. That being said, it may be
possible that we need to start that from scratch, while taking into account what
we learnt from the MIR testing.

Thanks for the feedbacks,
-Quentin
> 
> Thanks for tackling this project and being willing to put this out there
for discussion and feedback. I'm looking forward to the code and future
design.
> 
> -eric
>  
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20151119/26458d23/attachment.html>

Eric Christopher via llvm-dev

2015-Nov-20 00:58 UTC

head link

[llvm-dev] [GlobalISel] A Proposal for global instruction selection

On Thu, Nov 19, 2015 at 2:26 PM Quentin Colombet <qcolombet at apple.com>
wrote:
> Hi Eric,
>
>
> On Nov 19, 2015, at 12:46 PM, Eric Christopher <echristo at
gmail.com> wrote:
>
> Hi Quentin,
>
>>
>>
>> *** Goals ***
>>
>> The high level goals of the new instruction selector are:
>> - Global instruction selector.
>> - Fast instruction selector.
>>
>
> Are these separate or the same? It reads like two instruction selectors at
> the moment.
>
>
> They are the same, sorry for the confusion. This reads, we want a global
> and fast instruction selector where producing the code fast and producing
> good code quality exercise the same basic path in the framework. I.e.,
> producing code fast is a trimmed down version of producing good code. E.g.,
> for fast, analysis are less precise, fewer passes are run, etc.
>
Excellent.

>
>
>
>> - Shared code path for fast and good instruction selection.
>>
>
> But then I'm not sure starting here.
>
>
>> - IR that represents ISA concepts better.
>> - More flexible instruction selector.
>>
>
> Some definitions here would be good.
>
>
> For IR that represents ISA concepts better, this is in opposition to
> SDISel or LLVM IR. In other words, the target should be able to insert
> target specific code (e.g., instruction, physical register) at anytime
> without needing some extra crust to express that (e.g., intrinsic or custom
> SDNode).
>
>I'm not sure that this represents the concepts any better. Basically it
means that you have less and easier target independent handling, I'm
unconvinced this is that useful. Perhaps an example might help :)

> By more flexible we mean that targets should be able to inject target
> specific passes between the generic passes or replace those passes by their
> own.
>
It'll be interesting to see how this is going to be developed and how to
keep the target independentness of the code generator with this new scheme.
I.e. this is basically turning (in my mind) into "every backend for
themselves" with very little target independent unification. Outside of
special purpose ports I don't see a lot of need for this, but we'll see.
I
think it's going to take some discipline to avoid the "every backend is
a
large C++ project that defines everything it needs custom".

>
>
>
>> - Easier to maintain/understand framework, in particular legalization.
>> - Self contained machine representation, no back links to LLVM IR.
>> - No change to LLVM IR.
>>
>>
> These sound great. Would be good to get the assumptions of the
> legalization pass written down more explicitly as you go through this.
>
>
> Agree.
> For now, the assumptions are there are no illegal types, just illegal pair
> of operation and type. But yeah, we may need to refine when we get to the
> legalization.
>
Also things like canonicalization, etc. Just something to think about.

>
>
>
>> *** Proposed Approach ***
>>
>> In this section, I describe the approach I plan to pursue in the
>> prototype and the roadmap to get there. The final design will flow out
of
>> it.
>>
>> For this prototype, we purposely exclude any work to improve or use
>> TableGen or
>>
>
> I'm getting the idea that you really don't want to work on
TableGen? ;)
>
>
> Heh, that’s more a pragmatic approach. I don’t want we spend months
> improving TableGen before we start working on GlobalISel.
> That being said, I think we should push as much thing as possible in
> tablegen when we are done with prototyping.
>
Sure.

>
>
>
>>
>> ** Implications **
>>
>> As part of the bring-up of the prototype, we need to extend some of the
>> core MachineInstr-level APIs:
>>   - Need to remember FastMath flags for each MachineInstr.
>>
>
> Not orthogonal to this proposal? I don't mind lumping it in as being
able
> to do this is probably a good goal for the prototype at least, but it seems
> like being able to do this is something that could be done incrementally as
> a separate project?
>
>
> That’s a good point and yes, it could be done as a separate project. The
> reason why this is here is because if we want to experiment with combine
> and such in the prototype, this is the kind of information we would need.
>
>Hmm, I thought you were avoiding combine? :)

>
>
>> At the end of M1, the prototype will not be able to produce code, since
>> we would only have the beginning of the Global ISel pipeline. Instead,
we
>> will test the IRTranslator on the generic output that is produced from
the
>> tested IR.
>>
>>
> So this would be targeting Generic MachineInstr?
>
>
> Yes.
>
> (Better name perhaps?).
>
>
> Suggestion welcome :).
>
Yeah. First suggestion: Let's leave off the r ;)

>
> Which means that it should be serializable and testable in isolation yes?
>
>
> Partly. The lowering of the body of the function will be generic, but the
> ABI lowering will be target specific and unless we create some kind of fake
> target, the tests need to be bound to one target.
>
That's reasonable.

>
>
>
>> * Design Decisions *
>>
>> - The IRTranslator is a final class. Its purpose is to move away from
>> LLVM IR to MachineInstr world *[final]*.
>> - Lower the ABI as part of the translation process *[final]*.
>>
>> * Design Questions the Prototype Addresses at the End of M1 *
>>
>> - Handling of aggregate types during the translation.
>> - Lowering of switches.
>> - What about Module pass for Machine pass?
>>
>
> Could you elaborate a bit more here?
>
>
> I have quickly mentioned in my reply to Marcello why this may be
> interesting. Let me rephrase my answer here.
> Basically, we would like to have the MachineInstr to be self contained,
> i.e., get rid of those back links to LLVM IR. This implies that we would
> need to lower globals (maybe directly to MC) as part of the translation
> process. Globals are not attached to function but module, therefore it
> seems to make sense to introduce a concept of MachineModulePass.
>
*nod* I'd like to do something about the AsmPrinter anyhow.

>
>
>
>> - Introduce new APIs to have a clearer separation between:
>>   - Legalization (setOperationAction, etc.)
>>   - Cost/Combine related (isXXXFree, etc.)
>>   - Lowering related (LowerFormal, etc.)
>> - What is the contract with the backends? Is it still “should be able
to
>> select any valid LLVM IR”?
>>
>
> Probably :)
>
> As far as the prototype I think you also need to address a few additional
> things:
>
> a) Calls
>  Calls are probably the most important part of any new instruction
> selector and lowering machinery and I think that the design of the call
> lowering infrastructure is going to be a critical part of evaluating the
> prototype. You might have meant this earlier when you said Lowering
> related, but I wanted to make sure to call it out explicitly.
>
>
> Yes, lowering of calls is definitely going to be evaluated in the
> prototype for this first milestone and the "lowering related” stuff
was
> about that :).
> (You’re good at deciphering messages ;)).
>
I try. Anyhow, glad to hear about calls.

>
>
> b) Testing
>  It's been covered a bit before, but being able to serialize and use
for
> testing the various IR constructs is important. In particular, I worry
> about the existing MIR code as I and a few others have tried to use it for
> testcases and failed. I'm very interested in whatever ideas you have
here,
> all of mine are much more invasive than I think we'd like.
>
>
> Honestly I haven’t used the MIR testing infrastructure yet, but yes my
> impression was it is not really… mature. I would love to have some
> serialization mechanism for the MI that really work so that we can write
> those testcases more easily.
> As for now, I haven’t looked into it, so I cannot share any ideas. I’ve
> discussed a bit with Matthias and he thinks that we might not be that far
> away from having MIR testing useable modulo bug fixes.
>
> It would be helpful if you could file PR on the cases where MIR was not
> working for you so that we can look into it at some point.
>
> My hope is that someone could look into it before we actually need a
> proper MI testing in place.
>
> (*Hidden message:* If you are willing to work on the MIR testing or any
> other mechanism that would allow us to do MI serialization deserialization,
> please come forward, we need you!! :D)
>
> Indeed, for the translation part the MIR testing is not critical since we
> do have the LLVM IR around.
> Then, if we get rid of the LLVM IR back links, serialization should become
> easier and maybe MIR testing could be leverage. That being said, it may be
> possible that we need to start that from scratch, while taking into account
> what we learnt from the MIR testing.
>
Pretty much agree with this. I didn't file a bug because I wasn't sure
what
to say other than "this serialization wasn't useful for making test
cases".
Maybe you'll find it more so and we can get some best practices out of it.

Thanks!

-eric

>
> Thanks for the feedbacks,
> -Quentin
>
>
> Thanks for tackling this project and being willing to put this out there
> for discussion and feedback. I'm looking forward to the code and future
> design.
>
> -eric
>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20151120/46ad1376/attachment.html>

Quentin Colombet via llvm-dev

2015-Nov-20 01:43 UTC

head link

[llvm-dev] [GlobalISel] A Proposal for global instruction selection

> On Nov 19, 2015, at 4:58 PM, Eric Christopher <echristo at gmail.com>
wrote:
> 
> 
> 
> On Thu, Nov 19, 2015 at 2:26 PM Quentin Colombet <qcolombet at apple.com
<mailto:qcolombet at apple.com>> wrote:
> Hi Eric,
> 
> 
>> On Nov 19, 2015, at 12:46 PM, Eric Christopher <echristo at
gmail.com <mailto:echristo at gmail.com>> wrote:
>> 
>> Hi Quentin,
>> 
>> 
>> *** Goals ***
>> 
>> The high level goals of the new instruction selector are:
>> - Global instruction selector.
>> - Fast instruction selector.
>> 
>> Are these separate or the same? It reads like two instruction selectors
at the moment.
> 
> They are the same, sorry for the confusion. This reads, we want a global
and fast instruction selector where producing the code fast and producing good
code quality exercise the same basic path in the framework. I.e., producing code
fast is a trimmed down version of producing good code. E.g., for fast, analysis
are less precise, fewer passes are run, etc.
> 
> Excellent.
>  
> 
>>  
>> - Shared code path for fast and good instruction selection.
>> 
>> But then I'm not sure starting here.
>>  
>> - IR that represents ISA concepts better.
>> - More flexible instruction selector.
>> 
>> Some definitions here would be good.
> 
> For IR that represents ISA concepts better, this is in opposition to SDISel
or LLVM IR. In other words, the target should be able to insert target specific
code (e.g., instruction, physical register) at anytime without needing some
extra crust to express that (e.g., intrinsic or custom SDNode).
> 
> 
> I'm not sure that this represents the concepts any better. Basically it
means that you have less and easier target independent handling, I'm
unconvinced this is that useful. Perhaps an example might help :)
I don’t have an example off hand, but basically, any time we have to create a
custom SDNode, it is useless.
Another thing, that I didn’t call out before because it is a strong statement
and I don’t want to commit on that for now, is that we can have a better
estimate for register pressure and thing like choosing addressing mode, since we
are at a much lower level and that we can directly emit the proper memory
operation or look at the actual register classes.

Those are the kind of opportunities that I envision by moving to MachineInstr
level.
>  
> By more flexible we mean that targets should be able to inject target
specific passes between the generic passes or replace those passes by their own.
> 
> It'll be interesting to see how this is going to be developed and how
to keep the target independentness of the code generator with this new scheme.
I.e. this is basically turning (in my mind) into "every backend for
themselves" with very little target independent unification. Outside of
special purpose ports I don't see a lot of need for this, but we'll see.
I think it's going to take some discipline to avoid the "every backend
is a large C++ project that defines everything it needs custom”.
At this point, the idea is to have the standard passes shared (i.e.,
IRTranslator, Legalizer, RegBankSelect, and Select) and let the targets create
their own pass if they want to do more stuff. Then, if we see room for
generalization, we can refactor :).

This is what happens right now with the IR passes that are target specific.
Sometimes those get factored out like GlobalMerge.

I believe the same could happen with MachineInstr passes.

Now, regarding the standard passes themselves, I don’t see how the target
independentness will be different than our current selector. If your concern is
about canonicalization, well, yes, if targets start to mess up the generic
opcodes with target specific opcodes, we may lose some of it, but, I would say,
that is the point!
If targets want to mess up with canonicalization, so be it. This is a problem we
have with SDISel: sometime targets fight the canonicalization and this is very
hard. Now, that would be much easier :).

Anyhow, yes, this is something we need to keep an eye on, and at some point in
the prototype, it will be nice to have quick targeting of the framework for
other backends.
>  
> 
>>  
>> - Easier to maintain/understand framework, in particular legalization.
>> - Self contained machine representation, no back links to LLVM IR.
>> - No change to LLVM IR.
>> 
>> 
>> These sound great. Would be good to get the assumptions of the
legalization pass written down more explicitly as you go through this.
> 
> Agree.
> For now, the assumptions are there are no illegal types, just illegal pair
of operation and type. But yeah, we may need to refine when we get to the
legalization.
> 
> Also things like canonicalization, etc. Just something to think about.
Yeah, just mentioned canonicalization in my previous paragraph and the bottom
line is that I don’t think canonicalization should be required for correctness.
>  
> 
>>  
>> *** Proposed Approach ***
>> 
>> In this section, I describe the approach I plan to pursue in the
prototype and the roadmap to get there. The final design will flow out of it.
>> 
>> For this prototype, we purposely exclude any work to improve or use
TableGen or
>> 
>> I'm getting the idea that you really don't want to work on
TableGen? ;)
> 
> Heh, that’s more a pragmatic approach. I don’t want we spend months
improving TableGen before we start working on GlobalISel.
> That being said, I think we should push as much thing as possible in
tablegen when we are done with prototyping.
> 
> Sure.
>  
> 
>>  
>> 
>> ** Implications **
>> 
>> As part of the bring-up of the prototype, we need to extend some of the
core MachineInstr-level APIs:
>>   - Need to remember FastMath flags for each MachineInstr.
>> 
>> Not orthogonal to this proposal? I don't mind lumping it in as
being able to do this is probably a good goal for the prototype at least, but it
seems like being able to do this is something that could be done incrementally
as a separate project?
> 
> That’s a good point and yes, it could be done as a separate project. The
reason why this is here is because if we want to experiment with combine and
such in the prototype, this is the kind of information we would need.
> 
> 
> Hmm, I thought you were avoiding combine? :)
Heh, figured someones may want to try it during the prototype timeframe :),
though, what interests me here is to check how easy it is to propagate this kind
of information, while going for the self contained IR.
>  
>>  
>> At the end of M1, the prototype will not be able to produce code, since
we would only have the beginning of the Global ISel pipeline. Instead, we will
test the IRTranslator on the generic output that is produced from the tested IR.
>> 
>> 
>> So this would be targeting Generic MachineInstr? 
> 
> Yes.
> 
>> (Better name perhaps?).
> 
> Suggestion welcome :).
> 
> Yeah. First suggestion: Let's leave off the r ;)
Gene.ic :P
>  
> 
>> Which means that it should be serializable and testable in isolation
yes?
> 
> Partly. The lowering of the body of the function will be generic, but the
ABI lowering will be target specific and unless we create some kind of fake
target, the tests need to be bound to one target.
> 
> That's reasonable.
The fake target or be bound to one target?
>  
> 
>>  
>> * Design Decisions *
>> 
>> - The IRTranslator is a final class. Its purpose is to move away from
LLVM IR to MachineInstr world [final].
>> - Lower the ABI as part of the translation process [final].
>> 
>> * Design Questions the Prototype Addresses at the End of M1 *
>> 
>> - Handling of aggregate types during the translation.
>> - Lowering of switches.
>> - What about Module pass for Machine pass?
>> 
>> Could you elaborate a bit more here?
> 
> I have quickly mentioned in my reply to Marcello why this may be
interesting. Let me rephrase my answer here.
> Basically, we would like to have the MachineInstr to be self contained,
i.e., get rid of those back links to LLVM IR. This implies that we would need to
lower globals (maybe directly to MC) as part of the translation process. Globals
are not attached to function but module, therefore it seems to make sense to
introduce a concept of MachineModulePass.
> 
> *nod* I'd like to do something about the AsmPrinter anyhow.
>  
> 
>>  
>> - Introduce new APIs to have a clearer separation between:
>>   - Legalization (setOperationAction, etc.)
>>   - Cost/Combine related (isXXXFree, etc.)
>>   - Lowering related (LowerFormal, etc.)
>> - What is the contract with the backends? Is it still “should be able
to select any valid LLVM IR”?
>> 
>> Probably :) 
>> 
>> As far as the prototype I think you also need to address a few
additional things:
>> 
>> a) Calls
>>  Calls are probably the most important part of any new instruction
selector and lowering machinery and I think that the design of the call lowering
infrastructure is going to be a critical part of evaluating the prototype. You
might have meant this earlier when you said Lowering related, but I wanted to
make sure to call it out explicitly.
> 
> Yes, lowering of calls is definitely going to be evaluated in the prototype
for this first milestone and the "lowering related” stuff was about that
:).
> (You’re good at deciphering messages ;)).
> 
> I try. Anyhow, glad to hear about calls.
>  
> 
>> 
>> b) Testing
>>  It's been covered a bit before, but being able to serialize and
use for testing the various IR constructs is important. In particular, I worry
about the existing MIR code as I and a few others have tried to use it for
testcases and failed. I'm very interested in whatever ideas you have here,
all of mine are much more invasive than I think we'd like.
> 
> Honestly I haven’t used the MIR testing infrastructure yet, but yes my
impression was it is not really… mature. I would love to have some serialization
mechanism for the MI that really work so that we can write those testcases more
easily.
> As for now, I haven’t looked into it, so I cannot share any ideas. I’ve
discussed a bit with Matthias and he thinks that we might not be that far away
from having MIR testing useable modulo bug fixes.
> 
> It would be helpful if you could file PR on the cases where MIR was not
working for you so that we can look into it at some point.
> 
> My hope is that someone could look into it before we actually need a proper
MI testing in place.
> 
> (Hidden message: If you are willing to work on the MIR testing or any other
mechanism that would allow us to do MI serialization deserialization, please
come forward, we need you!! :D)
> 
> Indeed, for the translation part the MIR testing is not critical since we
do have the LLVM IR around.
> Then, if we get rid of the LLVM IR back links, serialization should become
easier and maybe MIR testing could be leverage. That being said, it may be
possible that we need to start that from scratch, while taking into account what
we learnt from the MIR testing.
> 
> Pretty much agree with this. I didn't file a bug because I wasn't
sure what to say other than "this serialization wasn't useful for
making test cases". Maybe you'll find it more so and we can get some
best practices out of it.
Fingers crossed!

Thanks for the additional feedback, I think it really helps to call all that
out!
Q.
> 
> Thanks!
> 
> -eric
>  
> 
> Thanks for the feedbacks,
> -Quentin
> 
>> 
>> Thanks for tackling this project and being willing to put this out
there for discussion and feedback. I'm looking forward to the code and
future design.
>> 
>> -eric
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20151119/a9009a5a/attachment-0001.html>

Krzysztof Parzyszek via llvm-dev

2015-Nov-30 15:30 UTC

head link

[llvm-dev] [GlobalISel] A Proposal for global instruction selection

On 11/19/2015 6:58 PM, Eric Christopher via llvm-dev
wrote:> It'll be interesting to see how this is going to be developed and how
to
> keep the target independentness of the code generator with this new
> scheme. I.e. this is basically turning (in my mind) into "every
backend
> for themselves" with very little target independent unification.
I really don't mind the "every backend for themselves" approach. 
The
instruction selection pass is about as target-specific as a common pass 
can get, and the more work the generic code tries to do, the more 
potential it has to be inflexible.  This is not to say that a generic 
code will necessarily be bad, but that a target-centric approach has a 
better chance of working out better, even if it means that more work is 
required to implement instruction selection for a new target.

As someone mentioned in another email, the canonicalization currently 
done in the DAG combiner has a tendency to interfere with what 
individual targets may prefer.  One example of it that I remember for 
Hexagon was that the LLVM IR had a combination of shifts left and right 
to extract a bitfield from a longer integer.  Hexagon has an instruction 
to do that and it's quite simple to map the shifts into that 
instruction.  The combiner, hovewer, would fold the shifts leaving only 
the minimum sequence of operations necessary to get the bitfield.  This 
seems to be better from the generic point of view, but it makes it 
practically impossible for us to match it to the "extract"
instruction,
and in practice the code turns out to be worse.  This is the only reason 
why we have the HexagonGenExtract pass---we detect the patterns in the 
LLVM IR and generate "extract" intrinsics before the combiner mangles 
them up into unrecognizable forms.  The same goes for replacing ADD with 
OR when the bits in the operands do not overlap.  We have code that 
specifically undoes that, since for us, if the original code had an ADD, 
it is pretty much always better if it remains an ADD.

There were cases in the past when we had to disable parts of 
CodeGenPrepare, or else it would happily promote i32 into i64 where it 
wasn't strictly necessary.  I64 is a legal type on Hexagon, but it uses 
pairs of registers which, in practical terms, means that our register 
set is cut by half when 64-bit values are used.

On the other hand, having a relatively simple, generic IR makes it 
easier to simplify code that is no longer subjected to the LLVM IR's 
constraints (e.g. getelementptr expressed as +/*, etc.).  Hexagon has a 
lot of very specific complex/compound instructions and a given code can 
be written in many different ways.  This makes it harder to optimize 
code after the specific instructions have been selected.  For example, a 
pass that would try to simplify arithmetic code would need to deal with 
the dozens of variants of add/multiplication instructions, instead of 
simply looking at some generic GADD/GMPY.

-Krzysztof

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, 
hosted by The Linux Foundation

llvm dev - Nov 2015 - [GlobalISel] A Proposal for global instruction selection

[llvm-dev] [GlobalISel] A Proposal for global instruction selection

[llvm-dev] [GlobalISel] A Proposal for global instruction selection

[llvm-dev] [GlobalISel] A Proposal for global instruction selection

[llvm-dev] [GlobalISel] A Proposal for global instruction selection