Hal Finkel via llvm-dev
2017-Nov-11 19:03 UTC
[llvm-dev] RFC: [GlobalISel] Towards a generic MI combiner framework
On 11/11/2017 12:44 PM, Amara Emerson wrote:> >> On Nov 10, 2017, at 10:04 PM, Aditya Nandakumar <proaditya at gmail.com >> <mailto:proaditya at gmail.com>> wrote: >>> >>> The current DAGCombine, being constructed on top of SDAG, has a kind >>> of built-in CSE and automatic DCE. How will things change, if >>> they'll change, in this new model? >> Hi Hal, >> >> I suspect one option is to have a separate CSE pass, and the backends >> get to choose where exactly they plug in their pipeline. I think DCE >> should be part of the combine pass (and the legalizer is about to >> start doing that as well). > For SSA form MIR there’s already the MachineCSE pass. How important > the CSE/DCE is at the combine stage I don’t know. As an approximation > perhaps we can get an idea by disabling the behavior in the > DAGCombiner and seeing the effects.My impression is that the automated CSE and DCE is very important to the current implementation. There's a lot of code that depends on his happening in order to have the expected effects. Otherwise, the use-count checks won't do the right thing (because the old unused uses of things won't immediately go away). I'm not entirely sure you can just turn off the uniquing in SDAG and get a sensible result. -Hal> >> On Nov 10, 2017, at 9:54 PM, Daniel Sanders >> <daniel_l_sanders at apple.com <mailto:daniel_l_sanders at apple.com>> wrote: >> >> My thinking on this is that (with a few exceptions that I'll get to), >> combine and select are basically the same thing. You match some MIR, >> and replace it with other MIR. The main difference being that combine >> doesn't have to constrain to register classes (unless it wants to) >> while select does. >> >> With that in mind, I was thinking that it makes sense to put a lot of >> effort into the optimization of the tablegen-erated selection table >> (as has been started in Quentin's recent patch) and then re-use it >> for combines too. We'll need to be careful how we define GlobalISel's >> counterpart to SelectionDAG patterns to make it expressive enough to >> support combines but that's essentially a second frontend (the other >> being the SelectionDAG importer) on a common backend > Agreed that combine and selection are similar processes. It sounds > like this is something we should look at prototyping. > >> >> Req 2 becomes simple to implement in this approach. You can either >> use the existing feature-bits mechanism to enable/disable combine >> rules as a group, or add an equivalent mechanism in tablegen to >> decide whether a rule makes it into the emitted table or not and have >> multiple tables which you can run/not-run at will. With the new >> coverage feedback mechanism, we could potentially organize our tables >> semi-automatically by highlighting combine rules that never or rarely >> fire in a particular pass. >> >> One feature I think we ought to have that isn't on the requirements >> list already, is that I think we should have a means to support rules >> with more than one match root. For example (using SelectionDAG patterns): >> (set $dst1:GPR32, (i32 (load $ptr:GPR64))) >> (set $dst2:GPR32, (i32 (load (add $ptr:GPR64 4)))) >> into: >> (set $tmp:GPR64, (v2s32 (load $ptr:GPR64))) >> (set $dst1, (extractelt $tmp:GPR64, 0)) >> (set $dst2, (extractelt $tmp:GPR64, 1)) >> Or something along those lines (such as fusing div/mod together). The >> combiner should be smart enough to make the root the $ptr, and follow >> the use of $ptr into the load/add, then follow the def to the 4. > This seems like a nice feature, but I wonder about the impact this > will have on the speed of the matching algorithm. I don’t know enough > about it to say though. IMO complex features can be done in C++ code > if they’re uncommon, in preference for fast handling of the common > cases. Maybe a few more use cases are needed. > > Thanks, > Amara-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171111/4557e350/attachment.html>
Quentin Colombet via llvm-dev
2017-Nov-12 05:31 UTC
[llvm-dev] RFC: [GlobalISel] Towards a generic MI combiner framework
> On Nov 11, 2017, at 11:03 AM, Hal Finkel via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > On 11/11/2017 12:44 PM, Amara Emerson wrote: >> >>> On Nov 10, 2017, at 10:04 PM, Aditya Nandakumar <proaditya at gmail.com <mailto:proaditya at gmail.com>> wrote: >>>> >>>> The current DAGCombine, being constructed on top of SDAG, has a kind of built-in CSE and automatic DCE. How will things change, if they'll change, in this new model? >>> Hi Hal, >>> >>> I suspect one option is to have a separate CSE pass, and the backends get to choose where exactly they plug in their pipeline. I think DCE should be part of the combine pass (and the legalizer is about to start doing that as well). >> For SSA form MIR there’s already the MachineCSE pass. How important the CSE/DCE is at the combine stage I don’t know. As an approximation perhaps we can get an idea by disabling the behavior in the DAGCombiner and seeing the effects. > > My impression is that the automated CSE and DCE is very important to the current implementation.IIRC Aditya turned off CSE in SDISel for experimentation purposes (we wanted to quantify how useful that is), and the results were surprising in the sense that it pretty much didn’t affect the quality of the generated code. Aditya, could you share what you found when you did this experiment? Admittedly, this was a while and I may misremember!> There's a lot of code that depends on his happening in order to have the expected effects. Otherwise, the use-count checks won't do the right thing (because the old unused uses of things won't immediately go away). > > I'm not entirely sure you can just turn off the uniquing in SDAG and get a sensible result. > > -Hal > >> >>> On Nov 10, 2017, at 9:54 PM, Daniel Sanders <daniel_l_sanders at apple.com <mailto:daniel_l_sanders at apple.com>> wrote: >>> >>> My thinking on this is that (with a few exceptions that I'll get to), combine and select are basically the same thing. You match some MIR, and replace it with other MIR. The main difference being that combine doesn't have to constrain to register classes (unless it wants to) while select does. >>> >>> With that in mind, I was thinking that it makes sense to put a lot of effort into the optimization of the tablegen-erated selection table (as has been started in Quentin's recent patch) and then re-use it for combines too. We'll need to be careful how we define GlobalISel's counterpart to SelectionDAG patterns to make it expressive enough to support combines but that's essentially a second frontend (the other being the SelectionDAG importer) on a common backend >> Agreed that combine and selection are similar processes. It sounds like this is something we should look at prototyping. >> >>> >>> Req 2 becomes simple to implement in this approach. You can either use the existing feature-bits mechanism to enable/disable combine rules as a group, or add an equivalent mechanism in tablegen to decide whether a rule makes it into the emitted table or not and have multiple tables which you can run/not-run at will. With the new coverage feedback mechanism, we could potentially organize our tables semi-automatically by highlighting combine rules that never or rarely fire in a particular pass. >>> >>> One feature I think we ought to have that isn't on the requirements list already, is that I think we should have a means to support rules with more than one match root. For example (using SelectionDAG patterns): >>> (set $dst1:GPR32, (i32 (load $ptr:GPR64))) >>> (set $dst2:GPR32, (i32 (load (add $ptr:GPR64 4)))) >>> into: >>> (set $tmp:GPR64, (v2s32 (load $ptr:GPR64))) >>> (set $dst1, (extractelt $tmp:GPR64, 0)) >>> (set $dst2, (extractelt $tmp:GPR64, 1)) >>> Or something along those lines (such as fusing div/mod together). The combiner should be smart enough to make the root the $ptr, and follow the use of $ptr into the load/add, then follow the def to the 4. >> This seems like a nice feature, but I wonder about the impact this will have on the speed of the matching algorithm. I don’t know enough about it to say though. IMO complex features can be done in C++ code if they’re uncommon, in preference for fast handling of the common cases. Maybe a few more use cases are needed. >> >> Thanks, >> Amara > > -- > Hal Finkel > Lead, Compiler Technology and Programming Languages > Leadership Computing Facility > Argonne National Laboratory > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171111/81bdeedf/attachment.html>
Aditya Nandakumar via llvm-dev
2017-Nov-12 06:05 UTC
[llvm-dev] RFC: [GlobalISel] Towards a generic MI combiner framework
> On Nov 11, 2017, at 9:31 PM, Quentin Colombet <qcolombet at apple.com> wrote: > >> >> On Nov 11, 2017, at 11:03 AM, Hal Finkel via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> >> >> On 11/11/2017 12:44 PM, Amara Emerson wrote: >>> >>>> On Nov 10, 2017, at 10:04 PM, Aditya Nandakumar <proaditya at gmail.com <mailto:proaditya at gmail.com>> wrote: >>>>> >>>>> The current DAGCombine, being constructed on top of SDAG, has a kind of built-in CSE and automatic DCE. How will things change, if they'll change, in this new model? >>>> Hi Hal, >>>> >>>> I suspect one option is to have a separate CSE pass, and the backends get to choose where exactly they plug in their pipeline. I think DCE should be part of the combine pass (and the legalizer is about to start doing that as well). >>> For SSA form MIR there’s already the MachineCSE pass. How important the CSE/DCE is at the combine stage I don’t know. As an approximation perhaps we can get an idea by disabling the behavior in the DAGCombiner and seeing the effects. >> >> My impression is that the automated CSE and DCE is very important to the current implementation. > > IIRC Aditya turned off CSE in SDISel for experimentation purposes (we wanted to quantify how useful that is), and the results were surprising in the sense that it pretty much didn’t affect the quality of the generated code. > > Aditya, could you share what you found when you did this experiment? > Admittedly, this was a while and I may misremember!IIRC (from a while back), I made doNotCSE method return true always for a small set of tests and noticed based on crude metrics that there wasn’t a major performance degradation (~2-3% increase in instructions). From what I hear now, I suspect that alone is not sufficient to completely turn off CSE.> >> There's a lot of code that depends on his happening in order to have the expected effects. Otherwise, the use-count checks won't do the right thing (because the old unused uses of things won't immediately go away). >> >> I'm not entirely sure you can just turn off the uniquing in SDAG and get a sensible result. >> >> -Hal >> >>> >>>> On Nov 10, 2017, at 9:54 PM, Daniel Sanders <daniel_l_sanders at apple.com <mailto:daniel_l_sanders at apple.com>> wrote: >>>> >>>> My thinking on this is that (with a few exceptions that I'll get to), combine and select are basically the same thing. You match some MIR, and replace it with other MIR. The main difference being that combine doesn't have to constrain to register classes (unless it wants to) while select does. >>>> >>>> With that in mind, I was thinking that it makes sense to put a lot of effort into the optimization of the tablegen-erated selection table (as has been started in Quentin's recent patch) and then re-use it for combines too. We'll need to be careful how we define GlobalISel's counterpart to SelectionDAG patterns to make it expressive enough to support combines but that's essentially a second frontend (the other being the SelectionDAG importer) on a common backend >>> Agreed that combine and selection are similar processes. It sounds like this is something we should look at prototyping. >>> >>>> >>>> Req 2 becomes simple to implement in this approach. You can either use the existing feature-bits mechanism to enable/disable combine rules as a group, or add an equivalent mechanism in tablegen to decide whether a rule makes it into the emitted table or not and have multiple tables which you can run/not-run at will. With the new coverage feedback mechanism, we could potentially organize our tables semi-automatically by highlighting combine rules that never or rarely fire in a particular pass. >>>> >>>> One feature I think we ought to have that isn't on the requirements list already, is that I think we should have a means to support rules with more than one match root. For example (using SelectionDAG patterns): >>>> (set $dst1:GPR32, (i32 (load $ptr:GPR64))) >>>> (set $dst2:GPR32, (i32 (load (add $ptr:GPR64 4)))) >>>> into: >>>> (set $tmp:GPR64, (v2s32 (load $ptr:GPR64))) >>>> (set $dst1, (extractelt $tmp:GPR64, 0)) >>>> (set $dst2, (extractelt $tmp:GPR64, 1)) >>>> Or something along those lines (such as fusing div/mod together). The combiner should be smart enough to make the root the $ptr, and follow the use of $ptr into the load/add, then follow the def to the 4. >>> This seems like a nice feature, but I wonder about the impact this will have on the speed of the matching algorithm. I don’t know enough about it to say though. IMO complex features can be done in C++ code if they’re uncommon, in preference for fast handling of the common cases. Maybe a few more use cases are needed. >>> >>> Thanks, >>> Amara >> >> -- >> Hal Finkel >> Lead, Compiler Technology and Programming Languages >> Leadership Computing Facility >> Argonne National Laboratory >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171111/d2710239/attachment.html>
Possibly Parallel Threads
- RFC: [GlobalISel] Towards a generic MI combiner framework
- RFC: [GlobalISel] Towards a generic MI combiner framework
- RFC: [GlobalISel] Towards a generic MI combiner framework
- [GlobalISel] [X86] unable to legalize instruction
- RFC: [GlobalISel] Towards a generic MI combiner framework