Amara Emerson via llvm-dev
2017-Nov-28 16:09 UTC
[llvm-dev] RFC: [GlobalISel] Towards a generic MI combiner framework
Thanks for the suggestions Vedant. Synthetic debug info is an interesting idea that sounds worthwhile. Could this be implemented as a “wrapper” pass that automatically decorates debug info before and after a specific pass run in opt (or pipeline of passes)? It might be useful to be able to easily enable this for a wide range of tests without having to manually modify each run line, perhaps as an environment variable/build time flag. Cheers, Amara> On Nov 27, 2017, at 6:18 PM, Vedant Kumar <vsk at apple.com> wrote: > >> >> On Nov 17, 2017, at 4:41 PM, Gerolf Hoflehner <ghoflehner at apple.com <mailto:ghoflehner at apple.com>> wrote: >> >> >> >>> On Nov 13, 2017, at 11:53 AM, Vedant Kumar via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >>> >>> Hi Amara, >>> >>>> On Nov 10, 2017, at 9:12 AM, Amara Emerson via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >>>> >>>> Hi everyone, >>>> >>>> This RFC concerns the design and architecture of a generic machine instruction combiner/optimizer framework to be developed as part of the GISel pipeline. As we transition from correctness and reducing the fallback rate to SelectionDAG at -O0, we’re now starting to think about using GlobalISel with optimizations enabled. There are obviously many parts to this story as optimizations happen at various stages of the codegen pipeline. The focus of this RFC is the replacement of the equivalent of the DAGCombiner in SDAG land. Despite the focus on the DAGCombiner, since there aren’t perfect 1-1 mappings between SDAG and GlobalISel components, this may also include features that are currently implemented as part of the target lowerings, and tablegen isel patterns. As we’re starting from a blank slate, we have an opportunity here to think about what we might need from such a framework without the legacy cruft (although we still have the high performance bar to meet). >>>> >>>> I want to poll the community about what future requirements we have for the GISel G_MI optimizer/combiner. The following are the general requirements we have so far: >>>> >>>> It should have at least equivalent, but hopefully better runtime/compile time trade off than the DAGCombiner. >>>> There needs to be flexibility in the design to allow targets to run subsets of the overall optimizer. For example, some targets may want to avoid trying to run certain types of optimizations like vector or FP combines if they’re either not applicable, or not worth the compile time. >>>> Have a reasonably concise way to write most optimizations. Hand written C++ will always be an option, but there’s value in having easy to read and reason about descriptions of transforms. >>>> >>>> These requirements aren’t set in stone nor complete, but using them as a starting point: a single monolithic “Generic MI combiner” component doesn’t look like the right approach. Our current thinking is that, like we’ve done with the Legalizer, the specific mechanics of the actual optimization should be separated into it’s own unit. This would allow the combines to be re-used at different stages of the pipeline according to target needs. Using the current situation with instcombine as an example, there is no way to explicitly pick and choose a specific subset of IC, it’s only available as a whole pass with all the costs that entails. >>>> >>>> The reasoning behind req 3 is that there may be compile time savings available if we can describe in a declarative style the combines we want to do, like it’s currently possible with tablegen patterns. This hasn’t been proven out yet, but consider an alternative where we use the machine instruction equivalent of the IR/PatternMatch tooling which allows easy and expressive matching of IR sub-trees. A concern I have with using that as the main approach to writing combines is that it’s easy to add new matchers in an routine which re-computes information that’s previously been computed in previous match() attempts. This form of back-tracking might be avoided if we can reason about a group of combines together automatically (or perhaps we could add caching capabilities to PatternMatch). >>>> >>>> What would everyone else like to see from this? >>> >>> It would be great to provide first-class support for maintaining debug value information as a part of the new combine framework. >>> >>> With SelectionDAG, we don't have a systematic way of preserving debug locations and values across combines. This is a source of optimized debugging bugs. If, as a part of the new framework, we could concisely express that a RAUW-style combine simply transfers debug values from A to B, we might define away some of these bugs [1]. >>> >>> Adrian put in place some infrastructure to do this in SelectionDAG (r317825). However, auditing/fixing debug value transfer issues in hand-written combines is time-consuming. I think it should be a goal of the new framework to make this a bit easier. >> >> +1 >> >> Do you and/or Adrian also have thought on testing debug values? Verification and testing strategies should/could be part of the design also. > > I have two ideas on how to approach testing. > > 1) Recycle existing targeted lit tests to test debug info preservation. > > Consider a targeted lit test which look like this: > RUN: opt -S -loop-reduce addrec-gep.ll -o - | FileCheck %s > > We can repurpose this test by attaching synthetic debug information to the IR, and then checking how much of it survives LSR. I prototyped this idea for IR-level tests over the break. Adding a debug info test looks like this: > RUN: opt -S -debugify -loop-reduce -check-debugify addrec-gep.ll -o - | FileCheck ... > > The check-debugify pass can determine which DILocations and DIVariables went missing: > >> CheckDebugify: Instruction with empty DebugLoc -- %lsr.iv1 = bitcast double* %lsr.iv to i1* >> CheckDebugify: Missing line 3 >> CheckDebugify: Missing line 4 >> ... >> CheckDebugify: Missing line 33 > > This could be a handy way to create new targeted test cases at the IR/MIR level. Something like this would have helped triage issues like llvm.org/ <http://llvm.org/>PR25630 as well. > > 2) Assert that combines preserve debug info. > > If there's an important API like "CombineTo(From, To)", it could be useful to assert that the To node has at least as much debug info as the From node. I'm experimenting with this in SelectionDAG (llvm.org/PR35338 <http://llvm.org/PR35338>). I don't yet know where these asserts belong, how strict they should be, or if they need exception lists for certain combines. > > --- > > Adrian and others have thought about these issues much more, so take these ideas with a grain of salt :). > > thanks, > vedant > > >>> >>> best, >>> vedant >>> >>> [1] To pick one at random, the '(zext (zextload x)) -> (zext (truncate (zextload x)))' combine should transfer debug values from N to the new ExtLoad, and from N0 to the new (trunc ExtLoad), but currently doesn't. >>> >>>> >>>> Thanks, >>>> Amara >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171128/f20a6144/attachment-0001.html>
Daniel Sanders via llvm-dev
2017-Nov-28 18:56 UTC
[llvm-dev] RFC: [GlobalISel] Towards a generic MI combiner framework
I like this idea too but I'd like to see it work in the backend passes as well (like -verify-machineinstrs). It doesn't necessarily tell you if the information ends up in the right place but I think that detecting the loss is likely to be just as good in a fair portion of the backend (e.g. ISel when emitting one instruction) and when it isn't, it's still a good start. The one thing it wouldn't detect is when information is preserved but put in the wrong place.> On 28 Nov 2017, at 08:09, Amara Emerson via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Thanks for the suggestions Vedant. Synthetic debug info is an interesting idea that sounds worthwhile. Could this be implemented as a “wrapper” pass that automatically decorates debug info before and after a specific pass run in opt (or pipeline of passes)? It might be useful to be able to easily enable this for a wide range of tests without having to manually modify each run line, perhaps as an environment variable/build time flag. > > Cheers, > Amara > >> On Nov 27, 2017, at 6:18 PM, Vedant Kumar <vsk at apple.com <mailto:vsk at apple.com>> wrote: >> >>> >>> On Nov 17, 2017, at 4:41 PM, Gerolf Hoflehner <ghoflehner at apple.com <mailto:ghoflehner at apple.com>> wrote: >>> >>> >>> >>>> On Nov 13, 2017, at 11:53 AM, Vedant Kumar via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >>>> >>>> Hi Amara, >>>> >>>>> On Nov 10, 2017, at 9:12 AM, Amara Emerson via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >>>>> >>>>> Hi everyone, >>>>> >>>>> This RFC concerns the design and architecture of a generic machine instruction combiner/optimizer framework to be developed as part of the GISel pipeline. As we transition from correctness and reducing the fallback rate to SelectionDAG at -O0, we’re now starting to think about using GlobalISel with optimizations enabled. There are obviously many parts to this story as optimizations happen at various stages of the codegen pipeline. The focus of this RFC is the replacement of the equivalent of the DAGCombiner in SDAG land. Despite the focus on the DAGCombiner, since there aren’t perfect 1-1 mappings between SDAG and GlobalISel components, this may also include features that are currently implemented as part of the target lowerings, and tablegen isel patterns. As we’re starting from a blank slate, we have an opportunity here to think about what we might need from such a framework without the legacy cruft (although we still have the high performance bar to meet). >>>>> >>>>> I want to poll the community about what future requirements we have for the GISel G_MI optimizer/combiner. The following are the general requirements we have so far: >>>>> >>>>> It should have at least equivalent, but hopefully better runtime/compile time trade off than the DAGCombiner. >>>>> There needs to be flexibility in the design to allow targets to run subsets of the overall optimizer. For example, some targets may want to avoid trying to run certain types of optimizations like vector or FP combines if they’re either not applicable, or not worth the compile time. >>>>> Have a reasonably concise way to write most optimizations. Hand written C++ will always be an option, but there’s value in having easy to read and reason about descriptions of transforms. >>>>> >>>>> These requirements aren’t set in stone nor complete, but using them as a starting point: a single monolithic “Generic MI combiner” component doesn’t look like the right approach. Our current thinking is that, like we’ve done with the Legalizer, the specific mechanics of the actual optimization should be separated into it’s own unit. This would allow the combines to be re-used at different stages of the pipeline according to target needs. Using the current situation with instcombine as an example, there is no way to explicitly pick and choose a specific subset of IC, it’s only available as a whole pass with all the costs that entails. >>>>> >>>>> The reasoning behind req 3 is that there may be compile time savings available if we can describe in a declarative style the combines we want to do, like it’s currently possible with tablegen patterns. This hasn’t been proven out yet, but consider an alternative where we use the machine instruction equivalent of the IR/PatternMatch tooling which allows easy and expressive matching of IR sub-trees. A concern I have with using that as the main approach to writing combines is that it’s easy to add new matchers in an routine which re-computes information that’s previously been computed in previous match() attempts. This form of back-tracking might be avoided if we can reason about a group of combines together automatically (or perhaps we could add caching capabilities to PatternMatch). >>>>> >>>>> What would everyone else like to see from this? >>>> >>>> It would be great to provide first-class support for maintaining debug value information as a part of the new combine framework. >>>> >>>> With SelectionDAG, we don't have a systematic way of preserving debug locations and values across combines. This is a source of optimized debugging bugs. If, as a part of the new framework, we could concisely express that a RAUW-style combine simply transfers debug values from A to B, we might define away some of these bugs [1]. >>>> >>>> Adrian put in place some infrastructure to do this in SelectionDAG (r317825). However, auditing/fixing debug value transfer issues in hand-written combines is time-consuming. I think it should be a goal of the new framework to make this a bit easier. >>> >>> +1 >>> >>> Do you and/or Adrian also have thought on testing debug values? Verification and testing strategies should/could be part of the design also. >> >> I have two ideas on how to approach testing. >> >> 1) Recycle existing targeted lit tests to test debug info preservation. >> >> Consider a targeted lit test which look like this: >> RUN: opt -S -loop-reduce addrec-gep.ll -o - | FileCheck %s >> >> We can repurpose this test by attaching synthetic debug information to the IR, and then checking how much of it survives LSR. I prototyped this idea for IR-level tests over the break. Adding a debug info test looks like this: >> RUN: opt -S -debugify -loop-reduce -check-debugify addrec-gep.ll -o - | FileCheck ... >> >> The check-debugify pass can determine which DILocations and DIVariables went missing: >> >>> CheckDebugify: Instruction with empty DebugLoc -- %lsr.iv1 = bitcast double* %lsr.iv to i1* >>> CheckDebugify: Missing line 3 >>> CheckDebugify: Missing line 4 >>> ... >>> CheckDebugify: Missing line 33 >> >> This could be a handy way to create new targeted test cases at the IR/MIR level. Something like this would have helped triage issues like llvm.org/ <http://llvm.org/>PR25630 as well. >> >> 2) Assert that combines preserve debug info. >> >> If there's an important API like "CombineTo(From, To)", it could be useful to assert that the To node has at least as much debug info as the From node. I'm experimenting with this in SelectionDAG (llvm.org/PR35338 <http://llvm.org/PR35338>). I don't yet know where these asserts belong, how strict they should be, or if they need exception lists for certain combines. >> >> --- >> >> Adrian and others have thought about these issues much more, so take these ideas with a grain of salt :). >> >> thanks, >> vedant >> >> >>>> >>>> best, >>>> vedant >>>> >>>> [1] To pick one at random, the '(zext (zextload x)) -> (zext (truncate (zextload x)))' combine should transfer debug values from N to the new ExtLoad, and from N0 to the new (trunc ExtLoad), but currently doesn't. >>>> >>>>> >>>>> Thanks, >>>>> Amara >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> >>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171128/9d82f833/attachment.html>
Vedant Kumar via llvm-dev
2017-Nov-28 19:27 UTC
[llvm-dev] RFC: [GlobalISel] Towards a generic MI combiner framework
> On Nov 28, 2017, at 10:56 AM, Daniel Sanders <daniel_l_sanders at apple.com> wrote: > > I like this idea too but I'd like to see it work in the backend passes as well (like -verify-machineinstrs).+ 1, I haven't prototyped this because I don't know exactly what it would look like, but I've made a note about it.> It doesn't necessarily tell you if the information ends up in the right place but I think that detecting the loss is likely to be just as good in a fair portion of the backend (e.g. ISel when emitting one instruction) and when it isn't, it's still a good start. The one thing it wouldn't detect is when information is preserved but put in the wrong place.Right, showing that debug info is preserved is good, but it doesn't show that the info is preserved correctly. I'm not sure how to create tests for that in an automated way. One idea is to add asserts where possible. For example, it might be correct/useful to assert that in CombineTo(A, B), the debug locations for A and B are the same.> >> On 28 Nov 2017, at 08:09, Amara Emerson via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> >> Thanks for the suggestions Vedant. Synthetic debug info is an interesting idea that sounds worthwhile. Could this be implemented as a “wrapper” pass that automatically decorates debug info before and after a specific pass run in opt (or pipeline of passes)? It might be useful to be able to easily enable this for a wide range of tests without having to manually modify each run line, perhaps as an environment variable/build time flag.Enabling this sort of testing for a wide range of tests sounds useful. I'll take a stab at adding an option and environment variable to opt to enable this as a follow-up to D40512 <https://reviews.llvm.org/D40512>. thanks, vedant>> >> Cheers, >> Amara >> >>> On Nov 27, 2017, at 6:18 PM, Vedant Kumar <vsk at apple.com <mailto:vsk at apple.com>> wrote: >>> >>>> >>>> On Nov 17, 2017, at 4:41 PM, Gerolf Hoflehner <ghoflehner at apple.com <mailto:ghoflehner at apple.com>> wrote: >>>> >>>> >>>> >>>>> On Nov 13, 2017, at 11:53 AM, Vedant Kumar via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >>>>> >>>>> Hi Amara, >>>>> >>>>>> On Nov 10, 2017, at 9:12 AM, Amara Emerson via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >>>>>> >>>>>> Hi everyone, >>>>>> >>>>>> This RFC concerns the design and architecture of a generic machine instruction combiner/optimizer framework to be developed as part of the GISel pipeline. As we transition from correctness and reducing the fallback rate to SelectionDAG at -O0, we’re now starting to think about using GlobalISel with optimizations enabled. There are obviously many parts to this story as optimizations happen at various stages of the codegen pipeline. The focus of this RFC is the replacement of the equivalent of the DAGCombiner in SDAG land. Despite the focus on the DAGCombiner, since there aren’t perfect 1-1 mappings between SDAG and GlobalISel components, this may also include features that are currently implemented as part of the target lowerings, and tablegen isel patterns. As we’re starting from a blank slate, we have an opportunity here to think about what we might need from such a framework without the legacy cruft (although we still have the high performance bar to meet). >>>>>> >>>>>> I want to poll the community about what future requirements we have for the GISel G_MI optimizer/combiner. The following are the general requirements we have so far: >>>>>> >>>>>> It should have at least equivalent, but hopefully better runtime/compile time trade off than the DAGCombiner. >>>>>> There needs to be flexibility in the design to allow targets to run subsets of the overall optimizer. For example, some targets may want to avoid trying to run certain types of optimizations like vector or FP combines if they’re either not applicable, or not worth the compile time. >>>>>> Have a reasonably concise way to write most optimizations. Hand written C++ will always be an option, but there’s value in having easy to read and reason about descriptions of transforms. >>>>>> >>>>>> These requirements aren’t set in stone nor complete, but using them as a starting point: a single monolithic “Generic MI combiner” component doesn’t look like the right approach. Our current thinking is that, like we’ve done with the Legalizer, the specific mechanics of the actual optimization should be separated into it’s own unit. This would allow the combines to be re-used at different stages of the pipeline according to target needs. Using the current situation with instcombine as an example, there is no way to explicitly pick and choose a specific subset of IC, it’s only available as a whole pass with all the costs that entails. >>>>>> >>>>>> The reasoning behind req 3 is that there may be compile time savings available if we can describe in a declarative style the combines we want to do, like it’s currently possible with tablegen patterns. This hasn’t been proven out yet, but consider an alternative where we use the machine instruction equivalent of the IR/PatternMatch tooling which allows easy and expressive matching of IR sub-trees. A concern I have with using that as the main approach to writing combines is that it’s easy to add new matchers in an routine which re-computes information that’s previously been computed in previous match() attempts. This form of back-tracking might be avoided if we can reason about a group of combines together automatically (or perhaps we could add caching capabilities to PatternMatch). >>>>>> >>>>>> What would everyone else like to see from this? >>>>> >>>>> It would be great to provide first-class support for maintaining debug value information as a part of the new combine framework. >>>>> >>>>> With SelectionDAG, we don't have a systematic way of preserving debug locations and values across combines. This is a source of optimized debugging bugs. If, as a part of the new framework, we could concisely express that a RAUW-style combine simply transfers debug values from A to B, we might define away some of these bugs [1]. >>>>> >>>>> Adrian put in place some infrastructure to do this in SelectionDAG (r317825). However, auditing/fixing debug value transfer issues in hand-written combines is time-consuming. I think it should be a goal of the new framework to make this a bit easier. >>>> >>>> +1 >>>> >>>> Do you and/or Adrian also have thought on testing debug values? Verification and testing strategies should/could be part of the design also. >>> >>> I have two ideas on how to approach testing. >>> >>> 1) Recycle existing targeted lit tests to test debug info preservation. >>> >>> Consider a targeted lit test which look like this: >>> RUN: opt -S -loop-reduce addrec-gep.ll -o - | FileCheck %s >>> >>> We can repurpose this test by attaching synthetic debug information to the IR, and then checking how much of it survives LSR. I prototyped this idea for IR-level tests over the break. Adding a debug info test looks like this: >>> RUN: opt -S -debugify -loop-reduce -check-debugify addrec-gep.ll -o - | FileCheck ... >>> >>> The check-debugify pass can determine which DILocations and DIVariables went missing: >>> >>>> CheckDebugify: Instruction with empty DebugLoc -- %lsr.iv1 = bitcast double* %lsr.iv to i1* >>>> CheckDebugify: Missing line 3 >>>> CheckDebugify: Missing line 4 >>>> ... >>>> CheckDebugify: Missing line 33 >>> >>> This could be a handy way to create new targeted test cases at the IR/MIR level. Something like this would have helped triage issues like llvm.org/ <http://llvm.org/>PR25630 as well. >>> >>> 2) Assert that combines preserve debug info. >>> >>> If there's an important API like "CombineTo(From, To)", it could be useful to assert that the To node has at least as much debug info as the From node. I'm experimenting with this in SelectionDAG (llvm.org/PR35338 <http://llvm.org/PR35338>). I don't yet know where these asserts belong, how strict they should be, or if they need exception lists for certain combines. >>> >>> --- >>> >>> Adrian and others have thought about these issues much more, so take these ideas with a grain of salt :). >>> >>> thanks, >>> vedant >>> >>> >>>>> >>>>> best, >>>>> vedant >>>>> >>>>> [1] To pick one at random, the '(zext (zextload x)) -> (zext (truncate (zextload x)))' combine should transfer debug values from N to the new ExtLoad, and from N0 to the new (trunc ExtLoad), but currently doesn't. >>>>> >>>>>> >>>>>> Thanks, >>>>>> Amara >>>>>> _______________________________________________ >>>>>> LLVM Developers mailing list >>>>>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> >>>>> >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171128/b1fedbd7/attachment-0001.html>