search for: expensivecombines

Displaying 8 results from an estimated 8 matches for "expensivecombines".

2017 Mar 17
7
Saving Compile Time in InstCombine
...often use it just as a clean-up pass: it's scheduled 6 times in the current pass pipeline, and each time it's invoked it checks all known patterns. It sounds ok for O3, where we try to squeeze as much performance as possible, but it is too excessive for other opt-levels. InstCombine has an ExpensiveCombines parameter to address that - but I think it's underused at the moment. Trying to find out, which patterns are important, and which are rare, I profiled clang using CTMark and got the following coverage report: (beware, the file is ~6MB). Guided by this profile I moved some patterns under the...
2017 Mar 18
4
Saving Compile Time in InstCombine
...t's scheduled 6 times in the current pass pipeline, >> and each time it's invoked it checks all known patterns. It sounds ok >> for O3, where we try to squeeze as much performance as possible, but >> it is too excessive for other opt-levels. InstCombine has an >> ExpensiveCombines parameter to address that - but I think it's >> underused at the moment. > > Yes, the “ExpensiveCombines” has been added recently (4.0? 3.9?) but I > believe has always been intended to be extended the way you’re doing > it. So I support this effort :) +1 Also, did your p...
2017 Mar 20
2
Saving Compile Time in InstCombine
...: canonicalizations may duel endlessly if we get this wrong; the order of the combines is also important for exactly the same reason (SelectionDAG deals with this problem in a different way with its pattern complexity field). > > Another concern with moving seemingly arbitrary combines under ExpensiveCombines is that it will make it that much harder to understand what is and is not canonical at a given point during the execution of the optimizer. If a canonicalization is too costly to achieve, maybe it is not a reasonable one? It is also not clear to me that canonicalizations that are using complex ana...
2017 Mar 21
2
Saving Compile Time in InstCombine
...s is out of curiosity. There must be verifiers that check that this cannot happen. Or an implementation strategy that guarantees that. Global isel will run into the same/similar question when it gets far enough to replace SD. > > Another concern with moving seemingly arbitrary combines under ExpensiveCombines is that it will make it that much harder to understand what is and is not canonical at a given point during the execution of the optimizer. > > I'd be much more interested in a patch which caches the result of frequently called ValueTracking functionality like ComputeKnownBits, ComputeS...
2017 Mar 22
3
Saving Compile Time in InstCombine
...here must be verifiers that check that this cannot > happen. Or an implementation strategy that guarantees that. Global isel > will run into the same/similar question when it gets far enough to replace > SD. > > > Another concern with moving seemingly arbitrary combines under > ExpensiveCombines is that it will make it that much harder to understand > what is and is not canonical at a given point during the execution of the > optimizer. > > > > I'd be much more interested in a patch which caches the result of > frequently called ValueTracking functionality like Com...
2017 Mar 23
2
Saving Compile Time in InstCombine
In my testing results are not that impressive, but that's because I'm now focusing on Os. For me even complete disabling of all KnownBits-related patterns in InstCombine places the results very close to the noise level. In my original patch I also had some extra patterns moved under ExpensiveCombines - and that seems to make a difference too (without this part, or without the KnownBits part I get results below 1%, which are not reported as regressions/improvements). Personally I think caching results of KnownBits is a good idea, and should probably help O3 compile time (and obviously the cases...
2016 Sep 20
2
Inferring nsw/nuw flags for increment/decrement based on relational comparisons
...ion that was > shot down in the discussion you linked? The link in question is the following: http://lists.llvm.org/pipermail/llvm-dev/2015-January/080666.html Perhaps these operations are cheaper than I think and such caching is not needed? Alternatively they could be put behind -O3 i.e. the ExpensiveCombines variable Sanjay pointed out. > 2. InstCombiner::visitAdd only calls into ValueTracking for the > unsigned case, i.e. computeOverflowForUnsignedAdd. There are no > computeOverflowFor*Sub functions that InstCombiner::visitSub even > could make use of. Instead, InstCombiner has its ow...
2017 Apr 14
5
Saving Compile Time in InstCombine
...gt;> In my testing results are not that impressive, but that's because I'm now focusing on Os. For me even complete disabling of all KnownBits-related patterns in InstCombine places the results very close to the noise level. In my original patch I also had some extra patterns moved under ExpensiveCombines - and that seems to make a difference too (without this part, or without the KnownBits part I get results below 1%, which are not reported as regressions/improvements). >> > > Have you profiled a single InstCombine run to see where we actually > spend our cycles (as Sanjay did for...