thr3ads.net - search: "expensivecombines"

Displaying 8 results from an estimated 8 matches for "expensivecombines".

2017 Mar 17

Saving Compile Time in InstCombine

...often use it just as a clean-up pass: it's scheduled 6 times in the current pass pipeline, and each time it's invoked it checks all known patterns. It sounds ok for O3, where we try to squeeze as much performance as possible, but it is too excessive for other opt-levels. InstCombine has an ExpensiveCombines parameter to address that - but I think it's underused at the moment. Trying to find out, which patterns are important, and which are rare, I profiled clang using CTMark and got the following coverage report: (beware, the file is ~6MB). Guided by this profile I moved some patterns under the...

Saving Compile Time in InstCombine

2017 Mar 18

Saving Compile Time in InstCombine

...t's scheduled 6 times in the current pass pipeline, >> and each time it's invoked it checks all known patterns. It sounds ok >> for O3, where we try to squeeze as much performance as possible, but >> it is too excessive for other opt-levels. InstCombine has an >> ExpensiveCombines parameter to address that - but I think it's >> underused at the moment. > > Yes, the “ExpensiveCombines” has been added recently (4.0? 3.9?) but I > believe has always been intended to be extended the way you’re doing > it. So I support this effort :) +1 Also, did your p...

Saving Compile Time in InstCombine

2017 Mar 20

Saving Compile Time in InstCombine

...: canonicalizations may duel endlessly if we get this wrong; the order of the combines is also important for exactly the same reason (SelectionDAG deals with this problem in a different way with its pattern complexity field). > > Another concern with moving seemingly arbitrary combines under ExpensiveCombines is that it will make it that much harder to understand what is and is not canonical at a given point during the execution of the optimizer. If a canonicalization is too costly to achieve, maybe it is not a reasonable one? It is also not clear to me that canonicalizations that are using complex ana...

Saving Compile Time in InstCombine

2017 Mar 21

Saving Compile Time in InstCombine

...s is out of curiosity. There must be verifiers that check that this cannot happen. Or an implementation strategy that guarantees that. Global isel will run into the same/similar question when it gets far enough to replace SD. > > Another concern with moving seemingly arbitrary combines under ExpensiveCombines is that it will make it that much harder to understand what is and is not canonical at a given point during the execution of the optimizer. > > I'd be much more interested in a patch which caches the result of frequently called ValueTracking functionality like ComputeKnownBits, ComputeS...

Saving Compile Time in InstCombine

2017 Mar 22

Saving Compile Time in InstCombine

...here must be verifiers that check that this cannot > happen. Or an implementation strategy that guarantees that. Global isel > will run into the same/similar question when it gets far enough to replace > SD. > > > Another concern with moving seemingly arbitrary combines under > ExpensiveCombines is that it will make it that much harder to understand > what is and is not canonical at a given point during the execution of the > optimizer. > > > > I'd be much more interested in a patch which caches the result of > frequently called ValueTracking functionality like Com...

Saving Compile Time in InstCombine

2017 Mar 23

Saving Compile Time in InstCombine

In my testing results are not that impressive, but that's because I'm now focusing on Os. For me even complete disabling of all KnownBits-related patterns in InstCombine places the results very close to the noise level. In my original patch I also had some extra patterns moved under ExpensiveCombines - and that seems to make a difference too (without this part, or without the KnownBits part I get results below 1%, which are not reported as regressions/improvements). Personally I think caching results of KnownBits is a good idea, and should probably help O3 compile time (and obviously the cases...

Inferring nsw/nuw flags for increment/decrement based on relational comparisons

2016 Sep 20

Inferring nsw/nuw flags for increment/decrement based on relational comparisons

...ion that was > shot down in the discussion you linked? The link in question is the following: http://lists.llvm.org/pipermail/llvm-dev/2015-January/080666.html Perhaps these operations are cheaper than I think and such caching is not needed? Alternatively they could be put behind -O3 i.e. the ExpensiveCombines variable Sanjay pointed out. > 2. InstCombiner::visitAdd only calls into ValueTracking for the > unsigned case, i.e. computeOverflowForUnsignedAdd. There are no > computeOverflowFor*Sub functions that InstCombiner::visitSub even > could make use of. Instead, InstCombiner has its ow...

Saving Compile Time in InstCombine

2017 Apr 14

Saving Compile Time in InstCombine

...gt;> In my testing results are not that impressive, but that's because I'm now focusing on Os. For me even complete disabling of all KnownBits-related patterns in InstCombine places the results very close to the noise level. In my original patch I also had some extra patterns moved under ExpensiveCombines - and that seems to make a difference too (without this part, or without the KnownBits part I get results below 1%, which are not reported as regressions/improvements). >> > > Have you profiled a single InstCombine run to see where we actually > spend our cycles (as Sanjay did for...

search for: expensivecombines