thr3ads.net - llvm dev - [llvm-dev] RFC: Moving DAG heuristic-based transforms to MI passes [Jan 2017]

If this information is useful, please help other people find it:
Share via:

Andrew V. Tischenko via llvm-dev

2017-Jan-27 16:30 UTC

[llvm-dev] RFC: Moving DAG heuristic-based transforms to MI passes

All llvm-devs,

We're going to introduce the new possible implementation for such 
optimizations as reciprocal estimation instead of fdiv. In short it's a 
replacement of fdiv instruction (which is very expensive in most of 
CPUs) with alternative sequence of instructions which is usually cheaper 
but has appropriate precision (see genReciprocalDiv in 
lib/Target/X86/X86InstrInfo.cpp for details). There are other similar 
optimizations like usage of rsqrt, etc. but at the moment we're dealing 
with recip estimation only - see https://reviews.llvm.org/D26855 for 
details.

The current version of optimization is done at DAG Combiner level when 
we don't know the exact target instructions which will be used by 
CodeGen. As result we don't know the real cost of the alternative 
sequence and can't compare that cost with the cost of the single fdiv. 
As result the decision to select an alternative sequence (made on 
compiler options only) could be wrong because modern CPUs introduce very 
cheap fdiv and we should use it directly.

We suggest to move the implementation from DAG heuristics to 
MI-scheduler-based transformations (Machine Combiner). At that time we 
know exact target instructions and are able to use scheduler-based cost 
model. This knowledge allows as to select proper code sequence for final 
target code generation.

A possible disadvantage of the new implementation is compile time 
increasing (as discussed in D26855), but we expect to make improvements 
in that area. For the initial change (reciprocal transform), any 
difference is limited to fast-math compilations.

Any objections, suggestion, comments?

Hal Finkel via llvm-dev

2017-Jan-27 20:56 UTC

head link

[llvm-dev] RFC: Moving DAG heuristic-based transforms to MI passes

On 01/27/2017 10:30 AM, Andrew V. Tischenko via llvm-dev wrote:
> All llvm-devs,
>
> We're going to introduce the new possible implementation for such 
> optimizations as reciprocal estimation instead of fdiv. In short it's 
> a replacement of fdiv instruction (which is very expensive in most of 
> CPUs) with alternative sequence of instructions which is usually 
> cheaper but has appropriate precision (see genReciprocalDiv in 
> lib/Target/X86/X86InstrInfo.cpp for details). There are other similar 
> optimizations like usage of rsqrt, etc. but at the moment we're 
> dealing with recip estimation only - see 
> https://reviews.llvm.org/D26855 for details.
>
> The current version of optimization is done at DAG Combiner level when 
> we don't know the exact target instructions which will be used by 
> CodeGen. As result we don't know the real cost of the alternative 
> sequence and can't compare that cost with the cost of the single fdiv. 
> As result the decision to select an alternative sequence (made on 
> compiler options only) could be wrong because modern CPUs introduce 
> very cheap fdiv and we should use it directly.
>
> We suggest to move the implementation from DAG heuristics to 
> MI-scheduler-based transformations (Machine Combiner). At that time we 
> know exact target instructions and are able to use scheduler-based 
> cost model. This knowledge allows as to select proper code sequence 
> for final target code generation.
>
> A possible disadvantage of the new implementation is compile time 
> increasing (as discussed in D26855), but we expect to make 
> improvements in that area. For the initial change (reciprocal 
> transform), any difference is limited to fast-math compilations.
>
> Any objections, suggestion, comments?
>
Are you asking whether is okay to commit the change first and then look 
at the MachineCombiner's worst-case performance in followup? In general, 
I think that moving to using the MachineCombiner for these kinds of 
transformations, where there are complex tradeoffs between latency, 
throughput, etc., is the right direction.

  -Hal
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

Andrew V. Tischenko via llvm-dev

2017-Jan-28 10:19 UTC

head link

[llvm-dev] RFC: Moving DAG heuristic-based transforms to MI passes

In fact to commit the change before dealing with worst-case performance 
is a good idea because here we have 2 different issues. But the main 
idea of this RFC is an attempt to show the better approach to to these 
kinds of transformations and to suggest to use this approach in the future.

At the same time, I'm trying to explain that this patch is not the 
performance one because the generated code is almost identical to what 
we have just now. It is a suggestion to change the strategy in such 
transformations elaborating. If the community accept this new strategy 
we're ready to introduce new similar transformations, automate the 
framework, etc. But of course it will be themes for new RFCs and 
discussions.

On 1/27/2017 11:56 PM, Hal Finkel wrote:> On 01/27/2017 10:30 AM, Andrew V. Tischenko via llvm-dev wrote:
>
>> All llvm-devs,
>>
>> We're going to introduce the new possible implementation for such 
>> optimizations as reciprocal estimation instead of fdiv. In short
it's
>> a replacement of fdiv instruction (which is very expensive in most of 
>> CPUs) with alternative sequence of instructions which is usually 
>> cheaper but has appropriate precision (see genReciprocalDiv in 
>> lib/Target/X86/X86InstrInfo.cpp for details). There are other similar 
>> optimizations like usage of rsqrt, etc. but at the moment we're 
>> dealing with recip estimation only - see 
>> https://reviews.llvm.org/D26855 for details.
>>
>> The current version of optimization is done at DAG Combiner level 
>> when we don't know the exact target instructions which will be used
>> by CodeGen. As result we don't know the real cost of the
alternative
>> sequence and can't compare that cost with the cost of the single 
>> fdiv. As result the decision to select an alternative sequence (made 
>> on compiler options only) could be wrong because modern CPUs 
>> introduce very cheap fdiv and we should use it directly.
>>
>> We suggest to move the implementation from DAG heuristics to 
>> MI-scheduler-based transformations (Machine Combiner). At that time 
>> we know exact target instructions and are able to use scheduler-based 
>> cost model. This knowledge allows as to select proper code sequence 
>> for final target code generation.
>>
>> A possible disadvantage of the new implementation is compile time 
>> increasing (as discussed in D26855), but we expect to make 
>> improvements in that area. For the initial change (reciprocal 
>> transform), any difference is limited to fast-math compilations.
>>
>> Any objections, suggestion, comments?
>>
>
> Are you asking whether is okay to commit the change first and then 
> look at the MachineCombiner's worst-case performance in followup? In 
> general, I think that moving to using the MachineCombiner for these 
> kinds of transformations, where there are complex tradeoffs between 
> latency, throughput, etc., is the right direction.
>
>  -Hal
>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

Maybe Matching Threads

Search for more maybe matching threads

llvm dev - Jan 2017 - RFC: Moving DAG heuristic-based transforms to MI passes

[llvm-dev] RFC: Moving DAG heuristic-based transforms to MI passes

[llvm-dev] RFC: Moving DAG heuristic-based transforms to MI passes

[llvm-dev] RFC: Moving DAG heuristic-based transforms to MI passes

Maybe Matching Threads