thr3ads.net - llvm dev - [llvm-dev] [cfe-dev] CFG simplification question, and preservation of branching in the original code [Oct 2019]

If this information is useful, please help other people find it:
Share via:

Finkel, Hal J. via llvm-dev

2019-Oct-04 16:00 UTC

[llvm-dev] [cfe-dev] CFG simplification question, and preservation of branching in the original code

On 10/4/19 5:46 AM, Joan Lluch via llvm-dev wrote:
Hi all,

As a continuation of this thread, I was about to fill a bug report requesting
the modification of DAGTypeLegalizer::ExpandIntRes_SIGN_EXTEND, in order to
avoid the creation of Shifts for targets with no native support. For example by
generating a ‘select' equivalent to  a<0 ? -1 : 0 instead of an
arithmetic shift right. For targets with no multiple shifts or word extension
instructions, the select is much cheaper.

However, I found that the early InstCombine pass spoils such optimisation by
creating shifts on their own as a transform of (supposedly) equivalent code.

...

I strongly suggest the above gets fixed. HOWEVER, even after the DAG combiner is
fixed, the issues will remain due to InstCombine doing essentially the same
thing much earlier. Consider code like this:

int test0( int a )
{
  return a<0 ? -1 : 0;
}

int test1( int a )
{
  return a<0 ? 1 : 0;
}

int test2( int a )
{
  return a<0 ? 2 : 0;
}

int test3( int a )
{
  return a<0 ? 3 : 0;
}

In all cases, InstCombine converts the above into Shifts, right into the IR,
which the backend can’t do anything about.


Why not? It seems like we could pattern match the shifts back into
selects/branches?

 -Hal


...






_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20191004/28d09ebb/attachment.html>

Joan Lluch via llvm-dev

2019-Oct-05 13:09 UTC

head link

[llvm-dev] [cfe-dev] CFG simplification question, and preservation of branching in the original code

> On 4 Oct 2019, at 18:00, Finkel, Hal J. <hfinkel at anl.gov> wrote:
> 
> Why not? It seems like we could pattern match the shifts back into
selects/branches?
I don’t think that reverse pattern matching is the optimal approach in all
cases. I have tried and certainly for some simple cases the undesired shifts can
be reverted into selects producing better code. However, this does not work well
for complex patterns or cases where second transforms have been applied on top
of it.

One typical case is a sequence of ‘selects’ that depend on a single common
condition. If one of these selects is combined into non-branching code, then the
backend has a very hard work to track back the origin of that common condition.
Therefore, for targets that convert all selects into branches, these two selects
get almost inevitably converted into either two branching sequences, or one
branch sequence followed or preceded by the non-branching code of the other
select.

Scenarios like this are not a problem for pipelined processors with expensive
branches and ’select’ like instructions because in this case the effort is put
in removing branches by the insertion of speculative code which is usually
cheaper. However, for simple processors where branches must be used and are
cheap, we often want to do the opposite because the execution of speculative
code or particular transforms is comparatively most costly than just branching.

Going back to the case, if these two selects arrived together to the backend,
then they could be identified and folded into a block preceded by a single
conditional branch, resulting in faster and shorter code. The latter is the
purpose of some function in the CodeGen prepare pass, but this doesn’t fire
because the two selects are not there to work with them to begin with.

Finally, generally speaking, I don’t regard complex reverse pattern matching as
an elegant solution, compared with providing the right LLVM hooks for the
affected targets, because this is like acknowledging that these targets are
inferior, or not worth the effort to improve LLVM code. Also, whatever the way I
look at it, I always find that it’s better to solve problems by working on the
root causes rather than trying to apply late patches or corrections, which
generally make things convoluted and difficult to maintain, not to mention
suboptimal.

John> 
> On 10/4/19 5:46 AM, Joan Lluch via llvm-dev wrote:
>> Hi all,
>> 
>> As a continuation of this thread, I was about to fill a bug report
requesting the modification of DAGTypeLegalizer::ExpandIntRes_SIGN_EXTEND, in
order to avoid the creation
>>  of Shifts for targets with no native support. For example by
generating a ‘select' equivalent to  a<0 ? -1 : 0
>> instead of an arithmetic shift right. For targets with no multiple
shifts or word extension instructions, the select is much cheaper.
>> 
>> However, I found that the early InstCombine pass spoils such
optimisation by creating shifts on their own as a transform of (supposedly)
equivalent code.
>> 
>> ...
>> 
>> I strongly suggest the above gets fixed. HOWEVER, even after the DAG
combiner is fixed, the issues will remain due to InstCombine doing essentially
the same thing much earlier. Consider code like this:
>> 
>> int test0( int a )
>> {
>>   return a<0 ? -1 : 0;
>> }
>> 
>> int test1( int a )
>> {
>>   return a<0 ? 1 : 0;
>> }
>> 
>> int test2( int a )
>> {
>>   return a<0 ? 2 : 0;
>> }
>> 
>> int test3( int a )
>> {
>>   return a<0 ? 3 : 0;
>> }
>> 
>> In all cases, InstCombine converts the above into Shifts, right into
the IR, which the backend can’t do anything about.
> 
> Why not? It seems like we could pattern match the shifts back into
selects/branches?
> 
>  -Hal
> 
>> ...
>> 
>>> 
>> 
>> 
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> -- 
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20191005/d25a3135/attachment.html>

Mehdi AMINI via llvm-dev

2019-Oct-05 18:07 UTC

head link

[llvm-dev] [cfe-dev] CFG simplification question, and preservation of branching in the original code

On Sat, Oct 5, 2019 at 6:10 AM Joan Lluch via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
>
> On 4 Oct 2019, at 18:00, Finkel, Hal J. <hfinkel at anl.gov> wrote:
>
>
> Why not? It seems like we could pattern match the shifts back into
> selects/branches?
>
>
> I don’t think that reverse pattern matching is the optimal approach in all
> cases. I have tried and certainly for some simple cases the undesired
> shifts can be reverted into selects producing better code. However, this
> does not work well for complex patterns or cases where second transforms
> have been applied on top of it.
>
> One typical case is a sequence of ‘selects’ that depend on a single common
> condition. If one of these selects is combined into non-branching code,
> then the backend has a very hard work to track back the origin of that
> common condition. Therefore, for targets that convert all selects into
> branches, these two selects get almost inevitably converted into either two
> branching sequences, or one branch sequence followed or preceded by the
> non-branching code of the other select.
>
> Scenarios like this are not a problem for pipelined processors with
> expensive branches and ’select’ like instructions because in this case the
> effort is put in removing branches by the insertion of speculative code
> which is usually cheaper. However, for simple processors where branches
> must be used and are cheap, we often want to do the opposite because the
> execution of speculative code or particular transforms is comparatively
> most costly than just branching.
>
> Going back to the case, if these two selects arrived together to the
> backend, then they could be identified and folded into a block preceded by
> a single conditional branch, resulting in faster and shorter code. The
> latter is the purpose of some function in the CodeGen prepare pass, but
> this doesn’t fire because the two selects are not there to work with them
> to begin with.
>
> Finally, generally speaking, I don’t regard complex reverse pattern
> matching as an elegant solution, compared with providing the right LLVM
> hooks for the affected targets, because this is like acknowledging that
> these targets are inferior, or not worth the effort to improve LLVM code.
> Also, whatever the way I look at it, I always find that it’s better to
> solve problems by working on the root causes rather than trying to apply
> late patches or corrections, which generally make things convoluted and
> difficult to maintain, not to mention suboptimal.
>
While it is true that reverse pattern matching isn't optimal / nice, there
is a inherent tradeoff at scale: the middle-end passes are supposed to
serve all the targets. For this purpose the "canonical" form of the IR
is
supposed to be the one-form that the middle-end understands and reason
about, and can be lowered towards the form a target prefers (the last part
of the middle-end is doing this kind of target specialization on the IR
before handing over to the backend).
If you don't canonicalize because one target prefers another form, you need
to teach all the middle-end to properly recognize, understand, and maintain
these two equivalent forms of the same program. Ultimately this is also
making "things convoluted and difficult to maintain, not to mention
suboptimal", not for the target anymore, but for the whole middle-end. The
current model allows people working on the middle-end to not have to know
about all the targets and their preference.

Also "preservation of X in the original code" is also at odd with
canonicalization: the optimizer is expected to be as efficient in an
independent way of how the user wrote their source code. Canonicalization
helps by allowing the middle end optimizations to operate on a single form
of the program, eliminating equivalent forms when possible.

I'm not saying it is ideal, but just trying to help by providing more
context on the mindset others may look at it, sorry if this was already all
obvious to you ;)

In practice if it can be showed that the canonical form of the IR does not
make it possible (!= convenient) for a target to generate good code, it is
probably worth revisiting the chosen form and either switch it to another
one that can suit all targets or add some TTI (but I'm not sure there is a
precedent for this though).

Best,

-- 
Mehdi


>
> On 10/4/19 5:46 AM, Joan Lluch via llvm-dev wrote:
>
> Hi all,
>
> As a continuation of this thread, I was about to fill a bug report
> requesting the modification of DAGTypeLegalizer::ExpandIntRes_SIGN_EXTEND,
> in order to avoid the creation of Shifts for targets with no native
> support. For example by generating a ‘select' equivalent to  a<0 ?
-1 : 0 instead
> of an arithmetic shift right. For targets with no multiple shifts or word
> extension instructions, the select is much cheaper.
>
> However, I found that the early InstCombine pass spoils such optimisation
> by creating shifts on their own as a transform of (supposedly) equivalent
> code.
>
> ...
>
> I strongly suggest the above gets fixed. HOWEVER, even after the DAG
> combiner is fixed, the issues will remain due to InstCombine doing
> essentially the same thing much earlier. Consider code like this:
>
> int test0( int a )
> {
>   return a<0 ? -1 : 0;
> }
>
> int test1( int a )
> {
>   return a<0 ? 1 : 0;
> }
>
> int test2( int a )
> {
>   return a<0 ? 2 : 0;
> }
>
> int test3( int a )
> {
>   return a<0 ? 3 : 0;
> }
>
> In all cases, InstCombine converts the above into Shifts, right into the
> IR, which the backend can’t do anything about.
>
>
> Why not? It seems like we could pattern match the shifts back into
> selects/branches?
>
>  -Hal
>
>
> ...
>
>>
>>
>
>
> _______________________________________________
> LLVM Developers mailing listllvm-dev at
lists.llvm.orghttps://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> --
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20191005/27e0162b/attachment-0001.html>

llvm dev - Oct 2019 - [cfe-dev] CFG simplification question, and preservation of branching in the original code

[llvm-dev] [cfe-dev] CFG simplification question, and preservation of branching in the original code

[llvm-dev] [cfe-dev] CFG simplification question, and preservation of branching in the original code

[llvm-dev] [cfe-dev] CFG simplification question, and preservation of branching in the original code