thr3ads.net - llvm dev - [llvm-dev] [AVR] [MSP430] Code gen improvements for 8 bit and 16 bit targets [Nov 2019]

If this information is useful, please help other people find it:
Share via:

Joan Lluch via llvm-dev

2019-Oct-07 22:22 UTC

[llvm-dev] [AVR] [MSP430] Code gen improvements for 8 bit and 16 bit targets

Hi All,

While implementing a custom 16 bit target for academical and demonstration
purposes, I unexpectedly found that LLVM was not really ready for 8 bit and 16
bit targets. Let me expose why.

Target backends can be divided into two major categories, with essentially
nothing in between:

Type 1: The big 32 or 64 bit targets. Heavily pipelined with expensive branches,
running at clock frequencies up to the GHZ range. Aimed at workstations,
computers or smartphones. For example PowerPC, x86 and ARM.

Type 2: The 8 or 16 bit targets. Non-pipelined processors, running at
frequencies on the MHz range, generally fast access to memory, aimed at the
embedded marked or low consumption applications (they are virtually everywhere).
LLVM currently implements an experimental AVR target and the MSP430.

LLVM does a great for Type 1 targets, but it can be improved for Type 2 targets.

The essential target feature that makes one way of code generation better for
either type 1 or type 2 targets, is pipelining. For type 1 we want branching to
be avoided for as much as possible. Turning branching code into sequential
instructions with the execution of speculative code is advantageous. These
targets have instruction sets that help with that goal, in particular cheap
‘shifts’ and ‘cmove' type instructions.

Type 2 targets, on the contrary, have cheap branching. Their instruction set is
not particularly designed to assist branching avoidance because that’s not
required. In fact, branching on these targets is often desirable, as opposed to
transforms creating expensive speculative execution. ‘Shifts’ are only
one-single-bit, and conditional execution instructions other than branches are
not available.

The current situation is that some LLVM target-independent optimisations are not
really that ‘independent' when we bring type 2 targets into the mix.
Unfortunately, LLVM was apparently designed with type 1 targets in mind alone,
which causes degraded code for type 2 targets. In relation to this, I posted a
couple of bug reports that show some of these issues:

https://bugs.llvm.org/show_bug.cgi?id=43542
https://bugs.llvm.org/show_bug.cgi?id=43559

The first bug is already fixed by somebody who also suggested me to raise this
subject on the llvm-dev mailing list, which I’m doing now.

Incidentally, most code degradations happen on the DAGCombine code. It’s a bug
because LLVM may create transforms into instructions that are not Legal for some
targets. Such transforms are detrimental on those targets. This bug won't
show for most targets, but it is nonetheless particularly affecting targets with
no native shifts support. The bug consists on the transformation of already
relatively cheap code to expensive one. The fix prevents that.

Still, although the above DAGCombine code gets fixed, the poor code generation
issue will REMAIN. In fact, the same kind of transformations are performed
earlier as part of the IR optimisations, in the InstCombine pass. The result is
that the IR /already/ incorporates the undesirable transformations for type 2
targets, which DAGCombine can't do anything about.

At this point, reverse pattern matching looks as the obvious solution, but I
think it’s not the right one because that would need to be implemented on every
single current or future (type 2) target. It is also difficult to get rid of
undesired transforms when they carry complexity, or are the result or
consecutive combinations. Delegating the whole solution to only reverse pattern
matching code, will just perpetuate the overall problem, which will continue
affecting future target developments. Some reverse pattern matching is
acceptable and desirable to deal with very specific target features, but not as
a global solution to this problem.

On a previous email, a statement was posted that in recent years attempts have
been made to remove code from InstCombine and port it to DAGCombiner. I agree
that this is a good thing to do, but it was reportedly difficult and associated
with potential problems or unanticipated regressions. I understand those
concerns and I acknowledge the involved work as challenging. However, in order
to solve the presented problem, some work is still required in InstCombine.

Therefore, I wondered if something in between could still be done, so this is my
proposal: There are already many command line compiler options that modify IR
output in several ways. Some options are even target dependent, and some targets
even explicitly set them (In RenderTargetOptions). The InstCombine code, has
itself its own small set of options, for example
"instcombine-maxarray-size” or "instcombine-code-sinking”. Command
line compiler options produce functionally equivalent IR output, while
respecting stablished canonicalizations. In all cases, the output is just valid
IR code in a proper form that depends on the selected options. As an example -O0
produces a very different output than -O3, or -Os, all of them are valid as the
input to any target backend. My suggestion would be to incorporate a compiler
option acting on the InstCombine pass. The option would improve IR code aimed at
Type 2 targets. Of course, this option would not be enabled by default so the IR
output would remain exactly as it is today if not explicitly enabled.

What this option would need to do in practice is really easy and
straightforward. Just bypassing (avoiding) certain transformations that might be
considered harmful for targets benefiting from it. I performed some simple
tests, specially directed at the InstCombineSelect transformations, and I found
them to work great and generating greatly improved code for both the MSP430 and
AVR targets.

Now, I am aware that this proposal might come a bit unexpected and even regarded
as inelegant or undesirable, but maybe after some careful balancing of pros and
cons, it is just what we need to do, if we really care about LLVM as a viable
platform for 8 and 16 bit targets. As stated earlier, It’s easy to implement,
it’s just an optional compiler setting not affecting major targets at all, and
the future extend of it can be gradually defined or agreed upon as it is put
into operation.  Any views would be appreciated.

John.

via llvm-dev

2019-Oct-08 14:43 UTC

head link

[llvm-dev] [AVR] [MSP430] Code gen improvements for 8 bit and 16 bit targets

> -----Original Message-----
> From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of Joan
Lluch
> via llvm-dev
> Sent: Monday, October 07, 2019 6:22 PM
> To: llvm-dev <llvm-dev at lists.llvm.org>
> Subject: [llvm-dev] [AVR] [MSP430] Code gen improvements for 8 bit and 16
> bit targets
> 
> Hi All,
> 
> While implementing a custom 16 bit target for academical and demonstration
> purposes, I unexpectedly found that LLVM was not really ready for 8 bit
> and 16 bit targets. Let me expose why.
> 
> Target backends can be divided into two major categories, with essentially
> nothing in between:
> 
> Type 1: The big 32 or 64 bit targets. Heavily pipelined with expensive
> branches, running at clock frequencies up to the GHZ range. Aimed at
> workstations, computers or smartphones. For example PowerPC, x86 and ARM.
> 
> Type 2: The 8 or 16 bit targets. Non-pipelined processors, running at
> frequencies on the MHz range, generally fast access to memory, aimed at
> the embedded marked or low consumption applications (they are virtually
> everywhere). LLVM currently implements an experimental AVR target and the
> MSP430.
> 
> LLVM does a great for Type 1 targets, but it can be improved for Type 2
> targets.
> 
> The essential target feature that makes one way of code generation better
> for either type 1 or type 2 targets, is pipelining. For type 1 we want
> branching to be avoided for as much as possible. Turning branching code
> into sequential instructions with the execution of speculative code is
> advantageous. These targets have instruction sets that help with that
> goal, in particular cheap ‘shifts’ and ‘cmove' type instructions.
> 
> Type 2 targets, on the contrary, have cheap branching. Their instruction
> set is not particularly designed to assist branching avoidance because
> that’s not required. In fact, branching on these targets is often
> desirable, as opposed to transforms creating expensive speculative
> execution. ‘Shifts’ are only one-single-bit, and conditional execution
> instructions other than branches are not available.
> 
> The current situation is that some LLVM target-independent optimisations
> are not really that ‘independent' when we bring type 2 targets into the
> mix. Unfortunately, LLVM was apparently designed with type 1 targets in
> mind alone, which causes degraded code for type 2 targets. In relation to
> this, I posted a couple of bug reports that show some of these issues:
> 
> https://bugs.llvm.org/show_bug.cgi?id=43542
> https://bugs.llvm.org/show_bug.cgi?id=43559
> 
> The first bug is already fixed by somebody who also suggested me to raise
> this subject on the llvm-dev mailing list, which I’m doing now.
> 
> Incidentally, most code degradations happen on the DAGCombine code. It’s a
> bug because LLVM may create transforms into instructions that are not
> Legal for some targets. Such transforms are detrimental on those targets.
> This bug won't show for most targets, but it is nonetheless
particularly
> affecting targets with no native shifts support. The bug consists on the
> transformation of already relatively cheap code to expensive one. The fix
> prevents that.
> 
> Still, although the above DAGCombine code gets fixed, the poor code
> generation issue will REMAIN. In fact, the same kind of transformations
> are performed earlier as part of the IR optimisations, in the InstCombine
> pass. The result is that the IR /already/ incorporates the undesirable
> transformations for type 2 targets, which DAGCombine can't do anything
> about.
> 
> At this point, reverse pattern matching looks as the obvious solution, but
> I think it’s not the right one because that would need to be implemented
> on every single current or future (type 2) target. It is also difficult to
> get rid of undesired transforms when they carry complexity, or are the
> result or consecutive combinations. Delegating the whole solution to only
> reverse pattern matching code, will just perpetuate the overall problem,
> which will continue affecting future target developments. Some reverse
> pattern matching is acceptable and desirable to deal with very specific
> target features, but not as a global solution to this problem.
> 
> On a previous email, a statement was posted that in recent years attempts
> have been made to remove code from InstCombine and port it to DAGCombiner.
> I agree that this is a good thing to do, but it was reportedly difficult
> and associated with potential problems or unanticipated regressions. I
> understand those concerns and I acknowledge the involved work as
> challenging. However, in order to solve the presented problem, some work
> is still required in InstCombine.
> 
> Therefore, I wondered if something in between could still be done, so this
> is my proposal: There are already many command line compiler options that
> modify IR output in several ways. Some options are even target dependent,
> and some targets even explicitly set them (In RenderTargetOptions). The
> InstCombine code, has itself its own small set of options, for example
> "instcombine-maxarray-size” or "instcombine-code-sinking”.
Command line
> compiler options produce functionally equivalent IR output, while
> respecting stablished canonicalizations. In all cases, the output is just
> valid IR code in a proper form that depends on the selected options. As an
> example -O0 produces a very different output than -O3, or -Os, all of them
> are valid as the input to any target backend. My suggestion would be to
> incorporate a compiler option acting on the InstCombine pass. The option
> would improve IR code aimed at Type 2 targets. Of course, this option
> would not be enabled by default so the IR output would remain exactly as
> it is today if not explicitly enabled.
An option is certainly one way to get this effect; another would be to
add some sort of target-specific query, which would drive the same choices
in the IR transforms.  TargetTransformInfo appears to be full of these
sorts of queries.
--paulr
> 
> What this option would need to do in practice is really easy and
> straightforward. Just bypassing (avoiding) certain transformations that
> might be considered harmful for targets benefiting from it. I performed
> some simple tests, specially directed at the InstCombineSelect
> transformations, and I found them to work great and generating greatly
> improved code for both the MSP430 and AVR targets.
> 
> Now, I am aware that this proposal might come a bit unexpected and even
> regarded as inelegant or undesirable, but maybe after some careful
> balancing of pros and cons, it is just what we need to do, if we really
> care about LLVM as a viable platform for 8 and 16 bit targets. As stated
> earlier, It’s easy to implement, it’s just an optional compiler setting
> not affecting major targets at all, and the future extend of it can be
> gradually defined or agreed upon as it is put into operation.  Any views
> would be appreciated.
> 
> John.
> 
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Joan Lluch via llvm-dev

2019-Oct-09 06:33 UTC

head link

[llvm-dev] [AVR] [MSP430] Code gen improvements for 8 bit and 16 bit targets

Hi Paul,

TargetTransformInfo hooks are fine, as are the TargetLowering ones, to customise
backend code. They would certainly add flexibility compared with relying on
instruction Legality alone, and I would up-vote them along with the addition of
the missing legality checks in DAGCombine.  However, we shouldn’t apply any
target specific code to the frontend optimisations, because frontend code is
supposed to be mostly target-independent, and strong dependence on targets is
not desirable. This is why I proposed it the way I did.

John 
> On 8 Oct 2019, at 16:43, paul.robinson at sony.com wrote:
> 
> An option is certainly one way to get this effect; another would be to
> add some sort of target-specific query, which would drive the same choices
> in the IR transforms.  TargetTransformInfo appears to be full of these
> sorts of queries.
> —paulr

Joan Lluch via llvm-dev

2019-Nov-13 09:25 UTC

head link

[llvm-dev] [AVR] [MSP430] Code gen improvements for 8 bit and 16 bit targets

Hi All,

In relation to the subject of this message I got my first round of patches
successfully reviewed and committed. As a matter of reference, they are the
following:

https://reviews.llvm.org/D69116
https://reviews.llvm.org/D69120
https://reviews.llvm.org/D69326
https://reviews.llvm.org/D70042 

They provided hooks in TargetLowering and DAGCombine that enable interested
targets to implement a filter for expensive shift operations. The patches work
by preventing certain transformations that would result in expensive code for
these targets.

I want to express my gratitude to the LLVM community and particularly to members
@spatel and @asl who have directly followed, helped with, and reviewed these
patches.

This is half of what’s required to get the full benefits. As I exposed before,
in order to get this fully functional, we need to do some work on InstCombine.
This is because some of the transformations that we want to avoid, are created
earlier in InstCombine, thus deactivating the patches above.

My general proposal when I started this (quoted below for reference), was to
implement a command line option that would act on InstCombine by bypassing
(preventing) certain transformations. I still think that this is the easier and
safer way to obtain the desired goals, but I want to subject that to the
consideration of the community again to make sure I am on the right track.

My current concrete proposal is to add a command line option (boolean) that I
would name “enable-shift-relaxation” or just  “relax-shifts”. This option would
act in several places in InstCombineCasts and in InstCombineSelect with the
described effects.

I also need to ask about the best way to present tests cases for that. I learned
how to create test files for codegen transforms (IR to Assembly), but now I will
be working on the “target Independent” side. For my internal work, I have
manually been testing C-code to IR generation, but I do not know how to create
proper test cases for the llvm project. Any help on this would be appreciated.

Thanks in advance

John


> On 8 Oct 2019, at 00:22, Joan Lluch <joan.lluch at icloud.com> wrote:
> 
> Hi All,
> 
> While implementing a custom 16 bit target for academical and demonstration
purposes, I unexpectedly found that LLVM was not really ready for 8 bit and 16
bit targets. Let me expose why.
> 
> Target backends can be divided into two major categories, with essentially
nothing in between:
> 
> Type 1: The big 32 or 64 bit targets. Heavily pipelined with expensive
branches, running at clock frequencies up to the GHZ range. Aimed at
workstations, computers or smartphones. For example PowerPC, x86 and ARM.
> 
> Type 2: The 8 or 16 bit targets. Non-pipelined processors, running at
frequencies on the MHz range, generally fast access to memory, aimed at the
embedded marked or low consumption applications (they are virtually everywhere).
LLVM currently implements an experimental AVR target and the MSP430.
> 
> LLVM does a great for Type 1 targets, but it can be improved for Type 2
targets.
> 
> The essential target feature that makes one way of code generation better
for either type 1 or type 2 targets, is pipelining. For type 1 we want branching
to be avoided for as much as possible. Turning branching code into sequential
instructions with the execution of speculative code is advantageous. These
targets have instruction sets that help with that goal, in particular cheap
‘shifts’ and ‘cmove' type instructions.
> 
> Type 2 targets, on the contrary, have cheap branching. Their instruction
set is not particularly designed to assist branching avoidance because that’s
not required. In fact, branching on these targets is often desirable, as opposed
to transforms creating expensive speculative execution. ‘Shifts’ are only
one-single-bit, and conditional execution instructions other than branches are
not available.
> 
> The current situation is that some LLVM target-independent optimisations
are not really that ‘independent' when we bring type 2 targets into the mix.
Unfortunately, LLVM was apparently designed with type 1 targets in mind alone,
which causes degraded code for type 2 targets. In relation to this, I posted a
couple of bug reports that show some of these issues:
> 
> https://bugs.llvm.org/show_bug.cgi?id=43542
> https://bugs.llvm.org/show_bug.cgi?id=43559
> 
> The first bug is already fixed by somebody who also suggested me to raise
this subject on the llvm-dev mailing list, which I’m doing now.
> 
> Incidentally, most code degradations happen on the DAGCombine code. It’s a
bug because LLVM may create transforms into instructions that are not Legal for
some targets. Such transforms are detrimental on those targets. This bug
won't show for most targets, but it is nonetheless particularly affecting
targets with no native shifts support. The bug consists on the transformation of
already relatively cheap code to expensive one. The fix prevents that.
> 
> Still, although the above DAGCombine code gets fixed, the poor code
generation issue will REMAIN. In fact, the same kind of transformations are
performed earlier as part of the IR optimisations, in the InstCombine pass. The
result is that the IR /already/ incorporates the undesirable transformations for
type 2 targets, which DAGCombine can't do anything about.
> 
> At this point, reverse pattern matching looks as the obvious solution, but
I think it’s not the right one because that would need to be implemented on
every single current or future (type 2) target. It is also difficult to get rid
of undesired transforms when they carry complexity, or are the result or
consecutive combinations. Delegating the whole solution to only reverse pattern
matching code, will just perpetuate the overall problem, which will continue
affecting future target developments. Some reverse pattern matching is
acceptable and desirable to deal with very specific target features, but not as
a global solution to this problem.
> 
> On a previous email, a statement was posted that in recent years attempts
have been made to remove code from InstCombine and port it to DAGCombiner. I
agree that this is a good thing to do, but it was reportedly difficult and
associated with potential problems or unanticipated regressions. I understand
those concerns and I acknowledge the involved work as challenging. However, in
order to solve the presented problem, some work is still required in
InstCombine.
> 
> Therefore, I wondered if something in between could still be done, so this
is my proposal: There are already many command line compiler options that modify
IR output in several ways. Some options are even target dependent, and some
targets even explicitly set them (In RenderTargetOptions). The InstCombine code,
has itself its own small set of options, for example
"instcombine-maxarray-size” or "instcombine-code-sinking”. Command
line compiler options produce functionally equivalent IR output, while
respecting stablished canonicalizations. In all cases, the output is just valid
IR code in a proper form that depends on the selected options. As an example -O0
produces a very different output than -O3, or -Os, all of them are valid as the
input to any target backend. My suggestion would be to incorporate a compiler
option acting on the InstCombine pass. The option would improve IR code aimed at
Type 2 targets. Of course, this option would not be enabled by default so the IR
output would remain exactly as it is today if not explicitly enabled.
> 
> What this option would need to do in practice is really easy and
straightforward. Just bypassing (avoiding) certain transformations that might be
considered harmful for targets benefiting from it. I performed some simple
tests, specially directed at the InstCombineSelect transformations, and I found
them to work great and generating greatly improved code for both the MSP430 and
AVR targets.
> 
> Now, I am aware that this proposal might come a bit unexpected and even
regarded as inelegant or undesirable, but maybe after some careful balancing of
pros and cons, it is just what we need to do, if we really care about LLVM as a
viable platform for 8 and 16 bit targets. As stated earlier, It’s easy to
implement, it’s just an optional compiler setting not affecting major targets at
all, and the future extend of it can be gradually defined or agreed upon as it
is put into operation.  Any views would be appreciated.
> 
> John.
> 
> 
>

Roman Lebedev via llvm-dev

2019-Nov-13 09:39 UTC

head link

[llvm-dev] [AVR] [MSP430] Code gen improvements for 8 bit and 16 bit targets

On Wed, Nov 13, 2019 at 12:26 PM Joan Lluch via llvm-dev
<llvm-dev at lists.llvm.org> wrote:>
> Hi All,
>
> In relation to the subject of this message I got my first round of patches
successfully reviewed and committed. As a matter of reference, they are the
following:
>
> https://reviews.llvm.org/D69116
> https://reviews.llvm.org/D69120
> https://reviews.llvm.org/D69326
> https://reviews.llvm.org/D70042
>
> They provided hooks in TargetLowering and DAGCombine that enable interested
targets to implement a filter for expensive shift operations. The patches work
by preventing certain transformations that would result in expensive code for
these targets.
>
> I want to express my gratitude to the LLVM community and particularly to
members @spatel and @asl who have directly followed, helped with, and reviewed
these patches.
> This is half of what’s required to get the full benefits. As I exposed
before, in order to get this fully functional, we need to do some work on
InstCombine. This is because some of the transformations that we want to avoid,
are created earlier in InstCombine, thus deactivating the patches above.
>
> My general proposal when I started this (quoted below for reference), was
to implement a command line option that would act on InstCombine by bypassing
(preventing) certain transformations. I still think that this is the easier and
safer way to obtain the desired goals, but I want to subject that to the
consideration of the community again to make sure I am on the right track.
>
> My current concrete proposal is to add a command line option (boolean) that
I would name “enable-shift-relaxation” or just  “relax-shifts”. This option
would act in several places in InstCombineCasts and in InstCombineSelect with
the described effects.I'm not really sold on this part, for the reasons previously discussed.

This is only going to avoid creating such shifts, in passes that will
be adjusted.
This will not completely ban such shifts, meaning they still can exist.
Which means this will only partially prevent 'degrading' existing IR.
What about the ones that were already present in the original input
(from C code, e.g.)?

I think you just want to add an inverse set of DAGCombine transforms,
also guarded with that target hook you added. That way there's no chance
to still end up with unfavorable shifts on your target, and no middle-end
impact from having more than one canonical representation.
> I also need to ask about the best way to present tests cases for that. I
learned how to create test files for codegen transforms (IR to Assembly), but
now I will be working on the “target Independent” side. For my internal work, I
have manually been testing C-code to IR generation, but I do not know how to
create proper test cases for the llvm project. Any help on this would be
appreciated.
>
> Thanks in advance
>
> JohnRoman
> > On 8 Oct 2019, at 00:22, Joan Lluch <joan.lluch at icloud.com>
wrote:
> >
> > Hi All,
> >
> > While implementing a custom 16 bit target for academical and
demonstration purposes, I unexpectedly found that LLVM was not really ready for
8 bit and 16 bit targets. Let me expose why.
> >
> > Target backends can be divided into two major categories, with
essentially nothing in between:
> >
> > Type 1: The big 32 or 64 bit targets. Heavily pipelined with expensive
branches, running at clock frequencies up to the GHZ range. Aimed at
workstations, computers or smartphones. For example PowerPC, x86 and ARM.
> >
> > Type 2: The 8 or 16 bit targets. Non-pipelined processors, running at
frequencies on the MHz range, generally fast access to memory, aimed at the
embedded marked or low consumption applications (they are virtually everywhere).
LLVM currently implements an experimental AVR target and the MSP430.
> >
> > LLVM does a great for Type 1 targets, but it can be improved for Type
2 targets.
> >
> > The essential target feature that makes one way of code generation
better for either type 1 or type 2 targets, is pipelining. For type 1 we want
branching to be avoided for as much as possible. Turning branching code into
sequential instructions with the execution of speculative code is advantageous.
These targets have instruction sets that help with that goal, in particular
cheap ‘shifts’ and ‘cmove' type instructions.
> >
> > Type 2 targets, on the contrary, have cheap branching. Their
instruction set is not particularly designed to assist branching avoidance
because that’s not required. In fact, branching on these targets is often
desirable, as opposed to transforms creating expensive speculative execution.
‘Shifts’ are only one-single-bit, and conditional execution instructions other
than branches are not available.
> >
> > The current situation is that some LLVM target-independent
optimisations are not really that ‘independent' when we bring type 2 targets
into the mix. Unfortunately, LLVM was apparently designed with type 1 targets in
mind alone, which causes degraded code for type 2 targets. In relation to this,
I posted a couple of bug reports that show some of these issues:
> >
> > https://bugs.llvm.org/show_bug.cgi?id=43542
> > https://bugs.llvm.org/show_bug.cgi?id=43559
> >
> > The first bug is already fixed by somebody who also suggested me to
raise this subject on the llvm-dev mailing list, which I’m doing now.
> >
> > Incidentally, most code degradations happen on the DAGCombine code.
It’s a bug because LLVM may create transforms into instructions that are not
Legal for some targets. Such transforms are detrimental on those targets. This
bug won't show for most targets, but it is nonetheless particularly
affecting targets with no native shifts support. The bug consists on the
transformation of already relatively cheap code to expensive one. The fix
prevents that.
> >
> > Still, although the above DAGCombine code gets fixed, the poor code
generation issue will REMAIN. In fact, the same kind of transformations are
performed earlier as part of the IR optimisations, in the InstCombine pass. The
result is that the IR /already/ incorporates the undesirable transformations for
type 2 targets, which DAGCombine can't do anything about.
> >
> > At this point, reverse pattern matching looks as the obvious solution,
but I think it’s not the right one because that would need to be implemented on
every single current or future (type 2) target. It is also difficult to get rid
of undesired transforms when they carry complexity, or are the result or
consecutive combinations. Delegating the whole solution to only reverse pattern
matching code, will just perpetuate the overall problem, which will continue
affecting future target developments. Some reverse pattern matching is
acceptable and desirable to deal with very specific target features, but not as
a global solution to this problem.
> >
> > On a previous email, a statement was posted that in recent years
attempts have been made to remove code from InstCombine and port it to
DAGCombiner. I agree that this is a good thing to do, but it was reportedly
difficult and associated with potential problems or unanticipated regressions. I
understand those concerns and I acknowledge the involved work as challenging.
However, in order to solve the presented problem, some work is still required in
InstCombine.
> >
> > Therefore, I wondered if something in between could still be done, so
this is my proposal: There are already many command line compiler options that
modify IR output in several ways. Some options are even target dependent, and
some targets even explicitly set them (In RenderTargetOptions). The InstCombine
code, has itself its own small set of options, for example
"instcombine-maxarray-size” or "instcombine-code-sinking”. Command
line compiler options produce functionally equivalent IR output, while
respecting stablished canonicalizations. In all cases, the output is just valid
IR code in a proper form that depends on the selected options. As an example -O0
produces a very different output than -O3, or -Os, all of them are valid as the
input to any target backend. My suggestion would be to incorporate a compiler
option acting on the InstCombine pass. The option would improve IR code aimed at
Type 2 targets. Of course, this option would not be enabled by default so the IR
output would remain exactly as it is today if not explicitly enabled.
> >
> > What this option would need to do in practice is really easy and
straightforward. Just bypassing (avoiding) certain transformations that might be
considered harmful for targets benefiting from it. I performed some simple
tests, specially directed at the InstCombineSelect transformations, and I found
them to work great and generating greatly improved code for both the MSP430 and
AVR targets.
> >
> > Now, I am aware that this proposal might come a bit unexpected and
even regarded as inelegant or undesirable, but maybe after some careful
balancing of pros and cons, it is just what we need to do, if we really care
about LLVM as a viable platform for 8 and 16 bit targets. As stated earlier,
It’s easy to implement, it’s just an optional compiler setting not affecting
major targets at all, and the future extend of it can be gradually defined or
agreed upon as it is put into operation.  Any views would be appreciated.
> >
> > John.
> >
> >
> >
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Possibly Parallel Threads

Search for more apparently analagous threads

llvm dev - Nov 2019 - [AVR] [MSP430] Code gen improvements for 8 bit and 16 bit targets

[llvm-dev] [AVR] [MSP430] Code gen improvements for 8 bit and 16 bit targets

[llvm-dev] [AVR] [MSP430] Code gen improvements for 8 bit and 16 bit targets

[llvm-dev] [AVR] [MSP430] Code gen improvements for 8 bit and 16 bit targets

[llvm-dev] [AVR] [MSP430] Code gen improvements for 8 bit and 16 bit targets

[llvm-dev] [AVR] [MSP430] Code gen improvements for 8 bit and 16 bit targets

Possibly Parallel Threads