thr3ads.net - llvm dev - [LLVMdev] Proposal for new Legalization framework [Apr 2013]

If this information is useful, please help other people find it:
Share via:

Dan Gohman

2013-Apr-25 00:01 UTC

[LLVMdev] Proposal for new Legalization framework

In the spirit of the (long-term) intent to migrate away from the
SelectionDAG framework, it is desirable to implement legalization passes as
discrete passes. Attached is a patch which implements the beginning of a
new type legalization pass, to help motivate discussion.

Is LLVM IR the right level for this? The main alternative approach that's
been discussed is to do FastISel to a target-independent opcode set on
MachineInstrs, and then do legalization and ultimately the last phase off
instruction selection proper after that. The most obvious advantage of
using LLVM IR for legalization is that it's (currently) more
developer-friendly. The most obvious advantage of using MachineInstrs is
that they would make it easier to do low-level manipulations. Also, doing
legalization on MachineInstrs would mean avoiding having LLVM-IR-level
optimization passes which lower the IR, which has historically been a
design goal of LLVM.

The attached pass operates on LLVM IR, and it's been educational to develop
it this way, but I'm ok with rewriting it in MachineInstrs if that's the
consensus.

Given that the code I wrote operates on LLVM IR, it raises the following
interesting issues:

The code I wrote in the attached patch operates on LLVM IR, so for example
it expands adds into llvm.uadd_with_overflow intrinsics. The intrinsics
available in LLVM IR today aren't as expressive as the ISD operator set in
SelectionDAG, so the generated code is quite a bit more verbose in some
cases. Should we instead add new intrinsics, for add and for a bunch of
other things? People I've talked to so far were hesitant to add new
intrinsics unless they're really prohibitive to do in other ways.

How should we legalize function argument and return types? Because of LLVM
IR rules, one can't just change the signature of a function without
changing all its callsites, so as a consequence the code I wrote is a
ModulePass. This is unfortunate, since it's a goal that most of the codegen
passes be FunctionPasses. Modifying the Function types may also be
incompatible with the ABI coordination dance between front-ends and
backends on some targets. One alternative, which is also implemented, is to
leave function signatures alone and simply insert conversions to and from
legal types. In this case, instruction selection would need to know how to
handle illegal types in these special circumstances, but presumably it
would be easy enough to special-case them. However, if this pass is
followed by an optimization pass analogous to DAGCombine, it may be tricky
to keep the optimization pass from creating patterns which codegen isn't
prepared to handle. Another alternative, which is not implemented yet, is
have the legalization pass create new Functions, and make the original
Functions simply call the legalized functions, and then have a late pass
clean everything up.

We may already need some amount of special-casing for things like bitfield
loads and stores. To implement the C++ memory model, some bitfield loads
and stores actually need to load and store a precise number of bits, even
if that number of bits doesn't correspond to a legal integer (register)
size on the target machine. This isn't implemented yet, but I expect this
will be handled by leaving those loads and stores alone, and simply putting
the burden on subsequent passes to lower them properly. An alternative to
this is to add new truncating-store and sign/zero-extending load intrinsics.

Another complication due to using LLVM IR is the interaction with
DebugInfo. If AllocaInsts for illegal types are expanded, or if values for
llvm.dbg.value intrinsics are expanded, there's currently no way to
describe this (DWARF can describe it, but LLVM IR can't currently). I
assume this could be fixed by extending LLVM IR's DebugInfo intrinsics, but
I haven't investigated it yet.

Dan
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130424/51f958c6/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: legalize-integers.patch
Type: application/octet-stream
Size: 49922 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130424/51f958c6/attachment.obj>

Chris Lattner

2013-Apr-25 00:26 UTC

head link

[LLVMdev] Proposal for new Legalization framework

On Apr 24, 2013, at 5:01 PM, Dan Gohman <dan433584 at gmail.com>
wrote:> In the spirit of the (long-term) intent to migrate away from the
SelectionDAG framework, it is desirable to implement legalization passes as
discrete passes. Attached is a patch which implements the beginning of a new
type legalization pass, to help motivate discussion.
This is a great discussion to have.
> Is LLVM IR the right level for this?
IMO, no, definitely not.
> The main alternative approach that's been discussed is to do FastISel
to a target-independent opcode set on MachineInstrs, and then do legalization
and ultimately the last phase off instruction selection proper after that. The
most obvious advantage of using LLVM IR for legalization is that it's
(currently) more developer-friendly. The most obvious advantage of using
MachineInstrs is that they would make it easier to do low-level manipulations.
Also, doing legalization on MachineInstrs would mean avoiding having
LLVM-IR-level optimization passes which lower the IR, which has historically
been a design goal of LLVM.
I think that you (in the rest of your email) identify a number of specific
problems with using LLVM IR for legalization.  These are a lot of specific
issues caused by the fact that LLVM IR is intentionally not trying to model
machine issues.  I'm sure you *could* try to make this work by introducing a
bunch of new intrinsics into LLVM IR which would model the union of the
selection dag ISD nodes along with the target specific X86ISD nodes.  However,
at this point, you have only modeled the operations and haven't modeled the
proper type system.

LLVM IR is just not the right level for this.  You seem to think it is better
than MachineInstrs because of developer friendliness, but it isn't clear to
me that LLVM IR with the additions you're talking about would actually be
friendly anymore :-)

Personally, I think that the right representation for legalization is
MachineInstrs supplemented with a type system that allows MVTs as well as
register classes.  If you are seriously interested in pushing forward on this,
we should probably discuss it in person, or over beer at the next social or
something.

-Chris

Nadav Rotem

2013-Apr-25 00:44 UTC

head link

[LLVMdev] Proposal for new Legalization framework

> 
> Is LLVM IR the right level for this? The main alternative approach
that's been discussed is to do FastISel to a target-independent opcode set
on MachineInstrs, and then do legalization and ultimately the last phase off
instruction selection proper after that. The most obvious advantage of using
LLVM IR for legalization is that it's (currently) more developer-friendly.
The most obvious advantage of using MachineInstrs is that they would make it
easier to do low-level manipulations. Also, doing legalization on MachineInstrs
would mean avoiding having LLVM-IR-level optimization passes which lower the IR,
which has historically been a design goal of LLVM.
> 
> The attached pass operates on LLVM IR, and it's been educational to
develop it this way, but I'm ok with rewriting it in MachineInstrs if
that's the consensus.
> 
> Given that the code I wrote operates on LLVM IR, it raises the following
interesting issues:
> 
> The code I wrote in the attached patch operates on LLVM IR, so for example
it expands adds into llvm.uadd_with_overflow intrinsics. The intrinsics
available in LLVM IR today aren't as expressive as the ISD operator set in
SelectionDAG, so the generated code is quite a bit more verbose in some cases.
Should we instead add new intrinsics, for add and for a bunch of other things?
People I've talked to so far were hesitant to add new intrinsics unless
they're really prohibitive to do in other ways.
> 
> How should we legalize function argument and return types? Because of LLVM
IR rules, one can't just change the signature of a function without changing
all its callsites, so as a consequence the code I wrote is a ModulePass. This is
unfortunate, since it's a goal that most of the codegen passes be
FunctionPasses. Modifying the Function types may also be incompatible with the
ABI coordination dance between front-ends and backends on some targets. One
alternative, which is also implemented, is to leave function signatures alone
and simply insert conversions to and from legal types. In this case, instruction
selection would need to know how to handle illegal types in these special
circumstances, but presumably it would be easy enough to special-case them.
However, if this pass is followed by an optimization pass analogous to
DAGCombine, it may be tricky to keep the optimization pass from creating
patterns which codegen isn't prepared to handle. Another alternative, which
is not implemented yet, is have the legalization pass create new Functions, and
make the original Functions simply call the legalized functions, and then have a
late pass clean everything up.
> 
> We may already need some amount of special-casing for things like bitfield
loads and stores. To implement the C++ memory model, some bitfield loads and
stores actually need to load and store a precise number of bits, even if that
number of bits doesn't correspond to a legal integer (register) size on the
target machine. This isn't implemented yet, but I expect this will be
handled by leaving those loads and stores alone, and simply putting the burden
on subsequent passes to lower them properly. An alternative to this is to add
new truncating-store and sign/zero-extending load intrinsics.
> 
> Another complication due to using LLVM IR is the interaction with
DebugInfo. If AllocaInsts for illegal types are expanded, or if values for
llvm.dbg.value intrinsics are expanded, there's currently no way to describe
this (DWARF can describe it, but LLVM IR can't currently). I assume this
could be fixed by extending LLVM IR's DebugInfo intrinsics, but I
haven't investigated it yet.
Hi Dan, 

Thank you for working on this.  You mentioned that the two alternatives for
replacing SelectionDAG is to munch on it from the top (by legalizing the IR) or
from the bottom (by removing the scheduler, Isel and finally the legalization
and lowering).  You also mentioned some of the disadvantages of the approach
that you are proposing. And I agree with you, this approach has many
disadvantages. I think that the end goal should be legalization at the MI level.
The llvm IR is nice and compact but it is not verbose enough to allow lowering.
You mentioned ext/load and trunc/store, but this problem is much worse for
vectors. For example, we lower shuffle vector to the following ISD nodes:
broadcast, insert_subvector, extract_subvector, concat_vectors, permute, blend,
extract/insert_element and a few others. Representing all of these nodes in IR
would be inefficient and inconvenient. Every optimization that handles these
intrinsics would need to to setup std::vectors, etc, and I think that the
compile time for this will not be great either. But I don't think that this
is the biggest problem. How do you plan to handle constants ? Do you lower them
to global variables and loads ? How would you implement FastISEL ? Do you plan
on having two instruction selectors (like we have today) or do you plan to lower
IR to intrinsics and select that ? I think that one of the goals of getting rid
of selection dag is to have one instruction selector.

Thanks,
Nadav 

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130424/e1370fcf/attachment.html>

Reed Kotler

2013-Apr-25 00:53 UTC

head link

[LLVMdev] Proposal for new Legalization framework

One question :

"In the spirit of the (long-term) intent to migrate away from the
  SelectionDAG framework"

.. is this meant in general or just in respect to legalization?

On 04/24/2013 05:01 PM, Dan Gohman wrote:> In the spirit of the (long-term) intent to migrate away from the
> SelectionDAG framework, it is desirable to implement legalization passes
> as discrete passes. Attached is a patch which implements the beginning
> of a new type legalization pass, to help motivate discussion.
>
> Is LLVM IR the right level for this? The main alternative approach
> that's been discussed is to do FastISel to a target-independent opcode
> set on MachineInstrs, and then do legalization and ultimately the last
> phase off instruction selection proper after that. The most obvious
> advantage of using LLVM IR for legalization is that it's (currently)
> more developer-friendly. The most obvious advantage of using
> MachineInstrs is that they would make it easier to do low-level
> manipulations. Also, doing legalization on MachineInstrs would mean
> avoiding having LLVM-IR-level optimization passes which lower the IR,
> which has historically been a design goal of LLVM.
>
> The attached pass operates on LLVM IR, and it's been educational to
> develop it this way, but I'm ok with rewriting it in MachineInstrs if
> that's the consensus.
>
> Given that the code I wrote operates on LLVM IR, it raises the following
> interesting issues:
>
> The code I wrote in the attached patch operates on LLVM IR, so for
> example it expands adds into llvm.uadd_with_overflow intrinsics. The
> intrinsics available in LLVM IR today aren't as expressive as the ISD
> operator set in SelectionDAG, so the generated code is quite a bit more
> verbose in some cases. Should we instead add new intrinsics, for add and
> for a bunch of other things? People I've talked to so far were hesitant
> to add new intrinsics unless they're really prohibitive to do in other
ways.
>
> How should we legalize function argument and return types? Because of
> LLVM IR rules, one can't just change the signature of a function
without
> changing all its callsites, so as a consequence the code I wrote is a
> ModulePass. This is unfortunate, since it's a goal that most of the
> codegen passes be FunctionPasses. Modifying the Function types may also
> be incompatible with the ABI coordination dance between front-ends and
> backends on some targets. One alternative, which is also implemented, is
> to leave function signatures alone and simply insert conversions to and
> from legal types. In this case, instruction selection would need to know
> how to handle illegal types in these special circumstances, but
> presumably it would be easy enough to special-case them. However, if
> this pass is followed by an optimization pass analogous to DAGCombine,
> it may be tricky to keep the optimization pass from creating patterns
> which codegen isn't prepared to handle. Another alternative, which is
> not implemented yet, is have the legalization pass create new Functions,
> and make the original Functions simply call the legalized functions, and
> then have a late pass clean everything up.
>
> We may already need some amount of special-casing for things like
> bitfield loads and stores. To implement the C++ memory model, some
> bitfield loads and stores actually need to load and store a precise
> number of bits, even if that number of bits doesn't correspond to a
> legal integer (register) size on the target machine. This isn't
> implemented yet, but I expect this will be handled by leaving those
> loads and stores alone, and simply putting the burden on subsequent
> passes to lower them properly. An alternative to this is to add new
> truncating-store and sign/zero-extending load intrinsics.
>
> Another complication due to using LLVM IR is the interaction with
> DebugInfo. If AllocaInsts for illegal types are expanded, or if values
> for llvm.dbg.value intrinsics are expanded, there's currently no way to
> describe this (DWARF can describe it, but LLVM IR can't currently). I
> assume this could be fixed by extending LLVM IR's DebugInfo intrinsics,
> but I haven't investigated it yet.
>
> Dan
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

Nadav Rotem

2013-Apr-25 01:05 UTC

head link

[LLVMdev] Proposal for new Legalization framework

On Apr 24, 2013, at 5:53 PM, Reed Kotler <rkotler at mips.com> wrote:
> One question :
> 
> "In the spirit of the (long-term) intent to migrate away from the
> SelectionDAG framework"
> 
> .. is this meant in general or just in respect to legalization?
Everything. This includes all of the custom lowering code for all of the
targets, all of dagcombine, and maybe all of the patterns in the TD files.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130424/f5d6879d/attachment.html>

reed kotler

2013-Apr-25 01:27 UTC

head link

[LLVMdev] Proposal for new Legalization framework

On 04/24/2013 05:26 PM, Chris Lattner wrote:> On Apr 24, 2013, at 5:01 PM, Dan Gohman <dan433584 at gmail.com>
wrote:
>> In the spirit of the (long-term) intent to migrate away from the
SelectionDAG framework, it is desirable to implement legalization passes as
discrete passes. Attached is a patch which implements the beginning of a new
type legalization pass, to help motivate discussion.
>
> This is a great discussion to have.
>
>> Is LLVM IR the right level for this?
>
> IMO, no, definitely not.
>
>> The main alternative approach that's been discussed is to do
FastISel to a target-independent opcode set on MachineInstrs, and then do
legalization and ultimately the last phase off instruction selection proper
after that. The most obvious advantage of using LLVM IR for legalization is that
it's (currently) more developer-friendly. The most obvious advantage of
using MachineInstrs is that they would make it easier to do low-level
manipulations. Also, doing legalization on MachineInstrs would mean avoiding
having LLVM-IR-level optimization passes which lower the IR, which has
historically been a design goal of LLVM.
>
> I think that you (in the rest of your email) identify a number of specific
problems with using LLVM IR for legalization.  These are a lot of specific
issues caused by the fact that LLVM IR is intentionally not trying to model
machine issues.  I'm sure you *could* try to make this work by introducing a
bunch of new intrinsics into LLVM IR which would model the union of the
selection dag ISD nodes along with the target specific X86ISD nodes.  However,
at this point, you have only modeled the operations and haven't modeled the
proper type system.
>
> LLVM IR is just not the right level for this.  You seem to think it is
better than MachineInstrs because of developer friendliness, but it isn't
clear to me that LLVM IR with the additions you're talking about would
actually be friendly anymore :-)
>
> Personally, I think that the right representation for legalization is
MachineInstrs supplemented with a type system that allows MVTs as well as
register classes.  If you are seriously interested in pushing forward on this,
we should probably discuss it in person, or over beer at the next social or
something.
>
> -Chris
>
I would really push towards doing this in LLVM IR as the next step.

It's possible that what you are proposing is the right "long term"
solution but I think it's not a good evolutionary approach; it's more 
revolutionary.

I've already thought of many things that could be very clearly and 
easily done in IR that are done in very convoluted ways in Selection 
DAG. This kind of migration could take place right now and as we thin 
out the selection DAG portion of things to where it is almost non 
existent, making a jump to just eliminate it and replacing it would be 
more practical.

Something like soft float for example is nearly trivial to do in IR.

At the risk of appearing stupid, I can say that I've really struggled to 
understand selection DAG and all it's facets and interaction with table 
gen patterns, and this after having done a whole port from scratch 
already by myself.

Part of it is the lack of documentation but also there are too many 
illogical things (to me) and special cases and hacks surrounding 
selection DAG and tablegen.

On the other hand, I recently started to write some IR level passes and 
it was nearly trivial for me to understand how to use it and transform 
it. All the classes are more or less very clean, logical and regular. I 
was writing transformation passes on the first day with no issues.

I think that LLVM IR could be extended to allow for all the things in 
legalization to take place and many other parts of lowering, i.e. 
lowering to use some IR which has additional lower level operations.

Reed

Owen Anderson

2013-Apr-25 06:33 UTC

head link

[LLVMdev] Proposal for new Legalization framework

Hi Dan,

Others have weighed in on the merits of IR vs MI legalization, I thought I'd
chip in on a different area:

+    /// Legal roughly means there's a physical register class on the target
+    /// machine for a type, and there's a reasonable set of instructions
+    /// which operate on registers of this class and interpret their contents
+    /// as instances of the type. For convenience, Legal is also used for
+    /// types which are not legalized by this pass (vectors, floats, etc.)
+    Legal,

I don't think this is the right definition of a legal type.  I know that
that's how SelectionDAG currently defines it, and I think that definition is
behind a lot of the difficulty in retargeting LLVM to something that doesn't
look like the intersection of X86 and ARM.

I think the correct answer (credit to Chris for this description) is that a
legal type is one that (more or less) corresponds to a set of physical
registers, and which the target is capable of loading, storing, and copying
(possibly also inserting/extracting elements, for vector types).

--Owen

On Apr 24, 2013, at 5:01 PM, Dan Gohman <dan433584 at gmail.com> wrote:
> In the spirit of the (long-term) intent to migrate away from the
SelectionDAG framework, it is desirable to implement legalization passes as
discrete passes. Attached is a patch which implements the beginning of a new
type legalization pass, to help motivate discussion.
> 
> Is LLVM IR the right level for this? The main alternative approach
that's been discussed is to do FastISel to a target-independent opcode set
on MachineInstrs, and then do legalization and ultimately the last phase off
instruction selection proper after that. The most obvious advantage of using
LLVM IR for legalization is that it's (currently) more developer-friendly.
The most obvious advantage of using MachineInstrs is that they would make it
easier to do low-level manipulations. Also, doing legalization on MachineInstrs
would mean avoiding having LLVM-IR-level optimization passes which lower the IR,
which has historically been a design goal of LLVM.
> 
> The attached pass operates on LLVM IR, and it's been educational to
develop it this way, but I'm ok with rewriting it in MachineInstrs if
that's the consensus.
> 
> Given that the code I wrote operates on LLVM IR, it raises the following
interesting issues:
> 
> The code I wrote in the attached patch operates on LLVM IR, so for example
it expands adds into llvm.uadd_with_overflow intrinsics. The intrinsics
available in LLVM IR today aren't as expressive as the ISD operator set in
SelectionDAG, so the generated code is quite a bit more verbose in some cases.
Should we instead add new intrinsics, for add and for a bunch of other things?
People I've talked to so far were hesitant to add new intrinsics unless
they're really prohibitive to do in other ways.
> 
> How should we legalize function argument and return types? Because of LLVM
IR rules, one can't just change the signature of a function without changing
all its callsites, so as a consequence the code I wrote is a ModulePass. This is
unfortunate, since it's a goal that most of the codegen passes be
FunctionPasses. Modifying the Function types may also be incompatible with the
ABI coordination dance between front-ends and backends on some targets. One
alternative, which is also implemented, is to leave function signatures alone
and simply insert conversions to and from legal types. In this case, instruction
selection would need to know how to handle illegal types in these special
circumstances, but presumably it would be easy enough to special-case them.
However, if this pass is followed by an optimization pass analogous to
DAGCombine, it may be tricky to keep the optimization pass from creating
patterns which codegen isn't prepared to handle. Another alternative, which
is not implemented yet, is have the legalization pass create new Functions, and
make the original Functions simply call the legalized functions, and then have a
late pass clean everything up.
> 
> We may already need some amount of special-casing for things like bitfield
loads and stores. To implement the C++ memory model, some bitfield loads and
stores actually need to load and store a precise number of bits, even if that
number of bits doesn't correspond to a legal integer (register) size on the
target machine. This isn't implemented yet, but I expect this will be
handled by leaving those loads and stores alone, and simply putting the burden
on subsequent passes to lower them properly. An alternative to this is to add
new truncating-store and sign/zero-extending load intrinsics.
> 
> Another complication due to using LLVM IR is the interaction with
DebugInfo. If AllocaInsts for illegal types are expanded, or if values for
llvm.dbg.value intrinsics are expanded, there's currently no way to describe
this (DWARF can describe it, but LLVM IR can't currently). I assume this
could be fixed by extending LLVM IR's DebugInfo intrinsics, but I
haven't investigated it yet.
> 
> Dan
> 
>
<legalize-integers.patch>_______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

David Chisnall

2013-Apr-25 08:50 UTC

head link

[LLVMdev] Proposal for new Legalization framework

Hi Dan,

On 25 Apr 2013, at 01:01, Dan Gohman <dan433584 at gmail.com> wrote:
> The main alternative approach that's been discussed is to do FastISel
to a target-independent opcode set on MachineInstrs, and then do legalization
and ultimately the last phase off instruction selection proper after that. The
most obvious advantage of using LLVM IR for legalization is that it's
(currently) more developer-friendly. The most obvious advantage of using
MachineInstrs is that they would make it easier to do low-level manipulations.
Also, doing legalization on MachineInstrs would mean avoiding having
LLVM-IR-level optimization passes which lower the IR, which has historically
been a design goal of LLVM.
The approach taken in WHIRL, which has a lot of advantages, is exactly to lower
the IR.  It seems strange that in the back end we have Machine* classes that
correspond very closely to IR equivalents, but which don't share any code
and often have subtly different interfaces.  The approach taken in WHIRL is to
progressively replace machine-independent bits of the IR with machine-dependent
ones, with abstract instructions being replaced with machine instructions,
abstract registers with machine registers, and so on.

I would be interested to know the rationale behind the design choice to avoid
this, as it seems the obvious way of designing a compiler.  The down side would
be that you couldn't take any random pass that expected target-independent
IR and run it, but you never actually want to do this once you've handed off
to the codegen infrastructure anyway.

David

Justin Holewinski

2013-Apr-25 13:48 UTC

head link

[LLVMdev] Proposal for new Legalization framework

On Thu, Apr 25, 2013 at 4:50 AM, David Chisnall <David.Chisnall at
cl.cam.ac.uk> wrote:
> Hi Dan,
>
> On 25 Apr 2013, at 01:01, Dan Gohman <dan433584 at gmail.com> wrote:
>
> > The main alternative approach that's been discussed is to do
FastISel to
> a target-independent opcode set on MachineInstrs, and then do legalization
> and ultimately the last phase off instruction selection proper after that.
> The most obvious advantage of using LLVM IR for legalization is that
it's
> (currently) more developer-friendly. The most obvious advantage of using
> MachineInstrs is that they would make it easier to do low-level
> manipulations. Also, doing legalization on MachineInstrs would mean
> avoiding having LLVM-IR-level optimization passes which lower the IR, which
> has historically been a design goal of LLVM.
>
> The approach taken in WHIRL, which has a lot of advantages, is exactly to
> lower the IR.  It seems strange that in the back end we have Machine*
> classes that correspond very closely to IR equivalents, but which don't
> share any code and often have subtly different interfaces.  The approach
> taken in WHIRL is to progressively replace machine-independent bits of the
> IR with machine-dependent ones, with abstract instructions being replaced
> with machine instructions, abstract registers with machine registers, and
> so on.
>
Couldn't we first lower LLVM IR to a (mostly) target-independent sequence
of MachineInstrs, and then progressively lower those?  That seems to be
very close to what you describe, and makes a great deal of sense to me.
 The MachineInstr could model operations and types that are not legal for
the current target, but passes could lower those until everything is legal
and all opcodes are target-specific.

Targets can still lower function arguments/returns as they please.  And
most of the infrastructure for this is already present (no new IR).

>
> I would be interested to know the rationale behind the design choice to
> avoid this, as it seems the obvious way of designing a compiler.  The down
> side would be that you couldn't take any random pass that expected
> target-independent IR and run it, but you never actually want to do this
> once you've handed off to the codegen infrastructure anyway.
>
> David
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>


-- 

Thanks,

Justin Holewinski
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130425/ebb50481/attachment.html>

Dan Gohman

2013-Apr-25 15:00 UTC

head link

[LLVMdev] Proposal for new Legalization framework

On Wed, Apr 24, 2013 at 11:33 PM, Owen Anderson <resistor at mac.com>
wrote:
> Hi Dan,
>
Hi Owen,

> Others have weighed in on the merits of IR vs MI legalization, I thought
> I'd chip in on a different area:
>
> +    /// Legal roughly means there's a physical register class on the
> target
> +    /// machine for a type, and there's a reasonable set of
instructions
> +    /// which operate on registers of this class and interpret their
> contents
> +    /// as instances of the type. For convenience, Legal is also used for
> +    /// types which are not legalized by this pass (vectors, floats, etc.)
> +    Legal,
>
> I don't think this is the right definition of a legal type.  I know
that
> that's how SelectionDAG currently defines it, and I think that
definition
> is behind a lot of the difficulty in retargeting LLVM to something that
> doesn't look like the intersection of X86 and ARM.
>
Do you have a particular target in mind that we could discuss? Not all
variances from the intersection of x86 and ARM are of the same nature; it's
hard to talk in full generality here.

> I think the correct answer (credit to Chris for this description) is that
> a legal type is one that (more or less) corresponds to a set of physical
> registers, and which the target is capable of loading, storing, and copying
> (possibly also inserting/extracting elements, for vector types).
>
If the target doesn't actually have a copy for a register class which will
be register-allocated, it will just need to pretend it has one at this
level and lower it later somehow, otherwise a lot of other stuff won't work.

I don't see why load and store are special at this level though. Or
insert/extract element?

Dan
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130425/ff22942c/attachment.html>

Dan Gohman

2013-Apr-25 15:58 UTC

head link

[LLVMdev] Proposal for new Legalization framework

On Wed, Apr 24, 2013 at 5:26 PM, Chris Lattner <clattner at apple.com>
wrote:
> On Apr 24, 2013, at 5:01 PM, Dan Gohman <dan433584 at gmail.com>
wrote:
> > In the spirit of the (long-term) intent to migrate away from the
> SelectionDAG framework, it is desirable to implement legalization passes as
> discrete passes. Attached is a patch which implements the beginning of a
> new type legalization pass, to help motivate discussion.
>
> This is a great discussion to have.
>
> > Is LLVM IR the right level for this?
>
> IMO, no, definitely not.
>
> > The main alternative approach that's been discussed is to do
FastISel to
> a target-independent opcode set on MachineInstrs, and then do legalization
> and ultimately the last phase off instruction selection proper after that.
> The most obvious advantage of using LLVM IR for legalization is that
it's
> (currently) more developer-friendly. The most obvious advantage of using
> MachineInstrs is that they would make it easier to do low-level
> manipulations. Also, doing legalization on MachineInstrs would mean
> avoiding having LLVM-IR-level optimization passes which lower the IR, which
> has historically been a design goal of LLVM.
>
> I think that you (in the rest of your email) identify a number of specific
> problems with using LLVM IR for legalization.  These are a lot of specific
> issues caused by the fact that LLVM IR is intentionally not trying to model
> machine issues.  I'm sure you *could* try to make this work by
introducing
> a bunch of new intrinsics into LLVM IR which would model the union of the
> selection dag ISD nodes along with the target specific X86ISD nodes.
>  However, at this point, you have only modeled the operations and
haven't
> modeled the proper type system.
>
I don't wish to argue about this, and am fine following your suggestion.
However, I would like to understand your reasons better.

I don't think the type system is really the issue. The only thing
SelectionDAG's type system has which LLVM IR's lacks which is useful
here
is "untyped", and that's a special-purpose thing that we can
probably
handle in other ways.

You and others are right that there could be a fair number of new
intrinsics, especially considering all the X86ISD ones and all the rest. Is
this a significant concern for you? Targets already have large numbers of
target-specific intrinsics; would adding a relatively moderate number of
new intrinsics really be a problem?

There's also the problem of keeping callers and callees consistent, and
it's indeed quite a dickens, but it need not be a show-stopper.

LLVM IR is just not the right level for this.  You seem to think it
is> better than MachineInstrs because of developer friendliness, but it
isn't
> clear to me that LLVM IR with the additions you're talking about would
> actually be friendly anymore :-)
>
As I see it, people working in codegen are going to have to deal with lots
of codegeny instructions regardless of whether we call them instructions or
intrinsics. Is it really better one way or the other?

> Personally, I think that the right representation for legalization is
> MachineInstrs supplemented with a type system that allows MVTs as well as
> register classes.  If you are seriously interested in pushing forward on
> this, we should probably discuss it in person, or over beer at the next
> social or something.
>
Ok.

Dan
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130425/e5937fa3/attachment.html>

Chris Lattner

2013-Apr-26 18:43 UTC

head link

[LLVMdev] Proposal for new Legalization framework

On Apr 25, 2013, at 1:50 AM, David Chisnall <David.Chisnall at
cl.cam.ac.uk> wrote:
> Hi Dan,
> 
> On 25 Apr 2013, at 01:01, Dan Gohman <dan433584 at gmail.com> wrote:
> 
>> The main alternative approach that's been discussed is to do
FastISel to a target-independent opcode set on MachineInstrs, and then do
legalization and ultimately the last phase off instruction selection proper
after that. The most obvious advantage of using LLVM IR for legalization is that
it's (currently) more developer-friendly. The most obvious advantage of
using MachineInstrs is that they would make it easier to do low-level
manipulations. Also, doing legalization on MachineInstrs would mean avoiding
having LLVM-IR-level optimization passes which lower the IR, which has
historically been a design goal of LLVM.
> 
> The approach taken in WHIRL, which has a lot of advantages, is exactly to
lower the IR.  It seems strange that in the back end we have Machine* classes
that correspond very closely to IR equivalents, but which don't share any
code and often have subtly different interfaces.  The approach taken in WHIRL is
to progressively replace machine-independent bits of the IR with
machine-dependent ones, with abstract instructions being replaced with machine
instructions, abstract registers with machine registers, and so on.
> 
> I would be interested to know the rationale behind the design choice to
avoid this, as it seems the obvious way of designing a compiler.  The down side
would be that you couldn't take any random pass that expected
target-independent IR and run it, but you never actually want to do this once
you've handed off to the codegen infrastructure anyway.
There definitely are strong advantages to using one datastructure to represent
multiple levels of IR: you have less code in the compiler, more shared concepts,
etc.  I have seen and work with several compilers that tried to do this.  Even
GCC does this (in the opposite direction) with "treessa" which
repurposes some front-end data structures for their mid-level IR.

While there are advantages, it also means that you get fewer invariants, and
that the data structures are a worse fit for each level.  To give you one simple
example: LLVM IR is simplified greatly based on the assumption that it is always
in SSA and that each instruction produces one result value, and exceptions to
that rule (like some intrinsics) can easily be modeled with extract value
operations.

This doesn't work for MachineInstrs which have the following additional
complexity:
- Not everything is in SSA, you have to model physical registers, even very
early.
- Lots of things return N values, and extract-value doesn't work.

I consider it unacceptable to project complexity from MachineInstrs into LLVM
IR.  There are wins, but there are also unacceptably high costs.  Some of those
include:
 - LLVM IR is our stable IR format, MachineInstr is not.  The later *needs* to
evolve rapidly, where the former has settled down (mostly).
 - The reasons people like to work with LLVM IR is often directly because of the
simplifications we get from having a simple model.

Jeopardizing stability in the IR and making LLVM IR works to work with is not
acceptable to me.

-Chris

Maybe Matching Threads

Search for more reasonably related threads

llvm dev - Apr 2013 - [LLVMdev] Proposal for new Legalization framework

[LLVMdev] Proposal for new Legalization framework

[LLVMdev] Proposal for new Legalization framework

[LLVMdev] Proposal for new Legalization framework

[LLVMdev] Proposal for new Legalization framework

[LLVMdev] Proposal for new Legalization framework

[LLVMdev] Proposal for new Legalization framework

[LLVMdev] Proposal for new Legalization framework

[LLVMdev] Proposal for new Legalization framework

[LLVMdev] Proposal for new Legalization framework

[LLVMdev] Proposal for new Legalization framework

[LLVMdev] Proposal for new Legalization framework

[LLVMdev] Proposal for new Legalization framework

Maybe Matching Threads