thr3ads.net - llvm dev - [llvm-dev] The Trouble with Triples [Sep 2015]

If this information is useful, please help other people find it:
Share via:

Daniel Sanders via llvm-dev

2015-Sep-17 13:21 UTC

[llvm-dev] The Trouble with Triples

I think we need to take a step further back and re-enter from the right starting
point. The thing that's bothering me about the push back so far is that
it's trying to discuss and understand the consequences of resolving the core
problem while seemingly ignoring the core problem itself. The reason I've
been steering everything back to GNU Triple's being ambiguous and
inconsistent is because it's the root of all the problems and the fixes to
the various issues fall out naturally once this core point has been addressed.

Here's the line of thought that I'd like people to start with:

·         Triples don't describe the target. They look like they should, but
they don't. They're really just arbitrary strings.

·         LLVM relies on Triple as a description of the target. It defines the
backend to use, the binary format to use, OS and Vendor specific quirks to
enable/disable, the default CPU, the default ABI, the endian, and countless
other details about the target.

·         If LLVM is built on top of an incorrect concept we should fix that but
we can't abandon Triple's at the user level since every toolchain uses
them.

·         But we can't keep using Triples inappropriately either. If the
information feeding into LLVM is faulty then the resulting behaviour will be
faulty too.

·         So let's start with a Triple, and convert it to a not-broken
equivalent as early as possible. We'll call it TargetTuple.
Are there any disagreements on this part of the thinking? If there are, then we
should resolve these before proceeding to the rest since everything else depends
on accepting this core problem exists and can be fixed in this way.
If we have agreement on this, then I think that this by itself is ample reason
for phases 1-4, and 6 of the plan. The justification for the IR serialization in
phase 5 is simply that we need to deliver the Triple/TargetTuple to LTO for it
to operate correctly and we currently do this by serializing Triple in the IR.
If Triple has been replaced by TargetTuple then TargetTuple must be serializable
in the IR somehow.

Hopefully, we are agreed so far. Let's assume for the rest of this
explanation that Phases 1-6 are complete and we now have const TargetTuple
throughout the API. I'd like to draw particular attention to TargetMachine
which, like everything else, has had its Triple member (called TargetTriple)
replaced with a TargetTuple member (named TheTargetTuple). This member is used
in all the same ways it used to be used when it was a Triple (named
TargetTriple).

At this point, in the MC layer we have a number of classes that need to know the
ABI but lack this information. Our TargetMachine has an accurate TargetTuple
object that describes the invariants of the desired target. The desired ABI is
an invariant too so why not have it in the TargetTuple which is already plumbed
in everywhere we need it? After all, it's a property of the target
OS/Environment. If we have the ABI in the TargetTuple, then we don't need
any other means to set the ABI, tools can set it up front in the TargetTuple and
we don't need any command-line option handling for it in the backend.

Meanwhile, in clang we have a number of command line options that change the
desired target. Let's say we've constructed a Triple and resolved it to
TargetTuple (more on that below). We're now processing the –EL option. At
the moment, we substitute our mips-linux-gnu triple for a mipsel-linux-gnu
triple, construct a Triple object from it and resolve the new Triple to a
TargetTuple. But why do we need to bother with that kind of weird hackery when
we can simply do Obj.setEndian(Little)? This is what Phase 7 of the plan is
about. We end up with a cleaner way to process target changes that, until now,
have required weird triple hacking to handle.

I skipped the Triple -> TargetTuple resolution a moment ago and I should
address that now. We already know that mapping Triple to TargetTuple is a many
to many mapping. One Triple has many possible TargetTuple's depending on the
environment. One TargetTuple can be formed from multiple possible Triples. In an
ideal world, we'd like to bake in all of these mappings so that one clang
binary supports everything. Unfortunately, being a many to many mapping, some of
these mappings are mutually exclusive. Note that this isn't a new problem
resulting from this project. The problem has always been there but has been
ignored until now. To resolve this, we need to provide configure-time and
possibly run-time controls for how this conversion is disambiguated. This
resolution is performed as early as possible so that the middle/back-ends
don't need to know anything about the ambiguity problem.

---

To reply more directly to your email:> What can't be done to TargetMachine to avoid this serialization?
TargetMachine already has the serialization (see TargetMachine::TargetTriple).
We're not doing anything new here. We're simply replacing one object
holding faulty information with a new object holding reliable information.
> And a followup question: What can't be serialized at the function level
in the IR to make certain things clear that aren't global? We already do
this for a lot of command line options.
The data I want to fix is global. I think the bit you may be getting hung up on
here is that small portions of this global data can also be overridden at the
function level. Those overrides aren't a problem and continue to operate in
the same way as they do today.
> And one more: What global options do we need to consider here?
I'm not certain I understand this question. If you're talking command
line options, it's things like –EL, -EB, -mips32, -mips32r[2356], -mips64,
-mips64r[2356], -mabi=…. If you're talking about Triple -> TargetTuple
mappings, there's quite a wide variety but the main ones for Mips are
endian, architecture, default CPU, and default ABI.
> The goal of the configuration level of the TargetMachine is that it
controls things that don't change at the object level.
> This is a fairly recently stated goal, but I think it makes sense for LLVM
in general. TargetSubtargetInfo takes care of
> everything that resides under this (as much as possible, some bits are
still in transition, e.g. TargetOptions). This is part
> of my suggestion to Daniel about the problems with MCSubtargetInfo and the
assembler. Targets like Mips and ARM
> were unfortunately designed to change things on the fly during assembly and
need to collate or at least change defaults
> as we're processing code. I definitely had to deal with a lot of the
pain you're talking about when I was rewriting some
> of the handling there during the TargetSubtargetInfo work.
I generally agree with this. The key bit I need to draw attention to is that the
'defaults' don't change, but are instead overridden. These constant
defaults are stored in TargetMachine and particularly
TargetMachine::TargetTriple. These defaults are wrong for some toolchains since
the information stored in TargetMachine::TargetTriple are wrong. It's the
defaults I'm trying to fix rather than the overrides.

I think I understand your proposed plan now and it's a few steps ahead of
where we are and where we need to be. I agree that overridable state should be
in TargetSubtargetInfo, however I can't initialize that state without the
default values which come from the faulty information in
TargetMachine::TargetTriple. This triple work is a pre-requisite to your plan
and at first I don't need to override ABI's.
> Right now I see TargetTuple as trying to take over all of the various
arguments to TargetMachine and encapsulate them into a single thing.
> I also don't see this is bad, but I also don't see it taking all of
them right now and I'm not sure how it solves some of the existing problems
> with data sharing that we've got which is where the push back
you're both getting is coming from here. Ultimately library-wise I can agree
> with some of the directions you're headed - I just don't see the
unification and interactions right now.
I think we'll end up with TargetTuple taking over many arguments to
TargetMachine but that's not my goal at this stage. My goal is simply to fix
the faulty information currently held in Triple and use the now-accurate
information in TargetTuple to fix various blocking issues that prevent a proper
Mips toolchain product based on Clang/LLVM. At the end of Phase 7, it become
possible to fix a number of issues that are impossible to fix right now because
the available data we can consult at the moment is incorrect.

From: Eric Christopher [mailto:echristo at gmail.com]
Sent: 16 September 2015 23:52
To: Renato Golin; Jim Grosbach
Cc: Daniel Sanders; llvm-dev at lists.llvm.org
Subject: Re: The Trouble with Triples

Let's take a step back here.

It appears that you and Daniel are trying to solve some problems. I think
solving problems is good, I just want to make sure that we're solving them
in a way that gets us a decent API at the end. I also want to make sure
we're solving the right problems.

TargetTuple appears to be related to the TargetParser as you bring up in this
mail. They're two separate parts of similar problems - people trying to both
serialize command line options and communication from the front end to the
backend with respect to target information.

This leads me to a question: What can't be done to TargetMachine to avoid
this serialization?
And a followup question: What can't be serialized at the function level in
the IR to make certain things clear that aren't global? We already do this
for a lot of command line options.
And one more: What global options do we need to consider here?

The goal of the configuration level of the TargetMachine is that it controls
things that don't change at the object level. This is a fairly recently
stated goal, but I think it makes sense for LLVM in general. TargetSubtargetInfo
takes care of everything that resides under this (as much as possible, some bits
are still in transition, e.g. TargetOptions). This is part of my suggestion to
Daniel about the problems with MCSubtargetInfo and the assembler. Targets like
Mips and ARM were unfortunately designed to change things on the fly during
assembly and need to collate or at least change defaults as we're processing
code. I definitely had to deal with a lot of the pain you're talking about
when I was rewriting some of the handling there during the TargetSubtargetInfo
work.

Now a bit more on TargetParser + TargetTuple:

TargetParser appears to be trying to solve the parsing in Triple in a nice way
for ARM and also some of the "what kind of subtarget feature
canonicalization can we do in llvm that makes sense to communicate to the front
end". I like this particular idea and have often wanted a library of
feature handling, but it seems to have stabilized at an ARM specific set of code
with no defined interface. I can't even figure out how I'd use it in
lib/Basic right now for any target other than ARM. This isn't a condemnation
of TargetParser, but I think it's something that needs to be thought through
a bit more. It's been hooked up well before I'd expected it to and right
now if we moved it to the ARM backend from Support it'd make just as much
sense as it does where it is now other than making clang depend on the ARM
backend as well as the X86 backend :)

Right now I see TargetTuple as trying to take over all of the various arguments
to TargetMachine and encapsulate them into a single thing. I also don't see
this is bad, but I also don't see it taking all of them right now and
I'm not sure how it solves some of the existing problems with data sharing
that we've got which is where the push back you're both getting is
coming from here. Ultimately library-wise I can agree with some of the
directions you're headed - I just don't see the unification and
interactions right now.

As a suggestion as a way forward here let's see if we can get my questions
above answered and also show some of how the interactions between llvm's
libraries are going to get fixed, moved to a better place, etc here.

Thanks!

-eric

On Wed, Sep 16, 2015 at 3:02 PM Renato Golin <renato.golin at
linaro.org<mailto:renato.golin at linaro.org>> wrote:
On 16 September 2015 at 21:56, Jim Grosbach <grosbach at
apple.com<mailto:grosbach at apple.com>> wrote:> Why do we care about GAS? We have an assembler.
It's not that simple.

There are a lot of old code out there, including the Linux kernel
which we do care a lot, that only compiles with GAS. We're slowly
moving the legacy code up to modern standards, and specifically some
kernel folks are happy to move up not only the asm syntax, but the C
standard and move away from GNU-specific behaviour. But we're not
quite there yet, and might not be for a few more years. so, yes, we
still care about GAS.

But this is not just about GAS.

As I said on my previous email, this is about clearing the bloat in
target descriptions by both: removing the need for adding numerous CPU
names, target features, architecture names (xscale, strongarm, etc),
AND making sure all parties (front/middle/back-ends) speak the same
language, produced from the same source.

The TargetTuple is that common language, and the TargetParser created
from the TableGen files is the common source. The Triple becomes a
legacy constructor value for the Tuple. All other target information
classes are already (or should be) generated from the TableGen files,
so the ultimate source becomes the TableGen description, which I think
it what you were aiming to on your comment.

For simple architectures, like x86, you don't even need a
TargetParser. You can easily construct the Tuple from a triple and use
the Tuple as you've always used the triple. No harm done. But for the
complex ones like ARM and MIPS, having a common interface generated
from the same place the other interfaces are is important to avoid
more bridges between front and middle and back end interpretations of
the same target. Whatever legacy ARM or MIPS carry can be isolated in
their own implementation, leaving the rest of the targets with a clean
and simple interface.

cheers,
--renato
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150917/7916973a/attachment.html>

Renato Golin via llvm-dev

2015-Sep-17 13:31 UTC

head link

[llvm-dev] The Trouble with Triples

On 17 September 2015 at 14:21, Daniel Sanders <Daniel.Sanders at
imgtec.com> wrote:>> What can't be done to TargetMachine to avoid this serialization?
>
> TargetMachine already has the serialization (see
> TargetMachine::TargetTriple). We're not doing anything new here.
We're
> simply replacing one object holding faulty information with a new object
> holding reliable information.
I'd like to point out that we can't *replace* the triple
serialization, or we'd have a serious backward compatibility issue. We
can add new stuff, either to the triple field or to a new field,
however, without serious problems.

What this new information will look like, I don't know. But we have to
keep compatibility with triples in IR for *at least* a few major
releases.

cheers,
--renato

Daniel Sanders via llvm-dev

2015-Sep-17 13:39 UTC

head link

[llvm-dev] The Trouble with Triples

> -----Original Message-----
> From: Renato Golin [mailto:renato.golin at linaro.org]
> Sent: 17 September 2015 14:32
> To: Daniel Sanders
> Cc: Eric Christopher; Jim Grosbach; llvm-dev at lists.llvm.org
> Subject: Re: The Trouble with Triples
> 
> On 17 September 2015 at 14:21, Daniel Sanders
> <Daniel.Sanders at imgtec.com> wrote:
> >> What can't be done to TargetMachine to avoid this
serialization?
> >
> > TargetMachine already has the serialization (see
> > TargetMachine::TargetTriple). We're not doing anything new here.
We're
> > simply replacing one object holding faulty information with a new
object
> > holding reliable information.
> 
> I'd like to point out that we can't *replace* the triple
> serialization, or we'd have a serious backward compatibility issue. We
> can add new stuff, either to the triple field or to a new field,
> however, without serious problems.
> 
> What this new information will look like, I don't know. But we have to
> keep compatibility with triples in IR for *at least* a few major
> releases.
> 
> cheers,
> --renato
To clarify, I'm not talking about format in IR files here, just the
instantiated object held by TargetMachine. We've been talking about it as a
serialization of the target description so I stuck to the same terms.
TargetTuple has a Triple inside it at the moment and will probably have it for a
little while yet, even after making the predicates consult TargetTuple's own
data.

The serialization you're referring to is covered at:> The justification for the IR serialization in phase 5 is simply that we
need to deliver the Triple/TargetTuple
> to LTO for it to operate correctly and we currently do this by serializing
Triple in the IR.

Daniel Sanders via llvm-dev

2015-Sep-22 13:06 UTC

head link

[llvm-dev] The Trouble with Triples

The thread has gone quiet for a few days and I need to be making progress
towards a gcc-compatible toolchain (e.g. a mips-mti-linux-gnu toolchain that can
target MIPS32/MIPS64 and later for all appropriate ABI's and both endians)
so I need to chase this a earlier than I normally would.
> Here's the line of thought that I'd like people to start with:
> * Triples don't describe the target. They look like they should, but
they don't. They're really just arbitrary strings.
> * LLVM relies on Triple as a description of the target. It defines the
backend to use, the binary format to use, OS and Vendor specific quirks to
enable/disable, the default CPU, the default ABI, the endian, and countless
other details about the target.
> * If LLVM is built on top of an incorrect concept we should fix that but we
can't abandon Triple's at the user level since every toolchain uses
them.
> * But we can't keep using Triples inappropriately either. If the
information feeding into LLVM is faulty then the resulting behaviour will be
faulty too.
> * So let's start with a Triple, and convert it to a not-broken
equivalent as early as possible. We'll call it TargetTuple.
> Are there any disagreements on this part of the thinking?
> If we have agreement on this, then I think that this by itself is ample
reason for phases 1-4, and 6 of the plan.
> The justification for the IR serialization in phase 5 is simply that we
need to deliver the Triple/TargetTuple to
> LTO for it to operate correctly and we currently do this by serializing
Triple in the IR. If Triple has been replaced
> by TargetTuple then TargetTuple must be serializable in the IR somehow.
Are we agreed on this much? If so, I think we should go ahead with this part of
the work and judge each follow-on task independently on its own merits.

From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Daniel
Sanders via llvm-dev
Sent: 17 September 2015 14:21
To: Eric Christopher; Renato Golin; Jim Grosbach
Cc: llvm-dev at lists.llvm.org
Subject: Re: [llvm-dev] The Trouble with Triples

I think we need to take a step further back and re-enter from the right starting
point. The thing that's bothering me about the push back so far is that
it's trying to discuss and understand the consequences of resolving the core
problem while seemingly ignoring the core problem itself. The reason I've
been steering everything back to GNU Triple's being ambiguous and
inconsistent is because it's the root of all the problems and the fixes to
the various issues fall out naturally once this core point has been addressed.

Here's the line of thought that I'd like people to start with:

·         Triples don't describe the target. They look like they should, but
they don't. They're really just arbitrary strings.

·         LLVM relies on Triple as a description of the target. It defines the
backend to use, the binary format to use, OS and Vendor specific quirks to
enable/disable, the default CPU, the default ABI, the endian, and countless
other details about the target.

·         If LLVM is built on top of an incorrect concept we should fix that but
we can't abandon Triple's at the user level since every toolchain uses
them.

·         But we can't keep using Triples inappropriately either. If the
information feeding into LLVM is faulty then the resulting behaviour will be
faulty too.

·         So let's start with a Triple, and convert it to a not-broken
equivalent as early as possible. We'll call it TargetTuple.
Are there any disagreements on this part of the thinking? If there are, then we
should resolve these before proceeding to the rest since everything else depends
on accepting this core problem exists and can be fixed in this way.
If we have agreement on this, then I think that this by itself is ample reason
for phases 1-4, and 6 of the plan. The justification for the IR serialization in
phase 5 is simply that we need to deliver the Triple/TargetTuple to LTO for it
to operate correctly and we currently do this by serializing Triple in the IR.
If Triple has been replaced by TargetTuple then TargetTuple must be serializable
in the IR somehow.

Hopefully, we are agreed so far. Let's assume for the rest of this
explanation that Phases 1-6 are complete and we now have const TargetTuple
throughout the API. I'd like to draw particular attention to TargetMachine
which, like everything else, has had its Triple member (called TargetTriple)
replaced with a TargetTuple member (named TheTargetTuple). This member is used
in all the same ways it used to be used when it was a Triple (named
TargetTriple).

At this point, in the MC layer we have a number of classes that need to know the
ABI but lack this information. Our TargetMachine has an accurate TargetTuple
object that describes the invariants of the desired target. The desired ABI is
an invariant too so why not have it in the TargetTuple which is already plumbed
in everywhere we need it? After all, it's a property of the target
OS/Environment. If we have the ABI in the TargetTuple, then we don't need
any other means to set the ABI, tools can set it up front in the TargetTuple and
we don't need any command-line option handling for it in the backend.

Meanwhile, in clang we have a number of command line options that change the
desired target. Let's say we've constructed a Triple and resolved it to
TargetTuple (more on that below). We're now processing the –EL option. At
the moment, we substitute our mips-linux-gnu triple for a mipsel-linux-gnu
triple, construct a Triple object from it and resolve the new Triple to a
TargetTuple. But why do we need to bother with that kind of weird hackery when
we can simply do Obj.setEndian(Little)? This is what Phase 7 of the plan is
about. We end up with a cleaner way to process target changes that, until now,
have required weird triple hacking to handle.

I skipped the Triple -> TargetTuple resolution a moment ago and I should
address that now. We already know that mapping Triple to TargetTuple is a many
to many mapping. One Triple has many possible TargetTuple's depending on the
environment. One TargetTuple can be formed from multiple possible Triples. In an
ideal world, we'd like to bake in all of these mappings so that one clang
binary supports everything. Unfortunately, being a many to many mapping, some of
these mappings are mutually exclusive. Note that this isn't a new problem
resulting from this project. The problem has always been there but has been
ignored until now. To resolve this, we need to provide configure-time and
possibly run-time controls for how this conversion is disambiguated. This
resolution is performed as early as possible so that the middle/back-ends
don't need to know anything about the ambiguity problem.

---

To reply more directly to your email:> What can't be done to TargetMachine to avoid this serialization?
TargetMachine already has the serialization (see TargetMachine::TargetTriple).
We're not doing anything new here. We're simply replacing one object
holding faulty information with a new object holding reliable information.
> And a followup question: What can't be serialized at the function level
in the IR to make certain things clear that aren't global? We already do
this for a lot of command line options.
The data I want to fix is global. I think the bit you may be getting hung up on
here is that small portions of this global data can also be overridden at the
function level. Those overrides aren't a problem and continue to operate in
the same way as they do today.
> And one more: What global options do we need to consider here?
I'm not certain I understand this question. If you're talking command
line options, it's things like –EL, -EB, -mips32, -mips32r[2356], -mips64,
-mips64r[2356], -mabi=…. If you're talking about Triple -> TargetTuple
mappings, there's quite a wide variety but the main ones for Mips are
endian, architecture, default CPU, and default ABI.
> The goal of the configuration level of the TargetMachine is that it
controls things that don't change at the object level.
> This is a fairly recently stated goal, but I think it makes sense for LLVM
in general. TargetSubtargetInfo takes care of
> everything that resides under this (as much as possible, some bits are
still in transition, e.g. TargetOptions). This is part
> of my suggestion to Daniel about the problems with MCSubtargetInfo and the
assembler. Targets like Mips and ARM
> were unfortunately designed to change things on the fly during assembly and
need to collate or at least change defaults
> as we're processing code. I definitely had to deal with a lot of the
pain you're talking about when I was rewriting some
> of the handling there during the TargetSubtargetInfo work.
I generally agree with this. The key bit I need to draw attention to is that the
'defaults' don't change, but are instead overridden. These constant
defaults are stored in TargetMachine and particularly
TargetMachine::TargetTriple. These defaults are wrong for some toolchains since
the information stored in TargetMachine::TargetTriple are wrong. It's the
defaults I'm trying to fix rather than the overrides.

I think I understand your proposed plan now and it's a few steps ahead of
where we are and where we need to be. I agree that overridable state should be
in TargetSubtargetInfo, however I can't initialize that state without the
default values which come from the faulty information in
TargetMachine::TargetTriple. This triple work is a pre-requisite to your plan
and at first I don't need to override ABI's.
> Right now I see TargetTuple as trying to take over all of the various
arguments to TargetMachine and encapsulate them into a single thing.
> I also don't see this is bad, but I also don't see it taking all of
them right now and I'm not sure how it solves some of the existing problems
> with data sharing that we've got which is where the push back
you're both getting is coming from here. Ultimately library-wise I can agree
> with some of the directions you're headed - I just don't see the
unification and interactions right now.
I think we'll end up with TargetTuple taking over many arguments to
TargetMachine but that's not my goal at this stage. My goal is simply to fix
the faulty information currently held in Triple and use the now-accurate
information in TargetTuple to fix various blocking issues that prevent a proper
Mips toolchain product based on Clang/LLVM. At the end of Phase 7, it become
possible to fix a number of issues that are impossible to fix right now because
the available data we can consult at the moment is incorrect.

From: Eric Christopher [mailto:echristo at gmail.com]
Sent: 16 September 2015 23:52
To: Renato Golin; Jim Grosbach
Cc: Daniel Sanders; llvm-dev at lists.llvm.org
Subject: Re: The Trouble with Triples

Let's take a step back here.

It appears that you and Daniel are trying to solve some problems. I think
solving problems is good, I just want to make sure that we're solving them
in a way that gets us a decent API at the end. I also want to make sure
we're solving the right problems.

TargetTuple appears to be related to the TargetParser as you bring up in this
mail. They're two separate parts of similar problems - people trying to both
serialize command line options and communication from the front end to the
backend with respect to target information.

This leads me to a question: What can't be done to TargetMachine to avoid
this serialization?
And a followup question: What can't be serialized at the function level in
the IR to make certain things clear that aren't global? We already do this
for a lot of command line options.
And one more: What global options do we need to consider here?

The goal of the configuration level of the TargetMachine is that it controls
things that don't change at the object level. This is a fairly recently
stated goal, but I think it makes sense for LLVM in general. TargetSubtargetInfo
takes care of everything that resides under this (as much as possible, some bits
are still in transition, e.g. TargetOptions). This is part of my suggestion to
Daniel about the problems with MCSubtargetInfo and the assembler. Targets like
Mips and ARM were unfortunately designed to change things on the fly during
assembly and need to collate or at least change defaults as we're processing
code. I definitely had to deal with a lot of the pain you're talking about
when I was rewriting some of the handling there during the TargetSubtargetInfo
work.

Now a bit more on TargetParser + TargetTuple:

TargetParser appears to be trying to solve the parsing in Triple in a nice way
for ARM and also some of the "what kind of subtarget feature
canonicalization can we do in llvm that makes sense to communicate to the front
end". I like this particular idea and have often wanted a library of
feature handling, but it seems to have stabilized at an ARM specific set of code
with no defined interface. I can't even figure out how I'd use it in
lib/Basic right now for any target other than ARM. This isn't a condemnation
of TargetParser, but I think it's something that needs to be thought through
a bit more. It's been hooked up well before I'd expected it to and right
now if we moved it to the ARM backend from Support it'd make just as much
sense as it does where it is now other than making clang depend on the ARM
backend as well as the X86 backend :)

Right now I see TargetTuple as trying to take over all of the various arguments
to TargetMachine and encapsulate them into a single thing. I also don't see
this is bad, but I also don't see it taking all of them right now and
I'm not sure how it solves some of the existing problems with data sharing
that we've got which is where the push back you're both getting is
coming from here. Ultimately library-wise I can agree with some of the
directions you're headed - I just don't see the unification and
interactions right now.

As a suggestion as a way forward here let's see if we can get my questions
above answered and also show some of how the interactions between llvm's
libraries are going to get fixed, moved to a better place, etc here.

Thanks!

-eric

On Wed, Sep 16, 2015 at 3:02 PM Renato Golin <renato.golin at
linaro.org<mailto:renato.golin at linaro.org>> wrote:
On 16 September 2015 at 21:56, Jim Grosbach <grosbach at
apple.com<mailto:grosbach at apple.com>> wrote:> Why do we care about GAS? We have an assembler.
It's not that simple.

There are a lot of old code out there, including the Linux kernel
which we do care a lot, that only compiles with GAS. We're slowly
moving the legacy code up to modern standards, and specifically some
kernel folks are happy to move up not only the asm syntax, but the C
standard and move away from GNU-specific behaviour. But we're not
quite there yet, and might not be for a few more years. so, yes, we
still care about GAS.

But this is not just about GAS.

As I said on my previous email, this is about clearing the bloat in
target descriptions by both: removing the need for adding numerous CPU
names, target features, architecture names (xscale, strongarm, etc),
AND making sure all parties (front/middle/back-ends) speak the same
language, produced from the same source.

The TargetTuple is that common language, and the TargetParser created
from the TableGen files is the common source. The Triple becomes a
legacy constructor value for the Tuple. All other target information
classes are already (or should be) generated from the TableGen files,
so the ultimate source becomes the TableGen description, which I think
it what you were aiming to on your comment.

For simple architectures, like x86, you don't even need a
TargetParser. You can easily construct the Tuple from a triple and use
the Tuple as you've always used the triple. No harm done. But for the
complex ones like ARM and MIPS, having a common interface generated
from the same place the other interfaces are is important to avoid
more bridges between front and middle and back end interpretations of
the same target. Whatever legacy ARM or MIPS carry can be isolated in
their own implementation, leaving the rest of the targets with a clean
and simple interface.

cheers,
--renato
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150922/d09dcaf4/attachment-0001.html>

Eric Christopher via llvm-dev

2015-Sep-22 19:40 UTC

head link

[llvm-dev] The Trouble with Triples

On Thu, Sep 17, 2015 at 6:21 AM Daniel Sanders <Daniel.Sanders at
imgtec.com>
wrote:
> I think we need to take a step further back and re-enter from the right
> starting point. The thing that's bothering me about the push back so
far is
> that it's trying to discuss and understand the consequences of
resolving
> the core problem while seemingly ignoring the core problem itself. The
> reason I've been steering everything back to GNU Triple's being
ambiguous
> and inconsistent is because it's the root of all the problems and the
fixes
> to the various issues fall out naturally once this core point has been
> addressed.
>
>*sigh*

>
>
> Here's the line of thought that I'd like people to start with:
>
> ·         Triples don't describe the target. They look like they
should,
> but they don't. They're really just arbitrary strings.
>
Triples are used as a starting point, but no more.

> ·         LLVM relies on Triple as a description of the target. It
> defines the backend to use, the binary format to use, OS and Vendor
> specific quirks to enable/disable, the default CPU, the default ABI, the
> endian, and countless other details about the target.
>
These two statements aren't necessarily true in whole.

a) We don't use the Triple to fully specify the target.
b) We don't use the Triple to fully specify the ABI.
c) We don't use the Triple to fully specify the CPU.
d) We do use the triple to handle endianness since most, if not all,
triples actually bother to encode endianness.
e) The rest of the "countless details" may or may not be relevant, you
haven't given an example of what you care about.
>From here on your email relies on all of these assumptions being true. SoI'm going to skip past that part and go to where you answer some of my
questions.
> At this point, in the MC layer we have a number of classes that need to
> know the ABI but lack this information. Our TargetMachine has an accurate
> TargetTuple object that describes the invariants of the desired target. The
> desired ABI is an invariant too so why not have it in the TargetTuple which
> is already plumbed in everywhere we need it? After all, it's a property
of
> the target OS/Environment. If we have the ABI in the TargetTuple, then we
> don't need any other means to set the ABI, tools can set it up front in
the
> TargetTuple and we don't need any command-line option handling for it
in
> the backend.
>
>This isn't sufficient anyways as I don't want to depend on a weird
serialization format to deal with something a simple command line can deal
with (or you've said this in a way that's confused me). I see you saying
you want:

-tuple mips-linux-gnu-abio32-el

to specify on a command line to, say, llvm-mc or a new assembler interface,
or heck, to clang itself, that you want to compile for:

-triple mipsel-linux-gnu -mabi=o32

right? Basically? (Bikeshedding of how to actually serialize things aside?)

> Meanwhile, in clang we have a number of command line options that change
> the desired target. Let's say we've constructed a Triple and
resolved it to
> TargetTuple (more on that below). We're now processing the –EL option.
At
> the moment, we substitute our mips-linux-gnu triple for a mipsel-linux-gnu
> triple, construct a Triple object from it and resolve the new Triple to a
> TargetTuple. But why do we need to bother with that kind of weird hackery
> when we can simply do Obj.setEndian(Little)? This is what Phase 7 of the
> plan is about. We end up with a cleaner way to process target changes that,
> until now, have required weird triple hacking to handle.
>
>
>
This is something else I don't understand. Here is the first time you start
talking about APIs which is what I'm particularly asking about in my
earlier mails. I'd like to see how you plan on changing the TargetMachine
and MC level APIs to deal with this. It seems like the Tuple is going to be
a way to side-load information around to the MC layer and while I agree
that something is necessary there, I don't think that this solution is the
right one. (As I said earlier in the thread)

> I skipped the Triple -> TargetTuple resolution a moment ago and I should
> address that now. We already know that mapping Triple to TargetTuple is a
> many to many mapping. One Triple has many possible TargetTuple's
depending
> on the environment. One TargetTuple can be formed from multiple possible
> Triples. In an ideal world, we'd like to bake in all of these mappings
so
> that one clang binary supports everything. Unfortunately, being a many to
> many mapping, some of these mappings are mutually exclusive. Note that this
> isn't a new problem resulting from this project. The problem has always
> been there but has been ignored until now. To resolve this, we need to
> provide configure-time and possibly run-time controls for how this
> conversion is disambiguated. This resolution is performed as early as
> possible so that the middle/back-ends don't need to know anything about
the
> ambiguity problem.
>
>
>The minute you start talking about configure time controls we've already
lost. This, for me, is a non-starter. That said, I'd like to see the
examples you think show that things are impossible to deal with in the
current architecture.

> ---
>
>
>
> To reply more directly to your email:
>
Thanks :)

> > What can't be done to TargetMachine to avoid this serialization?
>
>
>
> TargetMachine already has the serialization (see
> TargetMachine::TargetTriple). We're not doing anything new here.
We're
> simply replacing one object holding faulty information with a new object
> holding reliable information.
>
>
>
This is side stepping my question and making it about Triple. I've
specifically said that TargetMachine does not and is not completely
dependent upon Triple.

> > And a followup question: What can't be serialized at the function
level
> in the IR to make certain things clear that aren't global? We already
do
> this for a lot of command line options.
>
>
>
> The data I want to fix is global. I think the bit you may be getting hung
> up on here is that small portions of this global data can also be
> overridden at the function level. Those overrides aren't a problem and
> continue to operate in the same way as they do today.
>
>
>Examples please.

> > And one more: What global options do we need to consider here?
>
>
>
> I'm not certain I understand this question. If you're talking
command line
> options, it's things like –EL, -EB, -mips32, -mips32r[2356], -mips64,
> -mips64r[2356], -mabi=…. If you're talking about Triple ->
TargetTuple
> mappings, there's quite a wide variety but the main ones for Mips are
> endian, architecture, default CPU, and default ABI.
>
>All of these are representable right now in the TargetMachine as far as I
can tell. What examples are you having problems with?

>
>
> > The goal of the configuration level of the TargetMachine is that it
> controls things that don't change at the object level.
>
> > This is a fairly recently stated goal, but I think it makes sense for
> LLVM in general. TargetSubtargetInfo takes care of
>
> > everything that resides under this (as much as possible, some bits are
> still in transition, e.g. TargetOptions). This is part
>
> > of my suggestion to Daniel about the problems with MCSubtargetInfo and
> the assembler. Targets like Mips and ARM
>
> > were unfortunately designed to change things on the fly during
assembly
> and need to collate or at least change defaults
>
> > as we're processing code. I definitely had to deal with a lot of
the
> pain you're talking about when I was rewriting some
>
> > of the handling there during the TargetSubtargetInfo work.
>
>
>
> I generally agree with this. The key bit I need to draw attention to is
> that the 'defaults' don't change, but are instead overridden.
These
> constant defaults are stored in TargetMachine and particularly
> TargetMachine::TargetTriple. These defaults are wrong for some toolchains
> since the information stored in TargetMachine::TargetTriple are wrong.
It's
> the defaults I'm trying to fix rather than the overrides.
>
>
>
I don't understand what you mean here.

> I think I understand your proposed plan now and it's a few steps ahead
of
> where we are and where we need to be. I agree that overridable state should
> be in TargetSubtargetInfo, however I can't initialize that state
without
> the default values which come from the faulty information in
> TargetMachine::TargetTriple. This triple work is a pre-requisite to your
> plan and at first I don't need to override ABI's.
>
>
>
Can you provide an example of using a tool that you're having problems with?

> > Right now I see TargetTuple as trying to take over all of the various
> arguments to TargetMachine and encapsulate them into a single thing.
>
> > I also don't see this is bad, but I also don't see it taking
all of them
> right now and I'm not sure how it solves some of the existing problems
>
> > with data sharing that we've got which is where the push back
you're
> both getting is coming from here. Ultimately library-wise I can agree
>
> > with some of the directions you're headed - I just don't see
the
> unification and interactions right now.
>
>
>
> I think we'll end up with TargetTuple taking over many arguments to
> TargetMachine but that's not my goal at this stage. My goal is simply
to
> fix the faulty information currently held in Triple and use the
> now-accurate information in TargetTuple to fix various blocking issues that
> prevent a proper Mips toolchain product based on Clang/LLVM. At the end of
> Phase 7, it become possible to fix a number of issues that are impossible
> to fix right now because the available data we can consult at the moment is
> incorrect.
>
>
>
Could you please provide some examples of things that are impossible right
now with command lines, how those interact with the TargetMachine, and how
you see it being impossible to deal with?

Thanks

-eric

>
>
> *From:* Eric Christopher [mailto:echristo at gmail.com]
> *Sent:* 16 September 2015 23:52
> *To:* Renato Golin; Jim Grosbach
> *Cc:* Daniel Sanders; llvm-dev at lists.llvm.org
>
>
> *Subject:* Re: The Trouble with Triples
>
>
>
> Let's take a step back here.
>
>
>
> It appears that you and Daniel are trying to solve some problems. I think
> solving problems is good, I just want to make sure that we're solving
them
> in a way that gets us a decent API at the end. I also want to make sure
> we're solving the right problems.
>
>
>
> TargetTuple appears to be related to the TargetParser as you bring up in
> this mail. They're two separate parts of similar problems - people
trying
> to both serialize command line options and communication from the front end
> to the backend with respect to target information.
>
>
>
> This leads me to a question: What can't be done to TargetMachine to
avoid
> this serialization?
>
> And a followup question: What can't be serialized at the function level
in
> the IR to make certain things clear that aren't global? We already do
this
> for a lot of command line options.
>
> And one more: What global options do we need to consider here?
>
>
>
> The goal of the configuration level of the TargetMachine is that it
> controls things that don't change at the object level. This is a fairly
> recently stated goal, but I think it makes sense for LLVM in general.
> TargetSubtargetInfo takes care of everything that resides under this (as
> much as possible, some bits are still in transition, e.g. TargetOptions).
> This is part of my suggestion to Daniel about the problems with
> MCSubtargetInfo and the assembler. Targets like Mips and ARM were
> unfortunately designed to change things on the fly during assembly and need
> to collate or at least change defaults as we're processing code. I
> definitely had to deal with a lot of the pain you're talking about when
I
> was rewriting some of the handling there during the TargetSubtargetInfo
> work.
>
>
>
> Now a bit more on TargetParser + TargetTuple:
>
>
>
> TargetParser appears to be trying to solve the parsing in Triple in a nice
> way for ARM and also some of the "what kind of subtarget feature
> canonicalization can we do in llvm that makes sense to communicate to the
> front end". I like this particular idea and have often wanted a
library of
> feature handling, but it seems to have stabilized at an ARM specific set of
> code with no defined interface. I can't even figure out how I'd use
it in
> lib/Basic right now for any target other than ARM. This isn't a
> condemnation of TargetParser, but I think it's something that needs to
be
> thought through a bit more. It's been hooked up well before I'd
expected it
> to and right now if we moved it to the ARM backend from Support it'd
make
> just as much sense as it does where it is now other than making clang
> depend on the ARM backend as well as the X86 backend :)
>
>
>
> Right now I see TargetTuple as trying to take over all of the various
> arguments to TargetMachine and encapsulate them into a single thing. I also
> don't see this is bad, but I also don't see it taking all of them
right now
> and I'm not sure how it solves some of the existing problems with data
> sharing that we've got which is where the push back you're both
getting is
> coming from here. Ultimately library-wise I can agree with some of the
> directions you're headed - I just don't see the unification and
> interactions right now.
>
>
>
> As a suggestion as a way forward here let's see if we can get my
questions
> above answered and also show some of how the interactions between
llvm's
> libraries are going to get fixed, moved to a better place, etc here.
>
>
>
> Thanks!
>
>
>
> -eric
>
>
>
>
>
> On Wed, Sep 16, 2015 at 3:02 PM Renato Golin <renato.golin at
linaro.org>
> wrote:
>
> On 16 September 2015 at 21:56, Jim Grosbach <grosbach at apple.com>
wrote:
> > Why do we care about GAS? We have an assembler.
>
> It's not that simple.
>
> There are a lot of old code out there, including the Linux kernel
> which we do care a lot, that only compiles with GAS. We're slowly
> moving the legacy code up to modern standards, and specifically some
> kernel folks are happy to move up not only the asm syntax, but the C
> standard and move away from GNU-specific behaviour. But we're not
> quite there yet, and might not be for a few more years. so, yes, we
> still care about GAS.
>
> But this is not just about GAS.
>
> As I said on my previous email, this is about clearing the bloat in
> target descriptions by both: removing the need for adding numerous CPU
> names, target features, architecture names (xscale, strongarm, etc),
> AND making sure all parties (front/middle/back-ends) speak the same
> language, produced from the same source.
>
> The TargetTuple is that common language, and the TargetParser created
> from the TableGen files is the common source. The Triple becomes a
> legacy constructor value for the Tuple. All other target information
> classes are already (or should be) generated from the TableGen files,
> so the ultimate source becomes the TableGen description, which I think
> it what you were aiming to on your comment.
>
> For simple architectures, like x86, you don't even need a
> TargetParser. You can easily construct the Tuple from a triple and use
> the Tuple as you've always used the triple. No harm done. But for the
> complex ones like ARM and MIPS, having a common interface generated
> from the same place the other interfaces are is important to avoid
> more bridges between front and middle and back end interpretations of
> the same target. Whatever legacy ARM or MIPS carry can be isolated in
> their own implementation, leaving the rest of the targets with a clean
> and simple interface.
>
> cheers,
> --renato
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150922/ad59f139/attachment.html>

Eric Christopher via llvm-dev

2015-Sep-22 19:41 UTC

head link

[llvm-dev] The Trouble with Triples

I've been busy working on other things. I replied to your earlier mail
which is much longer and also encapsulates all of the stuff here.

Thanks! :)

-eric

On Tue, Sep 22, 2015 at 6:06 AM Daniel Sanders <Daniel.Sanders at
imgtec.com>
wrote:
> The thread has gone quiet for a few days and I need to be making progress
> towards a gcc-compatible toolchain (e.g. a mips-mti-linux-gnu toolchain
> that can target MIPS32/MIPS64 and later for all appropriate ABI's and
both
> endians) so I need to chase this a earlier than I normally would.
>
>
>
> > Here's the line of thought that I'd like people to start with:
>
> > * Triples don't describe the target. They look like they should,
but
> they don't. They're really just arbitrary strings.
>
> > * LLVM relies on Triple as a description of the target. It defines the
> backend to use, the binary format to use, OS and Vendor specific quirks to
> enable/disable, the default CPU, the default ABI, the endian, and countless
> other details about the target.
>
> > * If LLVM is built on top of an incorrect concept we should fix that
but
> we can't abandon Triple's at the user level since every toolchain
uses them.
>
> > * But we can't keep using Triples inappropriately either. If the
> information feeding into LLVM is faulty then the resulting behaviour will
> be faulty too.
>
> > * So let's start with a Triple, and convert it to a not-broken
> equivalent as early as possible. We'll call it TargetTuple.
>
> > Are there any disagreements on this part of the thinking?
>
> > If we have agreement on this, then I think that this by itself is
ample
> reason for phases 1-4, and 6 of the plan.
>
> > The justification for the IR serialization in phase 5 is simply that
we
> need to deliver the Triple/TargetTuple to
>
> > LTO for it to operate correctly and we currently do this by
serializing
> Triple in the IR. If Triple has been replaced
>
> > by TargetTuple then TargetTuple must be serializable in the IR
somehow.
>
>
>
> Are we agreed on this much? If so, I think we should go ahead with this
> part of the work and judge each follow-on task independently on its own
> merits.
>
>
>
> *From:* llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] *On Behalf Of
*Daniel
> Sanders via llvm-dev
> *Sent:* 17 September 2015 14:21
> *To:* Eric Christopher; Renato Golin; Jim Grosbach
> *Cc:* llvm-dev at lists.llvm.org
> *Subject:* Re: [llvm-dev] The Trouble with Triples
>
>
>
> I think we need to take a step further back and re-enter from the right
> starting point. The thing that's bothering me about the push back so
far is
> that it's trying to discuss and understand the consequences of
resolving
> the core problem while seemingly ignoring the core problem itself. The
> reason I've been steering everything back to GNU Triple's being
ambiguous
> and inconsistent is because it's the root of all the problems and the
fixes
> to the various issues fall out naturally once this core point has been
> addressed.
>
>
>
> Here's the line of thought that I'd like people to start with:
>
> ·         Triples don't describe the target. They look like they
should,
> but they don't. They're really just arbitrary strings.
>
> ·         LLVM relies on Triple as a description of the target. It
> defines the backend to use, the binary format to use, OS and Vendor
> specific quirks to enable/disable, the default CPU, the default ABI, the
> endian, and countless other details about the target.
>
> ·         If LLVM is built on top of an incorrect concept we should fix
> that but we can't abandon Triple's at the user level since every
toolchain
> uses them.
>
> ·         But we can't keep using Triples inappropriately either. If
the
> information feeding into LLVM is faulty then the resulting behaviour will
> be faulty too.
>
> ·         So let's start with a Triple, and convert it to a not-broken
> equivalent as early as possible. We'll call it TargetTuple.
>
> Are there any disagreements on this part of the thinking? If there are,
> then we should resolve these before proceeding to the rest since everything
> else depends on accepting this core problem exists and can be fixed in this
> way.
>
> If we have agreement on this, then I think that this by itself is ample
> reason for phases 1-4, and 6 of the plan. The justification for the IR
> serialization in phase 5 is simply that we need to deliver the
> Triple/TargetTuple to LTO for it to operate correctly and we currently do
> this by serializing Triple in the IR. If Triple has been replaced by
> TargetTuple then TargetTuple must be serializable in the IR somehow.
>
>
>
> Hopefully, we are agreed so far. Let's assume for the rest of this
> explanation that Phases 1-6 are complete and we now have const TargetTuple
> throughout the API. I'd like to draw particular attention to
TargetMachine
> which, like everything else, has had its Triple member (called
> TargetTriple) replaced with a TargetTuple member (named TheTargetTuple).
> This member is used in all the same ways it used to be used when it was a
> Triple (named TargetTriple).
>
>
>
> At this point, in the MC layer we have a number of classes that need to
> know the ABI but lack this information. Our TargetMachine has an accurate
> TargetTuple object that describes the invariants of the desired target. The
> desired ABI is an invariant too so why not have it in the TargetTuple which
> is already plumbed in everywhere we need it? After all, it's a property
of
> the target OS/Environment. If we have the ABI in the TargetTuple, then we
> don't need any other means to set the ABI, tools can set it up front in
the
> TargetTuple and we don't need any command-line option handling for it
in
> the backend.
>
>
>
> Meanwhile, in clang we have a number of command line options that change
> the desired target. Let's say we've constructed a Triple and
resolved it to
> TargetTuple (more on that below). We're now processing the –EL option.
At
> the moment, we substitute our mips-linux-gnu triple for a mipsel-linux-gnu
> triple, construct a Triple object from it and resolve the new Triple to a
> TargetTuple. But why do we need to bother with that kind of weird hackery
> when we can simply do Obj.setEndian(Little)? This is what Phase 7 of the
> plan is about. We end up with a cleaner way to process target changes that,
> until now, have required weird triple hacking to handle.
>
>
>
> I skipped the Triple -> TargetTuple resolution a moment ago and I should
> address that now. We already know that mapping Triple to TargetTuple is a
> many to many mapping. One Triple has many possible TargetTuple's
depending
> on the environment. One TargetTuple can be formed from multiple possible
> Triples. In an ideal world, we'd like to bake in all of these mappings
so
> that one clang binary supports everything. Unfortunately, being a many to
> many mapping, some of these mappings are mutually exclusive. Note that this
> isn't a new problem resulting from this project. The problem has always
> been there but has been ignored until now. To resolve this, we need to
> provide configure-time and possibly run-time controls for how this
> conversion is disambiguated. This resolution is performed as early as
> possible so that the middle/back-ends don't need to know anything about
the
> ambiguity problem.
>
>
>
> ---
>
>
>
> To reply more directly to your email:
>
> > What can't be done to TargetMachine to avoid this serialization?
>
>
>
> TargetMachine already has the serialization (see
> TargetMachine::TargetTriple). We're not doing anything new here.
We're
> simply replacing one object holding faulty information with a new object
> holding reliable information.
>
>
>
> > And a followup question: What can't be serialized at the function
level
> in the IR to make certain things clear that aren't global? We already
do
> this for a lot of command line options.
>
>
>
> The data I want to fix is global. I think the bit you may be getting hung
> up on here is that small portions of this global data can also be
> overridden at the function level. Those overrides aren't a problem and
> continue to operate in the same way as they do today.
>
>
>
> > And one more: What global options do we need to consider here?
>
>
>
> I'm not certain I understand this question. If you're talking
command line
> options, it's things like –EL, -EB, -mips32, -mips32r[2356], -mips64,
> -mips64r[2356], -mabi=…. If you're talking about Triple ->
TargetTuple
> mappings, there's quite a wide variety but the main ones for Mips are
> endian, architecture, default CPU, and default ABI.
>
>
>
> > The goal of the configuration level of the TargetMachine is that it
> controls things that don't change at the object level.
>
> > This is a fairly recently stated goal, but I think it makes sense for
> LLVM in general. TargetSubtargetInfo takes care of
>
> > everything that resides under this (as much as possible, some bits are
> still in transition, e.g. TargetOptions). This is part
>
> > of my suggestion to Daniel about the problems with MCSubtargetInfo and
> the assembler. Targets like Mips and ARM
>
> > were unfortunately designed to change things on the fly during
assembly
> and need to collate or at least change defaults
>
> > as we're processing code. I definitely had to deal with a lot of
the
> pain you're talking about when I was rewriting some
>
> > of the handling there during the TargetSubtargetInfo work.
>
>
>
> I generally agree with this. The key bit I need to draw attention to is
> that the 'defaults' don't change, but are instead overridden.
These
> constant defaults are stored in TargetMachine and particularly
> TargetMachine::TargetTriple. These defaults are wrong for some toolchains
> since the information stored in TargetMachine::TargetTriple are wrong.
It's
> the defaults I'm trying to fix rather than the overrides.
>
>
>
> I think I understand your proposed plan now and it's a few steps ahead
of
> where we are and where we need to be. I agree that overridable state should
> be in TargetSubtargetInfo, however I can't initialize that state
without
> the default values which come from the faulty information in
> TargetMachine::TargetTriple. This triple work is a pre-requisite to your
> plan and at first I don't need to override ABI's.
>
>
>
> > Right now I see TargetTuple as trying to take over all of the various
> arguments to TargetMachine and encapsulate them into a single thing.
>
> > I also don't see this is bad, but I also don't see it taking
all of them
> right now and I'm not sure how it solves some of the existing problems
>
> > with data sharing that we've got which is where the push back
you're
> both getting is coming from here. Ultimately library-wise I can agree
>
> > with some of the directions you're headed - I just don't see
the
> unification and interactions right now.
>
>
>
> I think we'll end up with TargetTuple taking over many arguments to
> TargetMachine but that's not my goal at this stage. My goal is simply
to
> fix the faulty information currently held in Triple and use the
> now-accurate information in TargetTuple to fix various blocking issues that
> prevent a proper Mips toolchain product based on Clang/LLVM. At the end of
> Phase 7, it become possible to fix a number of issues that are impossible
> to fix right now because the available data we can consult at the moment is
> incorrect.
>
>
>
>
>
> *From:* Eric Christopher [mailto:echristo at gmail.com]
> *Sent:* 16 September 2015 23:52
> *To:* Renato Golin; Jim Grosbach
> *Cc:* Daniel Sanders; llvm-dev at lists.llvm.org
> *Subject:* Re: The Trouble with Triples
>
>
>
> Let's take a step back here.
>
>
>
> It appears that you and Daniel are trying to solve some problems. I think
> solving problems is good, I just want to make sure that we're solving
them
> in a way that gets us a decent API at the end. I also want to make sure
> we're solving the right problems.
>
>
>
> TargetTuple appears to be related to the TargetParser as you bring up in
> this mail. They're two separate parts of similar problems - people
trying
> to both serialize command line options and communication from the front end
> to the backend with respect to target information.
>
>
>
> This leads me to a question: What can't be done to TargetMachine to
avoid
> this serialization?
>
> And a followup question: What can't be serialized at the function level
in
> the IR to make certain things clear that aren't global? We already do
this
> for a lot of command line options.
>
> And one more: What global options do we need to consider here?
>
>
>
> The goal of the configuration level of the TargetMachine is that it
> controls things that don't change at the object level. This is a fairly
> recently stated goal, but I think it makes sense for LLVM in general.
> TargetSubtargetInfo takes care of everything that resides under this (as
> much as possible, some bits are still in transition, e.g. TargetOptions).
> This is part of my suggestion to Daniel about the problems with
> MCSubtargetInfo and the assembler. Targets like Mips and ARM were
> unfortunately designed to change things on the fly during assembly and need
> to collate or at least change defaults as we're processing code. I
> definitely had to deal with a lot of the pain you're talking about when
I
> was rewriting some of the handling there during the TargetSubtargetInfo
> work.
>
>
>
> Now a bit more on TargetParser + TargetTuple:
>
>
>
> TargetParser appears to be trying to solve the parsing in Triple in a nice
> way for ARM and also some of the "what kind of subtarget feature
> canonicalization can we do in llvm that makes sense to communicate to the
> front end". I like this particular idea and have often wanted a
library of
> feature handling, but it seems to have stabilized at an ARM specific set of
> code with no defined interface. I can't even figure out how I'd use
it in
> lib/Basic right now for any target other than ARM. This isn't a
> condemnation of TargetParser, but I think it's something that needs to
be
> thought through a bit more. It's been hooked up well before I'd
expected it
> to and right now if we moved it to the ARM backend from Support it'd
make
> just as much sense as it does where it is now other than making clang
> depend on the ARM backend as well as the X86 backend :)
>
>
>
> Right now I see TargetTuple as trying to take over all of the various
> arguments to TargetMachine and encapsulate them into a single thing. I also
> don't see this is bad, but I also don't see it taking all of them
right now
> and I'm not sure how it solves some of the existing problems with data
> sharing that we've got which is where the push back you're both
getting is
> coming from here. Ultimately library-wise I can agree with some of the
> directions you're headed - I just don't see the unification and
> interactions right now.
>
>
>
> As a suggestion as a way forward here let's see if we can get my
questions
> above answered and also show some of how the interactions between
llvm's
> libraries are going to get fixed, moved to a better place, etc here.
>
>
>
> Thanks!
>
>
>
> -eric
>
>
>
>
>
> On Wed, Sep 16, 2015 at 3:02 PM Renato Golin <renato.golin at
linaro.org>
> wrote:
>
> On 16 September 2015 at 21:56, Jim Grosbach <grosbach at apple.com>
wrote:
> > Why do we care about GAS? We have an assembler.
>
> It's not that simple.
>
> There are a lot of old code out there, including the Linux kernel
> which we do care a lot, that only compiles with GAS. We're slowly
> moving the legacy code up to modern standards, and specifically some
> kernel folks are happy to move up not only the asm syntax, but the C
> standard and move away from GNU-specific behaviour. But we're not
> quite there yet, and might not be for a few more years. so, yes, we
> still care about GAS.
>
> But this is not just about GAS.
>
> As I said on my previous email, this is about clearing the bloat in
> target descriptions by both: removing the need for adding numerous CPU
> names, target features, architecture names (xscale, strongarm, etc),
> AND making sure all parties (front/middle/back-ends) speak the same
> language, produced from the same source.
>
> The TargetTuple is that common language, and the TargetParser created
> from the TableGen files is the common source. The Triple becomes a
> legacy constructor value for the Tuple. All other target information
> classes are already (or should be) generated from the TableGen files,
> so the ultimate source becomes the TableGen description, which I think
> it what you were aiming to on your comment.
>
> For simple architectures, like x86, you don't even need a
> TargetParser. You can easily construct the Tuple from a triple and use
> the Tuple as you've always used the triple. No harm done. But for the
> complex ones like ARM and MIPS, having a common interface generated
> from the same place the other interfaces are is important to avoid
> more bridges between front and middle and back end interpretations of
> the same target. Whatever legacy ARM or MIPS carry can be isolated in
> their own implementation, leaving the rest of the targets with a clean
> and simple interface.
>
> cheers,
> --renato
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150922/450036f7/attachment-0001.html>

Daniel Sanders via llvm-dev

2015-Sep-22 22:23 UTC

head link

[llvm-dev] The Trouble with Triples

>> Here's the line of thought that I'd like people to start with:
>> * Triples don't describe the target. They look like they should,
but they
>>   don't. They're really just arbitrary strings.
>
>Triples are used as a starting point, but no more.
I disagree with this but for now let's assume it's true. The starting
point is
incorrect because triples are ambiguous and inconsistent (as demonstrated in
this thread) and this should be fixed.
>> * LLVM relies on Triple as a description of the target. It defines the
>> backend to use, the binary format to use, OS and Vendor specific quirks
to
>> enable/disable, the default CPU, the default ABI, the endian, and
countless
>> other details about the target.
>
>These two statements aren't necessarily true in whole.
>
>a) We don't use the Triple to fully specify the target.
The key word here is 'fully'. Triples do select an initial complete
target
which we then modify with command line options or programatically.
>b) We don't use the Triple to fully specify the ABI.
This statement makes the same mistake as in 'a)'.
>c) We don't use the Triple to fully specify the CPU.
This statement makes the same mistake as in 'a)'. Additionally, it's
important
to note that I said _default_ CPU. You are correct that -target-cpu and similar
specify the CPU but the default CPU is often wrong because it doesn't
account
for triple-customization in the GCC toolchain on the same system. X86, ARM, and
Mips examples have already been provided in this thread.
>d) We do use the triple to handle endianness since most, if not all, triples
>   actually bother to encode endianness.
And it is occasionally incorrect, as demonstrated in this thread in the
(perverse
but sadly real) example where mips-linux-gnu was supposed to target little
endian by default. It should also be noted that the -EL and -EB options
currently
operate by hacking the triple and replacing the first component.
>e) The rest of the "countless details" may or may not be relevant,
you haven't
>   given an example of what you care about.
"grep 'TT\.\|TheTargetTuple\.'" reveals most of the details
that LLVM's various
targets derive from the triple. I don't understand all the decisions being
made
by all targets but they are definitely using information from the triple to make
behavioural decisions throughout the backends.
> From here on your email relies on all of these assumptions being true. So
I'm
> > going to skip past that part and go to where you answer some of my
questions.
> > At this point, in the MC layer we have a number of classes that need
to know the ABI
> > but lack this information. Our TargetMachine has an accurate
TargetTuple object that
> > describes the invariants of the desired target. The desired ABI is an
invariant too so
> > why not have it in the TargetTuple which is already plumbed in
everywhere we need
> > it? After all, it's a property of the target OS/Environment. If we
have the ABI in the
> > TargetTuple, then we don't need any other means to set the ABI,
tools can set it up
> > front in the TargetTuple and we don't need any command-line option
handling for it in the backend.
>
> This isn't sufficient anyways as I don't want to depend on a weird
serialization format to deal with
> something a simple command line can deal with (or you've said this in a
way that's confused me). I see you saying you want:
>
> -tuple mips-linux-gnu-abio32-el
>
> to specify on a command line to, say, llvm-mc or a new assembler interface,
or heck, to clang itself, that you want to compile for:
>
> -triple mipsel-linux-gnu -mabi=o32
>
> right? Basically? (Bikeshedding of how to actually serialize things aside?)
I think you're still missing the central point so before I directly answer
your
question let met ask you this: What do you believe '-triple
mipsel-linux-gnu'
means? You're probably thinking something like "mips32r2 little endian,
obviously"
but this is not actually correct all the time. The true answer is 'whatever
I
(the person who built the toolchain) want it to mean'. It could be mips32r6,
it
could be mips4, it could even be big-endian mips64r5 with nan2008 and msa. It
could even be octeon or p5600. In GCC toolchains, distributors routinely use
configure-time options to define the triple they wish to use. Nothing is
stopping anyone using the same string for completely different meanings and
indeed conflicting definitions are very common. To be compatible with GCC
toolchains we must be able to accommodate triple customization too.

To answer your question: The reason I wanted that is because a consequence of
supporting triple customization is that '-triple mipsel-linux-gnu' on
your
toolchain might mean something different to what it does on my toolchain.
Being able to specify the tuple directly allows us to mitigate these toolchain
differences
and avoid the pain of having to figure out which meaning of the triple is in
use.
>> Meanwhile, in clang we have a number of command line options that
change the
>> desired target. Let's say we've constructed a Triple and
resolved it to
>> TargetTuple (more on that below). We're now processing the –EL
option. At the
>> moment, we substitute our mips-linux-gnu triple for a mipsel-linux-gnu
triple,
>> construct a Triple object from it and resolve the new Triple to a
TargetTuple.
>> But why do we need to bother with that kind of weird hackery when we
can simply
>> do Obj.setEndian(Little)? This is what Phase 7 of the plan is about. We
end up
>> with a cleaner way to process target changes that, until now, have
required
>> weird triple hacking to handle.
>
> This is something else I don't understand. Here is the first time you
start
> talking about APIs which is what I'm particularly asking about in my
earlier
> mails. I'd like to see how you plan on changing the TargetMachine and
MC level
> APIs to deal with this. It seems like the Tuple is going to be a way to
> side-load information around to the MC layer and while I agree that
something
> is necessary there, I don't think that this solution is the right one.
(As I
> said earlier in the thread)
As far as this section is concerned, I don't intend to change the MC API or
the TargetMachine API at all except in so far as replacing Triple objects with a
TargetTuple. It's the Triple/TargetTuple API that changes to provide more
convenient setters and getters like setEndian()/getEndian(). As you can see the
API side is uninteresting. The important bit is that the values inside the
TargetTuple reflect the desired target and not the arbitrary triple string.

The ABI follow-on work is slightly different but not very much so. As above, I
don't intend to change the MC/TargetMachine API. By adding the ABI
blob-of-data
to the TargetTuple we deliver it to the TargetMachine via the existing plumbing.
After that, targets without per-function ABI selection (which aside from MIPS16
includes MIPS for the foreseeable future) can just consult the TargetTuple since
it's
already reachable from everything. Targets that need per-function ABI selection
can
use the TargetTuple to initialize a TargetSubtargetInfo, override it as desired,
and
have the MC layer consult the TargetSubtargetInfo.
>> I skipped the Triple -> TargetTuple resolution a moment ago and I
should
>> address that now. We already know that mapping Triple to TargetTuple is
a
>> many to many mapping. One Triple has many possible TargetTuple's
depending
>> on the environment. One TargetTuple can be formed from multiple
possible
>> Triples. In an ideal world, we'd like to bake in all of these
mappings so
>> that one clang binary supports everything. Unfortunately, being a many
to
>> many mapping, some of these mappings are mutually exclusive. Note that
this
>> isn't a new problem resulting from this project. The problem has
always been
>> there but has been ignored until now. To resolve this, we need to
provide
>> configure-time and possibly run-time controls for how this conversion
is
>> disambiguated. This resolution is performed as early as possible so
that the
>> middle/back-ends don't need to know anything about the ambiguity
problem.
>
> The minute you start talking about configure time controls we've
already lost.
> This, for me, is a non-starter. That said, I'd like to see the examples
you
> think show that things are impossible to deal with in the current
architecture.
I don't like it either but we can either deal with the world as it is or
attempt
to convince the entire computing industry that they're doing it wrong and
need
to replace the multiverse with something sensible. I don't fancy my chances
with the latter so dealing with it is by far the most realistic option.

I've already given several examples in this thread but to quickly re-iterate
some
of them:
* mips-linux-gnu means different things on different Linux distributions.
  * I've seen MIPS-II, MIPS32, MIPS32R2. They can't all be the one true
meaning.
  * I've also seen one instance of it meaning little endian.
  * Modern toolchains will mean NAN2008, existing ones usually mean NAN1985.
    (IEEE754-2008 made MIPS's QNaN/SNaN encodings wrong)
  * Some use O32 FPXX instead of plain O32 FP32.
* The sysroot layout for mips-mti-linux-gnu significantly changed in recent
  toolchains. We can't hardcode both layouts since they're mutually
exclusive and
  very different.
  * Likewise for mips-img-linux-gnu toolchains.
* mips-linux-gnu -mips64 should produce O32 ABI code on a MIPS64 CPU but
currently crashes.
* mips64-linux-gnu -mips32 should produce O32 ABI code on a MIPS32 CPU but
currently crashes.
* i386-linux-gnu means i486 on Debian Etch and i586 on more recent Debians
Renato had a number of similar ARM examples too.
>> What can't be done to TargetMachine to avoid this serialization?
>> TargetMachine already has the serialization (see
TargetMachine::TargetTriple).
>> We're not doing anything new here. We're simply replacing one
object holding
>> faulty information with a new object holding reliable information.
> This is side stepping my question and making it about Triple. I've
specifically
> said that TargetMachine does not and is not completely dependent upon
Triple.
Again, you said 'completely'. If any portion is dependent on the faulty
information in Triple then the behaviour is incorrect.
>>> And a followup question: What can't be serialized at the
function level in
>>> the IR to make certain things clear that aren't global? We
already do this
>>> for a lot of command line options.
>> The data I want to fix is global. I think the bit you may be getting
hung up
>> on here is that small portions of this global data can also be
overridden at
>> the function level. Those overrides aren't a problem and continue
to operate
>> in the same way as they do today.
> Examples please.
Examples of what? The point was that you're on the wrong track. The data I
want
to be correct is the data that is already in the triple but is incorrect.
>> And one more: What global options do we need to consider here?
>> I'm not certain I understand this question. If you're talking
command line
>> options, it's things like –EL, -EB, -mips32, -mips32r[2356],
-mips64,
>> -mips64r[2356], -mabi=…. If you're talking about Triple ->
TargetTuple
>> mappings, there's quite a wide variety but the main ones for Mips
are endian,
>> architecture, default CPU, and default ABI.
>
> All of these are representable right now in the TargetMachine as far as I
can
> tell. What examples are you having problems with?
Architecture = Stored in 'TargetMachine::TargetTriple'. As previously
explained, it's also
         sometimes incorrect.
Endian = Stored in 'TargetMachine::TargetTriple'. As previously
explained, it's also
         sometimes incorrect.
CPU = You're correct on this one.
Default CPU = Implied by the triple but often hardcoded to a single value
              regardless of the correct behaviour of the triple on this target.
Default ABI = Implied by the triple but often hardcoded to a single value
              regardless of the correct behaviour of the triple on this target.

Examples are earlier in this email.
>>> The goal of the configuration level of the TargetMachine is that it
controls things that don't change at the object level.
>>> This is a fairly recently stated goal, but I think it makes sense
for LLVM in general. TargetSubtargetInfo takes care of
>>> everything that resides under this (as much as possible, some bits
are still in transition, e.g. TargetOptions). This is part
>>> of my suggestion to Daniel about the problems with MCSubtargetInfo
and the assembler. Targets like Mips and ARM
>>> were unfortunately designed to change things on the fly during
assembly and need to collate or at least change defaults
>>> as we're processing code. I definitely had to deal with a lot
of the pain you're talking about when I was rewriting some
>>> of the handling there during the TargetSubtargetInfo work.
>>
>> I generally agree with this. The key bit I need to draw attention to is
that
>> the 'defaults' don't change, but are instead overridden.
These constant
>> defaults are stored in TargetMachine and particularly
>> TargetMachine::TargetTriple. These defaults are wrong for some
toolchains
>> since the information stored in TargetMachine::TargetTriple are wrong.
It's
>> the defaults I'm trying to fix rather than the overrides.
>
> I don't understand what you mean here.
I'm trying to clarify that the default CPU/ABI/etc. is defined by the
triple.
There is no single default which is correct for every toolchain.

Triple implies the default subject to triple customization.
TargetTuple carries the default to TargetMachine constructor.
TargetMachine holds the value to use for the compilation unit (except where
overridden).
TargetSubtargetInfo holds the value to use for the function.
>> I think I understand your proposed plan now and it's a few steps
ahead of
>> where we are and where we need to be. I agree that overridable state
should
>> be in TargetSubtargetInfo, however I can't initialize that state
without the
>> default values which come from the faulty information in
>> TargetMachine::TargetTriple. This triple work is a pre-requisite to
your plan
>> and at first I don't need to override ABI's.
> Can you provide an example of using a tool that you're having problems
with?
I'm using this particular example because I have the machine to hand. There
are
more examples above. On Debian Jessie (mips):
$ touch empty.c
$ gcc -c -o empty.gcc.o empty.c
$ clang-3.5 -c -o empty.clang.o empty.c
$ file empty.*.o
empty.clang.o: ELF 32-bit MSB relocatable, MIPS, MIPS32 rel2 version 1 (SYSV),
not stripped
empty.gcc.o:   ELF 32-bit MSB relocatable, MIPS, MIPS-II version 1 (SYSV), not
stripped

Jessie only has clang-3.5 but you'd get the same result with clang-3.7 and
clang-3.8.
> Could you please provide some examples of things that are impossible right
now
> with command lines, how those interact with the TargetMachine, and how you
see
> it being impossible to deal with?
There's some examples above but I'll give the detail in the morning.
It's 11:30pm
at the moment :-).

________________________________
From: Eric Christopher [echristo at gmail.com]
Sent: 22 September 2015 20:40
To: Daniel Sanders; Renato Golin; Jim Grosbach
Cc: llvm-dev at lists.llvm.org
Subject: Re: The Trouble with Triples

On Thu, Sep 17, 2015 at 6:21 AM Daniel Sanders <Daniel.Sanders at
imgtec.com<mailto:Daniel.Sanders at imgtec.com>> wrote:
I think we need to take a step further back and re-enter from the right starting
point. The thing that's bothering me about the push back so far is that
it's trying to discuss and understand the consequences of resolving the core
problem while seemingly ignoring the core problem itself. The reason I've
been steering everything back to GNU Triple's being ambiguous and
inconsistent is because it's the root of all the problems and the fixes to
the various issues fall out naturally once this core point has been addressed.

*sigh*

Here's the line of thought that I'd like people to start with:

•         Triples don't describe the target. They look like they should, but
they don't. They're really just arbitrary strings.

Triples are used as a starting point, but no more.

•         LLVM relies on Triple as a description of the target. It defines the
backend to use, the binary format to use, OS and Vendor specific quirks to
enable/disable, the default CPU, the default ABI, the endian, and countless
other details about the target.

These two statements aren't necessarily true in whole.

a) We don't use the Triple to fully specify the target.
b) We don't use the Triple to fully specify the ABI.
c) We don't use the Triple to fully specify the CPU.
d) We do use the triple to handle endianness since most, if not all, triples
actually bother to encode endianness.
e) The rest of the "countless details" may or may not be relevant, you
haven't given an example of what you care about.
>From here on your email relies on all of these assumptions being true. So
I'm going to skip past that part and go to where you answer some of my
questions.At this point, in the MC layer we have a number of classes that need to know the
ABI but lack this information. Our TargetMachine has an accurate TargetTuple
object that describes the invariants of the desired target. The desired ABI is
an invariant too so why not have it in the TargetTuple which is already plumbed
in everywhere we need it? After all, it's a property of the target
OS/Environment. If we have the ABI in the TargetTuple, then we don't need
any other means to set the ABI, tools can set it up front in the TargetTuple and
we don't need any command-line option handling for it in the backend.

This isn't sufficient anyways as I don't want to depend on a weird
serialization format to deal with something a simple command line can deal with
(or you've said this in a way that's confused me). I see you saying you
want:

-tuple mips-linux-gnu-abio32-el

to specify on a command line to, say, llvm-mc or a new assembler interface, or
heck, to clang itself, that you want to compile for:

-triple mipsel-linux-gnu -mabi=o32

right? Basically? (Bikeshedding of how to actually serialize things aside?)

Meanwhile, in clang we have a number of command line options that change the
desired target. Let's say we've constructed a Triple and resolved it to
TargetTuple (more on that below). We're now processing the –EL option. At
the moment, we substitute our mips-linux-gnu triple for a mipsel-linux-gnu
triple, construct a Triple object from it and resolve the new Triple to a
TargetTuple. But why do we need to bother with that kind of weird hackery when
we can simply do Obj.setEndian(Little)? This is what Phase 7 of the plan is
about. We end up with a cleaner way to process target changes that, until now,
have required weird triple hacking to handle.

This is something else I don't understand. Here is the first time you start
talking about APIs which is what I'm particularly asking about in my earlier
mails. I'd like to see how you plan on changing the TargetMachine and MC
level APIs to deal with this. It seems like the Tuple is going to be a way to
side-load information around to the MC layer and while I agree that something is
necessary there, I don't think that this solution is the right one. (As I
said earlier in the thread)

I skipped the Triple -> TargetTuple resolution a moment ago and I should
address that now. We already know that mapping Triple to TargetTuple is a many
to many mapping. One Triple has many possible TargetTuple's depending on the
environment. One TargetTuple can be formed from multiple possible Triples. In an
ideal world, we'd like to bake in all of these mappings so that one clang
binary supports everything. Unfortunately, being a many to many mapping, some of
these mappings are mutually exclusive. Note that this isn't a new problem
resulting from this project. The problem has always been there but has been
ignored until now. To resolve this, we need to provide configure-time and
possibly run-time controls for how this conversion is disambiguated. This
resolution is performed as early as possible so that the middle/back-ends
don't need to know anything about the ambiguity problem.

The minute you start talking about configure time controls we've already
lost. This, for me, is a non-starter. That said, I'd like to see the
examples you think show that things are impossible to deal with in the current
architecture.

---

To reply more directly to your email:

Thanks :)
> What can't be done to TargetMachine to avoid this serialization?
TargetMachine already has the serialization (see TargetMachine::TargetTriple).
We're not doing anything new here. We're simply replacing one object
holding faulty information with a new object holding reliable information.

This is side stepping my question and making it about Triple. I've
specifically said that TargetMachine does not and is not completely dependent
upon Triple.
> And a followup question: What can't be serialized at the function level
in the IR to make certain things clear that aren't global? We already do
this for a lot of command line options.
The data I want to fix is global. I think the bit you may be getting hung up on
here is that small portions of this global data can also be overridden at the
function level. Those overrides aren't a problem and continue to operate in
the same way as they do today.

Examples please.
> And one more: What global options do we need to consider here?
I'm not certain I understand this question. If you're talking command
line options, it's things like –EL, -EB, -mips32, -mips32r[2356], -mips64,
-mips64r[2356], -mabi=…. If you're talking about Triple -> TargetTuple
mappings, there's quite a wide variety but the main ones for Mips are
endian, architecture, default CPU, and default ABI.

All of these are representable right now in the TargetMachine as far as I can
tell. What examples are you having problems with?

> The goal of the configuration level of the TargetMachine is that it
controls things that don't change at the object level.
> This is a fairly recently stated goal, but I think it makes sense for LLVM
in general. TargetSubtargetInfo takes care of
> everything that resides under this (as much as possible, some bits are
still in transition, e.g. TargetOptions). This is part
> of my suggestion to Daniel about the problems with MCSubtargetInfo and the
assembler. Targets like Mips and ARM
> were unfortunately designed to change things on the fly during assembly and
need to collate or at least change defaults
> as we're processing code. I definitely had to deal with a lot of the
pain you're talking about when I was rewriting some
> of the handling there during the TargetSubtargetInfo work.
I generally agree with this. The key bit I need to draw attention to is that the
'defaults' don't change, but are instead overridden. These constant
defaults are stored in TargetMachine and particularly
TargetMachine::TargetTriple. These defaults are wrong for some toolchains since
the information stored in TargetMachine::TargetTriple are wrong. It's the
defaults I'm trying to fix rather than the overrides.

I don't understand what you mean here.

I think I understand your proposed plan now and it's a few steps ahead of
where we are and where we need to be. I agree that overridable state should be
in TargetSubtargetInfo, however I can't initialize that state without the
default values which come from the faulty information in
TargetMachine::TargetTriple. This triple work is a pre-requisite to your plan
and at first I don't need to override ABI's.

Can you provide an example of using a tool that you're having problems with?
> Right now I see TargetTuple as trying to take over all of the various
arguments to TargetMachine and encapsulate them into a single thing.
> I also don't see this is bad, but I also don't see it taking all of
them right now and I'm not sure how it solves some of the existing problems
> with data sharing that we've got which is where the push back
you're both getting is coming from here. Ultimately library-wise I can agree
> with some of the directions you're headed - I just don't see the
unification and interactions right now.
I think we'll end up with TargetTuple taking over many arguments to
TargetMachine but that's not my goal at this stage. My goal is simply to fix
the faulty information currently held in Triple and use the now-accurate
information in TargetTuple to fix various blocking issues that prevent a proper
Mips toolchain product based on Clang/LLVM. At the end of Phase 7, it become
possible to fix a number of issues that are impossible to fix right now because
the available data we can consult at the moment is incorrect.

Could you please provide some examples of things that are impossible right now
with command lines, how those interact with the TargetMachine, and how you see
it being impossible to deal with?

Thanks

-eric

From: Eric Christopher [mailto:echristo at gmail.com<mailto:echristo at
gmail.com>]
Sent: 16 September 2015 23:52
To: Renato Golin; Jim Grosbach
Cc: Daniel Sanders; llvm-dev at lists.llvm.org<mailto:llvm-dev at
lists.llvm.org>

Subject: Re: The Trouble with Triples

Let's take a step back here.

It appears that you and Daniel are trying to solve some problems. I think
solving problems is good, I just want to make sure that we're solving them
in a way that gets us a decent API at the end. I also want to make sure
we're solving the right problems.

TargetTuple appears to be related to the TargetParser as you bring up in this
mail. They're two separate parts of similar problems - people trying to both
serialize command line options and communication from the front end to the
backend with respect to target information.

This leads me to a question: What can't be done to TargetMachine to avoid
this serialization?
And a followup question: What can't be serialized at the function level in
the IR to make certain things clear that aren't global? We already do this
for a lot of command line options.
And one more: What global options do we need to consider here?

The goal of the configuration level of the TargetMachine is that it controls
things that don't change at the object level. This is a fairly recently
stated goal, but I think it makes sense for LLVM in general. TargetSubtargetInfo
takes care of everything that resides under this (as much as possible, some bits
are still in transition, e.g. TargetOptions). This is part of my suggestion to
Daniel about the problems with MCSubtargetInfo and the assembler. Targets like
Mips and ARM were unfortunately designed to change things on the fly during
assembly and need to collate or at least change defaults as we're processing
code. I definitely had to deal with a lot of the pain you're talking about
when I was rewriting some of the handling there during the TargetSubtargetInfo
work.

Now a bit more on TargetParser + TargetTuple:

TargetParser appears to be trying to solve the parsing in Triple in a nice way
for ARM and also some of the "what kind of subtarget feature
canonicalization can we do in llvm that makes sense to communicate to the front
end". I like this particular idea and have often wanted a library of
feature handling, but it seems to have stabilized at an ARM specific set of code
with no defined interface. I can't even figure out how I'd use it in
lib/Basic right now for any target other than ARM. This isn't a condemnation
of TargetParser, but I think it's something that needs to be thought through
a bit more. It's been hooked up well before I'd expected it to and right
now if we moved it to the ARM backend from Support it'd make just as much
sense as it does where it is now other than making clang depend on the ARM
backend as well as the X86 backend :)

Right now I see TargetTuple as trying to take over all of the various arguments
to TargetMachine and encapsulate them into a single thing. I also don't see
this is bad, but I also don't see it taking all of them right now and
I'm not sure how it solves some of the existing problems with data sharing
that we've got which is where the push back you're both getting is
coming from here. Ultimately library-wise I can agree with some of the
directions you're headed - I just don't see the unification and
interactions right now.

As a suggestion as a way forward here let's see if we can get my questions
above answered and also show some of how the interactions between llvm's
libraries are going to get fixed, moved to a better place, etc here.

Thanks!

-eric

On Wed, Sep 16, 2015 at 3:02 PM Renato Golin <renato.golin at
linaro.org<mailto:renato.golin at linaro.org>> wrote:
On 16 September 2015 at 21:56, Jim Grosbach <grosbach at
apple.com<mailto:grosbach at apple.com>> wrote:> Why do we care about GAS? We have an assembler.
It's not that simple.

There are a lot of old code out there, including the Linux kernel
which we do care a lot, that only compiles with GAS. We're slowly
moving the legacy code up to modern standards, and specifically some
kernel folks are happy to move up not only the asm syntax, but the C
standard and move away from GNU-specific behaviour. But we're not
quite there yet, and might not be for a few more years. so, yes, we
still care about GAS.

But this is not just about GAS.

As I said on my previous email, this is about clearing the bloat in
target descriptions by both: removing the need for adding numerous CPU
names, target features, architecture names (xscale, strongarm, etc),
AND making sure all parties (front/middle/back-ends) speak the same
language, produced from the same source.

The TargetTuple is that common language, and the TargetParser created
from the TableGen files is the common source. The Triple becomes a
legacy constructor value for the Tuple. All other target information
classes are already (or should be) generated from the TableGen files,
so the ultimate source becomes the TableGen description, which I think
it what you were aiming to on your comment.

For simple architectures, like x86, you don't even need a
TargetParser. You can easily construct the Tuple from a triple and use
the Tuple as you've always used the triple. No harm done. But for the
complex ones like ARM and MIPS, having a common interface generated
from the same place the other interfaces are is important to avoid
more bridges between front and middle and back end interpretations of
the same target. Whatever legacy ARM or MIPS carry can be isolated in
their own implementation, leaving the rest of the targets with a clean
and simple interface.

cheers,
--renato
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150922/ef47daac/attachment.html>

Apparently Analagous Threads

Search for more reasonably related threads

llvm dev - Sep 2015 - The Trouble with Triples

[llvm-dev] The Trouble with Triples

[llvm-dev] The Trouble with Triples

[llvm-dev] The Trouble with Triples

[llvm-dev] The Trouble with Triples

[llvm-dev] The Trouble with Triples

[llvm-dev] The Trouble with Triples

[llvm-dev] The Trouble with Triples

Apparently Analagous Threads