thr3ads.net - llvm dev - [llvm-dev] The Trouble with Triples [Sep 2015]

If this information is useful, please help other people find it:
Share via:

Renato Golin via llvm-dev

2015-Sep-15 19:21 UTC

[llvm-dev] The Trouble with Triples

On 15 September 2015 at 19:34, Daniel Sanders <Daniel.Sanders at
imgtec.com> wrote:> We can go further with this analogy too. For example, let's say John
Smith
> with the SSN Y also answers to the name Rameses. This is the problem that
> Renato is working on. Renato needs to be able to see the name Rameses and
> map this to the correct John Smith (or at least someone very much like
him).
> This is the gist of what ARMTargetParser is/was doing.
A good example is "krait", a CPU design from Qualcomm.

Krait used to be mapped to Cortex-A15 because it has VFP4 and HDIV,
but architecturally, it is a lot closer to a Cortex-A9 than an A15. So
assuming that Krait == A15 means making a lot of bad optimisation
decisions in the back-end, and the code performed poorly.

This year we made the change, so that Krait == A9+HDIV+VFP4, but
neither the triple, nor the CPU descriptions could cope with that.
Then, a hack was made to treat "krait" especially, changing the
internal options (triple and others) to explicitly say cpu=A9 + HDIV +
VFP4.

Logic like that is spread all over the compiler, not just in the
driver, but the back-end. For example, the recent discussions about
GNUEABI not entirely being the same as EABI, and the need for
additional flags, such as "-meabi=gnu". This has direct consequences
to the back-end, which has knowledge that it should not have (when
gnueabi is really just eabi, and when it's not). The back-end should
be ignorant of such things and have a flag "isEabi" do this
"else" do
that.

But triples can't do that at all. As Daniel demonstrated, they have a
very specific, irrational and sometimes perverse meaning. Any change
we do to the triples will impact compatibility with other toolchains
and we can't afford that.

So, instead of refactoring the Triple to carry all legacy + all decent
logic, we thought that making the Triple into *just* legacy, and
having a Tuple as the new way forward we'd achieve two main goals that
was discussed in the community for a very long time:

1. Separate target-specific crud from the rest. That also includes
"legacy ARM" from "new ARM". All users of these interfaces
should not
just be unaware of the underlying complexity, but also not have to pay
the price for that complexity. So all string-parsing, name-mangling,
legacy-wrapper should be the last action, not the first.

2. Be able to create complex target descriptions from the command
line, or a config file, or some database, or whatever. Especially in
the ARM world, target description databases are not uncommon, as they
solve the ambiguity problem in a clear way. But we don't want to move
everyone into a complex database just because ARM description is a
mess. So, the design is to have the Tuple constructed from a Triple
for legacy and simplicity reasons, but keeping an accurate and
unambiguous description for the more complex targets, while allowing
*other* methods of creating a Tuple for those of us who need it.

So, by default, or if the driver has a "-target foo", it uses the
triple to create the tuple, and no change in behaviour is observed.
OTOH, if the driver has a "-config arm-foo.yaml", it creates the Tuple
from that config file, while still keeping the same unique values
being passed down the target description classes to the back-end, and
*no change in behaviour is observed*.

Hope that helps,

cheers,
--renato

Jim Grosbach via llvm-dev

2015-Sep-15 20:58 UTC

head link

[llvm-dev] The Trouble with Triples

> On Sep 15, 2015, at 12:21 PM, Renato Golin <renato.golin at
linaro.org> wrote:
> 
> On 15 September 2015 at 19:34, Daniel Sanders <Daniel.Sanders at
imgtec.com> wrote:
>> We can go further with this analogy too. For example, let's say
John Smith
>> with the SSN Y also answers to the name Rameses. This is the problem
that
>> Renato is working on. Renato needs to be able to see the name Rameses
and
>> map this to the correct John Smith (or at least someone very much like
him).
>> This is the gist of what ARMTargetParser is/was doing.
> 
> A good example is "krait", a CPU design from Qualcomm.
> 
> Krait used to be mapped to Cortex-A15 because it has VFP4 and HDIV,
> but architecturally, it is a lot closer to a Cortex-A9 than an A15. So
> assuming that Krait == A15 means making a lot of bad optimisation
> decisions in the back-end, and the code performed poorly.
> 
> This year we made the change, so that Krait == A9+HDIV+VFP4, but
> neither the triple, nor the CPU descriptions could cope with that.
> Then, a hack was made to treat "krait" especially, changing the
> internal options (triple and others) to explicitly say cpu=A9 + HDIV +
> VFP4.
That’s not quite accurate. It’s not A9+HDIV+VFP. It uses the A9 scheduling
model, yes, but has its own completely distinct list of sub target features and
such:

def : ProcessorModel<"krait",       CortexA9Model,
                                    [ProcKrait, HasV7Ops,
                                     FeatureNEON, FeatureDB,
                                     FeatureDSPThumb2, FeatureHasRAS,
                                     FeatureAClass]>;  

def ProcKrait   : SubtargetFeature<"krait",
"ARMProcFamily", "Krait",
                                   "Qualcomm ARM processors",
                                   [FeatureVMLxForwarding,
                                    FeatureT2XtPk, FeatureFP16,
                                    FeatureAvoidPartialCPSR,
                                    FeatureTrustZone,
                                    FeatureVFP4,
                                    FeatureHWDiv,
                                    FeatureHWDivARM]>;

Having the triple, or anything else outside this level of implementation
details, refer to krait in terms of cortex-a9 is a problem.

> 
> Logic like that is spread all over the compiler, not just in the
> driver, but the back-end. For example, the recent discussions about
> GNUEABI not entirely being the same as EABI, and the need for
> additional flags, such as "-meabi=gnu". This has direct
consequences
> to the back-end, which has knowledge that it should not have (when
> gnueabi is really just eabi, and when it's not). The back-end should
> be ignorant of such things and have a flag "isEabi" do this
"else" do
> that.
> 
> But triples can't do that at all. As Daniel demonstrated, they have a
> very specific, irrational and sometimes perverse meaning. Any change
> we do to the triples will impact compatibility with other toolchains
> and we can't afford that.
> 
> So, instead of refactoring the Triple to carry all legacy + all decent
> logic, we thought that making the Triple into *just* legacy, and
> having a Tuple as the new way forward we'd achieve two main goals that
> was discussed in the community for a very long time:
> 
> 1. Separate target-specific crud from the rest. That also includes
> "legacy ARM" from "new ARM". All users of these
interfaces should not
> just be unaware of the underlying complexity, but also not have to pay
> the price for that complexity. So all string-parsing, name-mangling,
> legacy-wrapper should be the last action, not the first.
> 
> 2. Be able to create complex target descriptions from the command
> line, or a config file, or some database, or whatever. Especially in
> the ARM world, target description databases are not uncommon, as they
> solve the ambiguity problem in a clear way. But we don't want to move
> everyone into a complex database just because ARM description is a
> mess. So, the design is to have the Tuple constructed from a Triple
> for legacy and simplicity reasons, but keeping an accurate and
> unambiguous description for the more complex targets, while allowing
> *other* methods of creating a Tuple for those of us who need it.
> 
> So, by default, or if the driver has a "-target foo", it uses the
> triple to create the tuple, and no change in behaviour is observed.
> OTOH, if the driver has a "-config arm-foo.yaml", it creates the
Tuple
> from that config file, while still keeping the same unique values
> being passed down the target description classes to the back-end, and
> *no change in behaviour is observed*.
> 
> Hope that helps,
> 
> cheers,
> --renato

Daniel Sanders via llvm-dev

2015-Sep-16 10:13 UTC

head link

[llvm-dev] The Trouble with Triples

> That’s not quite accurate. It’s not A9+HDIV+VFP. It uses the A9 scheduling
> model, yes, but has its own completely distinct list of sub target features
and
> such:
Are you sure? I don't know ARM very well so I might be missing something,
but Krait's feature list is the same as A9's with the addition of
FeatureVFP4, FeatureHWDiv, and FeatureHWDivARM. This sounds like A9+HDIV+VFP.
For reference, here are the lists side-by-side:
	A9: FeatureVMLxForwarding, FeatureT2XtPk, FeatureFP16, FeatureAvoidPartialCPSR,
FeatureTrustZone
	Krait: FeatureVMLxForwarding, FeatureT2XtPk, FeatureFP16,
FeatureAvoidPartialCPSR, FeatureTrustZone, FeatureVFP4, FeatureHWDiv,
FeatureHWDivARM

The other thing I want to mention is that some triples do legitimately contain
things
that aren't architecture names. Processor names in the triple has fallen out
of favour
in Mips triples (we used to have things like mips64vr5900-* for the vr5900
processor)
but we still have things like 'xc32' which refers to the Mips-based
PIC32. We don't want
the backend to need to know about 'xc32' and we don't want to
register a Triple::xc32
target but it's also incorrect to treat it as a simple alias for
'mips'.
> -----Original Message-----
> From: grosbach at apple.com [mailto:grosbach at apple.com]
> Sent: 15 September 2015 21:59
> To: Renato Golin
> Cc: Daniel Sanders; Eric Christopher; llvm-dev at lists.llvm.org
> Subject: Re: The Trouble with Triples
> 
> 
> > On Sep 15, 2015, at 12:21 PM, Renato Golin <renato.golin at
linaro.org>
> wrote:
> >
> > On 15 September 2015 at 19:34, Daniel Sanders
> <Daniel.Sanders at imgtec.com> wrote:
> >> We can go further with this analogy too. For example, let's
say John Smith
> >> with the SSN Y also answers to the name Rameses. This is the
problem
> that
> >> Renato is working on. Renato needs to be able to see the name
Rameses
> and
> >> map this to the correct John Smith (or at least someone very much
like
> him).
> >> This is the gist of what ARMTargetParser is/was doing.
> >
> > A good example is "krait", a CPU design from Qualcomm.
> >
> > Krait used to be mapped to Cortex-A15 because it has VFP4 and HDIV,
> > but architecturally, it is a lot closer to a Cortex-A9 than an A15. So
> > assuming that Krait == A15 means making a lot of bad optimisation
> > decisions in the back-end, and the code performed poorly.
> >
> > This year we made the change, so that Krait == A9+HDIV+VFP4, but
> > neither the triple, nor the CPU descriptions could cope with that.
> > Then, a hack was made to treat "krait" especially, changing
the
> > internal options (triple and others) to explicitly say cpu=A9 + HDIV +
> > VFP4.
> 
> That’s not quite accurate. It’s not A9+HDIV+VFP. It uses the A9 scheduling
> model, yes, but has its own completely distinct list of sub target features
and
> such:
> 
> def : ProcessorModel<"krait",       CortexA9Model,
>                                     [ProcKrait, HasV7Ops,
>                                      FeatureNEON, FeatureDB,
>                                      FeatureDSPThumb2, FeatureHasRAS,
>                                      FeatureAClass]>;
> 
> def ProcKrait   : SubtargetFeature<"krait",
"ARMProcFamily", "Krait",
>                                    "Qualcomm ARM processors",
>                                    [FeatureVMLxForwarding,
>                                     FeatureT2XtPk, FeatureFP16,
>                                     FeatureAvoidPartialCPSR,
>                                     FeatureTrustZone,
>                                     FeatureVFP4,
>                                     FeatureHWDiv,
>                                     FeatureHWDivARM]>;
> 
> Having the triple, or anything else outside this level of implementation
> details, refer to krait in terms of cortex-a9 is a problem.
> 
> 
> >
> > Logic like that is spread all over the compiler, not just in the
> > driver, but the back-end. For example, the recent discussions about
> > GNUEABI not entirely being the same as EABI, and the need for
> > additional flags, such as "-meabi=gnu". This has direct
consequences
> > to the back-end, which has knowledge that it should not have (when
> > gnueabi is really just eabi, and when it's not). The back-end
should
> > be ignorant of such things and have a flag "isEabi" do this
"else" do
> > that.
> >
> > But triples can't do that at all. As Daniel demonstrated, they
have a
> > very specific, irrational and sometimes perverse meaning. Any change
> > we do to the triples will impact compatibility with other toolchains
> > and we can't afford that.
> >
> > So, instead of refactoring the Triple to carry all legacy + all decent
> > logic, we thought that making the Triple into *just* legacy, and
> > having a Tuple as the new way forward we'd achieve two main goals
that
> > was discussed in the community for a very long time:
> >
> > 1. Separate target-specific crud from the rest. That also includes
> > "legacy ARM" from "new ARM". All users of these
interfaces should not
> > just be unaware of the underlying complexity, but also not have to pay
> > the price for that complexity. So all string-parsing, name-mangling,
> > legacy-wrapper should be the last action, not the first.
> >
> > 2. Be able to create complex target descriptions from the command
> > line, or a config file, or some database, or whatever. Especially in
> > the ARM world, target description databases are not uncommon, as they
> > solve the ambiguity problem in a clear way. But we don't want to
move
> > everyone into a complex database just because ARM description is a
> > mess. So, the design is to have the Tuple constructed from a Triple
> > for legacy and simplicity reasons, but keeping an accurate and
> > unambiguous description for the more complex targets, while allowing
> > *other* methods of creating a Tuple for those of us who need it.
> >
> > So, by default, or if the driver has a "-target foo", it
uses the
> > triple to create the tuple, and no change in behaviour is observed.
> > OTOH, if the driver has a "-config arm-foo.yaml", it creates
the Tuple
> > from that config file, while still keeping the same unique values
> > being passed down the target description classes to the back-end, and
> > *no change in behaviour is observed*.
> >
> > Hope that helps,
> >
> > cheers,
> > --renato

Renato Golin via llvm-dev

2015-Sep-16 10:15 UTC

head link

[llvm-dev] The Trouble with Triples

On 15 September 2015 at 21:58, Jim Grosbach <grosbach at apple.com>
wrote:> That’s not quite accurate. It’s not A9+HDIV+VFP. It uses the A9 scheduling
model, yes, but has its own completely distinct list of sub target features and
such:
Well, this is the target description in the TableGen files, and not
exactly what I was talking about. when available, A9 has VFP3, while
Krait has VFP4.

Which brings the other side of the discussion, around the
TargetParser, that the information in Clang is completely disconnected
from the TableGen descriptions, but it's not relevant to the
discussion.

> Having the triple, or anything else outside this level of implementation
details, refer to krait in terms of cortex-a9 is a problem.
The Krait case is special, because GAS doesn't support it, so we have
to use A9 + target-features or compilation breaks when using GAS.
We're trying to get it in GAS, but it could take a while. When it
does, we change it.

But the ARM world is more complicated than that. Cores come in so many
variations that it would be impossible to give a name to all of them,
or more importantly, it would bloat our target descriptions to an
unmanageable point. Having a way to access the target features at that
level via a high-level target description is crucial to keep all that
mess to a minimum, while still having a complete and sane
implementation.

Bottom line is, we can't create a name for each CPU+Arch+Features
variations out there because of technical (bloat in LLVM, maintenance
nightmare) and political (interaction with other tools and LLVM based
products).

cheers,
--renato

Reasonably Related Threads

Search for more reasonably related threads

llvm dev - Sep 2015 - The Trouble with Triples

[llvm-dev] The Trouble with Triples

[llvm-dev] The Trouble with Triples

[llvm-dev] The Trouble with Triples

[llvm-dev] The Trouble with Triples

Reasonably Related Threads