thr3ads.net - llvm dev - [llvm-dev] The Trouble with Triples [Aug 2015]

If this information is useful, please help other people find it:
Share via:

Eric Christopher

2015-Jul-29 22:47 UTC

[LLVMdev] The Trouble with Triples

>
> The Triple object will remain unchanged.
>
> The Tuple will be the API to handle getting/setting parameters
> depending on the Triple, compiler flags, attributes, etc.
>
>This part doesn't seem obvious from the direction the patches are going.

> There will be no string representation of all options, as that would
> be impossible, or at least, highly undesirable, especially in the IR.
>
>Yes.

> The Tuple is for the sole use of front-ends, middle-ends and back-ends
> to communicate and understand the *same* meaning regarding the *same*
> input.
>
>Definitely don't want this in the middle end at all. That all can be part
of the TargetMachine/TargetSubtargetInfo interface.

> Having a Tuple class that encodes details of the targets go a long way
> to ensure that, since you can directly pass the Tuple when you build
> the Target objects, and the information it provides will be identical,
> no matter where it is. Right now, we have multiple representations of
> the targets' information because the Triple object cannot encode every
> aspect of them, especially problematic between Clang and LLVM.
>
>This part I agree with.

> The decision to create a new class (Tuple) is because Triple already
> has a legacy-heavy meaning, which should not change.
>
>Agreed with at least the "because" part.

> This is not about the serialization of the target information, but
> rather the consistent manipulation of it inside the compiler.
>
>
> > My suggestion on a route forward here is that we should look at the
> > particular API and areas of the backend that you're having an
issue with
> and
> > figure out how to best communicate the data you'd like to the
appropriate
> > area.
>
> That is precisely what he's doing. :)
>
>OK. What's the general class design look like then? The text from the
original mail was fairly confusing then as I thought he was doing something
that you say he isn't doing :)

-eric
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150729/65e9019b/attachment.html>

Daniel Sanders via llvm-dev

2015-Aug-08 14:00 UTC

head link

[llvm-dev] The Trouble with Triples

Re-sent with the new mailing list CC'd.
________________________________________
From: Daniel Sanders
Sent: 08 August 2015 14:58
To: Renato Golin; Eric Christopher
Cc: LLVM Developers Mailing List (llvmdev at cs.uiuc.edu); Jim Grosbach
(grosbach at apple.com)
Subject: RE: The Trouble with Triples
> > OK. What's the general class design look like then? The text from
the
> > original mail was fairly confusing then as I thought he was doing
something
> > that you say he isn't doing :)
> Daniel, can you send your current plan for the Tuple class?
Sorry for being a bit slow to answer this bit. It's been a busy week due to
my sister's wedding.

The plan for TargetTuple's implementation is fairly brief since it is
heavily based on the current llvm::Triple but it has two main parts to it. The
first is fairly uninteresting and is to create a TargetTuple class with (almost)
the same interface as Triple and have this be the primary object throughout the
core libraries. This separates the GNU Triple from the target description and
gives us room to make changes to the target description without changing the
meaning of 'GNU Triple'. This is what the patch I posted is starting and
we've already covered this part of the plan so I'll focus on the second
part.

The second part is to refactor the representation and tweak the
interface/implementation of TargetTuple into something more appropriate and
complete for a target description. As you'll see, these
implementation/interface changes compared to llvm::Triple are fairly minor. The
key improvement in the addition of the TargetTuple is in the ability to
represent the real target after tool options are taken into account and
llvm::Triple's methods are generally good as they are.

The member variables will be similar in principle to llvm::Triple except that
the string form (if we have any at all following the discussion elsewhere in
this thread) will not be stored. This is because, unlike Triple, we can
reconstruct it from the fields. Member variables will be split where they
currently specify multiple things. For example, the architecture field sometimes
specifies endianness in values like 'mips'/'mipsel' and these
will be split into an architecture value 'mips' and a separate
endianness boolean. I don't think we need to handle mixed-endian but we can
use enums if necessary. I'm expecting to have the following members in the
base class by the end (all undefined types here are enums and most are in the
posted patch):

  // The architecture with no additional details such as endianness.
  // Note that this is not the same ArchType as in llvm::Triple. For example,
'arm' and 'armeb' are both
  // simply 'arm' in TargetTuple. Similarly for other endian variants
and things like 'mips'/'mips64' where
  // 'mips64' isn't a significantly different architecture from
'mips'
  ArchType Arch;

  // The Architecture revision or variant
  // Like Arch, this isn't the same as the one in llvm::Triple since more
architectures ought to be using it than is necessary for llvm::Triple.
  // For example, Mips architecture revisions should be represented here even
though GNU triples don't state it. For completeness,
  // I should mention that there are some unconventional triples that do mention
Mips architecture revisions but we don't implement them at the moment.
  SubArchType SubArch;

  // The vendor.
  // This doesn't have to be the same as the one in llvm::Triple but I see
little reason to differ.
  VendorType Vendor;

  // The OS.
  // This doesn't have to be the same as the one in llvm::Triple but I see
little reason to differ.
  OSType OS;

  // The environment.
  // This doesn't have to be the same as the one in llvm::Triple but I see
little reason to differ.
  EnvironmentType Environment;
  unsigned MajorVersion, MinorVersion, MicroVersion;

  // The object format (ELF, MachO, COFF, etc.) in use.
  ObjectFormatType ObjectFormat;

  // The endianness. If we ever need mixed-endian or other variants we can
change this to an enum.
  bool BigEndian;

  // The ABI. The value 'Standard' can be used for targets with only one
ABI.
  ABIType ABI;

  // The SubABI. As far as I know, few architectures need this so it may make
more sense in a target specific subclass
  SubABIType SubABI;

To be clear, the above list is based on the assumption that we can serialize a
TargetTuple in IR without including a GNU Triple. If we do want to always write
out the GNU Triple when writing IR then we will need to add it to this list of
members since we can't derive it from the rest of the TargetTuple (we can
have zero or multiple valid GNU Triples for a given TargetTuple).

As these members are introduced, we will re-implement the predicates that
we've carried over from llvm::Triple in terms of these members. The current
list of interface changes compared to llvm::Triple is below but note that this
list is subject to change:
* Add getters/setters for new members as necessary (e.g. for endian or ABI).
* Remove normalize() since the representation of a TargetTuple is already
normalized.
* Assuming we don't want to write GNU Triples to IR, remove getTriple() and
setTriple() since we won't need them. Otherwise, keep them but make them use
an llvm::Triple instead of a string.
* Assuming we don't want a string form, remove str() as well.
* Rename isArch*bit() to arePointers*bit() which is all they really test. The
current name is misleading on some architectures since it implies that
architectures are inherently 16/32/64-bit but it's really a software
decision.
* Remove the get(Big|Little)EndianArchVariant() factory methods in favour of a
setEndian(...) (or similar) mutator.
* Likewise, remove get*bitArchVariant() and have the TargetParser handle
-m32/-m64 instead since it needs to call setArch() or setABI() depending on the
architecture.
* Remove getArchTypePrefix() since there shouldn't be anything that truly
needs the original string if our representation is good enough.
There are a few other minor tweaks that could probably be made but I've left
them out of this list because they are very target specific (e.g. OS X version
numbers need a special comparison predicate at the moment).

For completeness, I should also mention that as a result of the introduction of
TargetTuple a number of member functions of Triple will become unused. These
will be deprecated and deleted.
Finally, at some point we'll need to introduce target specific TargetTuple
subclasses to add target specific data like ARM's FPU information or
MIPS's NaN encoding. At this point we'll have to add a factory method to
select the correct subclass based on the Arch member. This should be a fairly
easy change when we get to that point as there will be very few invocations of
the TargetTuple constructor left.

Hopefully, that gives you a better picture of the end point for TargetTuple.
Overall, it's just a mutable version of Triple with some additional target
information and without the ambiguity found in GNU Triples. Let me know if you
still have questions.
________________________________________
From: Renato Golin [renato.golin at linaro.org]
Sent: 30 July 2015 00:11
To: Eric Christopher
Cc: Daniel Sanders; LLVM Developers Mailing List (llvmdev at cs.uiuc.edu); Jim
Grosbach (grosbach at apple.com)
Subject: Re: The Trouble with Triples

On 29 July 2015 at 23:47, Eric Christopher <echristo at gmail.com>
wrote:> This part doesn't seem obvious from the direction the patches are
going.
Until now, most of what he has done was to refactor the Triple class,
with no functional change, and to create a thin layer around the
Triple (called Tuple) and pass those instead. This is on par with that
premise.

The current patch is the first one to actually have some more
substantial change, so it's good that you stopped it now, before we
start breaking everything.

Maybe, knowing what it is now, if you could have another quick look at
the patch, and see if the new light has helped understand the patch
for what it will be. Maybe it's still not good enough, so then we'll
have to resort to a new round of design discussions.

> Definitely don't want this in the middle end at all. That all can be
part of
> the TargetMachine/TargetSubtargetInfo interface.
Ah, yes! But the TargetMachine (& pals) are created from information
from the Triple and the other bits that front-ends keep for
themselves.

So, in theory, if the Tuple is universal, creating them with a Tuple
(instead of a Triple+stuff) will free the front-ends of keeping the
rest of the info on their own, and TargetMachine/SubTargetInfo/etc
will be more homogeneous across different tools / front-ends than it
is today.

Another strong point is: we're not trying to change any other machine
description / API. This is just about the user options and defaults,
that are used to *create* machine descriptions.

>> The decision to create a new class (Tuple) is because Triple already
>> has a legacy-heavy meaning, which should not change.
>
> Agreed with at least the "because" part.
There was also the name. Triple is very far from the truth. :)

But changing the Triple class could cause ripples in the mud that
would be hard to see at first, and hard to change later, after people
started relying on it.

The final goal is that the Triple class would end up as being nothing
more than a Triple *parser*, with the current legacy logic, setting up
the Tuple fields and using them to select the rest of the default
fields.

> OK. What's the general class design look like then? The text from the
> original mail was fairly confusing then as I thought he was doing something
> that you say he isn't doing :)
Daniel, can you send your current plan for the Tuple class?

cheers,
--renato

Eric Christopher via llvm-dev

2015-Sep-15 16:24 UTC

head link

[llvm-dev] Fwd: The Trouble with Triples

resending to the right address for the mailing list.

---------- Forwarded message ---------
From: Eric Christopher <echristo at gmail.com>
Date: Tue, Sep 15, 2015 at 9:11 AM
Subject: Re: The Trouble with Triples
To: Daniel Sanders <Daniel.Sanders at imgtec.com>, Renato Golin <
renato.golin at linaro.org>
Cc: LLVM Developers Mailing List (llvmdev at cs.uiuc.edu) <llvmdev at
cs.uiuc.edu>,
Jim Grosbach (grosbach at apple.com) <grosbach at apple.com>


I was writing a rather longer email with an actual plan for you, but as
you've committed I'll send it briefly.

While you've done a lot of (legitimate) complaining about mips I don't
see
how this is an effective long term strategy for how we communicate
information within llvm. This mostly duplicates a lot of data that we have
sitting around and puts it into an interface. Why aren't you fixing the
interfaces rather than duplicating the data? It's a complicated interface
right now and a lot of work to untangle it, but that's what's needed.

The largest part of the problem for the mips port is that the assembler
needs information that only the TargetMachine has in any useful way. The
better idea is to:

a) get rid of the is-a interface for MCSubtargetInfo/TargetSubtargetInfo
and replace that with a has-a interface so that you can look up information
directly and only IR level compilation information remains at the
TargetSubtargetInfo interface, while

b) probably adding a TargetMachine equivalent at the MC level. This could
handle a lot of the low level target interface you're concerned with that
we don't have at that level, but do at the TargetMachine level.

Does this make sense?

Until LTO the serialization issue isn't much of an issue as anything that
wants to deal with code generation in this way can create a TargetMachine
via some of the existing code generation interfaces, it's most assuredly
not ideal, but mostly hampered by the existing libLTO mechanism for passing
options to the code generator. A serialization here might make sense but I
haven't figured out if one is necessarily needed yet.

-eric


On Sat, Aug 8, 2015 at 6:58 AM Daniel Sanders <Daniel.Sanders at
imgtec.com>
wrote:
> > > OK. What's the general class design look like then? The text
from the
> > > original mail was fairly confusing then as I thought he was doing
> something
> > > that you say he isn't doing :)
> > Daniel, can you send your current plan for the Tuple class?
>
> Sorry for being a bit slow to answer this bit. It's been a busy week
due
> to my sister's wedding.
>
> The plan for TargetTuple's implementation is fairly brief since it is
> heavily based on the current llvm::Triple but it has two main parts to it.
> The first is fairly uninteresting and is to create a TargetTuple class with
> (almost) the same interface as Triple and have this be the primary object
> throughout the core libraries. This separates the GNU Triple from the
> target description and gives us room to make changes to the target
> description without changing the meaning of 'GNU Triple'. This is
what the
> patch I posted is starting and we've already covered this part of the
plan
> so I'll focus on the second part.
>
> The second part is to refactor the representation and tweak the
> interface/implementation of TargetTuple into something more appropriate and
> complete for a target description. As you'll see, these
> implementation/interface changes compared to llvm::Triple are fairly minor.
> The key improvement in the addition of the TargetTuple is in the ability to
> represent the real target after tool options are taken into account and
> llvm::Triple's methods are generally good as they are.
>
> The member variables will be similar in principle to llvm::Triple except
> that the string form (if we have any at all following the discussion
> elsewhere in this thread) will not be stored. This is because, unlike
> Triple, we can reconstruct it from the fields. Member variables will be
> split where they currently specify multiple things. For example, the
> architecture field sometimes specifies endianness in values like
> 'mips'/'mipsel' and these will be split into an
architecture value 'mips'
> and a separate endianness boolean. I don't think we need to handle
> mixed-endian but we can use enums if necessary. I'm expecting to have
the
> following members in the base class by the end (all undefined types here
> are enums and most are in the posted patch):
>
>   // The architecture with no additional details such as endianness.
>   // Note that this is not the same ArchType as in llvm::Triple. For
> example, 'arm' and 'armeb' are both
>   // simply 'arm' in TargetTuple. Similarly for other endian
variants and
> things like 'mips'/'mips64' where
>   // 'mips64' isn't a significantly different architecture from
'mips'
>   ArchType Arch;
>
>   // The Architecture revision or variant
>   // Like Arch, this isn't the same as the one in llvm::Triple since
more
> architectures ought to be using it than is necessary for llvm::Triple.
>   // For example, Mips architecture revisions should be represented here
> even though GNU triples don't state it. For completeness,
>   // I should mention that there are some unconventional triples that do
> mention Mips architecture revisions but we don't implement them at the
> moment.
>   SubArchType SubArch;
>
>   // The vendor.
>   // This doesn't have to be the same as the one in llvm::Triple but I
see
> little reason to differ.
>   VendorType Vendor;
>
>   // The OS.
>   // This doesn't have to be the same as the one in llvm::Triple but I
see
> little reason to differ.
>   OSType OS;
>
>   // The environment.
>   // This doesn't have to be the same as the one in llvm::Triple but I
see
> little reason to differ.
>   EnvironmentType Environment;
>   unsigned MajorVersion, MinorVersion, MicroVersion;
>
>   // The object format (ELF, MachO, COFF, etc.) in use.
>   ObjectFormatType ObjectFormat;
>
>   // The endianness. If we ever need mixed-endian or other variants we can
> change this to an enum.
>   bool BigEndian;
>
>   // The ABI. The value 'Standard' can be used for targets with
only one
> ABI.
>   ABIType ABI;
>
>   // The SubABI. As far as I know, few architectures need this so it may
> make more sense in a target specific subclass
>   SubABIType SubABI;
>
> To be clear, the above list is based on the assumption that we can
> serialize a TargetTuple in IR without including a GNU Triple. If we do want
> to always write out the GNU Triple when writing IR then we will need to add
> it to this list of members since we can't derive it from the rest of
the
> TargetTuple (we can have zero or multiple valid GNU Triples for a given
> TargetTuple).
>
> As these members are introduced, we will re-implement the predicates that
> we've carried over from llvm::Triple in terms of these members. The
current
> list of interface changes compared to llvm::Triple is below but note that
> this list is subject to change:
> * Add getters/setters for new members as necessary (e.g. for endian or
> ABI).
> * Remove normalize() since the representation of a TargetTuple is already
> normalized.
> * Assuming we don't want to write GNU Triples to IR, remove getTriple()
> and setTriple() since we won't need them. Otherwise, keep them but make
> them use an llvm::Triple instead of a string.
> * Assuming we don't want a string form, remove str() as well.
> * Rename isArch*bit() to arePointers*bit() which is all they really test.
> The current name is misleading on some architectures since it implies that
> architectures are inherently 16/32/64-bit but it's really a software
> decision.
> * Remove the get(Big|Little)EndianArchVariant() factory methods in favour
> of a setEndian(...) (or similar) mutator.
> * Likewise, remove get*bitArchVariant() and have the TargetParser handle
> -m32/-m64 instead since it needs to call setArch() or setABI() depending on
> the architecture.
> * Remove getArchTypePrefix() since there shouldn't be anything that
truly
> needs the original string if our representation is good enough.
> There are a few other minor tweaks that could probably be made but I've
> left them out of this list because they are very target specific (e.g. OS X
> version numbers need a special comparison predicate at the moment).
>
> For completeness, I should also mention that as a result of the
> introduction of TargetTuple a number of member functions of Triple will
> become unused. These will be deprecated and deleted.
> Finally, at some point we'll need to introduce target specific
TargetTuple
> subclasses to add target specific data like ARM's FPU information or
MIPS's
> NaN encoding. At this point we'll have to add a factory method to
select
> the correct subclass based on the Arch member. This should be a fairly easy
> change when we get to that point as there will be very few invocations of
> the TargetTuple constructor left.
>
> Hopefully, that gives you a better picture of the end point for
> TargetTuple. Overall, it's just a mutable version of Triple with some
> additional target information and without the ambiguity found in GNU
> Triples. Let me know if you still have questions.
> ________________________________________
> From: Renato Golin [renato.golin at linaro.org]
> Sent: 30 July 2015 00:11
> To: Eric Christopher
> Cc: Daniel Sanders; LLVM Developers Mailing List (llvmdev at cs.uiuc.edu);
> Jim Grosbach (grosbach at apple.com)
> Subject: Re: The Trouble with Triples
>
> On 29 July 2015 at 23:47, Eric Christopher <echristo at gmail.com>
wrote:
> > This part doesn't seem obvious from the direction the patches are
going.
>
> Until now, most of what he has done was to refactor the Triple class,
> with no functional change, and to create a thin layer around the
> Triple (called Tuple) and pass those instead. This is on par with that
> premise.
>
> The current patch is the first one to actually have some more
> substantial change, so it's good that you stopped it now, before we
> start breaking everything.
>
> Maybe, knowing what it is now, if you could have another quick look at
> the patch, and see if the new light has helped understand the patch
> for what it will be. Maybe it's still not good enough, so then
we'll
> have to resort to a new round of design discussions.
>
>
> > Definitely don't want this in the middle end at all. That all can
be
> part of
> > the TargetMachine/TargetSubtargetInfo interface.
>
> Ah, yes! But the TargetMachine (& pals) are created from information
> from the Triple and the other bits that front-ends keep for
> themselves.
>
> So, in theory, if the Tuple is universal, creating them with a Tuple
> (instead of a Triple+stuff) will free the front-ends of keeping the
> rest of the info on their own, and TargetMachine/SubTargetInfo/etc
> will be more homogeneous across different tools / front-ends than it
> is today.
>
> Another strong point is: we're not trying to change any other machine
> description / API. This is just about the user options and defaults,
> that are used to *create* machine descriptions.
>
>
> >> The decision to create a new class (Tuple) is because Triple
already
> >> has a legacy-heavy meaning, which should not change.
> >
> > Agreed with at least the "because" part.
>
> There was also the name. Triple is very far from the truth. :)
>
> But changing the Triple class could cause ripples in the mud that
> would be hard to see at first, and hard to change later, after people
> started relying on it.
>
> The final goal is that the Triple class would end up as being nothing
> more than a Triple *parser*, with the current legacy logic, setting up
> the Tuple fields and using them to select the rest of the default
> fields.
>
>
> > OK. What's the general class design look like then? The text from
the
> > original mail was fairly confusing then as I thought he was doing
> something
> > that you say he isn't doing :)
>
> Daniel, can you send your current plan for the Tuple class?
>
> cheers,
> --renato
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150915/75db9829/attachment.html>

Eric Christopher via llvm-dev

2015-Sep-15 16:30 UTC

head link

[llvm-dev] The Trouble with Triples

>
> I was writing a rather longer email with an actual plan for you, but as
> you've committed I'll send it briefly.
>
> While you've done a lot of (legitimate) complaining about mips I
don't see
> how this is an effective long term strategy for how we communicate
> information within llvm. This mostly duplicates a lot of data that we have
> sitting around and puts it into an interface. Why aren't you fixing the
> interfaces rather than duplicating the data? It's a complicated
interface
> right now and a lot of work to untangle it, but that's what's
needed.
>
> The largest part of the problem for the mips port is that the assembler
> needs information that only the TargetMachine has in any useful way. The
> better idea is to:
>
> a) get rid of the is-a interface for MCSubtargetInfo/TargetSubtargetInfo
> and replace that with a has-a interface so that you can look up information
> directly and only IR level compilation information remains at the
> TargetSubtargetInfo interface, while
>
> b) probably adding a TargetMachine equivalent at the MC level. This could
> handle a lot of the low level target interface you're concerned with
that
> we don't have at that level, but do at the TargetMachine level.
>
> Does this make sense?
>
>To be fair TargetTuple might be the answer, but I don't see how it's
necessarily solving the issues with the MC layer interfaces rather than
simply adding a blob of data to them? Could you explain what the future
direction is for how this is going to work?

-eric

> Until LTO the serialization issue isn't much of an issue as anything
that
> wants to deal with code generation in this way can create a TargetMachine
> via some of the existing code generation interfaces, it's most
assuredly
> not ideal, but mostly hampered by the existing libLTO mechanism for passing
> options to the code generator. A serialization here might make sense but I
> haven't figured out if one is necessarily needed yet.
>
> -eric
>
>
> On Sat, Aug 8, 2015 at 6:58 AM Daniel Sanders <Daniel.Sanders at
imgtec.com>
> wrote:
>
>> > > OK. What's the general class design look like then? The
text from the
>> > > original mail was fairly confusing then as I thought he was
doing
>> something
>> > > that you say he isn't doing :)
>> > Daniel, can you send your current plan for the Tuple class?
>>
>> Sorry for being a bit slow to answer this bit. It's been a busy
week due
>> to my sister's wedding.
>>
>> The plan for TargetTuple's implementation is fairly brief since it
is
>> heavily based on the current llvm::Triple but it has two main parts to
it.
>> The first is fairly uninteresting and is to create a TargetTuple class
with
>> (almost) the same interface as Triple and have this be the primary
object
>> throughout the core libraries. This separates the GNU Triple from the
>> target description and gives us room to make changes to the target
>> description without changing the meaning of 'GNU Triple'. This
is what the
>> patch I posted is starting and we've already covered this part of
the plan
>> so I'll focus on the second part.
>>
>> The second part is to refactor the representation and tweak the
>> interface/implementation of TargetTuple into something more appropriate
and
>> complete for a target description. As you'll see, these
>> implementation/interface changes compared to llvm::Triple are fairly
minor.
>> The key improvement in the addition of the TargetTuple is in the
ability to
>> represent the real target after tool options are taken into account and
>> llvm::Triple's methods are generally good as they are.
>>
>> The member variables will be similar in principle to llvm::Triple
except
>> that the string form (if we have any at all following the discussion
>> elsewhere in this thread) will not be stored. This is because, unlike
>> Triple, we can reconstruct it from the fields. Member variables will be
>> split where they currently specify multiple things. For example, the
>> architecture field sometimes specifies endianness in values like
>> 'mips'/'mipsel' and these will be split into an
architecture value 'mips'
>> and a separate endianness boolean. I don't think we need to handle
>> mixed-endian but we can use enums if necessary. I'm expecting to
have the
>> following members in the base class by the end (all undefined types
here
>> are enums and most are in the posted patch):
>>
>>   // The architecture with no additional details such as endianness.
>>   // Note that this is not the same ArchType as in llvm::Triple. For
>> example, 'arm' and 'armeb' are both
>>   // simply 'arm' in TargetTuple. Similarly for other endian
variants and
>> things like 'mips'/'mips64' where
>>   // 'mips64' isn't a significantly different architecture
from 'mips'
>>   ArchType Arch;
>>
>>   // The Architecture revision or variant
>>   // Like Arch, this isn't the same as the one in llvm::Triple
since more
>> architectures ought to be using it than is necessary for llvm::Triple.
>>   // For example, Mips architecture revisions should be represented
here
>> even though GNU triples don't state it. For completeness,
>>   // I should mention that there are some unconventional triples that
do
>> mention Mips architecture revisions but we don't implement them at
the
>> moment.
>>   SubArchType SubArch;
>>
>>   // The vendor.
>>   // This doesn't have to be the same as the one in llvm::Triple
but I
>> see little reason to differ.
>>   VendorType Vendor;
>>
>>   // The OS.
>>   // This doesn't have to be the same as the one in llvm::Triple
but I
>> see little reason to differ.
>>   OSType OS;
>>
>>   // The environment.
>>   // This doesn't have to be the same as the one in llvm::Triple
but I
>> see little reason to differ.
>>   EnvironmentType Environment;
>>   unsigned MajorVersion, MinorVersion, MicroVersion;
>>
>>   // The object format (ELF, MachO, COFF, etc.) in use.
>>   ObjectFormatType ObjectFormat;
>>
>>   // The endianness. If we ever need mixed-endian or other variants we
>> can change this to an enum.
>>   bool BigEndian;
>>
>>   // The ABI. The value 'Standard' can be used for targets with
only one
>> ABI.
>>   ABIType ABI;
>>
>>   // The SubABI. As far as I know, few architectures need this so it
may
>> make more sense in a target specific subclass
>>   SubABIType SubABI;
>>
>> To be clear, the above list is based on the assumption that we can
>> serialize a TargetTuple in IR without including a GNU Triple. If we do
want
>> to always write out the GNU Triple when writing IR then we will need to
add
>> it to this list of members since we can't derive it from the rest
of the
>> TargetTuple (we can have zero or multiple valid GNU Triples for a given
>> TargetTuple).
>>
>> As these members are introduced, we will re-implement the predicates
that
>> we've carried over from llvm::Triple in terms of these members. The
current
>> list of interface changes compared to llvm::Triple is below but note
that
>> this list is subject to change:
>> * Add getters/setters for new members as necessary (e.g. for endian or
>> ABI).
>> * Remove normalize() since the representation of a TargetTuple is
already
>> normalized.
>> * Assuming we don't want to write GNU Triples to IR, remove
getTriple()
>> and setTriple() since we won't need them. Otherwise, keep them but
make
>> them use an llvm::Triple instead of a string.
>> * Assuming we don't want a string form, remove str() as well.
>> * Rename isArch*bit() to arePointers*bit() which is all they really
test.
>> The current name is misleading on some architectures since it implies
that
>> architectures are inherently 16/32/64-bit but it's really a
software
>> decision.
>> * Remove the get(Big|Little)EndianArchVariant() factory methods in
favour
>> of a setEndian(...) (or similar) mutator.
>> * Likewise, remove get*bitArchVariant() and have the TargetParser
handle
>> -m32/-m64 instead since it needs to call setArch() or setABI()
depending on
>> the architecture.
>> * Remove getArchTypePrefix() since there shouldn't be anything that
truly
>> needs the original string if our representation is good enough.
>> There are a few other minor tweaks that could probably be made but
I've
>> left them out of this list because they are very target specific (e.g.
OS X
>> version numbers need a special comparison predicate at the moment).
>>
>> For completeness, I should also mention that as a result of the
>> introduction of TargetTuple a number of member functions of Triple will
>> become unused. These will be deprecated and deleted.
>> Finally, at some point we'll need to introduce target specific
>> TargetTuple subclasses to add target specific data like ARM's FPU
>> information or MIPS's NaN encoding. At this point we'll have to
add a
>> factory method to select the correct subclass based on the Arch member.
>> This should be a fairly easy change when we get to that point as there
will
>> be very few invocations of the TargetTuple constructor left.
>>
>> Hopefully, that gives you a better picture of the end point for
>> TargetTuple. Overall, it's just a mutable version of Triple with
some
>> additional target information and without the ambiguity found in GNU
>> Triples. Let me know if you still have questions.
>> ________________________________________
>> From: Renato Golin [renato.golin at linaro.org]
>> Sent: 30 July 2015 00:11
>> To: Eric Christopher
>> Cc: Daniel Sanders; LLVM Developers Mailing List (llvmdev at
cs.uiuc.edu);
>> Jim Grosbach (grosbach at apple.com)
>> Subject: Re: The Trouble with Triples
>>
>> On 29 July 2015 at 23:47, Eric Christopher <echristo at
gmail.com> wrote:
>> > This part doesn't seem obvious from the direction the patches
are going.
>>
>> Until now, most of what he has done was to refactor the Triple class,
>> with no functional change, and to create a thin layer around the
>> Triple (called Tuple) and pass those instead. This is on par with that
>> premise.
>>
>> The current patch is the first one to actually have some more
>> substantial change, so it's good that you stopped it now, before we
>> start breaking everything.
>>
>> Maybe, knowing what it is now, if you could have another quick look at
>> the patch, and see if the new light has helped understand the patch
>> for what it will be. Maybe it's still not good enough, so then
we'll
>> have to resort to a new round of design discussions.
>>
>>
>> > Definitely don't want this in the middle end at all. That all
can be
>> part of
>> > the TargetMachine/TargetSubtargetInfo interface.
>>
>> Ah, yes! But the TargetMachine (& pals) are created from
information
>> from the Triple and the other bits that front-ends keep for
>> themselves.
>>
>> So, in theory, if the Tuple is universal, creating them with a Tuple
>> (instead of a Triple+stuff) will free the front-ends of keeping the
>> rest of the info on their own, and TargetMachine/SubTargetInfo/etc
>> will be more homogeneous across different tools / front-ends than it
>> is today.
>>
>> Another strong point is: we're not trying to change any other
machine
>> description / API. This is just about the user options and defaults,
>> that are used to *create* machine descriptions.
>>
>>
>> >> The decision to create a new class (Tuple) is because Triple
already
>> >> has a legacy-heavy meaning, which should not change.
>> >
>> > Agreed with at least the "because" part.
>>
>> There was also the name. Triple is very far from the truth. :)
>>
>> But changing the Triple class could cause ripples in the mud that
>> would be hard to see at first, and hard to change later, after people
>> started relying on it.
>>
>> The final goal is that the Triple class would end up as being nothing
>> more than a Triple *parser*, with the current legacy logic, setting up
>> the Tuple fields and using them to select the rest of the default
>> fields.
>>
>>
>> > OK. What's the general class design look like then? The text
from the
>> > original mail was fairly confusing then as I thought he was doing
>> something
>> > that you say he isn't doing :)
>>
>> Daniel, can you send your current plan for the Tuple class?
>>
>> cheers,
>> --renato
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150915/d55ec30f/attachment-0001.html>

Daniel Sanders via llvm-dev

2015-Sep-15 18:34 UTC

head link

[llvm-dev] The Trouble with Triples

> While you've done a lot of (legitimate) complaining about mips I
don't see how
> this is an effective long term strategy for how we communicate information
> within llvm. This mostly duplicates a lot of data that we have sitting
around
> and puts it into an interface. Why aren't you fixing the interfaces
rather
> than duplicating the data? It's a complicated interface right now and a
lot
> of work to untangle it, but that's what's needed.
The interface isn't what's wrong, it's the data values. Given the
triple mips64-linux-gnu,
the data in Triple says Triple::mips64 even on targets where mips64-linux-gnu
_means_ Triple::mips. This particular example is taken from Debian Jessie
(mips/mipsel) running on an Octeon II. I have numerous other examples including
one
particularly disturbing case where mips-linux-gnu should be read as
Triple::mipsel.
This plan solves the problem by disambiguating the triple early and passing the
unambiguous target description throughout the backend.
In this example, Triple would say Triple::mips64 and TargetTuple would say
TargetTuple::mips.

You mentioned that I've duplicated the data and I can certainly see why it
looks that way. However,
it's important to realize that it's not the same data. The words are the
same (at the moment, there
will be differences later on), but the semantics differ. The semantics of
Triple::mips64 are 'the
CPU/Arch part of the triple was the string "mips64"', while the
semantics of TargetTuple::mips is 'the
target architecture is Mips'.

To give an analogy: Triple's are like peoples names. There are numerous
people with the name 'John Smith' but they are distinct people with
different traits. It's not correct to, for example, say that all 'John
Smith's' have green eyes since they really have various colours. LLVM is
currently making such a false statement when it says that all
mips-linux-gnu's refer to the same target.
Now, TargetTuples are like social security numbers. Let's say John Smith
with the social security number X has blue eyes. It's therefore correct to
say that person X has blue eyes. I want to be able to ask whether the person has
blue eyes and have the answer derived from the properties of person X, rather
than the incorrect generalization about the various John Smiths.
Does that make sense?

We can go further with this analogy too. For example, let's say John Smith
with the SSN Y also answers to the name Rameses. This is the problem that Renato
is working on. Renato needs to be able to see the name Rameses and map this to
the correct John Smith (or at least someone very much like him). This is the
gist of what ARMTargetParser is/was doing.

All that said, phase 4 does include interface changes. Triple will lose much of
it's
API on the grounds that it will be unused (everything will be consulting
TargetTuple). Additionally, parts of TargetTuple's interface will be
adjusted where appropriate.
> The largest part of the problem for the mips port is that the assembler
needs
> information that only the TargetMachine has in any useful way. The better
> idea is to:
>
> a) get rid of the is-a interface for MCSubtargetInfo/TargetSubtargetInfo
and
> replace that with a has-a interface so that you can look up information
> directly and only IR level compilation information remains at the
> TargetSubtargetInfo interface, while
>
> b) probably adding a TargetMachine equivalent at the MC level. This could
> handle a lot of the low level target interface you're concerned with
that
> we don't have at that level, but do at the TargetMachine level.
>
> Does this make sense?
Not really. The bit that doesn't make sense to me is 'a)'. The first
problem is that
MCSubtargetInfo/TargetSubtargetInfo cannot provide information it doesn't
possess.
Assuming we fix that, we still have the fundamental problem that any information
that
is derived from the Triple does not describe the desired target. It describes
LLVM's
mistaken opinion on the meaning of a string that has no universally accepted
meaning.

It's this incorrect belief that Triples have a 1-1 mapping to targets or
even a
universally accepted meaning that is by far the largest problem.
> Until LTO the serialization issue isn't much of an issue as anything
that wants to
> deal with code generation in this way can create a TargetMachine via some
of the
> existing code generation interfaces, it's most assuredly not ideal, but
mostly
> hampered by the existing libLTO mechanism for passing options to the code
generator.
> A serialization here might make sense but I haven't figured out if one
is necessarily
> needed yet.
>
> -eric
I agree that LTO is the only place where serialization really matters and that
everything
else can create a TargetMachine. However, it's important to re-iterate that
LLVM's
interpretation of the Triple string the TargetMachine is created from is
incorrect.

________________________________
From: Eric Christopher [echristo at gmail.com]
Sent: 15 September 2015 17:11
To: Daniel Sanders; Renato Golin
Cc: LLVM Developers Mailing List (llvmdev at cs.uiuc.edu); Jim Grosbach
(grosbach at apple.com)
Subject: Re: The Trouble with Triples

I was writing a rather longer email with an actual plan for you, but as
you've committed I'll send it briefly.

While you've done a lot of (legitimate) complaining about mips I don't
see how this is an effective long term strategy for how we communicate
information within llvm. This mostly duplicates a lot of data that we have
sitting around and puts it into an interface. Why aren't you fixing the
interfaces rather than duplicating the data? It's a complicated interface
right now and a lot of work to untangle it, but that's what's needed.

The largest part of the problem for the mips port is that the assembler needs
information that only the TargetMachine has in any useful way. The better idea
is to:

a) get rid of the is-a interface for MCSubtargetInfo/TargetSubtargetInfo and
replace that with a has-a interface so that you can look up information directly
and only IR level compilation information remains at the TargetSubtargetInfo
interface, while

b) probably adding a TargetMachine equivalent at the MC level. This could handle
a lot of the low level target interface you're concerned with that we
don't have at that level, but do at the TargetMachine level.

Does this make sense?

Until LTO the serialization issue isn't much of an issue as anything that
wants to deal with code generation in this way can create a TargetMachine via
some of the existing code generation interfaces, it's most assuredly not
ideal, but mostly hampered by the existing libLTO mechanism for passing options
to the code generator. A serialization here might make sense but I haven't
figured out if one is necessarily needed yet.

-eric

On Sat, Aug 8, 2015 at 6:58 AM Daniel Sanders <Daniel.Sanders at
imgtec.com<mailto:Daniel.Sanders at imgtec.com>>
wrote:> > OK. What's the general class design look like then? The text from
the
> > original mail was fairly confusing then as I thought he was doing
something
> > that you say he isn't doing :)
> Daniel, can you send your current plan for the Tuple class?
Sorry for being a bit slow to answer this bit. It's been a busy week due to
my sister's wedding.

The plan for TargetTuple's implementation is fairly brief since it is
heavily based on the current llvm::Triple but it has two main parts to it. The
first is fairly uninteresting and is to create a TargetTuple class with (almost)
the same interface as Triple and have this be the primary object throughout the
core libraries. This separates the GNU Triple from the target description and
gives us room to make changes to the target description without changing the
meaning of 'GNU Triple'. This is what the patch I posted is starting and
we've already covered this part of the plan so I'll focus on the second
part.

The second part is to refactor the representation and tweak the
interface/implementation of TargetTuple into something more appropriate and
complete for a target description. As you'll see, these
implementation/interface changes compared to llvm::Triple are fairly minor. The
key improvement in the addition of the TargetTuple is in the ability to
represent the real target after tool options are taken into account and
llvm::Triple's methods are generally good as they are.

The member variables will be similar in principle to llvm::Triple except that
the string form (if we have any at all following the discussion elsewhere in
this thread) will not be stored. This is because, unlike Triple, we can
reconstruct it from the fields. Member variables will be split where they
currently specify multiple things. For example, the architecture field sometimes
specifies endianness in values like 'mips'/'mipsel' and these
will be split into an architecture value 'mips' and a separate
endianness boolean. I don't think we need to handle mixed-endian but we can
use enums if necessary. I'm expecting to have the following members in the
base class by the end (all undefined types here are enums and most are in the
posted patch):

  // The architecture with no additional details such as endianness.
  // Note that this is not the same ArchType as in llvm::Triple. For example,
'arm' and 'armeb' are both
  // simply 'arm' in TargetTuple. Similarly for other endian variants
and things like 'mips'/'mips64' where
  // 'mips64' isn't a significantly different architecture from
'mips'
  ArchType Arch;

  // The Architecture revision or variant
  // Like Arch, this isn't the same as the one in llvm::Triple since more
architectures ought to be using it than is necessary for llvm::Triple.
  // For example, Mips architecture revisions should be represented here even
though GNU triples don't state it. For completeness,
  // I should mention that there are some unconventional triples that do mention
Mips architecture revisions but we don't implement them at the moment.
  SubArchType SubArch;

  // The vendor.
  // This doesn't have to be the same as the one in llvm::Triple but I see
little reason to differ.
  VendorType Vendor;

  // The OS.
  // This doesn't have to be the same as the one in llvm::Triple but I see
little reason to differ.
  OSType OS;

  // The environment.
  // This doesn't have to be the same as the one in llvm::Triple but I see
little reason to differ.
  EnvironmentType Environment;
  unsigned MajorVersion, MinorVersion, MicroVersion;

  // The object format (ELF, MachO, COFF, etc.) in use.
  ObjectFormatType ObjectFormat;

  // The endianness. If we ever need mixed-endian or other variants we can
change this to an enum.
  bool BigEndian;

  // The ABI. The value 'Standard' can be used for targets with only one
ABI.
  ABIType ABI;

  // The SubABI. As far as I know, few architectures need this so it may make
more sense in a target specific subclass
  SubABIType SubABI;

To be clear, the above list is based on the assumption that we can serialize a
TargetTuple in IR without including a GNU Triple. If we do want to always write
out the GNU Triple when writing IR then we will need to add it to this list of
members since we can't derive it from the rest of the TargetTuple (we can
have zero or multiple valid GNU Triples for a given TargetTuple).

As these members are introduced, we will re-implement the predicates that
we've carried over from llvm::Triple in terms of these members. The current
list of interface changes compared to llvm::Triple is below but note that this
list is subject to change:
* Add getters/setters for new members as necessary (e.g. for endian or ABI).
* Remove normalize() since the representation of a TargetTuple is already
normalized.
* Assuming we don't want to write GNU Triples to IR, remove getTriple() and
setTriple() since we won't need them. Otherwise, keep them but make them use
an llvm::Triple instead of a string.
* Assuming we don't want a string form, remove str() as well.
* Rename isArch*bit() to arePointers*bit() which is all they really test. The
current name is misleading on some architectures since it implies that
architectures are inherently 16/32/64-bit but it's really a software
decision.
* Remove the get(Big|Little)EndianArchVariant() factory methods in favour of a
setEndian(...) (or similar) mutator.
* Likewise, remove get*bitArchVariant() and have the TargetParser handle
-m32/-m64 instead since it needs to call setArch() or setABI() depending on the
architecture.
* Remove getArchTypePrefix() since there shouldn't be anything that truly
needs the original string if our representation is good enough.
There are a few other minor tweaks that could probably be made but I've left
them out of this list because they are very target specific (e.g. OS X version
numbers need a special comparison predicate at the moment).

For completeness, I should also mention that as a result of the introduction of
TargetTuple a number of member functions of Triple will become unused. These
will be deprecated and deleted.
Finally, at some point we'll need to introduce target specific TargetTuple
subclasses to add target specific data like ARM's FPU information or
MIPS's NaN encoding. At this point we'll have to add a factory method to
select the correct subclass based on the Arch member. This should be a fairly
easy change when we get to that point as there will be very few invocations of
the TargetTuple constructor left.

Hopefully, that gives you a better picture of the end point for TargetTuple.
Overall, it's just a mutable version of Triple with some additional target
information and without the ambiguity found in GNU Triples. Let me know if you
still have questions.
________________________________________
From: Renato Golin [renato.golin at linaro.org<mailto:renato.golin at
linaro.org>]
Sent: 30 July 2015 00:11
To: Eric Christopher
Cc: Daniel Sanders; LLVM Developers Mailing List (llvmdev at
cs.uiuc.edu<mailto:llvmdev at cs.uiuc.edu>); Jim Grosbach (grosbach at
apple.com<mailto:grosbach at apple.com>)
Subject: Re: The Trouble with Triples

On 29 July 2015 at 23:47, Eric Christopher <echristo at
gmail.com<mailto:echristo at gmail.com>> wrote:> This part doesn't seem obvious from the direction the patches are
going.
Until now, most of what he has done was to refactor the Triple class,
with no functional change, and to create a thin layer around the
Triple (called Tuple) and pass those instead. This is on par with that
premise.

The current patch is the first one to actually have some more
substantial change, so it's good that you stopped it now, before we
start breaking everything.

Maybe, knowing what it is now, if you could have another quick look at
the patch, and see if the new light has helped understand the patch
for what it will be. Maybe it's still not good enough, so then we'll
have to resort to a new round of design discussions.

> Definitely don't want this in the middle end at all. That all can be
part of
> the TargetMachine/TargetSubtargetInfo interface.
Ah, yes! But the TargetMachine (& pals) are created from information
from the Triple and the other bits that front-ends keep for
themselves.

So, in theory, if the Tuple is universal, creating them with a Tuple
(instead of a Triple+stuff) will free the front-ends of keeping the
rest of the info on their own, and TargetMachine/SubTargetInfo/etc
will be more homogeneous across different tools / front-ends than it
is today.

Another strong point is: we're not trying to change any other machine
description / API. This is just about the user options and defaults,
that are used to *create* machine descriptions.

>> The decision to create a new class (Tuple) is because Triple already
>> has a legacy-heavy meaning, which should not change.
>
> Agreed with at least the "because" part.
There was also the name. Triple is very far from the truth. :)

But changing the Triple class could cause ripples in the mud that
would be hard to see at first, and hard to change later, after people
started relying on it.

The final goal is that the Triple class would end up as being nothing
more than a Triple *parser*, with the current legacy logic, setting up
the Tuple fields and using them to select the rest of the default
fields.

> OK. What's the general class design look like then? The text from the
> original mail was fairly confusing then as I thought he was doing something
> that you say he isn't doing :)
Daniel, can you send your current plan for the Tuple class?

cheers,
--renato
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150915/f3ca97b3/attachment.html>

Renato Golin via llvm-dev

2015-Sep-15 19:21 UTC

head link

[llvm-dev] The Trouble with Triples

On 15 September 2015 at 19:34, Daniel Sanders <Daniel.Sanders at
imgtec.com> wrote:> We can go further with this analogy too. For example, let's say John
Smith
> with the SSN Y also answers to the name Rameses. This is the problem that
> Renato is working on. Renato needs to be able to see the name Rameses and
> map this to the correct John Smith (or at least someone very much like
him).
> This is the gist of what ARMTargetParser is/was doing.
A good example is "krait", a CPU design from Qualcomm.

Krait used to be mapped to Cortex-A15 because it has VFP4 and HDIV,
but architecturally, it is a lot closer to a Cortex-A9 than an A15. So
assuming that Krait == A15 means making a lot of bad optimisation
decisions in the back-end, and the code performed poorly.

This year we made the change, so that Krait == A9+HDIV+VFP4, but
neither the triple, nor the CPU descriptions could cope with that.
Then, a hack was made to treat "krait" especially, changing the
internal options (triple and others) to explicitly say cpu=A9 + HDIV +
VFP4.

Logic like that is spread all over the compiler, not just in the
driver, but the back-end. For example, the recent discussions about
GNUEABI not entirely being the same as EABI, and the need for
additional flags, such as "-meabi=gnu". This has direct consequences
to the back-end, which has knowledge that it should not have (when
gnueabi is really just eabi, and when it's not). The back-end should
be ignorant of such things and have a flag "isEabi" do this
"else" do
that.

But triples can't do that at all. As Daniel demonstrated, they have a
very specific, irrational and sometimes perverse meaning. Any change
we do to the triples will impact compatibility with other toolchains
and we can't afford that.

So, instead of refactoring the Triple to carry all legacy + all decent
logic, we thought that making the Triple into *just* legacy, and
having a Tuple as the new way forward we'd achieve two main goals that
was discussed in the community for a very long time:

1. Separate target-specific crud from the rest. That also includes
"legacy ARM" from "new ARM". All users of these interfaces
should not
just be unaware of the underlying complexity, but also not have to pay
the price for that complexity. So all string-parsing, name-mangling,
legacy-wrapper should be the last action, not the first.

2. Be able to create complex target descriptions from the command
line, or a config file, or some database, or whatever. Especially in
the ARM world, target description databases are not uncommon, as they
solve the ambiguity problem in a clear way. But we don't want to move
everyone into a complex database just because ARM description is a
mess. So, the design is to have the Tuple constructed from a Triple
for legacy and simplicity reasons, but keeping an accurate and
unambiguous description for the more complex targets, while allowing
*other* methods of creating a Tuple for those of us who need it.

So, by default, or if the driver has a "-target foo", it uses the
triple to create the tuple, and no change in behaviour is observed.
OTOH, if the driver has a "-config arm-foo.yaml", it creates the Tuple
from that config file, while still keeping the same unique values
being passed down the target description classes to the back-end, and
*no change in behaviour is observed*.

Hope that helps,

cheers,
--renato

llvm dev - Aug 2015 - The Trouble with Triples

[LLVMdev] The Trouble with Triples

[llvm-dev] The Trouble with Triples

[llvm-dev] Fwd: The Trouble with Triples

[llvm-dev] The Trouble with Triples

[llvm-dev] The Trouble with Triples

[llvm-dev] The Trouble with Triples