thr3ads.net - llvm dev - [LLVMdev] New TargetSpec 'llvmnote' [Feb 2011]

If this information is useful, please help other people find it:
Share via:

Chris Lattner

2011-Feb-23 02:46 UTC

[LLVMdev] New TargetSpec 'llvmnote'

Hi All,

There is recently a discussion on the LLDB list about how to deal with targets,
and our current mismash of llvm::Triple and the various subclasses of
TargetSubtarget leave a lot to be desired.  GNU target triples are really
important as input devices to the compiler (users want to specify them) but they
aren't detailed enough for internal clients.

Anyway, in short, I think that we should unify the variety of code we have to
deal with this stuff into a new TargetSpec class.   I don't have any
short-term plan to implement this, but I wrote up some of my thoughts here:
http://nondot.org/sabre/LLVMNotes/TargetSpec.txt

Remember that this isn't intended to be something users deal with, it's
just an internal implementation detail of the compiler, debugger, nm
implementation, etc.

-Chris

David Given

2011-Feb-23 10:47 UTC

head link

[LLVMdev] New TargetSpec 'llvmnote'

On 02/23/11 02:46, Chris Lattner wrote:
[...]> Remember that this isn't intended to be something users deal with,
it's just an internal implementation detail of the compiler, debugger, nm
implementation, etc.
Can I put in a plea to have as much of LLVM as possible *not* require
knowledge of a single, specific architecture to work?

I have various things I would like to do that work on abstract machines,
where I don't have a specific target or CPU in mind, but just want to
work at the bitcode level. Right now the only way I know of doing this
is to hardcode the datalayout into a new target and rebuild the whole
shooting match, LLVM and clang combined. I very much do not want to do this.

What would be really nice is to be able to specify a custom datalayout
on the command line and have as many tools as possible still work,
particularly clang --- trying to generate code with non-standard
datalayouts is kinda hard right now.

-- 
┌─── ｄｇ＠ｃｏｗｌａｒｋ．ｃｏｍ ───── http://www.cowlark.com ─────
│ "Thou who might be our Father, who perhaps may be in Heaven, hallowed
│ be Thy Name, if Name Thou hast and any desire to see it hallowed..."
│ --- _Creatures of Light and Darkness_, Roger Zelazny

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 262 bytes
Desc: OpenPGP digital signature
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110223/506f79fc/attachment.sig>

Sandeep Patel

2011-Feb-23 17:59 UTC

head link

[LLVMdev] New TargetSpec 'llvmnote'

On Wed, Feb 23, 2011 at 2:46 AM, Chris Lattner <clattner at apple.com>
wrote:>
> There is recently a discussion on the LLDB list about how to deal with
targets, and our current mismash of llvm::Triple and the various subclasses of
TargetSubtarget leave a lot to be desired.  GNU target triples are really
important as input devices to the compiler (users want to specify them) but they
aren't detailed enough for internal clients.
>
> Anyway, in short, I think that we should unify the variety of code we have
to deal with this stuff into a new TargetSpec class.   I don't have any
short-term plan to implement this, but I wrote up some of my thoughts here:
> http://nondot.org/sabre/LLVMNotes/TargetSpec.txt
>
> Remember that this isn't intended to be something users deal with,
it's just an internal implementation detail of the compiler, debugger, nm
implementation, etc.
Bitcode currently does not carry enough options information to handle
LTO. For example, if you use -O1 for a particular translation unit but
-O4 for the rest of them, that information isn't saved and provided to
LTO when the actual optimization is happening. Similarly, some options
like soft-float/hard-float aren't preserved. We should consider these
issues while solving this.

deep

Chris Lattner

2011-Feb-23 19:26 UTC

head link

[LLVMdev] New TargetSpec 'llvmnote'

On Feb 23, 2011, at 2:47 AM, David Given wrote:
> On 02/23/11 02:46, Chris Lattner wrote:
> [...]
>> Remember that this isn't intended to be something users deal with,
it's just an internal implementation detail of the compiler, debugger, nm
implementation, etc.
> 
> Can I put in a plea to have as much of LLVM as possible *not* require
> knowledge of a single, specific architecture to work?
> 
> I have various things I would like to do that work on abstract machines,
> where I don't have a specific target or CPU in mind, but just want to
> work at the bitcode level. Right now the only way I know of doing this
> is to hardcode the datalayout into a new target and rebuild the whole
> shooting match, LLVM and clang combined. I very much do not want to do
this.
This request is completely orthogonal to the proposal.  If you generate target
independent LLVM IR, you don't have to put a triple into the IR.  This
isn't going to change.

-Chris

Dan Gohman

2011-Feb-23 21:43 UTC

head link

[LLVMdev] New TargetSpec 'llvmnote'

On Feb 22, 2011, at 6:46 PM, Chris Lattner wrote:> This leads to a number of problems in LLVM:
>  - we have a bunch of duplication
>  - we have confusion about what a triple is (normalized or not)
>  - no good way to tell if a triple is normalized
>  - no good, centralized way to reason about which triples are allowed and
valid
>  - the MC assembler has to link in the entire X86 backend to get subtarget
info
>  - we don't have a good way to implement things like .code32 in the MC
assembler
>  - LLDB replicates a lot of this code and heuristics
>  - we don't have good interfaces to inquire about the host
>  - we do std::string manipulation in llvm::Triple
>  - linux triples are actually quadruples!
>  - darwin tools that take -arch have to map them onto something internally.
Most of these are motivations for refactoring and code cleanup, but not
really for inventing a new target mini-language to replace triples.

The main problems with triples IMHO which motivate this are:

  - The vendor field is vague and non-orthoganal.
  - Triples don't represent subtarget attributes, except in the way that
    subtarget attributes are sometimes mangled into the architecture field
    in confusing ways.

At an initial read, the targetspec proposal's solutions to these
problems seem reasonable.

It's a little surprising to have a dedicated "Byte Order" field.
One
possible reason for it is that mips.le.* is marginally nicer than
mipsel.*, however that's not obviously worth burdening everyone else
for. Another possible reason is to allow otherwise
architecture-independent strings to encode an endianness. However,
that's not a concept that LLVM currently has. And without more
targetdata parts, it's not obvious how useful it is by itself.

On the other hand, if "Byte Order" makes sense to include, should
other parts of targetdata be included? Pointer size seems the next
most desirable -- endianness and pointer size would be sufficient for
many elf tools, for example. However, the other parts of
targetdata could conceivably be useful too.

The "OS" field seems like it should be renamed to "ABI",
since in the
description you discuss actual OS's that support multiple ABIs.

In the "Feature Delta" field, using "+" to add features but
using
a charactar other than "-" to remove them is unfortunate. How about
just prohibiting "-" in CPU names? Or for another idea, how about
prefixing negative features with "no-", as in
"core2+sse41+no-cmov"?

Dan

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110223/b045a55e/attachment.html>

Stephen Wilson

2011-Feb-23 23:24 UTC

head link

[LLVMdev] New TargetSpec 'llvmnote'

On Wed, Feb 23, 2011 at 01:43:35PM -0800, Dan Gohman
wrote:> On Feb 22, 2011, at 6:46 PM, Chris Lattner wrote:
> > This leads to a number of problems in LLVM:
> >  - we have a bunch of duplication
> >  - we have confusion about what a triple is (normalized or not)
> >  - no good way to tell if a triple is normalized
> >  - no good, centralized way to reason about which triples are allowed
and valid
> >  - the MC assembler has to link in the entire X86 backend to get
subtarget info
> >  - we don't have a good way to implement things like .code32 in
the MC assembler
> >  - LLDB replicates a lot of this code and heuristics
> >  - we don't have good interfaces to inquire about the host
> >  - we do std::string manipulation in llvm::Triple
> >  - linux triples are actually quadruples!
> >  - darwin tools that take -arch have to map them onto something
internally.
> 
> Most of these are motivations for refactoring and code cleanup, but not
> really for inventing a new target mini-language to replace triples.
> 
> The main problems with triples IMHO which motivate this are:
> 
>   - The vendor field is vague and non-orthoganal.
>   - Triples don't represent subtarget attributes, except in the way
that
>     subtarget attributes are sometimes mangled into the architecture field
>     in confusing ways.
> 
> At an initial read, the targetspec proposal's solutions to these
> problems seem reasonable.
> 
> It's a little surprising to have a dedicated "Byte Order"
field. One
> possible reason for it is that mips.le.* is marginally nicer than
> mipsel.*, however that's not obviously worth burdening everyone else
> for. Another possible reason is to allow otherwise
> architecture-independent strings to encode an endianness. However,
> that's not a concept that LLVM currently has. And without more
> targetdata parts, it's not obvious how useful it is by itself.
In LLDB we currently have an "ArchSpec" class that llvm::TargetSpec
could eventually replace.  Currently, one of the main applications for
having a "byte order" bit in LLDB is to allow sensible construction of
default specifications:  for example ARM is almost always little endian,
but there are board configurations where this is not the case.  I think
with sensible default values most people will not find the extra flag a
burden.

Having a byte order bit just helps model bi-endian architectures that
much more accurately IMHO.  For example, it would help when implementing
support for debugging boot code that forces the processor to switch
modes (PowerPC for example).

> On the other hand, if "Byte Order" makes sense to include, should
> other parts of targetdata be included? Pointer size seems the next
> most desirable -- endianness and pointer size would be sufficient for
> many elf tools, for example. However, the other parts of
> targetdata could conceivably be useful too.
Possibly useful again from an LLDB perspective.  I could imagine
debugging x86_64 operating system code and needing a way to communicate
transitions from 64-bit mode and 32-bit compatibility mode seamlessly.
However, I must stress this is *possibly* useful -- I do not have a firm
conclusion to offer here.   Perhaps this is something that we could
support on an as needed basis.
> The "OS" field seems like it should be renamed to
"ABI", since in the
> description you discuss actual OS's that support multiple ABIs.
> 
> In the "Feature Delta" field, using "+" to add features
but using
> a charactar other than "-" to remove them is unfortunate. How
about
> just prohibiting "-" in CPU names? Or for another idea, how about
> prefixing negative features with "no-", as in
"core2+sse41+no-cmov"?
> 
> Dan
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

-- 
steve

Chris Lattner

2011-Feb-24 06:29 UTC

head link

[LLVMdev] New TargetSpec 'llvmnote'

On Feb 23, 2011, at 9:59 AM, Sandeep Patel wrote:
>> Remember that this isn't intended to be something users deal with,
it's just an internal implementation detail of the compiler, debugger, nm
implementation, etc.
> 
> Bitcode currently does not carry enough options information to handle
> LTO. For example, if you use -O1 for a particular translation unit but
> -O4 for the rest of them, that information isn't saved and provided to
> LTO when the actual optimization is happening. Similarly, some options
> like soft-float/hard-float aren't preserved. We should consider these
> issues while solving this.
That's true, but the same is also true for a huge variety of other
codegen-level flags.  I don't think we want to encode every possible detail
at this level.  Specific things can be solved in different ways: for example,
-ffast-math is best solved by adding a flag onto individual fp ops.  Some things
(like mixed versions of -mpreferred-stack-boundary) are worth just punting on,
IMO.

In any case, I'm not interested in trying to tackle the long tail of weird
codegen options + LTO at this point.

-Chris

Chris Lattner

2011-Feb-24 06:36 UTC

head link

[LLVMdev] New TargetSpec 'llvmnote'

On Feb 23, 2011, at 1:43 PM, Dan Gohman wrote:> Most of these are motivations for refactoring and code cleanup, but not
> really for inventing a new target mini-language to replace triples.
That's all I'm proposing.  I'm not suggesting that the "mini
language" be exposed to users, it's just a "serialized for an
internal-to-llvm clients" data structure.  The string form would be
persisted in .ll and .bc files, that's all.
> It's a little surprising to have a dedicated "Byte Order"
field. One
> possible reason for it is that mips.le.* is marginally nicer than
> mipsel.*, however that's not obviously worth burdening everyone else
> for. Another possible reason is to allow otherwise
> architecture-independent strings to encode an endianness. However,
> that's not a concept that LLVM currently has. And without more
> targetdata parts, it's not obvious how useful it is by itself.
It is useful for doing simple queries about the target, and these are things
that can be derived from .o files.
> On the other hand, if "Byte Order" makes sense to include, should
> other parts of targetdata be included? Pointer size seems the next
> most desirable -- endianness and pointer size would be sufficient for
> many elf tools, for example. However, the other parts of
> targetdata could conceivably be useful too.
I could be convinced about this.  The other approach would be to formalize this
as part of the arch spec and treat mips and mips-le as two different arches, and
have a predicate that generates the bit on demand.
> The "OS" field seems like it should be renamed to
"ABI", since in the
> description you discuss actual OS's that support multiple ABIs.
It's really a cross product of OS's and ABIs.  For example, darwin10 vs
darwin9 is not an ABI, it is an OS.  I consider linux-eabi to be different than
linux-someotherabi because the entire OS has to be build that way.  *shrug*
> In the "Feature Delta" field, using "+" to add features
but using
> a charactar other than "-" to remove them is unfortunate. How
about
> just prohibiting "-" in CPU names? Or for another idea, how about
> prefixing negative features with "no-", as in
"core2+sse41+no-cmov"?
Good idea! I changed it to use commas and "no", giving
"core2,sse41,nocmov".

-Chris

Reasonably Related Threads

Search for more reasonably related threads

llvm dev - Feb 2011 - [LLVMdev] New TargetSpec 'llvmnote'

[LLVMdev] New TargetSpec 'llvmnote'

[LLVMdev] New TargetSpec 'llvmnote'

[LLVMdev] New TargetSpec 'llvmnote'

[LLVMdev] New TargetSpec 'llvmnote'

[LLVMdev] New TargetSpec 'llvmnote'

[LLVMdev] New TargetSpec 'llvmnote'

[LLVMdev] New TargetSpec 'llvmnote'

[LLVMdev] New TargetSpec 'llvmnote'

Reasonably Related Threads