Hi All, There is recently a discussion on the LLDB list about how to deal with targets, and our current mismash of llvm::Triple and the various subclasses of TargetSubtarget leave a lot to be desired. GNU target triples are really important as input devices to the compiler (users want to specify them) but they aren't detailed enough for internal clients. Anyway, in short, I think that we should unify the variety of code we have to deal with this stuff into a new TargetSpec class. I don't have any short-term plan to implement this, but I wrote up some of my thoughts here: http://nondot.org/sabre/LLVMNotes/TargetSpec.txt Remember that this isn't intended to be something users deal with, it's just an internal implementation detail of the compiler, debugger, nm implementation, etc. -Chris
On 02/23/11 02:46, Chris Lattner wrote: [...]> Remember that this isn't intended to be something users deal with, it's just an internal implementation detail of the compiler, debugger, nm implementation, etc.Can I put in a plea to have as much of LLVM as possible *not* require knowledge of a single, specific architecture to work? I have various things I would like to do that work on abstract machines, where I don't have a specific target or CPU in mind, but just want to work at the bitcode level. Right now the only way I know of doing this is to hardcode the datalayout into a new target and rebuild the whole shooting match, LLVM and clang combined. I very much do not want to do this. What would be really nice is to be able to specify a custom datalayout on the command line and have as many tools as possible still work, particularly clang --- trying to generate code with non-standard datalayouts is kinda hard right now. -- ┌─── dg@cowlark.com ───── http://www.cowlark.com ───── │ "Thou who might be our Father, who perhaps may be in Heaven, hallowed │ be Thy Name, if Name Thou hast and any desire to see it hallowed..." │ --- _Creatures of Light and Darkness_, Roger Zelazny -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 262 bytes Desc: OpenPGP digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110223/506f79fc/attachment.sig>
On Wed, Feb 23, 2011 at 2:46 AM, Chris Lattner <clattner at apple.com> wrote:> > There is recently a discussion on the LLDB list about how to deal with targets, and our current mismash of llvm::Triple and the various subclasses of TargetSubtarget leave a lot to be desired. GNU target triples are really important as input devices to the compiler (users want to specify them) but they aren't detailed enough for internal clients. > > Anyway, in short, I think that we should unify the variety of code we have to deal with this stuff into a new TargetSpec class. I don't have any short-term plan to implement this, but I wrote up some of my thoughts here: > http://nondot.org/sabre/LLVMNotes/TargetSpec.txt > > Remember that this isn't intended to be something users deal with, it's just an internal implementation detail of the compiler, debugger, nm implementation, etc.Bitcode currently does not carry enough options information to handle LTO. For example, if you use -O1 for a particular translation unit but -O4 for the rest of them, that information isn't saved and provided to LTO when the actual optimization is happening. Similarly, some options like soft-float/hard-float aren't preserved. We should consider these issues while solving this. deep
On Feb 23, 2011, at 2:47 AM, David Given wrote:> On 02/23/11 02:46, Chris Lattner wrote: > [...] >> Remember that this isn't intended to be something users deal with, it's just an internal implementation detail of the compiler, debugger, nm implementation, etc. > > Can I put in a plea to have as much of LLVM as possible *not* require > knowledge of a single, specific architecture to work? > > I have various things I would like to do that work on abstract machines, > where I don't have a specific target or CPU in mind, but just want to > work at the bitcode level. Right now the only way I know of doing this > is to hardcode the datalayout into a new target and rebuild the whole > shooting match, LLVM and clang combined. I very much do not want to do this.This request is completely orthogonal to the proposal. If you generate target independent LLVM IR, you don't have to put a triple into the IR. This isn't going to change. -Chris
On Feb 22, 2011, at 6:46 PM, Chris Lattner wrote:> This leads to a number of problems in LLVM: > - we have a bunch of duplication > - we have confusion about what a triple is (normalized or not) > - no good way to tell if a triple is normalized > - no good, centralized way to reason about which triples are allowed and valid > - the MC assembler has to link in the entire X86 backend to get subtarget info > - we don't have a good way to implement things like .code32 in the MC assembler > - LLDB replicates a lot of this code and heuristics > - we don't have good interfaces to inquire about the host > - we do std::string manipulation in llvm::Triple > - linux triples are actually quadruples! > - darwin tools that take -arch have to map them onto something internally.Most of these are motivations for refactoring and code cleanup, but not really for inventing a new target mini-language to replace triples. The main problems with triples IMHO which motivate this are: - The vendor field is vague and non-orthoganal. - Triples don't represent subtarget attributes, except in the way that subtarget attributes are sometimes mangled into the architecture field in confusing ways. At an initial read, the targetspec proposal's solutions to these problems seem reasonable. It's a little surprising to have a dedicated "Byte Order" field. One possible reason for it is that mips.le.* is marginally nicer than mipsel.*, however that's not obviously worth burdening everyone else for. Another possible reason is to allow otherwise architecture-independent strings to encode an endianness. However, that's not a concept that LLVM currently has. And without more targetdata parts, it's not obvious how useful it is by itself. On the other hand, if "Byte Order" makes sense to include, should other parts of targetdata be included? Pointer size seems the next most desirable -- endianness and pointer size would be sufficient for many elf tools, for example. However, the other parts of targetdata could conceivably be useful too. The "OS" field seems like it should be renamed to "ABI", since in the description you discuss actual OS's that support multiple ABIs. In the "Feature Delta" field, using "+" to add features but using a charactar other than "-" to remove them is unfortunate. How about just prohibiting "-" in CPU names? Or for another idea, how about prefixing negative features with "no-", as in "core2+sse41+no-cmov"? Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110223/b045a55e/attachment.html>
On Wed, Feb 23, 2011 at 01:43:35PM -0800, Dan Gohman wrote:> On Feb 22, 2011, at 6:46 PM, Chris Lattner wrote: > > This leads to a number of problems in LLVM: > > - we have a bunch of duplication > > - we have confusion about what a triple is (normalized or not) > > - no good way to tell if a triple is normalized > > - no good, centralized way to reason about which triples are allowed and valid > > - the MC assembler has to link in the entire X86 backend to get subtarget info > > - we don't have a good way to implement things like .code32 in the MC assembler > > - LLDB replicates a lot of this code and heuristics > > - we don't have good interfaces to inquire about the host > > - we do std::string manipulation in llvm::Triple > > - linux triples are actually quadruples! > > - darwin tools that take -arch have to map them onto something internally. > > Most of these are motivations for refactoring and code cleanup, but not > really for inventing a new target mini-language to replace triples. > > The main problems with triples IMHO which motivate this are: > > - The vendor field is vague and non-orthoganal. > - Triples don't represent subtarget attributes, except in the way that > subtarget attributes are sometimes mangled into the architecture field > in confusing ways. > > At an initial read, the targetspec proposal's solutions to these > problems seem reasonable. > > It's a little surprising to have a dedicated "Byte Order" field. One > possible reason for it is that mips.le.* is marginally nicer than > mipsel.*, however that's not obviously worth burdening everyone else > for. Another possible reason is to allow otherwise > architecture-independent strings to encode an endianness. However, > that's not a concept that LLVM currently has. And without more > targetdata parts, it's not obvious how useful it is by itself.In LLDB we currently have an "ArchSpec" class that llvm::TargetSpec could eventually replace. Currently, one of the main applications for having a "byte order" bit in LLDB is to allow sensible construction of default specifications: for example ARM is almost always little endian, but there are board configurations where this is not the case. I think with sensible default values most people will not find the extra flag a burden. Having a byte order bit just helps model bi-endian architectures that much more accurately IMHO. For example, it would help when implementing support for debugging boot code that forces the processor to switch modes (PowerPC for example).> On the other hand, if "Byte Order" makes sense to include, should > other parts of targetdata be included? Pointer size seems the next > most desirable -- endianness and pointer size would be sufficient for > many elf tools, for example. However, the other parts of > targetdata could conceivably be useful too.Possibly useful again from an LLDB perspective. I could imagine debugging x86_64 operating system code and needing a way to communicate transitions from 64-bit mode and 32-bit compatibility mode seamlessly. However, I must stress this is *possibly* useful -- I do not have a firm conclusion to offer here. Perhaps this is something that we could support on an as needed basis.> The "OS" field seems like it should be renamed to "ABI", since in the > description you discuss actual OS's that support multiple ABIs. > > In the "Feature Delta" field, using "+" to add features but using > a charactar other than "-" to remove them is unfortunate. How about > just prohibiting "-" in CPU names? Or for another idea, how about > prefixing negative features with "no-", as in "core2+sse41+no-cmov"? > > Dan >> _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-- steve
On Feb 23, 2011, at 9:59 AM, Sandeep Patel wrote:>> Remember that this isn't intended to be something users deal with, it's just an internal implementation detail of the compiler, debugger, nm implementation, etc. > > Bitcode currently does not carry enough options information to handle > LTO. For example, if you use -O1 for a particular translation unit but > -O4 for the rest of them, that information isn't saved and provided to > LTO when the actual optimization is happening. Similarly, some options > like soft-float/hard-float aren't preserved. We should consider these > issues while solving this.That's true, but the same is also true for a huge variety of other codegen-level flags. I don't think we want to encode every possible detail at this level. Specific things can be solved in different ways: for example, -ffast-math is best solved by adding a flag onto individual fp ops. Some things (like mixed versions of -mpreferred-stack-boundary) are worth just punting on, IMO. In any case, I'm not interested in trying to tackle the long tail of weird codegen options + LTO at this point. -Chris
On Feb 23, 2011, at 1:43 PM, Dan Gohman wrote:> Most of these are motivations for refactoring and code cleanup, but not > really for inventing a new target mini-language to replace triples.That's all I'm proposing. I'm not suggesting that the "mini language" be exposed to users, it's just a "serialized for an internal-to-llvm clients" data structure. The string form would be persisted in .ll and .bc files, that's all.> It's a little surprising to have a dedicated "Byte Order" field. One > possible reason for it is that mips.le.* is marginally nicer than > mipsel.*, however that's not obviously worth burdening everyone else > for. Another possible reason is to allow otherwise > architecture-independent strings to encode an endianness. However, > that's not a concept that LLVM currently has. And without more > targetdata parts, it's not obvious how useful it is by itself.It is useful for doing simple queries about the target, and these are things that can be derived from .o files.> On the other hand, if "Byte Order" makes sense to include, should > other parts of targetdata be included? Pointer size seems the next > most desirable -- endianness and pointer size would be sufficient for > many elf tools, for example. However, the other parts of > targetdata could conceivably be useful too.I could be convinced about this. The other approach would be to formalize this as part of the arch spec and treat mips and mips-le as two different arches, and have a predicate that generates the bit on demand.> The "OS" field seems like it should be renamed to "ABI", since in the > description you discuss actual OS's that support multiple ABIs.It's really a cross product of OS's and ABIs. For example, darwin10 vs darwin9 is not an ABI, it is an OS. I consider linux-eabi to be different than linux-someotherabi because the entire OS has to be build that way. *shrug*> In the "Feature Delta" field, using "+" to add features but using > a charactar other than "-" to remove them is unfortunate. How about > just prohibiting "-" in CPU names? Or for another idea, how about > prefixing negative features with "no-", as in "core2+sse41+no-cmov"?Good idea! I changed it to use commas and "no", giving "core2,sse41,nocmov". -Chris