Hi all, There's been some controversies in the TargetTriple changes and I want to explain it better in the list (to a wider audience) and also propose my plans on how to support the ARM platform better, especially cross-compilation in Clang. All this discussion came as a spin-off of bug 8911 (http://llvm.org/bugs/show_bug.cgi?id=8957)... Today we have three major problems in cross-compiling to ARM with Clang: 1. Some ARM triples "arm*-none-eabi" don't get properly recognized, so Clang doesn't generate correct AAPCS (soft and hard) calls and don't pass the correct triple to LLC. 2. Some options in Clang are chosen by parsing the triple directly, because triples don't have all properties necessary to make such decisions. 3. Clang today has only host-triple, which is inaccurate to describe a cross-compilation environment, when the difference between the host and the target matters when the driver is choosing what to run and what options to pass. To fix that, I think we needs three things: 1. Adding the options to the triple, so EABI can be recognized and properly stored to avoid string comparisons. This is not as simple as it seems because "arm-none-eabi" actually puts EABI in the OS slot, and the triple validation breaks before it gets to parsing the environment, because OS is actually invalid. There are some alternatives (such as split: normalization / validation and do special fiddling in between) but the main problem is that the triple logic is not written in stone and different vendors use different styles. 2. Clang parses - if arch.contains("v6") - which is not ideal. Clang should have a localized parse for sub-architectures (only important for ARM, I believe) and fill in TargetData objects with all information it can gather (by also using the other triple properties) to make every IR / command line generation decision based on that. I don't know why Clang calls LLC with command line options instead of just calling the back-end directly passing the TargetData, but it doesn't make that much difference. 3. Once all that support is in place, implement a target triple and use it for codegen/cmd-line choices, while the host triple is still used to choose what binaries to call by the driver. I hope all that makes sense, and if so, I plan on working on that. -- cheers, --renato
Hi Renato,> 1. Some ARM triples "arm*-none-eabi" don't get properly recognized, > so Clang doesn't generate correct AAPCS (soft and hard) calls and > don't pass the correct triple to LLC.in order to have "eabi" be properly recognized by LLVM, it is enough to add "eabi" as a valid environment value. Then Triple::Normalize will automatically move eabi to the environment position, resulting in arm*-none--eabi So I think this is easy to take care of.> 2. Some options in Clang are chosen by parsing the triple directly, > because triples don't have all properties necessary to make such > decisions.I think that's a good thing! Clang's needs are different to LLVM's, so probably clang should have it's own ClangTriple class. Given a triple string you probably first want to have Triple::Normalize crunch on it, permuting recognized components into the correct positions, and then have ClangTriple apply additional logic.> 1. Adding the options to the triple, so EABI can be recognized and > properly stored to avoid string comparisons. This is not as simple as > it seems because "arm-none-eabi" actually puts EABI in the OS slot,That's because Triple doesn't (or didn't) know that eabi is a valid value for environment. If it did then Normalize would move it to the right position. Ciao, Duncan.
On 23 January 2011 13:33, Duncan Sands <baldrick at free.fr> wrote:> in order to have "eabi" be properly recognized by LLVM, it is enough to > add "eabi" as a valid environment value. Then Triple::Normalize will > automatically move eabi to the environment position, resulting in > arm*-none--eabi So I think this is easy to take care of.That could work, if Clang/LLVM could work around the "--" issue. When looking for "arm-none-eabi.gcc", clang would never find a "arm-none--eabi-gcc". We could do a "canonicalTriple" to print the proper way (with two dashes) and a "compressedTriple" printing the expected by GCC. That would make my change in the normalization redundant, but would require some cleanup in Clang (which I'm not that familiar). But I think this is also not the best way to fix the problem, see below...> I think that's a good thing! Clang's needs are different to LLVM's, so > probably clang should have it's own ClangTriple class. Given a triple > string you probably first want to have Triple::Normalize crunch on it, > permuting recognized components into the correct positions, and then > have ClangTriple apply additional logic.I agree that Clang has different needs and probably ClangTriple should extend Triple in that way, but that's not what's happening. Clang parses independently and messes up when passing the triple back to LLC. It should keep the triple intact or parse it completely, And, given that Clang only parses triples from the command-line and LLVM only parses it from the IR, if the parser is different (ie, when Clang passes it via command-line to llc) , you might get different results from the LLVM's parser (Triple) and the Clang parser (ClangTriple -> Triple). Adding the fact that Clang calls LLC instead of invoking the back-end directly using TargetData and setting the rest of the options directly, we have a design decision in our hands, that I was expecting to delay to the second cycle. In my view, there are two ways of doing this: 1. Have a very basic Triple class (removing the Env-to-OS feature I've added), extending ClangTriple to parse ARM sub-architectures, Env-to-OS idiosyncrasy and create a TargetData object from the difference between ClangTriple and Triple, and setting all sub-architecture parameters (AAPCS, EABI, VFP, hardFP, softFP, etc) directly on the back-end properties and invoking the back-end directly. 2. Have Clang to call LLC via command-line, where we're forced to have the same triple parser for both cases (LLVM's Triple), or face the difference in attributes when coming from Clang or LLC directly via command line. I prefer the first path, but I have actually coded the second. Why? Because I didn't want to break everything. I just wanted to add EABI to Environment, but that brought all the other problems. So I had a choice: either re-factor the whole Clang-LLC interaction or do the required modification to have it working with the current design, even if it was a bit ugly. To be honest, this change doesn't fix our cross-compilation problems, but it was the first toe in the cold water, to get *precisely* the feed-back you're giving us... ;) Hope this makes things a bit more clear. cheers, --renato