Peter Smith via llvm-dev
2018-Mar-16 10:38 UTC
[llvm-dev] [RFC] Stop giving a default CPU to the LTO plugin?
Thanks for the example, that is very useful in working out the overall scope of the problem, which is now wider than I thought it was. I've put some comments inline. On 15 March 2018 at 19:12, Friedman, Eli <efriedma at codeaurora.org> wrote:> On 3/15/2018 9:43 AM, Peter Smith via llvm-dev wrote: >> >> Hello everyone, this is most likely Arm specific, but could affect >> other targets where there is a somewhat complex relationship between >> the triple and mcpu option. >> >> At present when clang is used as a linker driver for the gold-plugin >> and when using and an explicit -mcpu is not given to clang, then clang >> will always generate a -Wl,-plugin-opt=mcpu=<default CPU> where the >> default CPU is based on the triple. >> >> I think that this is causing more problems than it is worth and I'd >> like clang to not pass -Wl,-plugin-opt=mcpu=<CPU> unless the >> -mcpu=<CPU> is explicitly on the clang command line. >> >> I'd like to know if I'm missing something important, and if there >> would be any support for a patch to not pass an implicit mcpu via >> plugin? There may also be a way of handling an inappropriate >> cpu/triple combination more gracefully. >> >> The full story of a problem with the Arm target is available at >> https://bugs.llvm.org/show_bug.cgi?id=36542 . A short description of >> the problem is: >> - Source files are compiled with -mthumb >> --target=arm-linux-androideabi -mcpu=cortex-a7, >> this records a triple of thumbv7a-linux-androideabi in the bitcode file. >> - The command line for the link step only passes --target>> arm-linux-androideabi. >> - The default cpu passed to LTO for a triple arm-linux-androideabi is >> the very old arm7tdmi. >> - When the ARMAsmBackend is created from the combination of triple >> thumbv7a-linux-androideabi and cpu arm7tdmi the resultant target >> features only include Thumb1. Amongst other things this prevents MC >> from widening branch instructions to the Thumb2 wide branch. >> - Code generation prior to MC seems to prefer the attributes in the >> bitcode file (triple thumbv7a-linux-androideabi -mcpu=generic), this >> selects a narrow branch instruction for a tail call with the >> expectation it will be widened by MC. End result is a relocation out >> of range error. >> >> In summary the combination of -mcpu=arm7tdmi and a triple from the >> bitcode file of thumbv7a-linux-androideabi does not make sense, but >> when passed to LTO seems to: >> - Escape all the existing warnings about nonsensical triples [*]. >> - On Arm targets at least; having a final link step without a cpu is >> likely to be quite common so I don't think that this will be a corner >> case. >> - The Build Attributes of the output ELF file get the triple ARMv4t >> from -mcpu=arm7tdmi which is lying to the linker as the file contains >> ARMv7a instructions. > > > Having ARMv7a instructions in an ARMv4t file shouldn't be a problem: a > function should be allowed to override the CPU attributes to generate code > for a newer target. This is generally done using the "target" function > attribute. If this doesn't work correctly, we should fix it. It looks like > it's currently broken; testcase: > > void g(); > __attribute__((target("thumb,arch=cortex-a53"))) > void f() { g(); } >Hmmm, allowing that makes life much more complicated. For example I can also write: void g(); __attribute__((target("thumb,arch=cortex-m0"))) void f() { g(); } void i(); __attribute__((target("arm,arch=cortex-a53"))) void h() { i(); } With -mcpu=cortex-m0 and get ARM code within an object claiming to be Thumb only with no errors or warnings, with no chance of a linker detecting a mismatch either. I think that part of this is the same problem that is observed in PR36542 the ARMAsmBackend that is responsible for widening the tail call to a Thumb2 branch is created with ARMv4T which doesn't support Thumb1. There has been a recent change that threads through the existing SubtargetInfo instead of recreating it from the triple alone. It is worth mentioning that the object level BuildAttributes do not include Thumbv7a which is misleading to a linker as it will be expecting no ARMv7A in the object. Has there already been a discussion about what per function code-generation with BuildAttributes higher than the base object should mean in the context of capabilities of the ARMAsmBackend and BuildAttributes? My thoughts right now are that if ARMAsmBackend is to operate at an object level, rather than a per-function level then it has to use the capabilities of the highest architecture in the file. This also means giving the object BuildAttributes of the highest architecture in the file, and giving an error if they contradict, for example mixing Thumb Cortex-M0 and ARM Cortex-A53. If the ARMAsmBackend could be made to work on a per-function level then there is a chance that we could only widen the tail call to g() in f(), but not elsewhere. To honestly describe this in the BuildAttributes we would need to use per Section or per Function attributes though. My suggestion to move forward here is: - Recreate the SubtargetInfo based on the merge of all the Targets and CPU information that we have seen, or warn/error if they are incompatible. - Ouput the Tag_CPU_arch BuildAttributes based on the merge of all the Targets and CPU information that we have seen. It is probably worth moving any discussion of this particular part to PR36542 since it is somewhat Arm specific. I'll add this comment to there.>> >> My personal preference is that we shouldn't pass an implicit CPU >> derived from the triple, at least for Arm targets. If that isn't >> possible I think that we should make sure we detect inappropriate >> cpu/triple combinations and either warn or error. Does anyone have any >> strong opinions on which way to go? I'm happy to work on a patch to >> improve this. >> >> I know that this can be worked around by explicitly passing a cpu or >> more specific triple to the link line, however this is often not done >> in existing builds. >> >> Thanks in advance for any suggestions > > It doesn't really make sense to override the target specified in the > bitcode; I agree we shouldn't do that by default. But I don't think > changing that actually fixes the underlying bug. >It won't fix the underlying bug, but depending on we deal with nonsensical combinations of triple and CPU we may end up with a warning or an error for a common use case.> -Eli > > -- > Employee of Qualcomm Innovation Center, Inc. > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux > Foundation Collaborative Project >
Alex Bradbury via llvm-dev
2018-Mar-16 11:09 UTC
[llvm-dev] [RFC] Stop giving a default CPU to the LTO plugin?
On 16 March 2018 at 10:38, Peter Smith via llvm-dev <llvm-dev at lists.llvm.org> wrote:> On 15 March 2018 at 19:12, Friedman, Eli <efriedma at codeaurora.org> wrote: >> Having ARMv7a instructions in an ARMv4t file shouldn't be a problem: a >> function should be allowed to override the CPU attributes to generate code >> for a newer target. This is generally done using the "target" function >> attribute. If this doesn't work correctly, we should fix it. It looks like >> it's currently broken; testcase: >> >> void g(); >> __attribute__((target("thumb,arch=cortex-a53"))) >> void f() { g(); } >> > > Hmmm, allowing that makes life much more complicated. For example I > can also write: > void g(); > __attribute__((target("thumb,arch=cortex-m0"))) > void f() { g(); } > > void i(); > __attribute__((target("arm,arch=cortex-a53"))) > void h() { i(); } > > With -mcpu=cortex-m0 and get ARM code within an object claiming to be > Thumb only with no errors or warnings, with no chance of a linker > detecting a mismatch either.I think we can all agree that there should be no real problem with instruction selection when adding these sorts of target attributes. As you point out below, the problems start occurring when it gets to the MC layer and object emission. As the author of the well-intentioned cleanup patch that unmasked this issue, I'd like to thank you for putting in the time to delve into things. The patches in question were: * https://reviews.llvm.org/rL321707 * https://reviews.llvm.org/rL321692> I think that part of this is the same problem that is observed in > PR36542 the ARMAsmBackend that is responsible for widening the tail > call to a Thumb2 branch is created with ARMv4T which doesn't support > Thumb1. There has been a recent change that threads through the > existing SubtargetInfo instead of recreating it from the triple alone. > It is worth mentioning that the object level BuildAttributes do not > include Thumbv7a which is misleading to a linker as it will be > expecting no ARMv7A in the object. > > Has there already been a discussion about what per function > code-generation with BuildAttributes higher than the base object > should mean in the context of capabilities of the ARMAsmBackend and > BuildAttributes? My thoughts right now are that if ARMAsmBackend is to > operate at an object level, rather than a per-function level then it > has to use the capabilities of the highest architecture in the file. > This also means giving the object BuildAttributes of the highest > architecture in the file, and giving an error if they contradict, for > example mixing Thumb Cortex-M0 and ARM Cortex-A53. If the > ARMAsmBackend could be made to work on a per-function level then there > is a chance that we could only widen the tail call to g() in f(), but > not elsewhere. To honestly describe this in the BuildAttributes we > would need to use per Section or per Function attributes though. > > My suggestion to move forward here is: > - Recreate the SubtargetInfo based on the merge of all the Targets and > CPU information that we have seen, or warn/error if they are > incompatible. > - Ouput the Tag_CPU_arch BuildAttributes based on the merge of all the > Targets and CPU information that we have seen. > > It is probably worth moving any discussion of this particular part to > PR36542 since it is somewhat Arm specific. I'll add this comment to > there.I'm not so sure this is ARM specific, as other targets might well encounter similar issues (even if there is no direct equivalent to build attributes, there are cases where information is encoded into ELF flags on a per-object basis). Best, Alex
Eric Christopher via llvm-dev
2018-Mar-20 04:59 UTC
[llvm-dev] [RFC] Stop giving a default CPU to the LTO plugin?
On Fri, Mar 16, 2018 at 4:09 AM Alex Bradbury via llvm-dev < llvm-dev at lists.llvm.org> wrote:> On 16 March 2018 at 10:38, Peter Smith via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > On 15 March 2018 at 19:12, Friedman, Eli <efriedma at codeaurora.org> > wrote: > >> Having ARMv7a instructions in an ARMv4t file shouldn't be a problem: a > >> function should be allowed to override the CPU attributes to generate > code > >> for a newer target. This is generally done using the "target" function > >> attribute. If this doesn't work correctly, we should fix it. It looks > like > >> it's currently broken; testcase: > >> > >> void g(); > >> __attribute__((target("thumb,arch=cortex-a53"))) > >> void f() { g(); } > >> > > > > Hmmm, allowing that makes life much more complicated. For example I > > can also write: > > void g(); > > __attribute__((target("thumb,arch=cortex-m0"))) > > void f() { g(); } > > > > void i(); > > __attribute__((target("arm,arch=cortex-a53"))) > > void h() { i(); } > > > > With -mcpu=cortex-m0 and get ARM code within an object claiming to be > > Thumb only with no errors or warnings, with no chance of a linker > > detecting a mismatch either. > > I think we can all agree that there should be no real problem with > instruction selection when adding these sorts of target attributes. As > you point out below, the problems start occurring when it gets to the > MC layer and object emission. > > As the author of the well-intentioned cleanup patch that unmasked this > issue, I'd like to thank you for putting in the time to delve into > things. The patches in question were: > * https://reviews.llvm.org/rL321707 > * https://reviews.llvm.org/rL321692 > > > I think that part of this is the same problem that is observed in > > PR36542 the ARMAsmBackend that is responsible for widening the tail > > call to a Thumb2 branch is created with ARMv4T which doesn't support > > Thumb1. There has been a recent change that threads through the > > existing SubtargetInfo instead of recreating it from the triple alone. > > It is worth mentioning that the object level BuildAttributes do not > > include Thumbv7a which is misleading to a linker as it will be > > expecting no ARMv7A in the object. > > > > Has there already been a discussion about what per function > > code-generation with BuildAttributes higher than the base object > > should mean in the context of capabilities of the ARMAsmBackend and > > BuildAttributes? My thoughts right now are that if ARMAsmBackend is to > > operate at an object level, rather than a per-function level then it > > has to use the capabilities of the highest architecture in the file. > > This also means giving the object BuildAttributes of the highest > > architecture in the file, and giving an error if they contradict, for > > example mixing Thumb Cortex-M0 and ARM Cortex-A53. If the > > ARMAsmBackend could be made to work on a per-function level then there > > is a chance that we could only widen the tail call to g() in f(), but > > not elsewhere. To honestly describe this in the BuildAttributes we > > would need to use per Section or per Function attributes though. > > > > My suggestion to move forward here is: > > - Recreate the SubtargetInfo based on the merge of all the Targets and > > CPU information that we have seen, or warn/error if they are > > incompatible. > > - Ouput the Tag_CPU_arch BuildAttributes based on the merge of all the > > Targets and CPU information that we have seen. > > > > It is probably worth moving any discussion of this particular part to > > PR36542 since it is somewhat Arm specific. I'll add this comment to > > there. > > I'm not so sure this is ARM specific, as other targets might well > encounter similar issues (even if there is no direct equivalent to > build attributes, there are cases where information is encoded into > ELF flags on a per-object basis). > >FWIW I've followed up in the bug with both a high level description of how these things work (and should work) as well as what to do for things that are encoded on a per-object basis. -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180320/47bdfb83/attachment.html>
Possibly Parallel Threads
- [RFC] Stop giving a default CPU to the LTO plugin?
- [RFC] Stop giving a default CPU to the LTO plugin?
- [RFC] Stop giving a default CPU to the LTO plugin?
- [LLVMdev] Strange i386 cross build error.
- [RFC] Making -mcpu=generic the default for ARM armv7a and arm8a rather than -mcpu=cortex-a8 or -mcpu=cortex-a53