Zakk via llvm-dev
2020-Jan-13 14:12 UTC
[llvm-dev] Encode target-abi into LLVM bitcode for LTO.
David Blaikie via llvm-dev <llvm-dev at lists.llvm.org> 於 2020年1月11日 週六 上午2:03寫道:> Ah, OK - thanks for walking me through that. > > Fair enough, I think I understand the issue/tradeoff now - and that the > other module level metadata don't currently influence the target > configuration at this level? > >I'm not sure, I only know that the target-abi is decided and computed when instantiating the TargetMachine, and TragetMachine constructor doesn't have IR as argument. I guess this starts to come from the other direction for me, then: what> purpose does the "target triple" in the module textual IR serve? I guess > it's not used to create the backend, but at best used to validate that it's > correct/the same as what the backend's been configured with? (or perhaps > that is accessible before the Module proper is parsed?) >I'm not sure too. For RISC-V as Sam mention before, it's not easy to add ABI to the triple... but Yes, the target triple is accessible before the Module proper is parsed in current codegen flow. so in my PoC https://reviews.llvm.org/D72245#change-sHyISc6hOqcy, I need to changed the flow but I think static method is not a good idea.> > On Fri, Jan 10, 2020 at 9:09 AM Sam Elliott via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> I also work on the RISC-V backend, and have been doing a little work on >> the ELF psABI document for RISC-V. >> >> I agree that, conceptually, the psABI choice should be in the module >> metadata. >> >> Zakk, however, has discovered a phase ordering issue within LLVM that >> relates to this approach. The phase ordering problem is that the LTO >> backend is currently setup without interrogating the current module for any >> information which might affect its setup. He has two patches that propose >> two different ways forward, but he thinks neither is satisfactory. It would >> be useful to have some guidance as to which approach is preferred by LLVM. >> >> Of course, it would be great if any such adopted approach was compatible >> with Mips, given that backend seems to have a similar kind of problem >> (though they have also managed to solve it different ways). >> >> As an aside, adding the ABI to the triple is not particularly easy, as, >> for instance, there’s already half an ABI choice in the >> ‘riscv64-unknown-linux-gnu` “triple”. The gnu/musl ABI choice is orthogonal >> to the RISC-V calling convention choice around the use of software/hardware >> floating point ABIs, and saying you have to specify both in the triple >> seems like it will only confuse users. >> >> tldr: Module metadata seems like the right approach, feedback on which of >> the two of Zakk’s patches (from the original email) forms the better >> approach to get this information into the LTO backend setup would be >> helpful. >> >> Sam >> >> > On 10 Jan 2020, at 12:23 am, David Blaikie via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> > >> > On Wed, Jan 8, 2020 at 5:57 PM Eric Christopher <echristo at gmail.com> >> wrote: >> > Right. I think that's what we ended up doing rather than a more general >> attribute on the module itself. >> > >> > What sort of further generality did you have in mind? >> > >> > *shrugs* Probably ok? I'd probably prefer not to have to have target >> code to do the evaluation if possible, >> > >> > Target code to evaluate what, sorry? I'm not following (this whole >> conversation's started being a bit confusing to me - from what I can see, >> module metadata pretty well captures the feature request here (a module >> level target-specific attribute that can fail during IR linking if linking >> modules with inconsistent values for this attribute) >> > >> > but everything is weird and an edge case - mips abis more than some :) >> > >> > -eric >> > >> > On Wed, Jan 8, 2020 at 8:58 AM David Blaikie <dblaikie at gmail.com> >> wrote: >> > Oh, I should say - the module flags metadata also has support for >> "error if you try to merge two modules with different values for this flag". >> > >> > On Wed, Jan 8, 2020 at 8:57 AM David Blaikie <dblaikie at gmail.com> >> wrote: >> > >> > >> > On Tue, Jan 7, 2020 at 5:27 PM Eric Christopher <echristo at gmail.com> >> wrote: >> > >> > >> > On Tue, Jan 7, 2020 at 3:18 PM Daniel Sanders via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> > >> > >> >> On Jan 7, 2020, at 13:57, David Blaikie <dblaikie at gmail.com> wrote: >> >> >> >> >> >> >> >> On Mon, Jan 6, 2020 at 6:05 PM Daniel Sanders < >> daniel_l_sanders at apple.com> wrote: >> >> >> >> >> >>> On Jan 6, 2020, at 14:29, David Blaikie via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> >> >>> >> >>> >> >>> On Mon, Jan 6, 2020 at 5:58 AM Zakk <zakk0610 at gmail.com> wrote: >> >>> >> >>> >> >>> David Blaikie <dblaikie at gmail.com> 於 2020年1月6日 週一 下午2:23寫道: >> >>> If this is something that can vary per file in a compilation and >> resolve correctly when one object file is built with one ABI and another >> object file is built with a different ABI (that seems to be antithetical to >> the concept of "ABI" Though) - then it should be a subtarget feature. >> >>> >> >>> ABI is generally something that has to be agreed upon across object >> files - so it wouldn't make sense to link two object files with two >> different ABIs. What's going on here that makes that valid in this case? >> >>> >> >>> >> >>> Are you talking about that "[mips] Pass ABI name via -target-abi >> instead of target-features"? >> >>> >> >>> I'm not talking about that patch in particular (I have no specific >> knowledge of mips or its implementation) - but speaking about the general >> design of LLVM's subtarget features. >> >>> >> >>> Might be interesting to know why that change was made & may help >> explain what's going on here. >> >> >> >> It's been a while so I don't remember the detail but IIRC one of the >> reasons was that mips had a feature bit per ABI and had a lot of duplicated >> code sanity checking that only one bit was enabled and deriving the ABI >> from the feature bits. The -target-abi option already existed and using >> that prevented the possibility of having more than one ABI selected. >> >> >> >> There was a lot of code (some of which didn't have access to target >> features) in the backend that tried to derive the ABI from the arch >> component of the triple (e.g. mips64 => n64 ABI) even though there were >> multiple possible ABI's for each arch (mips64 => o32, n32, or n64 ABI's) >> and there isn't a canonical choice for any given triple (it varies between >> linux distributions and toolchains in general). Settling on -target-abi >> allowed us to sort out the inconsistencies in the backends opinion of what >> the selected ABI was. It also allowed us to move the selection of the ABI >> into the frontend where disagreements between distributions/toolchains on >> what each triple means was easier to deal with. >> >> >> >> Is this something that can vary per function in a program? (that seems >> confusing to me - ABI is usually, sort of by definition, the thing that all >> parts of the program have to agree with (at least on either side of any >> function call - I suppose different functions could have different ABIs as >> long as the function declarations carried ABI information so callers could >> cooperate, etc)) It sounds to me like that's what Zakk is >> suggesting/grappling with. >> > >> > No, it was a per-binary thing for mips and was stored in the ELF >> header. Ignoring a couple quirks*, every object in the program had to agree >> on the ABI in order to link. >> > >> > I'm not particularly familiar with LTO but going by the description of >> the problem it seems to me that the overall issue is that for 1, 2, and 5, >> each module fails to completely describe the contents. They each have a >> label saying it's riscv64, elf, etc. but it doesn't mention lp64d anywhere. >> As a result you can't check that you aren't trying to mix incompatible >> modules and can only trust (and require) the command line option. It's >> worth mentioning that DataLayout tends to change for different ABI's so the >> ABI is kind-of there but there isn't anything that really guarantees that >> there's a 1:1 relationship. >> > >> > 3 and 4 fix the problem of the missing labels but the snag with 4 is >> that target features are overridable at the function level too and that >> doesn't really make sense for ABI's (it's fine for calling conventions but >> that's only part of the ABI and calling conventions are described elsewhere >> in the IR anyway). Without changing the IR, 3 looks like the only one that >> solves the overall problem but then you have potential for problems where >> the official triple for a platform doesn't match what needs to be in the >> triple metadata in the IR. For example, mips64-linux-gnu can be N32 or N64 >> ABI (or more rarely O32) depending on the >> OS/distribution/toolchain/version. FWIW, back when I worked on it, we were >> generally moving towards the idea of canonical triples which contained the >> ABI and some lowering code on the user facing interfaces to disambiguate >> things like mips64-linux-gnu to mips64-linux-gnuabin32. >> > >> > >> > To reply here a bit: >> > >> > I worry about target triple being used, but I think I do/did agree that >> it's probably the best we can move to in the near term. My concern is that >> we will have diverged from more "canonical" triples that are used in other >> places just for our compilation model. I'd love to be able to encode ABI >> into the module in some way and make it an error to link two modules that >> have incompatible ABIs and am definitely up to ideas on how to encode >> target specific module level data into the module. I'd like to avoid >> metadata if possible. Any thoughts here on how you'd like to see it encoded >> for the long term? >> > >> > I was going to say - module metadata has those semantics (but, yeah, do >> have the "this is load bearing metadata" problem). >> > >> > But let's see what else is already there - NumRegisterParameters (no >> idea what that is, but that sounds like an ABI feature/not something you >> could drop & maintain correctness), Dwarf Version (kinda), CodeView >> (kinda), PIC level (probably load bearing), PIE level (similar), Code Model >> (load bearing). >> > >> > https://llvm.org/docs/LangRef.html#module-flags-metadata - yeah, I >> think this is probably the right tool for this now, given what else is >> already here. If someone wants to make this solution not metadata (but I >> think this "global flags metadata" is specifically treated to have all the >> semantics we want here, so I'm not sure what else would be created that >> didn't look basically the same) then it can be done & all these things can >> be ported over to whatever that new thing is. >> > >> > >> > *Just for completeness, the quirks I can remember off-hand were: >> > - IEEE754 1985 and 2008 would successfully cross-link unless you used a >> flag indicating that it mattered. This was because we wanted to omit the >> 1985 standard from newer chips but there were many ecosystems using it due >> to historical reasons. In practice, very few programs care about the tiny >> details (does negation trap, etc.) so we essentially force-migrated whole >> ecosystems by relaxing the link requirements and changing the default. >> > - Along the same lines, we also supported cross-linking specific >> variants of the O32 ABI. There was only supposed to be one O32 but an >> unfortunate mis-reading of the ABI spec coupled with a failure to catch it >> with conformance tests split it in two. Luckily, Matthew Fortune found a >> way to reunite them without breaking either one by adding a third that >> followed the original intent of the spec and was compatible with either one >> (but not both at once) and then migrating everyone to that. >> > >> > >> > *sigh* I remember that. :) >> > >> > Thanks for chiming in Daniel! >> > >> > -eric >> > >> >> If it can vary per function, then the ABI information shouldn't be >> used outside the per-function context (ie: no global variables/other output >> could depend on the ABI because which function's ABI would it depend on?). >> >> >> >> >> >>> I don't know WHY -target-abi is passing via different option, not via >> -mattr (subtarget feature) >> >>> maybe usually subtarget feature is used to manages different specific >> ISA. >> >>> >> >>> >> >>> On Sun, Jan 5, 2020 at 10:04 PM Zakk via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> Hi all. >> >>> >> >>> There are two steps in LTO codegen so the problem is how to pass ABI >> info into LTO code generator. >> >>> >> >>> The easier way is pass -target-abi via option to LTO codegen, but >> there is linking issue when linking two bitcodes generated by different >> -mabi option. (see https://reviews.llvm.org/D71387#1792169) >> >>> >> >>> Usually the ABI info for a file is derived from target triple, mcpu >> or -mabi, but in RISC-V, target-abi is only derived from -mabi and -mattr >> option, so the one of solutions is encoding target-abi in IR via LLVM >> module flags metadata. >> >>> >> >>> But there is an another issue in assembler. In current LLVM design, >> there is no mechanism to extract info from IR before AsmBackend >> construction, so I use some little weird approach to init target-abi option >> before construct AsmBackend[1] or reassign target-abi option in >> getSubtargetImpl and do some hack in backend[2]. >> >>> >> >>> 1. https://reviews.llvm.org/D72245#change-sHyISc6hOqcy (see llc.cpp) >> >>> 2. https://reviews.llvm.org/D72246 (see RISCVAsmBackend.h) >> >>> >> >>> I think [1] and [2] are not good enough, the other ideals like >> >>> >> >>> 3. encode target abi info in triple name. ex. >> riscv64-unknown-elf-lp64d >> >>> 4. encode target-abi into in target-feature (maybe it's not a good >> ideal because mips reverted this approach >> >>> before. >> http://llvm.org/viewvc/llvm-project?view=revision&revision=227583) >> >>> >> >>> 5. users should pass target-abi themselves. (append >> -Wl,-plugin-opt=-target-abi=ipl32f when compiling with -mabi=ilp32f) >> >>> >> >>> Is it a good idea to encode target-abi into bitcode? >> >>> If yes, is there another good approach to fix AsmBackend issue? >> >>> I’d appreciate any help or suggestions. >> >>> >> >>> Thanks. >> >>> >> >>> -- >> >>> Best regards, >> >>> Kuan-Hsu >> >>> >> >>> >> >>> >> >>> _______________________________________________ >> >>> LLVM Developers mailing list >> >>> llvm-dev at lists.llvm.org >> >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >>> >> >>> >> >>> -- >> >>> Best regards, >> >>> Kuan-Hsu >> >>> >> >>> >> >>> _______________________________________________ >> >>> LLVM Developers mailing list >> >>> llvm-dev at lists.llvm.org >> >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > >> > _______________________________________________ >> > LLVM Developers mailing list >> > llvm-dev at lists.llvm.org >> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > _______________________________________________ >> > LLVM Developers mailing list >> > llvm-dev at lists.llvm.org >> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> -- >> Sam Elliott >> Software Developer - LLVM >> lowRISC CIC >> -- >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-- Best regards, Kuan-Hsu -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200113/820e3c65/attachment.html>
David Blaikie via llvm-dev
2020-Jan-13 18:15 UTC
[llvm-dev] Encode target-abi into LLVM bitcode for LTO.
On Mon, Jan 13, 2020 at 6:12 AM Zakk <zakk0610 at gmail.com> wrote:> > > David Blaikie via llvm-dev <llvm-dev at lists.llvm.org> 於 2020年1月11日 週六 > 上午2:03寫道: > >> Ah, OK - thanks for walking me through that. >> >> Fair enough, I think I understand the issue/tradeoff now - and that the >> other module level metadata don't currently influence the target >> configuration at this level? >> >> > I'm not sure, I only know that the target-abi is decided and computed when > instantiating the TargetMachine, and TragetMachine constructor doesn't have > IR as argument. >Could you look into the other module level metadata to understand/explain that, as it may help inform the design here?> I guess this starts to come from the other direction for me, then: what >> purpose does the "target triple" in the module textual IR serve? I guess >> it's not used to create the backend, but at best used to validate that it's >> correct/the same as what the backend's been configured with? (or perhaps >> that is accessible before the Module proper is parsed?) >> > > I'm not sure too. For RISC-V as Sam mention before, it's not easy to add > ABI to the triple... but Yes, the target triple is accessible before the > Module proper is parsed in current codegen flow. > so in my PoC https://reviews.llvm.org/D72245#change-sHyISc6hOqcy, I need > to changed the flow but I think static method is not a good idea. >And this too. If the triple from the IR file is used in the places you need this new ABI parameter, it might inform how best to pass this other data through too.> On Fri, Jan 10, 2020 at 9:09 AM Sam Elliott via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> I also work on the RISC-V backend, and have been doing a little work on >>> the ELF psABI document for RISC-V. >>> >>> I agree that, conceptually, the psABI choice should be in the module >>> metadata. >>> >>> Zakk, however, has discovered a phase ordering issue within LLVM that >>> relates to this approach. The phase ordering problem is that the LTO >>> backend is currently setup without interrogating the current module for any >>> information which might affect its setup. He has two patches that propose >>> two different ways forward, but he thinks neither is satisfactory. It would >>> be useful to have some guidance as to which approach is preferred by LLVM. >>> >>> Of course, it would be great if any such adopted approach was compatible >>> with Mips, given that backend seems to have a similar kind of problem >>> (though they have also managed to solve it different ways). >>> >>> As an aside, adding the ABI to the triple is not particularly easy, as, >>> for instance, there’s already half an ABI choice in the >>> ‘riscv64-unknown-linux-gnu` “triple”. The gnu/musl ABI choice is orthogonal >>> to the RISC-V calling convention choice around the use of software/hardware >>> floating point ABIs, and saying you have to specify both in the triple >>> seems like it will only confuse users. >>> >>> tldr: Module metadata seems like the right approach, feedback on which >>> of the two of Zakk’s patches (from the original email) forms the better >>> approach to get this information into the LTO backend setup would be >>> helpful. >>> >>> Sam >>> >>> > On 10 Jan 2020, at 12:23 am, David Blaikie via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> > >>> > On Wed, Jan 8, 2020 at 5:57 PM Eric Christopher <echristo at gmail.com> >>> wrote: >>> > Right. I think that's what we ended up doing rather than a more >>> general attribute on the module itself. >>> > >>> > What sort of further generality did you have in mind? >>> > >>> > *shrugs* Probably ok? I'd probably prefer not to have to have target >>> code to do the evaluation if possible, >>> > >>> > Target code to evaluate what, sorry? I'm not following (this whole >>> conversation's started being a bit confusing to me - from what I can see, >>> module metadata pretty well captures the feature request here (a module >>> level target-specific attribute that can fail during IR linking if linking >>> modules with inconsistent values for this attribute) >>> > >>> > but everything is weird and an edge case - mips abis more than some :) >>> > >>> > -eric >>> > >>> > On Wed, Jan 8, 2020 at 8:58 AM David Blaikie <dblaikie at gmail.com> >>> wrote: >>> > Oh, I should say - the module flags metadata also has support for >>> "error if you try to merge two modules with different values for this flag". >>> > >>> > On Wed, Jan 8, 2020 at 8:57 AM David Blaikie <dblaikie at gmail.com> >>> wrote: >>> > >>> > >>> > On Tue, Jan 7, 2020 at 5:27 PM Eric Christopher <echristo at gmail.com> >>> wrote: >>> > >>> > >>> > On Tue, Jan 7, 2020 at 3:18 PM Daniel Sanders via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> > >>> > >>> >> On Jan 7, 2020, at 13:57, David Blaikie <dblaikie at gmail.com> wrote: >>> >> >>> >> >>> >> >>> >> On Mon, Jan 6, 2020 at 6:05 PM Daniel Sanders < >>> daniel_l_sanders at apple.com> wrote: >>> >> >>> >> >>> >>> On Jan 6, 2020, at 14:29, David Blaikie via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>> >>> >>> >>> >>> >>> >>> On Mon, Jan 6, 2020 at 5:58 AM Zakk <zakk0610 at gmail.com> wrote: >>> >>> >>> >>> >>> >>> David Blaikie <dblaikie at gmail.com> 於 2020年1月6日 週一 下午2:23寫道: >>> >>> If this is something that can vary per file in a compilation and >>> resolve correctly when one object file is built with one ABI and another >>> object file is built with a different ABI (that seems to be antithetical to >>> the concept of "ABI" Though) - then it should be a subtarget feature. >>> >>> >>> >>> ABI is generally something that has to be agreed upon across object >>> files - so it wouldn't make sense to link two object files with two >>> different ABIs. What's going on here that makes that valid in this case? >>> >>> >>> >>> >>> >>> Are you talking about that "[mips] Pass ABI name via -target-abi >>> instead of target-features"? >>> >>> >>> >>> I'm not talking about that patch in particular (I have no specific >>> knowledge of mips or its implementation) - but speaking about the general >>> design of LLVM's subtarget features. >>> >>> >>> >>> Might be interesting to know why that change was made & may help >>> explain what's going on here. >>> >> >>> >> It's been a while so I don't remember the detail but IIRC one of the >>> reasons was that mips had a feature bit per ABI and had a lot of duplicated >>> code sanity checking that only one bit was enabled and deriving the ABI >>> from the feature bits. The -target-abi option already existed and using >>> that prevented the possibility of having more than one ABI selected. >>> >> >>> >> There was a lot of code (some of which didn't have access to target >>> features) in the backend that tried to derive the ABI from the arch >>> component of the triple (e.g. mips64 => n64 ABI) even though there were >>> multiple possible ABI's for each arch (mips64 => o32, n32, or n64 ABI's) >>> and there isn't a canonical choice for any given triple (it varies between >>> linux distributions and toolchains in general). Settling on -target-abi >>> allowed us to sort out the inconsistencies in the backends opinion of what >>> the selected ABI was. It also allowed us to move the selection of the ABI >>> into the frontend where disagreements between distributions/toolchains on >>> what each triple means was easier to deal with. >>> >> >>> >> Is this something that can vary per function in a program? (that >>> seems confusing to me - ABI is usually, sort of by definition, the thing >>> that all parts of the program have to agree with (at least on either side >>> of any function call - I suppose different functions could have different >>> ABIs as long as the function declarations carried ABI information so >>> callers could cooperate, etc)) It sounds to me like that's what Zakk is >>> suggesting/grappling with. >>> > >>> > No, it was a per-binary thing for mips and was stored in the ELF >>> header. Ignoring a couple quirks*, every object in the program had to agree >>> on the ABI in order to link. >>> > >>> > I'm not particularly familiar with LTO but going by the description of >>> the problem it seems to me that the overall issue is that for 1, 2, and 5, >>> each module fails to completely describe the contents. They each have a >>> label saying it's riscv64, elf, etc. but it doesn't mention lp64d anywhere. >>> As a result you can't check that you aren't trying to mix incompatible >>> modules and can only trust (and require) the command line option. It's >>> worth mentioning that DataLayout tends to change for different ABI's so the >>> ABI is kind-of there but there isn't anything that really guarantees that >>> there's a 1:1 relationship. >>> > >>> > 3 and 4 fix the problem of the missing labels but the snag with 4 is >>> that target features are overridable at the function level too and that >>> doesn't really make sense for ABI's (it's fine for calling conventions but >>> that's only part of the ABI and calling conventions are described elsewhere >>> in the IR anyway). Without changing the IR, 3 looks like the only one that >>> solves the overall problem but then you have potential for problems where >>> the official triple for a platform doesn't match what needs to be in the >>> triple metadata in the IR. For example, mips64-linux-gnu can be N32 or N64 >>> ABI (or more rarely O32) depending on the >>> OS/distribution/toolchain/version. FWIW, back when I worked on it, we were >>> generally moving towards the idea of canonical triples which contained the >>> ABI and some lowering code on the user facing interfaces to disambiguate >>> things like mips64-linux-gnu to mips64-linux-gnuabin32. >>> > >>> > >>> > To reply here a bit: >>> > >>> > I worry about target triple being used, but I think I do/did agree >>> that it's probably the best we can move to in the near term. My concern is >>> that we will have diverged from more "canonical" triples that are used in >>> other places just for our compilation model. I'd love to be able to encode >>> ABI into the module in some way and make it an error to link two modules >>> that have incompatible ABIs and am definitely up to ideas on how to encode >>> target specific module level data into the module. I'd like to avoid >>> metadata if possible. Any thoughts here on how you'd like to see it encoded >>> for the long term? >>> > >>> > I was going to say - module metadata has those semantics (but, yeah, >>> do have the "this is load bearing metadata" problem). >>> > >>> > But let's see what else is already there - NumRegisterParameters (no >>> idea what that is, but that sounds like an ABI feature/not something you >>> could drop & maintain correctness), Dwarf Version (kinda), CodeView >>> (kinda), PIC level (probably load bearing), PIE level (similar), Code Model >>> (load bearing). >>> > >>> > https://llvm.org/docs/LangRef.html#module-flags-metadata - yeah, I >>> think this is probably the right tool for this now, given what else is >>> already here. If someone wants to make this solution not metadata (but I >>> think this "global flags metadata" is specifically treated to have all the >>> semantics we want here, so I'm not sure what else would be created that >>> didn't look basically the same) then it can be done & all these things can >>> be ported over to whatever that new thing is. >>> > >>> > >>> > *Just for completeness, the quirks I can remember off-hand were: >>> > - IEEE754 1985 and 2008 would successfully cross-link unless you used >>> a flag indicating that it mattered. This was because we wanted to omit the >>> 1985 standard from newer chips but there were many ecosystems using it due >>> to historical reasons. In practice, very few programs care about the tiny >>> details (does negation trap, etc.) so we essentially force-migrated whole >>> ecosystems by relaxing the link requirements and changing the default. >>> > - Along the same lines, we also supported cross-linking specific >>> variants of the O32 ABI. There was only supposed to be one O32 but an >>> unfortunate mis-reading of the ABI spec coupled with a failure to catch it >>> with conformance tests split it in two. Luckily, Matthew Fortune found a >>> way to reunite them without breaking either one by adding a third that >>> followed the original intent of the spec and was compatible with either one >>> (but not both at once) and then migrating everyone to that. >>> > >>> > >>> > *sigh* I remember that. :) >>> > >>> > Thanks for chiming in Daniel! >>> > >>> > -eric >>> > >>> >> If it can vary per function, then the ABI information shouldn't be >>> used outside the per-function context (ie: no global variables/other output >>> could depend on the ABI because which function's ABI would it depend on?). >>> >> >>> >> >>> >>> I don't know WHY -target-abi is passing via different option, not >>> via -mattr (subtarget feature) >>> >>> maybe usually subtarget feature is used to manages different >>> specific ISA. >>> >>> >>> >>> >>> >>> On Sun, Jan 5, 2020 at 10:04 PM Zakk via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>> Hi all. >>> >>> >>> >>> There are two steps in LTO codegen so the problem is how to pass ABI >>> info into LTO code generator. >>> >>> >>> >>> The easier way is pass -target-abi via option to LTO codegen, but >>> there is linking issue when linking two bitcodes generated by different >>> -mabi option. (see https://reviews.llvm.org/D71387#1792169) >>> >>> >>> >>> Usually the ABI info for a file is derived from target triple, mcpu >>> or -mabi, but in RISC-V, target-abi is only derived from -mabi and -mattr >>> option, so the one of solutions is encoding target-abi in IR via LLVM >>> module flags metadata. >>> >>> >>> >>> But there is an another issue in assembler. In current LLVM design, >>> there is no mechanism to extract info from IR before AsmBackend >>> construction, so I use some little weird approach to init target-abi option >>> before construct AsmBackend[1] or reassign target-abi option in >>> getSubtargetImpl and do some hack in backend[2]. >>> >>> >>> >>> 1. https://reviews.llvm.org/D72245#change-sHyISc6hOqcy (see llc.cpp) >>> >>> 2. https://reviews.llvm.org/D72246 (see RISCVAsmBackend.h) >>> >>> >>> >>> I think [1] and [2] are not good enough, the other ideals like >>> >>> >>> >>> 3. encode target abi info in triple name. ex. >>> riscv64-unknown-elf-lp64d >>> >>> 4. encode target-abi into in target-feature (maybe it's not a good >>> ideal because mips reverted this approach >>> >>> before. >>> http://llvm.org/viewvc/llvm-project?view=revision&revision=227583) >>> >>> >>> >>> 5. users should pass target-abi themselves. (append >>> -Wl,-plugin-opt=-target-abi=ipl32f when compiling with -mabi=ilp32f) >>> >>> >>> >>> Is it a good idea to encode target-abi into bitcode? >>> >>> If yes, is there another good approach to fix AsmBackend issue? >>> >>> I’d appreciate any help or suggestions. >>> >>> >>> >>> Thanks. >>> >>> >>> >>> -- >>> >>> Best regards, >>> >>> Kuan-Hsu >>> >>> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> >>> LLVM Developers mailing list >>> >>> llvm-dev at lists.llvm.org >>> >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >>> >>> >>> >>> >>> -- >>> >>> Best regards, >>> >>> Kuan-Hsu >>> >>> >>> >>> >>> >>> _______________________________________________ >>> >>> LLVM Developers mailing list >>> >>> llvm-dev at lists.llvm.org >>> >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> > >>> > _______________________________________________ >>> > LLVM Developers mailing list >>> > llvm-dev at lists.llvm.org >>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> > _______________________________________________ >>> > LLVM Developers mailing list >>> > llvm-dev at lists.llvm.org >>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >>> -- >>> Sam Elliott >>> Software Developer - LLVM >>> lowRISC CIC >>> -- >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > > > -- > Best regards, > Kuan-Hsu > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200113/3cc7628a/attachment-0001.html>
Zakk via llvm-dev
2020-Jan-15 09:37 UTC
[llvm-dev] Encode target-abi into LLVM bitcode for LTO.
David Blaikie <dblaikie at gmail.com> 於 2020年1月14日 週二 上午2:15寫道:> > > On Mon, Jan 13, 2020 at 6:12 AM Zakk <zakk0610 at gmail.com> wrote: > >> >> >> David Blaikie via llvm-dev <llvm-dev at lists.llvm.org> 於 2020年1月11日 週六 >> 上午2:03寫道: >> >>> Ah, OK - thanks for walking me through that. >>> >>> Fair enough, I think I understand the issue/tradeoff now - and that the >>> other module level metadata don't currently influence the target >>> configuration at this level? >>> >>> >> I'm not sure, I only know that the target-abi is decided and computed >> when instantiating the TargetMachine, and TragetMachine constructor doesn't >> have IR as argument. >> > > Could you look into the other module level metadata to understand/explain > that, as it may help inform the design here? > >I grep "addModuleFlag" in clang and list all module level metadata: "NumRegisterParameters": implement __attribute((regparm(3))) which is x86 specific feature and used in X86TargetLowering phase. "cfguard": implement Control Flow Guard (Check mechanism) in IR level pass. "StrictVTablePointers", "StrictVTablePointersRequirement": there is no code use this info in llvm source tree "min_enum_size", "wchar_size": indicate C type size, ARMAsmPrinter use it to emit corresponded asm attributes. "Cross-DSO CFI", "CFI Canonical Jump Tables": implement control flow integrity (CFI) schemes in IR level pass. "cf-protection-return" and "cf-protection-branch": implement control flow architecture protection in x86, used in X86ISelLowering and X86AsmPrinter phases. "nvvm-reflect-ftz": implement optimized code paths that flush subnormals to zero in NVPTX custom IR pass. "EnableSplitLTOUnit", "ThinLTO": LTO related stuff, I didn't look that because I think they will not influence the target configuration. I didn't dig in those "Dwarf Version", "CodeView", "CodeViewGHash", "Debug Info Version" and others Objective-C related info in module level metadata. But I found several metadata info are handled at AsmPrinter::EmitStartOfAsmFile(Module &M), but there is no any similar interface in MC layer. ex. MCAsmBackend or MCStreamer.> I guess this starts to come from the other direction for me, then: what >>> purpose does the "target triple" in the module textual IR serve? I guess >>> it's not used to create the backend, but at best used to validate that it's >>> correct/the same as what the backend's been configured with? (or perhaps >>> that is accessible before the Module proper is parsed?) >>> >> >> I'm not sure too. For RISC-V as Sam mention before, it's not easy to add >> ABI to the triple... but Yes, the target triple is accessible before the >> Module proper is parsed in current codegen flow. >> so in my PoC https://reviews.llvm.org/D72245#change-sHyISc6hOqcy, I need >> to changed the flow but I think static method is not a good idea. >> > > And this too. If the triple from the IR file is used in the places you > need this new ABI parameter, it might inform how best to pass this other > data through too. >> >> On Fri, Jan 10, 2020 at 9:09 AM Sam Elliott via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>>> I also work on the RISC-V backend, and have been doing a little work on >>>> the ELF psABI document for RISC-V. >>>> >>>> I agree that, conceptually, the psABI choice should be in the module >>>> metadata. >>>> >>>> Zakk, however, has discovered a phase ordering issue within LLVM that >>>> relates to this approach. The phase ordering problem is that the LTO >>>> backend is currently setup without interrogating the current module for any >>>> information which might affect its setup. He has two patches that propose >>>> two different ways forward, but he thinks neither is satisfactory. It would >>>> be useful to have some guidance as to which approach is preferred by LLVM. >>>> >>>> Of course, it would be great if any such adopted approach was >>>> compatible with Mips, given that backend seems to have a similar kind of >>>> problem (though they have also managed to solve it different ways). >>>> >>>> As an aside, adding the ABI to the triple is not particularly easy, as, >>>> for instance, there’s already half an ABI choice in the >>>> ‘riscv64-unknown-linux-gnu` “triple”. The gnu/musl ABI choice is orthogonal >>>> to the RISC-V calling convention choice around the use of software/hardware >>>> floating point ABIs, and saying you have to specify both in the triple >>>> seems like it will only confuse users. >>>> >>>> tldr: Module metadata seems like the right approach, feedback on which >>>> of the two of Zakk’s patches (from the original email) forms the better >>>> approach to get this information into the LTO backend setup would be >>>> helpful. >>>> >>>> Sam >>>> >>>> > On 10 Jan 2020, at 12:23 am, David Blaikie via llvm-dev < >>>> llvm-dev at lists.llvm.org> wrote: >>>> > >>>> > On Wed, Jan 8, 2020 at 5:57 PM Eric Christopher <echristo at gmail.com> >>>> wrote: >>>> > Right. I think that's what we ended up doing rather than a more >>>> general attribute on the module itself. >>>> > >>>> > What sort of further generality did you have in mind? >>>> > >>>> > *shrugs* Probably ok? I'd probably prefer not to have to have target >>>> code to do the evaluation if possible, >>>> > >>>> > Target code to evaluate what, sorry? I'm not following (this whole >>>> conversation's started being a bit confusing to me - from what I can see, >>>> module metadata pretty well captures the feature request here (a module >>>> level target-specific attribute that can fail during IR linking if linking >>>> modules with inconsistent values for this attribute) >>>> > >>>> > but everything is weird and an edge case - mips abis more than some :) >>>> > >>>> > -eric >>>> > >>>> > On Wed, Jan 8, 2020 at 8:58 AM David Blaikie <dblaikie at gmail.com> >>>> wrote: >>>> > Oh, I should say - the module flags metadata also has support for >>>> "error if you try to merge two modules with different values for this flag". >>>> > >>>> > On Wed, Jan 8, 2020 at 8:57 AM David Blaikie <dblaikie at gmail.com> >>>> wrote: >>>> > >>>> > >>>> > On Tue, Jan 7, 2020 at 5:27 PM Eric Christopher <echristo at gmail.com> >>>> wrote: >>>> > >>>> > >>>> > On Tue, Jan 7, 2020 at 3:18 PM Daniel Sanders via llvm-dev < >>>> llvm-dev at lists.llvm.org> wrote: >>>> > >>>> > >>>> >> On Jan 7, 2020, at 13:57, David Blaikie <dblaikie at gmail.com> wrote: >>>> >> >>>> >> >>>> >> >>>> >> On Mon, Jan 6, 2020 at 6:05 PM Daniel Sanders < >>>> daniel_l_sanders at apple.com> wrote: >>>> >> >>>> >> >>>> >>> On Jan 6, 2020, at 14:29, David Blaikie via llvm-dev < >>>> llvm-dev at lists.llvm.org> wrote: >>>> >>> >>>> >>> >>>> >>> >>>> >>> On Mon, Jan 6, 2020 at 5:58 AM Zakk <zakk0610 at gmail.com> wrote: >>>> >>> >>>> >>> >>>> >>> David Blaikie <dblaikie at gmail.com> 於 2020年1月6日 週一 下午2:23寫道: >>>> >>> If this is something that can vary per file in a compilation and >>>> resolve correctly when one object file is built with one ABI and another >>>> object file is built with a different ABI (that seems to be antithetical to >>>> the concept of "ABI" Though) - then it should be a subtarget feature. >>>> >>> >>>> >>> ABI is generally something that has to be agreed upon across object >>>> files - so it wouldn't make sense to link two object files with two >>>> different ABIs. What's going on here that makes that valid in this case? >>>> >>> >>>> >>> >>>> >>> Are you talking about that "[mips] Pass ABI name via -target-abi >>>> instead of target-features"? >>>> >>> >>>> >>> I'm not talking about that patch in particular (I have no specific >>>> knowledge of mips or its implementation) - but speaking about the general >>>> design of LLVM's subtarget features. >>>> >>> >>>> >>> Might be interesting to know why that change was made & may help >>>> explain what's going on here. >>>> >> >>>> >> It's been a while so I don't remember the detail but IIRC one of the >>>> reasons was that mips had a feature bit per ABI and had a lot of duplicated >>>> code sanity checking that only one bit was enabled and deriving the ABI >>>> from the feature bits. The -target-abi option already existed and using >>>> that prevented the possibility of having more than one ABI selected. >>>> >> >>>> >> There was a lot of code (some of which didn't have access to target >>>> features) in the backend that tried to derive the ABI from the arch >>>> component of the triple (e.g. mips64 => n64 ABI) even though there were >>>> multiple possible ABI's for each arch (mips64 => o32, n32, or n64 ABI's) >>>> and there isn't a canonical choice for any given triple (it varies between >>>> linux distributions and toolchains in general). Settling on -target-abi >>>> allowed us to sort out the inconsistencies in the backends opinion of what >>>> the selected ABI was. It also allowed us to move the selection of the ABI >>>> into the frontend where disagreements between distributions/toolchains on >>>> what each triple means was easier to deal with. >>>> >> >>>> >> Is this something that can vary per function in a program? (that >>>> seems confusing to me - ABI is usually, sort of by definition, the thing >>>> that all parts of the program have to agree with (at least on either side >>>> of any function call - I suppose different functions could have different >>>> ABIs as long as the function declarations carried ABI information so >>>> callers could cooperate, etc)) It sounds to me like that's what Zakk is >>>> suggesting/grappling with. >>>> > >>>> > No, it was a per-binary thing for mips and was stored in the ELF >>>> header. Ignoring a couple quirks*, every object in the program had to agree >>>> on the ABI in order to link. >>>> > >>>> > I'm not particularly familiar with LTO but going by the description >>>> of the problem it seems to me that the overall issue is that for 1, 2, and >>>> 5, each module fails to completely describe the contents. They each have a >>>> label saying it's riscv64, elf, etc. but it doesn't mention lp64d anywhere. >>>> As a result you can't check that you aren't trying to mix incompatible >>>> modules and can only trust (and require) the command line option. It's >>>> worth mentioning that DataLayout tends to change for different ABI's so the >>>> ABI is kind-of there but there isn't anything that really guarantees that >>>> there's a 1:1 relationship. >>>> > >>>> > 3 and 4 fix the problem of the missing labels but the snag with 4 is >>>> that target features are overridable at the function level too and that >>>> doesn't really make sense for ABI's (it's fine for calling conventions but >>>> that's only part of the ABI and calling conventions are described elsewhere >>>> in the IR anyway). Without changing the IR, 3 looks like the only one that >>>> solves the overall problem but then you have potential for problems where >>>> the official triple for a platform doesn't match what needs to be in the >>>> triple metadata in the IR. For example, mips64-linux-gnu can be N32 or N64 >>>> ABI (or more rarely O32) depending on the >>>> OS/distribution/toolchain/version. FWIW, back when I worked on it, we were >>>> generally moving towards the idea of canonical triples which contained the >>>> ABI and some lowering code on the user facing interfaces to disambiguate >>>> things like mips64-linux-gnu to mips64-linux-gnuabin32. >>>> > >>>> > >>>> > To reply here a bit: >>>> > >>>> > I worry about target triple being used, but I think I do/did agree >>>> that it's probably the best we can move to in the near term. My concern is >>>> that we will have diverged from more "canonical" triples that are used in >>>> other places just for our compilation model. I'd love to be able to encode >>>> ABI into the module in some way and make it an error to link two modules >>>> that have incompatible ABIs and am definitely up to ideas on how to encode >>>> target specific module level data into the module. I'd like to avoid >>>> metadata if possible. Any thoughts here on how you'd like to see it encoded >>>> for the long term? >>>> > >>>> > I was going to say - module metadata has those semantics (but, yeah, >>>> do have the "this is load bearing metadata" problem). >>>> > >>>> > But let's see what else is already there - NumRegisterParameters (no >>>> idea what that is, but that sounds like an ABI feature/not something you >>>> could drop & maintain correctness), Dwarf Version (kinda), CodeView >>>> (kinda), PIC level (probably load bearing), PIE level (similar), Code Model >>>> (load bearing). >>>> > >>>> > https://llvm.org/docs/LangRef.html#module-flags-metadata - yeah, I >>>> think this is probably the right tool for this now, given what else is >>>> already here. If someone wants to make this solution not metadata (but I >>>> think this "global flags metadata" is specifically treated to have all the >>>> semantics we want here, so I'm not sure what else would be created that >>>> didn't look basically the same) then it can be done & all these things can >>>> be ported over to whatever that new thing is. >>>> > >>>> > >>>> > *Just for completeness, the quirks I can remember off-hand were: >>>> > - IEEE754 1985 and 2008 would successfully cross-link unless you used >>>> a flag indicating that it mattered. This was because we wanted to omit the >>>> 1985 standard from newer chips but there were many ecosystems using it due >>>> to historical reasons. In practice, very few programs care about the tiny >>>> details (does negation trap, etc.) so we essentially force-migrated whole >>>> ecosystems by relaxing the link requirements and changing the default. >>>> > - Along the same lines, we also supported cross-linking specific >>>> variants of the O32 ABI. There was only supposed to be one O32 but an >>>> unfortunate mis-reading of the ABI spec coupled with a failure to catch it >>>> with conformance tests split it in two. Luckily, Matthew Fortune found a >>>> way to reunite them without breaking either one by adding a third that >>>> followed the original intent of the spec and was compatible with either one >>>> (but not both at once) and then migrating everyone to that. >>>> > >>>> > >>>> > *sigh* I remember that. :) >>>> > >>>> > Thanks for chiming in Daniel! >>>> > >>>> > -eric >>>> > >>>> >> If it can vary per function, then the ABI information shouldn't be >>>> used outside the per-function context (ie: no global variables/other output >>>> could depend on the ABI because which function's ABI would it depend on?). >>>> >> >>>> >> >>>> >>> I don't know WHY -target-abi is passing via different option, not >>>> via -mattr (subtarget feature) >>>> >>> maybe usually subtarget feature is used to manages different >>>> specific ISA. >>>> >>> >>>> >>> >>>> >>> On Sun, Jan 5, 2020 at 10:04 PM Zakk via llvm-dev < >>>> llvm-dev at lists.llvm.org> wrote: >>>> >>> Hi all. >>>> >>> >>>> >>> There are two steps in LTO codegen so the problem is how to pass >>>> ABI info into LTO code generator. >>>> >>> >>>> >>> The easier way is pass -target-abi via option to LTO codegen, but >>>> there is linking issue when linking two bitcodes generated by different >>>> -mabi option. (see https://reviews.llvm.org/D71387#1792169) >>>> >>> >>>> >>> Usually the ABI info for a file is derived from target triple, mcpu >>>> or -mabi, but in RISC-V, target-abi is only derived from -mabi and -mattr >>>> option, so the one of solutions is encoding target-abi in IR via LLVM >>>> module flags metadata. >>>> >>> >>>> >>> But there is an another issue in assembler. In current LLVM design, >>>> there is no mechanism to extract info from IR before AsmBackend >>>> construction, so I use some little weird approach to init target-abi option >>>> before construct AsmBackend[1] or reassign target-abi option in >>>> getSubtargetImpl and do some hack in backend[2]. >>>> >>> >>>> >>> 1. https://reviews.llvm.org/D72245#change-sHyISc6hOqcy (see >>>> llc.cpp) >>>> >>> 2. https://reviews.llvm.org/D72246 (see RISCVAsmBackend.h) >>>> >>> >>>> >>> I think [1] and [2] are not good enough, the other ideals like >>>> >>> >>>> >>> 3. encode target abi info in triple name. ex. >>>> riscv64-unknown-elf-lp64d >>>> >>> 4. encode target-abi into in target-feature (maybe it's not a good >>>> ideal because mips reverted this approach >>>> >>> before. >>>> http://llvm.org/viewvc/llvm-project?view=revision&revision=227583) >>>> >>> >>>> >>> 5. users should pass target-abi themselves. (append >>>> -Wl,-plugin-opt=-target-abi=ipl32f when compiling with -mabi=ilp32f) >>>> >>> >>>> >>> Is it a good idea to encode target-abi into bitcode? >>>> >>> If yes, is there another good approach to fix AsmBackend issue? >>>> >>> I’d appreciate any help or suggestions. >>>> >>> >>>> >>> Thanks. >>>> >>> >>>> >>> -- >>>> >>> Best regards, >>>> >>> Kuan-Hsu >>>> >>> >>>> >>> >>>> >>> >>>> >>> _______________________________________________ >>>> >>> LLVM Developers mailing list >>>> >>> llvm-dev at lists.llvm.org >>>> >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>> >>>> >>> >>>> >>> -- >>>> >>> Best regards, >>>> >>> Kuan-Hsu >>>> >>> >>>> >>> >>>> >>> _______________________________________________ >>>> >>> LLVM Developers mailing list >>>> >>> llvm-dev at lists.llvm.org >>>> >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> > >>>> > _______________________________________________ >>>> > LLVM Developers mailing list >>>> > llvm-dev at lists.llvm.org >>>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> > _______________________________________________ >>>> > LLVM Developers mailing list >>>> > llvm-dev at lists.llvm.org >>>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>>> -- >>>> Sam Elliott >>>> Software Developer - LLVM >>>> lowRISC CIC >>>> -- >>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >> >> >> -- >> Best regards, >> Kuan-Hsu >> >> >>-- Best regards, Kuan-Hsu -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200115/abe5d054/attachment.html>