Mark Johnston via llvm-dev
2018-Aug-21 13:47 UTC
[llvm-dev] [lld] avoid emitting PLT entries for ifuncs
Hello, We've recently started using ifuncs in the x86(_64) FreeBSD kernel. Currently lld will emit a PLT entry for each ifunc, so ifunc calls are more expensive that those of regular functions. In our kernel, this overhead isn't really necessary: if lld instead emits PC-relative relocations for each ifunc call site, where each relocation references a symbol of type GNU_IFUNC, then during boot we can resolve each call site and apply the relocation before mapping the kernel text read-only. Then, ifunc calls have the same overhead as regular function calls. To implement this optimization, I wrote an lld patch to add "-z ifunc-noplt". When this option is specified, lld does not create PLT entries for ifuncs and instead passes the existing PC-relative relocation through to the output file. The patch is below; I tested it with lld 7.0 and the patch applied without modifications to the sources in trunk. I'm wondering if such an option would be acceptable in upstream lld, and whether anyone had comments on my implementation. The patch is lacking tests, and I had some questions: - How should "-z ifunc-noplt" interact with "-z text"? Should the invoker be required to additionally specify "-z notext"? - Could "-z ifunc-noplt" be subsumed by a more general mechanism which tells lld not to apply constant relocations and instead pass them through to the output file? I could imagine using such mechanism to make it possible to dynamically enable retpoline at boot time. It could also be useful for implementing static DTrace trace points. Thanks, -Mark diff --git a/ELF/Config.h b/ELF/Config.h index 5dc7f5321..b5a3d3266 100644 --- a/ELF/Config.h +++ b/ELF/Config.h @@ -182,6 +182,7 @@ struct Configuration { bool ZCopyreloc; bool ZExecstack; bool ZHazardplt; + bool ZIfuncnoplt; bool ZInitfirst; bool ZKeepTextSectionPrefix; bool ZNodelete; diff --git a/ELF/Driver.cpp b/ELF/Driver.cpp index aced1edca..e7896cedf 100644 --- a/ELF/Driver.cpp +++ b/ELF/Driver.cpp @@ -340,7 +340,8 @@ static bool getZFlag(opt::InputArgList &Args, StringRef K1, StringRef K2, static bool isKnown(StringRef S) { return S == "combreloc" || S == "copyreloc" || S == "defs" || - S == "execstack" || S == "hazardplt" || S == "initfirst" || + S == "execstack" || S == "hazardplt" || S == "ifunc-noplt" || + S == "initfirst" || S == "keep-text-section-prefix" || S == "lazy" || S == "muldefs" || S == "nocombreloc" || S == "nocopyreloc" || S == "nodelete" || S == "nodlopen" || S == "noexecstack" || @@ -834,6 +835,7 @@ void LinkerDriver::readConfigs(opt::InputArgList &Args) { Config->ZCopyreloc = getZFlag(Args, "copyreloc", "nocopyreloc", true); Config->ZExecstack = getZFlag(Args, "execstack", "noexecstack", false); Config->ZHazardplt = hasZOption(Args, "hazardplt"); + Config->ZIfuncnoplt = hasZOption(Args, "ifunc-noplt"); Config->ZInitfirst = hasZOption(Args, "initfirst"); Config->ZKeepTextSectionPrefix = getZFlag( Args, "keep-text-section-prefix", "nokeep-text-section-prefix", false); diff --git a/ELF/Relocations.cpp b/ELF/Relocations.cpp index 8f60aa3d2..a54d87e43 100644 --- a/ELF/Relocations.cpp +++ b/ELF/Relocations.cpp @@ -361,6 +361,10 @@ static bool isStaticLinkTimeConstant(RelExpr E, RelType Type, const Symbol &Sym, R_TLSLD_HINT>(E)) return true; + // The computation involves output from the ifunc resolver. + if (Sym.isGnuIFunc() && Config->ZIfuncnoplt) + return false; + // These never do, except if the entire file is position dependent or if // only the low bits are used. if (E == R_GOT || E == R_PLT || E == R_TLSDESC) @@ -808,6 +812,10 @@ static void processRelocAux(InputSectionBase &Sec, RelExpr Expr, RelType Type, Sec.Relocations.push_back({Expr, Type, Offset, Addend, &Sym}); return; } + if (Sym.isGnuIFunc() && Config->ZIfuncnoplt) { + InX::RelaDyn->addReloc(Type, &Sec, Offset, &Sym, Addend, R_ADDEND, Type); + return; + } bool CanWrite = (Sec.Flags & SHF_WRITE) || !Config->ZText; if (CanWrite) { // R_GOT refers to a position in the got, even if the symbol is preemptible. @@ -977,7 +985,7 @@ static void scanReloc(InputSectionBase &Sec, OffsetGetter &GetOffset, RelTy *&I, // all dynamic symbols that can be resolved within the executable will // actually be resolved that way at runtime, because the main exectuable // is always at the beginning of a search list. We can leverage that fact. - if (Sym.isGnuIFunc()) + if (Sym.isGnuIFunc() && !Config->ZIfuncnoplt) Expr = toPlt(Expr); else if (!Sym.IsPreemptible && Expr == R_GOT_PC && !isAbsoluteValue(Sym)) Expr = Target->adjustRelaxExpr(Type, RelocatedAddr, Expr); diff --git a/ELF/Writer.cpp b/ELF/Writer.cpp index 90462ecc7..418133ebd 100644 --- a/ELF/Writer.cpp +++ b/ELF/Writer.cpp @@ -1570,8 +1570,11 @@ template <class ELFT> void Writer<ELFT>::finalizeSections() { applySynthetic({InX::EhFrame}, [](SyntheticSection *SS) { SS->finalizeContents(); }); - for (Symbol *S : Symtab->getSymbols()) + for (Symbol *S : Symtab->getSymbols()) { S->IsPreemptible |= computeIsPreemptible(*S); + if (S->isGnuIFunc() && Config->ZIfuncnoplt) + S->ExportDynamic = true; + } // Scan relocations. This must be done after every symbol is declared so that // we can correctly decide if a dynamic relocation is needed.
Peter Smith via llvm-dev
2018-Aug-21 16:47 UTC
[llvm-dev] [lld] avoid emitting PLT entries for ifuncs
Hello Mark, On 21 August 2018 at 14:47, Mark Johnston via llvm-dev <llvm-dev at lists.llvm.org> wrote:> Hello, > > We've recently started using ifuncs in the x86(_64) FreeBSD kernel. > Currently lld will emit a PLT entry for each ifunc, so ifunc calls are > more expensive that those of regular functions. In our kernel, this > overhead isn't really necessary: if lld instead emits PC-relative > relocations for each ifunc call site, where each relocation references > a symbol of type GNU_IFUNC, then during boot we can resolve each > call site and apply the relocation before mapping the kernel text > read-only. Then, ifunc calls have the same overhead as regular function > calls. > > To implement this optimization, I wrote an lld patch to add > "-z ifunc-noplt". When this option is specified, lld does not create > PLT entries for ifuncs and instead passes the existing PC-relative > relocation through to the output file. The patch is below; I tested it > with lld 7.0 and the patch applied without modifications to the sources > in trunk. > > I'm wondering if such an option would be acceptable in upstream lld, and > whether anyone had comments on my implementation. The patch is lacking > tests, and I had some questions:I'm not the LLD maintainer so this is just a personal opinion. If I understand the optimisation correctly, if it used on some program then either the loader for the program or the program itself is responsible for running the ifunc resolver and resolving the callsites. I think it would have to come with a big health warning in at least the help and documentation that platform/OS support is needed to run the program.> - How should "-z ifunc-noplt" interact with "-z text"? Should the > invoker be required to additionally specify "-z notext"?I think it could it either be -z text -z ifunc-noplt = error, with -z ifunc-noplt implying -z notext; or -ifunc-noplt is an error without -z notext.> - Could "-z ifunc-noplt" be subsumed by a more general mechanism which > tells lld not to apply constant relocations and instead pass them > through to the output file? I could imagine using such mechanism > to make it possible to dynamically enable retpoline at boot time. > It could also be useful for implementing static DTrace trace points.In theory on RELA platforms emit-relocs gets you pretty close; it won't inhibit the generation of PLT or GOT entries though, but I think it would give enough information to alter the callsites to the results of the ifunc resolvers. I guess the problem here is where do you stop and how portable would the solution be across different targets. For example on Arm you would ideally only want to deal with a small subset of the instruction relocations at run/load time. I think it is a solvable problem but it does need some careful thought to avoid just implementing something that works for a specific target/OS. Peter> > Thanks, > -Mark > > diff --git a/ELF/Config.h b/ELF/Config.h > index 5dc7f5321..b5a3d3266 100644 > --- a/ELF/Config.h > +++ b/ELF/Config.h > @@ -182,6 +182,7 @@ struct Configuration { > bool ZCopyreloc; > bool ZExecstack; > bool ZHazardplt; > + bool ZIfuncnoplt; > bool ZInitfirst; > bool ZKeepTextSectionPrefix; > bool ZNodelete; > diff --git a/ELF/Driver.cpp b/ELF/Driver.cpp > index aced1edca..e7896cedf 100644 > --- a/ELF/Driver.cpp > +++ b/ELF/Driver.cpp > @@ -340,7 +340,8 @@ static bool getZFlag(opt::InputArgList &Args, StringRef K1, StringRef K2, > > static bool isKnown(StringRef S) { > return S == "combreloc" || S == "copyreloc" || S == "defs" || > - S == "execstack" || S == "hazardplt" || S == "initfirst" || > + S == "execstack" || S == "hazardplt" || S == "ifunc-noplt" || > + S == "initfirst" || > S == "keep-text-section-prefix" || S == "lazy" || S == "muldefs" || > S == "nocombreloc" || S == "nocopyreloc" || S == "nodelete" || > S == "nodlopen" || S == "noexecstack" || > @@ -834,6 +835,7 @@ void LinkerDriver::readConfigs(opt::InputArgList &Args) { > Config->ZCopyreloc = getZFlag(Args, "copyreloc", "nocopyreloc", true); > Config->ZExecstack = getZFlag(Args, "execstack", "noexecstack", false); > Config->ZHazardplt = hasZOption(Args, "hazardplt"); > + Config->ZIfuncnoplt = hasZOption(Args, "ifunc-noplt"); > Config->ZInitfirst = hasZOption(Args, "initfirst"); > Config->ZKeepTextSectionPrefix = getZFlag( > Args, "keep-text-section-prefix", "nokeep-text-section-prefix", false); > diff --git a/ELF/Relocations.cpp b/ELF/Relocations.cpp > index 8f60aa3d2..a54d87e43 100644 > --- a/ELF/Relocations.cpp > +++ b/ELF/Relocations.cpp > @@ -361,6 +361,10 @@ static bool isStaticLinkTimeConstant(RelExpr E, RelType Type, const Symbol &Sym, > R_TLSLD_HINT>(E)) > return true; > > + // The computation involves output from the ifunc resolver. > + if (Sym.isGnuIFunc() && Config->ZIfuncnoplt) > + return false; > + > // These never do, except if the entire file is position dependent or if > // only the low bits are used. > if (E == R_GOT || E == R_PLT || E == R_TLSDESC) > @@ -808,6 +812,10 @@ static void processRelocAux(InputSectionBase &Sec, RelExpr Expr, RelType Type, > Sec.Relocations.push_back({Expr, Type, Offset, Addend, &Sym}); > return; > } > + if (Sym.isGnuIFunc() && Config->ZIfuncnoplt) { > + InX::RelaDyn->addReloc(Type, &Sec, Offset, &Sym, Addend, R_ADDEND, Type); > + return; > + } > bool CanWrite = (Sec.Flags & SHF_WRITE) || !Config->ZText; > if (CanWrite) { > // R_GOT refers to a position in the got, even if the symbol is preemptible. > @@ -977,7 +985,7 @@ static void scanReloc(InputSectionBase &Sec, OffsetGetter &GetOffset, RelTy *&I, > // all dynamic symbols that can be resolved within the executable will > // actually be resolved that way at runtime, because the main exectuable > // is always at the beginning of a search list. We can leverage that fact. > - if (Sym.isGnuIFunc()) > + if (Sym.isGnuIFunc() && !Config->ZIfuncnoplt) > Expr = toPlt(Expr); > else if (!Sym.IsPreemptible && Expr == R_GOT_PC && !isAbsoluteValue(Sym)) > Expr = Target->adjustRelaxExpr(Type, RelocatedAddr, Expr); > diff --git a/ELF/Writer.cpp b/ELF/Writer.cpp > index 90462ecc7..418133ebd 100644 > --- a/ELF/Writer.cpp > +++ b/ELF/Writer.cpp > @@ -1570,8 +1570,11 @@ template <class ELFT> void Writer<ELFT>::finalizeSections() { > applySynthetic({InX::EhFrame}, > [](SyntheticSection *SS) { SS->finalizeContents(); }); > > - for (Symbol *S : Symtab->getSymbols()) > + for (Symbol *S : Symtab->getSymbols()) { > S->IsPreemptible |= computeIsPreemptible(*S); > + if (S->isGnuIFunc() && Config->ZIfuncnoplt) > + S->ExportDynamic = true; > + } > > // Scan relocations. This must be done after every symbol is declared so that > // we can correctly decide if a dynamic relocation is needed. > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Ed Maste via llvm-dev
2018-Aug-21 18:10 UTC
[llvm-dev] [lld] avoid emitting PLT entries for ifuncs
On 21 August 2018 at 12:47, Peter Smith via llvm-dev <llvm-dev at lists.llvm.org> wrote:> Hello Mark, > > I'm not the LLD maintainer so this is just a personal opinion. If I > understand the optimisation correctly, if it used on some program then > either the loader for the program or the program itself is responsible > for running the ifunc resolver and resolving the callsites. I think it > would have to come with a big health warning in at least the help and > documentation that platform/OS support is needed to run the program.Yes - in our case we're using it for kernel ifuncs, and the kernel's early reloc code handles the resolver and relocations. We'll definitely want a cautionary note in the man page and other documentation.
Joerg Sonnenberger via llvm-dev
2018-Aug-21 21:14 UTC
[llvm-dev] [lld] avoid emitting PLT entries for ifuncs
On Tue, Aug 21, 2018 at 09:47:41AM -0400, Mark Johnston via llvm-dev wrote:> We've recently started using ifuncs in the x86(_64) FreeBSD kernel. > Currently lld will emit a PLT entry for each ifunc, so ifunc calls are > more expensive that those of regular functions.If you rewrite the PLT entry to be a plain jump whenever possible, the difference should be pretty small. Have you considered that? Joerg
Mark Johnston via llvm-dev
2018-Aug-21 23:50 UTC
[llvm-dev] [lld] avoid emitting PLT entries for ifuncs
On Tue, Aug 21, 2018 at 11:14:02PM +0200, Joerg Sonnenberger wrote:> On Tue, Aug 21, 2018 at 09:47:41AM -0400, Mark Johnston via llvm-dev wrote: > > We've recently started using ifuncs in the x86(_64) FreeBSD kernel. > > Currently lld will emit a PLT entry for each ifunc, so ifunc calls are > > more expensive that those of regular functions. > > If you rewrite the PLT entry to be a plain jump whenever possible, the > difference should be pretty small. Have you considered that?I considered it, but don't like it as much: for each supported CPU architecture we would need code to find the PLT entry referencing the GOT entry being relocated, verify that the entry contains the instruction(s) that we expect, and write the plain jump. For my approach the kernel linker will just do the right thing for each CPU architecture without requiring any magic. Having support in the static linker means that the optimization is less fragile, and we do not incur the cost of the extra jump. It may be that other projects can benefit as well, and as I mentioned, I think there are other use-cases for similar functionality.
Mark Johnston via llvm-dev
2018-Aug-21 23:56 UTC
[llvm-dev] [lld] avoid emitting PLT entries for ifuncs
On Tue, Aug 21, 2018 at 05:47:59PM +0100, Peter Smith wrote:> Hello Mark, > > On 21 August 2018 at 14:47, Mark Johnston via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > Hello, > > > > We've recently started using ifuncs in the x86(_64) FreeBSD kernel. > > Currently lld will emit a PLT entry for each ifunc, so ifunc calls are > > more expensive that those of regular functions. In our kernel, this > > overhead isn't really necessary: if lld instead emits PC-relative > > relocations for each ifunc call site, where each relocation references > > a symbol of type GNU_IFUNC, then during boot we can resolve each > > call site and apply the relocation before mapping the kernel text > > read-only. Then, ifunc calls have the same overhead as regular function > > calls. > > > > To implement this optimization, I wrote an lld patch to add > > "-z ifunc-noplt". When this option is specified, lld does not create > > PLT entries for ifuncs and instead passes the existing PC-relative > > relocation through to the output file. The patch is below; I tested it > > with lld 7.0 and the patch applied without modifications to the sources > > in trunk. > > > > I'm wondering if such an option would be acceptable in upstream lld, and > > whether anyone had comments on my implementation. The patch is lacking > > tests, and I had some questions: > > I'm not the LLD maintainer so this is just a personal opinion. If I > understand the optimisation correctly, if it used on some program then > either the loader for the program or the program itself is responsible > for running the ifunc resolver and resolving the callsites. I think it > would have to come with a big health warning in at least the help and > documentation that platform/OS support is needed to run the program.That's a good point. For FreeBSD I had documented the option in the man page, and will amend it as you suggest.> > - How should "-z ifunc-noplt" interact with "-z text"? Should the > > invoker be required to additionally specify "-z notext"? > > I think it could it either be -z text -z ifunc-noplt = error, with -z > ifunc-noplt implying -z notext; or -ifunc-noplt is an error without -z > notext.I think the latter option is preferable for such a rarely used option, since it's more explicit.
Rui Ueyama via llvm-dev
2018-Aug-22 08:27 UTC
[llvm-dev] [lld] avoid emitting PLT entries for ifuncs
Hi Mark, Although I do understand your motivation to add this feature, because the proposed change works only with a specific loader, I'd explorer other options before adding a new feature to the linker. So the problem for the kernel loader is to know all locations from where the control jumps to ifunc PLT entries. Usually, once lld is done with linking, all traces of such relocations are discarded because they are no longer needed. However, if you pass the -emit-relocs option to the linker, lld keeps all relocations that have already been resolved in an output executable. By analyzing a relocation table in a resulting executable, you could find all locations where the ifunc PLT is called. Then, you can construct a new table for your linker, embed it to the executable using objcopy or something like that, and then let the kernel loader interpret it. Have you considered that? On Tue, Aug 21, 2018 at 10:48 PM Mark Johnston via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hello, > > We've recently started using ifuncs in the x86(_64) FreeBSD kernel. > Currently lld will emit a PLT entry for each ifunc, so ifunc calls are > more expensive that those of regular functions. In our kernel, this > overhead isn't really necessary: if lld instead emits PC-relative > relocations for each ifunc call site, where each relocation references > a symbol of type GNU_IFUNC, then during boot we can resolve each > call site and apply the relocation before mapping the kernel text > read-only. Then, ifunc calls have the same overhead as regular function > calls. > > To implement this optimization, I wrote an lld patch to add > "-z ifunc-noplt". When this option is specified, lld does not create > PLT entries for ifuncs and instead passes the existing PC-relative > relocation through to the output file. The patch is below; I tested it > with lld 7.0 and the patch applied without modifications to the sources > in trunk. > > I'm wondering if such an option would be acceptable in upstream lld, and > whether anyone had comments on my implementation. The patch is lacking > tests, and I had some questions: > - How should "-z ifunc-noplt" interact with "-z text"? Should the > invoker be required to additionally specify "-z notext"? > - Could "-z ifunc-noplt" be subsumed by a more general mechanism which > tells lld not to apply constant relocations and instead pass them > through to the output file? I could imagine using such mechanism > to make it possible to dynamically enable retpoline at boot time. > It could also be useful for implementing static DTrace trace points. > > Thanks, > -Mark > > diff --git a/ELF/Config.h b/ELF/Config.h > index 5dc7f5321..b5a3d3266 100644 > --- a/ELF/Config.h > +++ b/ELF/Config.h > @@ -182,6 +182,7 @@ struct Configuration { > bool ZCopyreloc; > bool ZExecstack; > bool ZHazardplt; > + bool ZIfuncnoplt; > bool ZInitfirst; > bool ZKeepTextSectionPrefix; > bool ZNodelete; > diff --git a/ELF/Driver.cpp b/ELF/Driver.cpp > index aced1edca..e7896cedf 100644 > --- a/ELF/Driver.cpp > +++ b/ELF/Driver.cpp > @@ -340,7 +340,8 @@ static bool getZFlag(opt::InputArgList &Args, > StringRef K1, StringRef K2, > > static bool isKnown(StringRef S) { > return S == "combreloc" || S == "copyreloc" || S == "defs" || > - S == "execstack" || S == "hazardplt" || S == "initfirst" || > + S == "execstack" || S == "hazardplt" || S == "ifunc-noplt" || > + S == "initfirst" || > S == "keep-text-section-prefix" || S == "lazy" || S == "muldefs" > || > S == "nocombreloc" || S == "nocopyreloc" || S == "nodelete" || > S == "nodlopen" || S == "noexecstack" || > @@ -834,6 +835,7 @@ void LinkerDriver::readConfigs(opt::InputArgList > &Args) { > Config->ZCopyreloc = getZFlag(Args, "copyreloc", "nocopyreloc", true); > Config->ZExecstack = getZFlag(Args, "execstack", "noexecstack", false); > Config->ZHazardplt = hasZOption(Args, "hazardplt"); > + Config->ZIfuncnoplt = hasZOption(Args, "ifunc-noplt"); > Config->ZInitfirst = hasZOption(Args, "initfirst"); > Config->ZKeepTextSectionPrefix = getZFlag( > Args, "keep-text-section-prefix", "nokeep-text-section-prefix", > false); > diff --git a/ELF/Relocations.cpp b/ELF/Relocations.cpp > index 8f60aa3d2..a54d87e43 100644 > --- a/ELF/Relocations.cpp > +++ b/ELF/Relocations.cpp > @@ -361,6 +361,10 @@ static bool isStaticLinkTimeConstant(RelExpr E, > RelType Type, const Symbol &Sym, > R_TLSLD_HINT>(E)) > return true; > > + // The computation involves output from the ifunc resolver. > + if (Sym.isGnuIFunc() && Config->ZIfuncnoplt) > + return false; > + > // These never do, except if the entire file is position dependent or if > // only the low bits are used. > if (E == R_GOT || E == R_PLT || E == R_TLSDESC) > @@ -808,6 +812,10 @@ static void processRelocAux(InputSectionBase &Sec, > RelExpr Expr, RelType Type, > Sec.Relocations.push_back({Expr, Type, Offset, Addend, &Sym}); > return; > } > + if (Sym.isGnuIFunc() && Config->ZIfuncnoplt) { > + InX::RelaDyn->addReloc(Type, &Sec, Offset, &Sym, Addend, R_ADDEND, > Type); > + return; > + } > bool CanWrite = (Sec.Flags & SHF_WRITE) || !Config->ZText; > if (CanWrite) { > // R_GOT refers to a position in the got, even if the symbol is > preemptible. > @@ -977,7 +985,7 @@ static void scanReloc(InputSectionBase &Sec, > OffsetGetter &GetOffset, RelTy *&I, > // all dynamic symbols that can be resolved within the executable will > // actually be resolved that way at runtime, because the main exectuable > // is always at the beginning of a search list. We can leverage that > fact. > - if (Sym.isGnuIFunc()) > + if (Sym.isGnuIFunc() && !Config->ZIfuncnoplt) > Expr = toPlt(Expr); > else if (!Sym.IsPreemptible && Expr == R_GOT_PC && > !isAbsoluteValue(Sym)) > Expr = Target->adjustRelaxExpr(Type, RelocatedAddr, Expr); > diff --git a/ELF/Writer.cpp b/ELF/Writer.cpp > index 90462ecc7..418133ebd 100644 > --- a/ELF/Writer.cpp > +++ b/ELF/Writer.cpp > @@ -1570,8 +1570,11 @@ template <class ELFT> void > Writer<ELFT>::finalizeSections() { > applySynthetic({InX::EhFrame}, > [](SyntheticSection *SS) { SS->finalizeContents(); }); > > - for (Symbol *S : Symtab->getSymbols()) > + for (Symbol *S : Symtab->getSymbols()) { > S->IsPreemptible |= computeIsPreemptible(*S); > + if (S->isGnuIFunc() && Config->ZIfuncnoplt) > + S->ExportDynamic = true; > + } > > // Scan relocations. This must be done after every symbol is declared > so that > // we can correctly decide if a dynamic relocation is needed. > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180822/eeb2425b/attachment.html>
Ed Maste via llvm-dev
2018-Aug-22 14:11 UTC
[llvm-dev] [lld] avoid emitting PLT entries for ifuncs
On 22 August 2018 at 04:27, Rui Ueyama via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > However, if you pass the -emit-relocs option to the linker, lld keeps all > relocations that have already been resolved in an output executable. By > analyzing a relocation table in a resulting executable, you could find all > locations where the ifunc PLT is called. Then, you can construct a new table > for your linker, embed it to the executable using objcopy or something like > that, and then let the kernel loader interpret it. > > Have you considered that?I've thought about alternative ways to achieve the same thing, including something like the above. My concern with that approach is that it's rather cumbersome and can be error-prone, and introduces a requirement for awkward multi-stage linking. In comparison Mark's patch is a relatively tiny tweak in lld. Despite the disadvantages I much prefer the proposed approach. On 22 August 2018 at 09:20, Joerg Sonnenberger via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > On the linker side, it is a custom hack for > something that is generally considered very bad nowadays: text > relocations.True, although the argument against .text relocations doesn't hold for our kernel use case.