Tim Northover
2014-Jul-01 14:35 UTC
[LLVMdev] [lld] [mach-o]: RFC: representing LC_REEXPORT_DYLIB
Hi all, I've been thinking about how best to represent MachO's LC_REEXPORT_DYLIB (used even by libSystem.dylib to provide its various sub-components[*]). It looks like this functionality would naturally fall into the InputGraph, in analogy with Groups and Archives. Unfortunately, it's rather more dynamic than the existing cases: we don't know the needed files before parsing the top-level one, and need to open multiple files. Essentially, we'd need to create new MachOFileNodes based on the contents of the parent. It seems there are two obvious ways to do this: 1. Create them while we still have the MachONormalizedFile around; I think this would mean extending the InputGraph::parse interface to allow new InputNodes to be passed back. 2. Add an atom type to represent the dependency and create the actual nodes when we get back to MachOFileNode::parse. I'm still very new to lld, so which of these fits in better with our goals? Or has someone else already thought about it and have a cunning plan? I'm happy to implement anyone's idea if it's the neatest way to go. Cheers. Tim. [*] It's the last barrier to "lld -flavor darwin -arch x86_64 -macosx_version_min 10.9 hello_world.o /usr/lib/libSystem.dylib -ohello_world" working, I think! Using /usr/lib/system/libsystem_c.dylib already does.
Shankar Easwaran
2014-Jul-01 16:25 UTC
[LLVMdev] [lld] [mach-o]: RFC: representing LC_REEXPORT_DYLIB
Hi Tim, Are you refererring to the dependencies of the dynamic library that needs to be traversed, to resolve shared library atoms ? You could build a Dynamic library node and have the symbol returned when the shared library is called for a symbol, that needs to be resolved using the below API. const SharedLibraryAtom *exports(StringRef name, bool dataSymbolOnly) Will this work ? I think symbol resolution is more involved in MachO(from previous conversations) with the current lld model, that symbols are resolved from archive and dynamic libraries after all the object files are processed too. Will this simplify things ? Thanks Shankar Easwaran On 7/1/2014 9:35 AM, Tim Northover wrote:> Hi all, > > I've been thinking about how best to represent MachO's > LC_REEXPORT_DYLIB (used even by libSystem.dylib to provide its various > sub-components[*]). > > It looks like this functionality would naturally fall into the > InputGraph, in analogy with Groups and Archives. Unfortunately, it's > rather more dynamic than the existing cases: we don't know the needed > files before parsing the top-level one, and need to open multiple > files. Essentially, we'd need to create new MachOFileNodes based on > the contents of the parent. > > It seems there are two obvious ways to do this: > > 1. Create them while we still have the MachONormalizedFile around; I > think this would mean extending the InputGraph::parse interface to > allow new InputNodes to be passed back. > 2. Add an atom type to represent the dependency and create the actual > nodes when we get back to MachOFileNode::parse. > > I'm still very new to lld, so which of these fits in better with our > goals? Or has someone else already thought about it and have a cunning > plan? I'm happy to implement anyone's idea if it's the neatest way to > go. > > Cheers. > > Tim. > > [*] It's the last barrier to "lld -flavor darwin -arch x86_64 > -macosx_version_min 10.9 hello_world.o /usr/lib/libSystem.dylib > -ohello_world" working, I think! Using > /usr/lib/system/libsystem_c.dylib already does. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation
Tim Northover
2014-Jul-01 16:41 UTC
[LLVMdev] [lld] [mach-o]: RFC: representing LC_REEXPORT_DYLIB
Hi Shankar, On 1 July 2014 17:25, Shankar Easwaran <shankare at codeaurora.org> wrote:> You could build a Dynamic library node and have the symbol returned when the > shared library is called for a symbol, that needs to be resolved using the > below API. > > const SharedLibraryAtom *exports(StringRef name, bool dataSymbolOnly) > > Will this work ?I did actually consider something along those lines, but it seemed like even more of a hack so I didn't mention it in my message. It could be made to work, but would involve reading new files in either MachONormalizedFileToAtoms or the exports function itself. Both of those seem like they're at the wrong level: we'd need to largely re-implement the FileNode I/O handling and graph descent that already exists. I'm also not convinced the lifetime and ownership issues work out well in that scheme. lld as a whole seems to keep the MemoryBuffers associated with files around, which makes that location even less pleasant from a layering point of view. Cheers. Tim.
Nick Kledzik
2014-Jul-03 00:09 UTC
[LLVMdev] [lld] [mach-o]: RFC: representing LC_REEXPORT_DYLIB
On Jul 1, 2014, at 7:35 AM, Tim Northover <t.p.northover at gmail.com> wrote:> Hi all, > > I've been thinking about how best to represent MachO's > LC_REEXPORT_DYLIB (used even by libSystem.dylib to provide its various > sub-components[*]). > > It looks like this functionality would naturally fall into the > InputGraph, in analogy with Groups and Archives. Unfortunately, it's > rather more dynamic than the existing cases: we don't know the needed > files before parsing the top-level one, and need to open multiple > files. Essentially, we'd need to create new MachOFileNodes based on > the contents of the parent. > > It seems there are two obvious ways to do this: > > 1. Create them while we still have the MachONormalizedFile around; I > think this would mean extending the InputGraph::parse interface to > allow new InputNodes to be passed back.ld64 does it in two phases. The first phase just loads the dylibs directly specified on the command line. The second phase loads any “indirect” dylibs. Perhaps at the end of DarwinLdDriver::parse() after the nodes are created for all the command line files, the driver can iterate over the nodes and instantiate any indirect dylibs needed? You don’t want to load the indirect dylibs as each direct dylib is loaded because one of the indirect ones may later turn out to be a direct one, and the order determines the two-level-namespace ordinal used which we want to remain deterministic. In ld64 the processing of indirect dylibs has two purposes: 1) to support LC_REEXPORT_DYLIB, 2) to support flat_namespace linking of a main executable wherein the linker must check all undefines in all dylibs are resolved. The second case is also needed for ELF linkers (—no-allow-shilb-undefined) which means an ELF linker would need to load indirect dylibs too. Shankar, Does lld for ELF support loading indirect DSOs? -Nick> 2. Add an atom type to represent the dependency and create the actual > nodes when we get back to MachOFileNode::parse. > > I'm still very new to lld, so which of these fits in better with our > goals? Or has someone else already thought about it and have a cunning > plan? I'm happy to implement anyone's idea if it's the neatest way to > go. > > Cheers. > > Tim. > > [*] It's the last barrier to "lld -flavor darwin -arch x86_64 > -macosx_version_min 10.9 hello_world.o /usr/lib/libSystem.dylib > -ohello_world" working, I think! Using > /usr/lib/system/libsystem_c.dylib already does. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Shankar Easwaran
2014-Jul-03 15:22 UTC
[LLVMdev] [lld] [mach-o]: RFC: representing LC_REEXPORT_DYLIB
On 7/2/2014 7:09 PM, Nick Kledzik wrote:> > Shankar, Does lld for ELF support loading indirect DSOs?The Gnu flavor doesnot try to read dependent(indirect DSO's) libraries for resolving symbols unless the dependent library is also added in the link line. Test case :- cat > 1.c << \! int main() { fn(); return 0; } ! cat > fn.c << \! int fn() { return fn1(); } ! cat > fn1.c << \! int fn1() { return fn2(); } ! gcc -c fn.c fn1.c -fPIC 1.c ld -shared fn1.o -o libfn1.so ld -shared fn.o -L. -lfn1 -o libfn.so ld 1.o -L. -lfn -t --no-allow-shlib-undefined => Does not read libfn1 et all, not used to this, I dont know why this has been followed on Gnu and the reasoning behind it. I'm not sure if this is by design or a bug, that was never fixed. In the case of lld, for the Gnu flavor, we dont need to support indirect DSO's :) Thanks Shankar Easwaran
Joerg Sonnenberger
2014-Jul-05 20:24 UTC
[LLVMdev] [lld] [mach-o]: RFC: representing LC_REEXPORT_DYLIB
On Wed, Jul 02, 2014 at 05:09:14PM -0700, Nick Kledzik wrote:> Shankar, Does lld for ELF support loading indirect DSOs?It doesn't, which is a bug. Joerg
Tim Northover
2014-Jul-07 12:34 UTC
[LLVMdev] [lld] [mach-o]: RFC: representing LC_REEXPORT_DYLIB
> Perhaps at the end of DarwinLdDriver::parse() after the nodes are created > for all the command line files, the driver can iterate over the nodes and > instantiate any indirect dylibs needed?That seems to be taking over from Driver::link, which very carefully dispatches object parsing to a bunch of tasks to do it in parallel (we obviously don't know what's LC_REEXPORT_DYLIBed until we have parsed the input file). It would seem better if we could find a way the left it doing that job.> You don’t want to load the > indirect dylibs as each direct dylib is loaded because one of the indirect > ones may later turn out to be a direct one, and the order determines > the two-level-namespace ordinal used which we want to remain deterministic.Ah, I'd not considered anything like that. Presumably this issue goes beyond the directly specified libraries too (i.e. it matters which directly specified library a symbol gets associated with, even if it doesn't come from one). That probably rules out a simple depth-first InputGraph, but makes constructing a correct one a bit tricky. I'll have to play around, I think.> Shankar, Does lld for ELF support loading indirect DSOs?I've looked into the ELF situation a bit more (thanks Joerg, for supplying the initial hints!). It seems to date back to the thread in [1] (with some background at [2]), where they changed the default from automatically copying DT_NEEDED entries to requiring a command-line override for it (personally, I think that was probably the right decision). But either way it means that ELF will probably need this ability eventually, exposed via a "--copy-dt-needed-entries" option if nothing else. Cheers. Tim. [1] https://sourceware.org/ml/binutils/2011-08/msg00129.html [2] http://fedoraproject.org/wiki/UnderstandingDSOLinkChange
Tim Northover
2014-Jul-10 12:56 UTC
[LLVMdev] [lld] [mach-o]: RFC: representing LC_REEXPORT_DYLIB
Hi Nick, On 3 July 2014 01:09, Nick Kledzik <kledzik at apple.com> wrote:> You don’t want to load the > indirect dylibs as each direct dylib is loaded because one of the indirect > ones may later turn out to be a direct one, and the order determines > the two-level-namespace ordinal used which we want to remain deterministic.I've finally got back to this issue and I'm not sure what you mean here. My tests suggest that ld64 performs a depth-first search of the libraries and we *do* want to load them at the same time (or at least make sure they're considered at the same time for resolution purposes). For example (reproduced by tmp.sh attached): $ cat foo.c int foo() { return 'f'; } $ cat main.c extern int foo(); int main() { return foo(); } $ cat wrapper.c $ clang -shared foo.c -olibfoo.dylib $ clang wrapper.c -shared -Wl,-reexport_library,libfoo.dylib -o libwrapper.dylib $ clang main.c libwrapper.dylib libfoo.dylib -omain $ nm -nm main (undefined) external _foo (from libwrapper) It looks like it would correspond reasonably well with sticking everything within the one directly specified InputElement somehow. Avoiding cycles seems like it might be the trickiest part. Cheers. Tim. -------------- next part -------------- A non-text attachment was scrubbed... Name: tmp.sh Type: application/x-sh Size: 327 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140710/3a4fd45f/attachment.sh>