Michael Clark via llvm-dev
2017-Jun-06 23:08 UTC
[llvm-dev] LLD support for ld64 mach-o linker synthesised symbols
Hi Folks, I have a question regarding LLD support for ld64 mach-o linker synthesised symbols. I did a quick search of the LLD source and I can not find support for them so before I start trying to use lld I thought I would ask. I have found a couple of cases where they are essential. i.e. where there is no other way to get the required information, such as getting the address of the mach-o headers of the current process, with ASLR enabled, if the process is not dyld as exec on macOS only provides the mach header address to dyld (*1). They are used inside of dyld and I am now using them in “x86_64-xnu-musl”. It’s possible to resolve a mach-o segment offset or a mach-o section offset using these special ld64 linker synthesised symbols. See resolveUndefines: - https://opensource.apple.com/source/ld64/ld64-274.2/src/ld/Resolver.cpp.auto.html <https://opensource.apple.com/source/ld64/ld64-274.2/src/ld/Resolver.cpp.auto.html> There are 4 special symbol prefixes for the mach-o linker synthesised symbols: - segment$start$__SEGMENT - segment$end$__SEGMENT - section$start$__SEGMENT$__section - section$end$__SEGMENT$__section In asm: /* get imagebase and slide for static PIE and ASLR support in x86_64-xnu-musl */ .align 3 __image_base: .quad segment$start$__TEXT __start_static: .quad start .text .align 3 .global start start: xor %rbp,%rbp mov %rsp,%rdi andq $-16,%rsp movq __image_base(%rip), %rsi leaq start(%rip), %rdx subq __start_static(%rip), %rdx call __start_c In C: /* run C++ constructors in __libc_start_main for x86_64-xnu-musl */ typedef void (*__init_fn)(int, char **, char **, char **); extern __init_fn __init_start __asm("section$start$__DATA$__mod_init_func"); extern __init_fn __init_end __asm("section$end$__DATA$__mod_init_func”); static void __init_mod(int argc, char **argv, char **envp, char **applep) { for (__init_fn *p = &__init_start; p < &__init_end; ++p) { (*p)(argc, argv, envp, applep); } } Michael. [1] https://github.com/opensource-apple/xnu/blob/dc0628e187c3148723505cf1f1d35bb948d3195b/bsd/kern/kern_exec.c#L1072-L1111 <https://github.com/opensource-apple/xnu/blob/dc0628e187c3148723505cf1f1d35bb948d3195b/bsd/kern/kern_exec.c#L1072-L1111> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170607/eb534f68/attachment.html>
Rui Ueyama via llvm-dev
2017-Jun-06 23:30 UTC
[llvm-dev] LLD support for ld64 mach-o linker synthesised symbols
Hi Michael, The Mach-O version of LLD is not being developed actively, and if some feature is missing, it is likely that it's just not implemented. What is your motivation to use LLD instead of ld64? On Tue, Jun 6, 2017 at 4:08 PM, Michael Clark via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi Folks, > > I have a question regarding LLD support for ld64 mach-o linker synthesised > symbols. I did a quick search of the LLD source and I can not find support > for them so before I start trying to use lld I thought I would ask. > > I have found a couple of cases where they are essential. i.e. where there > is no other way to get the required information, such as getting the > address of the mach-o headers of the current process, with ASLR enabled, if > the process is not dyld as exec on macOS only provides the mach header > address to dyld (*1). They are used inside of dyld and I am now using them > in “x86_64-xnu-musl”. > > It’s possible to resolve a mach-o segment offset or a mach-o section > offset using these special ld64 linker synthesised symbols. See > resolveUndefines: > > - https://opensource.apple.com/source/ld64/ld64-274.2/ > src/ld/Resolver.cpp.auto.html > > There are 4 special symbol prefixes for the mach-o linker synthesised > symbols: > > - segment$start$__SEGMENT > - segment$end$__SEGMENT > - section$start$__SEGMENT$__section > - section$end$__SEGMENT$__section > > In asm: > > /* get imagebase and slide for static PIE and ASLR support in > x86_64-xnu-musl */ > > .align 3 > __image_base: > .quad segment$start$__TEXT > __start_static: > .quad start > .text > .align 3 > .global start > start: > xor %rbp,%rbp > mov %rsp,%rdi > andq $-16,%rsp > movq __image_base(%rip), %rsi > leaq start(%rip), %rdx > subq __start_static(%rip), %rdx > call __start_c > > > In C: > > /* run C++ constructors in __libc_start_main for x86_64-xnu-musl */ > > typedef void (*__init_fn)(int, char **, char **, char **); > extern __init_fn __init_start __asm("section$start$__DATA$__ > mod_init_func"); > extern __init_fn __init_end __asm("section$end$__DATA$__ > mod_init_func”); > > static void __init_mod(int argc, char **argv, char **envp, char **applep) > { > for (__init_fn *p = &__init_start; p < &__init_end; ++p) { > (*p)(argc, argv, envp, applep); > } > } > > > Michael. > > [1] https://github.com/opensource-apple/xnu/blob/ > dc0628e187c3148723505cf1f1d35bb948d3195b/bsd/kern/kern_exec.c#L1072-L1111 > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170606/8d172694/attachment.html>
Michael Clark via llvm-dev
2017-Jun-06 23:38 UTC
[llvm-dev] LLD support for ld64 mach-o linker synthesised symbols
Hi Rui, The motivation would be primarily that LLVM/Clang/LLD are community projects such that if I or someone in the community added support for e.g. symbol aliases, then it could be reviewed and potentially merged. ld64 on the other hand does not have a community process for patch submission and code review that I am aware of so its unlikely that if someone from the community came up with a patch to support aliases that it would be merged. In that case I might check out the LLD code and try linking “x86_64-xnu-musl” with it. My requirements are likely simpler than Apple’s however I do need symbol aliases and these are not supported by ld64. The linker synthesised symbols are likely not too difficult to add if they are not present… now on my to do list… Michael.> On 7 Jun 2017, at 11:30 AM, Rui Ueyama <ruiu at google.com> wrote: > > Hi Michael, > > The Mach-O version of LLD is not being developed actively, and if some feature is missing, it is likely that it's just not implemented. What is your motivation to use LLD instead of ld64? > > On Tue, Jun 6, 2017 at 4:08 PM, Michael Clark via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > Hi Folks, > > I have a question regarding LLD support for ld64 mach-o linker synthesised symbols. I did a quick search of the LLD source and I can not find support for them so before I start trying to use lld I thought I would ask. > > I have found a couple of cases where they are essential. i.e. where there is no other way to get the required information, such as getting the address of the mach-o headers of the current process, with ASLR enabled, if the process is not dyld as exec on macOS only provides the mach header address to dyld (*1). They are used inside of dyld and I am now using them in “x86_64-xnu-musl”. > > It’s possible to resolve a mach-o segment offset or a mach-o section offset using these special ld64 linker synthesised symbols. See resolveUndefines: > > - https://opensource.apple.com/source/ld64/ld64-274.2/src/ld/Resolver.cpp.auto.html <https://opensource.apple.com/source/ld64/ld64-274.2/src/ld/Resolver.cpp.auto.html> > > There are 4 special symbol prefixes for the mach-o linker synthesised symbols: > > - segment$start$__SEGMENT > - segment$end$__SEGMENT > - section$start$__SEGMENT$__section > - section$end$__SEGMENT$__section > > In asm: > > /* get imagebase and slide for static PIE and ASLR support in x86_64-xnu-musl */ > > .align 3 > __image_base: > .quad segment$start$__TEXT > __start_static: > .quad start > .text > .align 3 > .global start > start: > xor %rbp,%rbp > mov %rsp,%rdi > andq $-16,%rsp > movq __image_base(%rip), %rsi > leaq start(%rip), %rdx > subq __start_static(%rip), %rdx > call __start_c > > In C: > > /* run C++ constructors in __libc_start_main for x86_64-xnu-musl */ > > typedef void (*__init_fn)(int, char **, char **, char **); > extern __init_fn __init_start __asm("section$start$__DATA$__mod_init_func"); > extern __init_fn __init_end __asm("section$end$__DATA$__mod_init_func”); > > static void __init_mod(int argc, char **argv, char **envp, char **applep) > { > for (__init_fn *p = &__init_start; p < &__init_end; ++p) { > (*p)(argc, argv, envp, applep); > } > } > > Michael. > > [1] https://github.com/opensource-apple/xnu/blob/dc0628e187c3148723505cf1f1d35bb948d3195b/bsd/kern/kern_exec.c#L1072-L1111 <https://github.com/opensource-apple/xnu/blob/dc0628e187c3148723505cf1f1d35bb948d3195b/bsd/kern/kern_exec.c#L1072-L1111> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170607/6f151c61/attachment-0001.html>
Michael Clark via llvm-dev
2017-Jun-07 00:22 UTC
[llvm-dev] LLD support for ld64 mach-o linker synthesised symbols
From the top of ld64’s ld.cpp // start temp HACK for cross builds extern "C" double log2 ( double ); //#define __MATH__ // end temp HACK for cross builds and a bit further down //fprintf(stderr, "FinalSection(%16s, %16s) _segmentOrder=%3d, _sectionOrder=0x%08X\n", // this->segmentName(), this->sectionName(), _segmentOrder, _sectionOrder); and a bit further down again ld64 uses qsort instead of std::sort //fprintf(stderr, "UNSORTED final sections:\n"); //for (std::vector<ld::Internal::FinalSection*>::iterator it = sections.begin(); it != sections.end(); ++it) { // fprintf(stderr, "final section %p %s/%s\n", (*it), (*it)->segmentName(), (*it)->sectionName()); //} qsort(§ions[0], sections.size(), sizeof(FinalSection*), &InternalState::FinalSection::sectionComparer); //fprintf(stderr, "SORTED final sections:\n"); //for (std::vector<ld::Internal::FinalSection*>::iterator it = sections.begin(); it != sections.end(); ++it) { // fprintf(stderr, "final section %p %s/%s\n", (*it), (*it)->segmentName(), (*it)->sectionName()); //} I doubt that would pass the LLVM projects’ code review. I could also raise an issue or fix this ld64 bug if LLD mach-o was supported and it was in the LLVM bugzilla: “Invalid zero page virtual address when linking with -static -image_base 0x7ffe00000000”. - https://gist.github.com/michaeljclark/0a805652ec4be987a782afb902f06a99 <https://gist.github.com/michaeljclark/0a805652ec4be987a782afb902f06a99> I’m not using Objective-C so LLD may well fit my purposes. Now determined to try out LLD on macos :-D> On 7 Jun 2017, at 11:30 AM, Rui Ueyama <ruiu at google.com> wrote: > > Hi Michael, > > The Mach-O version of LLD is not being developed actively, and if some feature is missing, it is likely that it's just not implemented. What is your motivation to use LLD instead of ld64? > > On Tue, Jun 6, 2017 at 4:08 PM, Michael Clark via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > Hi Folks, > > I have a question regarding LLD support for ld64 mach-o linker synthesised symbols. I did a quick search of the LLD source and I can not find support for them so before I start trying to use lld I thought I would ask. > > I have found a couple of cases where they are essential. i.e. where there is no other way to get the required information, such as getting the address of the mach-o headers of the current process, with ASLR enabled, if the process is not dyld as exec on macOS only provides the mach header address to dyld (*1). They are used inside of dyld and I am now using them in “x86_64-xnu-musl”. > > It’s possible to resolve a mach-o segment offset or a mach-o section offset using these special ld64 linker synthesised symbols. See resolveUndefines: > > - https://opensource.apple.com/source/ld64/ld64-274.2/src/ld/Resolver.cpp.auto.html <https://opensource.apple.com/source/ld64/ld64-274.2/src/ld/Resolver.cpp.auto.html> > > There are 4 special symbol prefixes for the mach-o linker synthesised symbols: > > - segment$start$__SEGMENT > - segment$end$__SEGMENT > - section$start$__SEGMENT$__section > - section$end$__SEGMENT$__section > > In asm: > > /* get imagebase and slide for static PIE and ASLR support in x86_64-xnu-musl */ > > .align 3 > __image_base: > .quad segment$start$__TEXT > __start_static: > .quad start > .text > .align 3 > .global start > start: > xor %rbp,%rbp > mov %rsp,%rdi > andq $-16,%rsp > movq __image_base(%rip), %rsi > leaq start(%rip), %rdx > subq __start_static(%rip), %rdx > call __start_c > > In C: > > /* run C++ constructors in __libc_start_main for x86_64-xnu-musl */ > > typedef void (*__init_fn)(int, char **, char **, char **); > extern __init_fn __init_start __asm("section$start$__DATA$__mod_init_func"); > extern __init_fn __init_end __asm("section$end$__DATA$__mod_init_func”); > > static void __init_mod(int argc, char **argv, char **envp, char **applep) > { > for (__init_fn *p = &__init_start; p < &__init_end; ++p) { > (*p)(argc, argv, envp, applep); > } > } > > Michael. > > [1] https://github.com/opensource-apple/xnu/blob/dc0628e187c3148723505cf1f1d35bb948d3195b/bsd/kern/kern_exec.c#L1072-L1111 <https://github.com/opensource-apple/xnu/blob/dc0628e187c3148723505cf1f1d35bb948d3195b/bsd/kern/kern_exec.c#L1072-L1111> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170607/2d33c680/attachment.html>
Michael Clark via llvm-dev
2017-Jun-08 00:55 UTC
[llvm-dev] LLD support for ld64 mach-o linker synthesised symbols
It seems I can find the static offset of the Mach-O header pre-initialisation in the crt without using the special dynamic linker synthesised symbols, rather a statically synthesised symbol that I was previously unaware of “ __mh_execute_header". I later add the slide to find the dynamic offset of the Mach-O headers. .align 3 __image_base: .quad __mh_execute_header I find the slide by subtracting a static pointer to a well known symbol from an RIP-relative access to the same symbol. __start_static: .quad start leaq start(%rip), %rdx subq __start_static(%rip), %rdx The crt then gets the stack pointer, static image base and slide, so it can relocate the image and call constructors. void _start_c(long *p, uintptr_t image_base, uintptr_t slide) I’m not sure about the second use case for the start and end of the “__mod_init_func” section, which would likely be required for linking dyld.> On 7 Jun 2017, at 11:08 AM, Michael Clark <michaeljclark at mac.com> wrote: > > In asm: > > /* get imagebase and slide for static PIE and ASLR support in x86_64-xnu-musl */ > > .align 3 > __image_base: > .quad segment$start$__TEXT > __start_static: > .quad start > .text > .align 3 > .global start > start: > xor %rbp,%rbp > mov %rsp,%rdi > andq $-16,%rsp > movq __image_base(%rip), %rsi > leaq start(%rip), %rdx > subq __start_static(%rip), %rdx > call __start_c > > In C: > > /* run C++ constructors in __libc_start_main for x86_64-xnu-musl */ > > typedef void (*__init_fn)(int, char **, char **, char **); > extern __init_fn __init_start __asm("section$start$__DATA$__mod_init_func"); > extern __init_fn __init_end __asm("section$end$__DATA$__mod_init_func”); > > static void __init_mod(int argc, char **argv, char **envp, char **applep) > { > for (__init_fn *p = &__init_start; p < &__init_end; ++p) { > (*p)(argc, argv, envp, applep); > } > } >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170608/6346691b/attachment.html>
Apparently Analagous Threads
- LLD support for ld64 mach-o linker synthesised symbols
- LLD support for ld64 mach-o linker synthesised symbols
- LLD support for mach-o aliases (weak or otherwise)
- LLD support for mach-o aliases (weak or otherwise)
- LLD support for mach-o aliases (weak or otherwise)