Peter Collingbourne via llvm-dev
2017-Mar-07  02:33 UTC
[llvm-dev] [BUG Report] -dead_strip, strips prefix data unconditionally on macOS
Firstly, do you need "main.dsp" defined as an external symbol, or can all external references go via "main"? If the answer is the latter, that will make the solution simpler. If only the latter, you will need to make a change to LLVM here: http://llvm-cs.pcc.me.uk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp#650 Basically you would need to add a hook to the TargetLoweringObjectFile class that allows the object format to control how prefix data is emitted. For Mach-O you would emit a label for a dummy internal symbol, followed by the prefix data and then an alt_entry directive for the function symbol. All other object formats would just emit the prefix data. Peter On Mon, Mar 6, 2017 at 6:16 PM, Moritz Angermann <moritz.angermann at gmail.com> wrote:> Thank you Peter! > > That seems to do the trick! > > $ cat test.s > .section __TEXT,__text > .globl _main > > .long 1 > _main: > inc %eax > ret > > .alt_entry _main > _main.dsp = _main-4 > > .subsections_via_symbols > > > $ clang test.s -dead_strip > $ otool -vVtdj a.out > a.out: > _main.dsp: > 0000000100000fb1 01 00 addl %eax, (%rax) > 0000000100000fb3 00 00 addb %al, (%rax) > _main: > 0000000100000fb5 ff c0 incl %eax > 0000000100000fb7 c3 retq > > > > However, now I need to figure out how to generate this from > llvm via an alias. > > @.str = private unnamed_addr constant [8 x i8] c"p = %d\0A\00", align 1 > > %struct.prefix = type { i32 } > @main.dsp = alias i32, i32* getelementptr (%struct.prefix, %struct.prefix* > bitcast (i32 ()* @main to %struct.prefix*), i32 -1, i32 0) > > declare i32 @printf(i8*, ...) > > define i32 @main() prefix %struct.prefix { i32 123 } { > %main = bitcast i32 ()* @main to i32* > %prefix_ptr = getelementptr inbounds i32, i32* %main, i32 -1 > %prefix_val = load i32, i32* %prefix_ptr > %ret = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([8 x i8], > [8 x i8]* @.str, i32 0, i32 0), i32 %prefix_val) > ret i32 0 > } > > > generates the following for me: > > .section __TEXT,__text,regular,pure_instructions > .macosx_version_min 10, 12 > .globl _main > .p2align 4, 0x90 > .long 123 ## @main > ## 0x7b > _main: > .cfi_startproc > ## BB#0: > pushq %rax > Ltmp0: > .cfi_def_cfa_offset 16 > movl _main-4(%rip), %esi > leaq L_.str(%rip), %rdi > xorl %eax, %eax > callq _printf > xorl %eax, %eax > popq %rcx > retq > .cfi_endproc > > .section __TEXT,__cstring,cstring_literals > L_.str: ## @.str > .asciz "p = %d\n" > > > .globl _main.dsp > .alt_entry _main.dsp > _main.dsp = _main-4 > .subsections_via_symbols > > any ideas how to get the .alt_entry right? > > Cheers, > Moritz > > > > On Mar 7, 2017, at 10:02 AM, Peter Collingbourne <peter at pcc.me.uk> > wrote: > > > > On Mon, Mar 6, 2017 at 5:54 PM, Moritz Angermann < > moritz.angermann at gmail.com> wrote: > > Hi Peter, > > > > I’ve just experimented with this a bit: > > > > Say we would end up with the following assembly: > > > > .section __TEXT,__text > > .globl _main > > > > .long 1 > > _main: > > inc %eax > > ret > > > > .globl _main.dsp > > .alt_entry _main.dsp > > > > What happens if you try ".alt_entry _main" instead? The alt_entry is > supposed to be bound to the atom appearing *before* it. > > > > _main.dsp = _main-4 > > > > .subsections_via_symbols > > > > (e.g. we inject the .alt_entry after the fact, pointing to the start of > the prefix data) > > > > this will yield: > > > > $ clang test.s -dead_strip > > ld: warning: N_ALT_ENTRY bit set on first atom in section __TEXT/__text > > > > And the prefix data will be stripped again. > > > > E.g. what you end up getting is: > > > > $ otool -vVtdj a.out > > a.out: > > _main: > > 0000000100000fb5 ff c0 incl %eax > > 0000000100000fb7 c3 retq > > > > instead of what we’d like to get: > > > > otool -vVtdj a.out > > a.out: > > _main.dsp: > > 0000000100000fb1 01 00 addl %eax, (%rax) > > 0000000100000fb3 00 00 addb %al, (%rax) > > _main: > > 0000000100000fb5 ff c0 incl %eax > > 0000000100000fb7 c3 retq > > > > .alt_entry’s are not dead_strip protected, and this makes sense I guess, > as if the alt_entry is never > > actually called visibly from anywhere, it’s probably not needed. However > there is the .no_daed_strip > > directive. Thus if we graft this slightly different: > > > > .section __TEXT,__text > > .globl _main > > > > .long 1 > > _main: > > inc %eax > > ret > > > > .no_dead_strip _main.dsp > > .alt_entry _main.dsp > > _main.dsp = _main-4 > > > > .subsections_via_symbols > > > > we still get a warning, but it won’t get stripped. At that point > however, we don’t need the .alt_entry > > anymore (and can drop the warning). > > > > Thus, I’d propose that for functions with prefix_data, a second symbol > with .no_dead_strip is emitted > > for the prefix data entry point. > > > > I don't think that is sufficient. I believe that the linker is allowed > to move the function away from the prefix data even if the function is not > dead stripped. > > > > Peter > > > > > > Cheers, > > Moritz > > > > > > > On Mar 7, 2017, at 3:35 AM, Peter Collingbourne <peter at pcc.me.uk> > wrote: > > > > > > That is in theory what omitting the .subsections_via_symbols directive > is supposed to do, but in an experiment I ran a year or two ago I found > that the Mach-O linker was still dead stripping on symbol boundaries with > this directive omitted. > > > > > > In any case, a more precise approach has more recently (~a few months > ago) become possible. There is a relatively new asm directive called > .altentry that, as I understand it, tells the linker to disregard a given > symbol as a section boundary (LLVM already uses this for aliases pointing > into the middle of a global). So what you would do is to use .altentry on > the function symbol, with an internal symbol appearing before the prefix > data to ensure that it is not considered part of the body of the previous > function. > > > > > > Peter > > > > > > On Mon, Mar 6, 2017 at 11:19 AM, James Y Knight via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > > AFAIK, this cannot actually work on Apple platforms, because its > object file format (Mach-O) doesn't use sections to determine the ranges of > code/data to keep together, but instead _infers_ boundaries based on the > range between global symbols in the symbol table. > > > > > > So, the symbol pointing to the beginning of @main *necessarily* makes > that be a section boundary. > > > > > > I think the best that could be done in LLVM is to not emit the > ".subsections_via_symbols" asm directive (effectively disabling dead > stripping on that object) if any prefix data exists. Currently it emits > that flag unconditionally for MachO. > > > > > > On Mon, Mar 6, 2017 at 4:40 AM, Moritz Angermann via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > > Hi, > > > > > > I just came across a rather annoying behavior with llvm 3.9. Assuming > the following > > > samle code in test.ll: > > > > > > ; Lets have some global int x = 4 > > > @x = global i32 10, align 4 > > > ; and two strings "p = %d\n" for the prefix data, > > > ; as well as "x = %d\n" to print the (global) x value. > > > @.str = private unnamed_addr constant [8 x i8] c"x = %d\0A\00", align 1 > > > @.str2 = private unnamed_addr constant [8 x i8] c"p = %d\0A\00", align > 1 > > > > > > ; declare printf, we'll use this later for printf style debugging. > > > declare i32 @printf(i8*, ...) > > > > > > ; define a main function. > > > define i32 @main() prefix i32 123 { > > > ; obtain a i32 pointer to the main function. > > > ; the prefix data is right before that pointer. > > > %main = bitcast i32 ()* @main to i32* > > > > > > ; use the gep, to cmpute the start of the prefix data. > > > %prefix_ptr = getelementptr inbounds i32, i32* %main, i32 -1 > > > ; and load it. > > > %prefix_val = load i32, i32* %prefix_ptr > > > > > > ; print that value. > > > %ret = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([8 x > i8], [8 x i8]* @.str2, i32 0, i32 0), i32 %prefix_val) > > > > > > ; similarly let's do the same with the global x. > > > %1 = alloca i32, align 4 > > > store i32 0, i32* %1, align 4 > > > %2 = load i32, i32* @x, align 4 > > > %3 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([8 x > i8], [8 x i8]* @.str, i32 0, i32 0), i32 %2) > > > ret i32 0 > > > } > > > > > > gives the following result (expected) > > > > > > $ clang test.ll > > > $ ./a.out > > > p = 123 > > > x = 10 > > > > > > however, with -dead_strip on macOS, we see the following: > > > > > > $ clang test.ll -dead_strip > > > $ ./a.out > > > p = 0 > > > x = 10 > > > > > > Thus I believe we are incorrectly stripping prefix data when linking > with -dead_strip on macOS. > > > > > > As I do not have a bugzilla account, and hence cannot post this as a > proper bug report. > > > > > > Cheers, > > > Moritz > > > _______________________________________________ > > > LLVM Developers mailing list > > > llvm-dev at lists.llvm.org > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > > > > > > _______________________________________________ > > > LLVM Developers mailing list > > > llvm-dev at lists.llvm.org > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > > > > > > > > > > > > -- > > > -- > > > Peter > > > > > > > > > > -- > > -- > > Peter > >-- -- Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170306/0477872d/attachment.html>
Moritz Angermann via llvm-dev
2017-Mar-07  02:44 UTC
[llvm-dev] [BUG Report] -dead_strip, strips prefix data unconditionally on macOS
Peter, thanks again! Yes, we only need to refer to main, but we must ensure that the prefix data is not stripped. I’ll have a look at the AsmPrinter. Another idea that came to mind is abusing the prologue data. And simply injecting the prefix data into the prologue data. Then adding the *real* entry_point as an alt_entry after the prologue data. Right now we have. .- - - - -. <- main.dsp | Prefix | |- - - - -| <- main | Body | '- - - - -' with Prologue, I believe: .- - - - - -. <- main | Prologue | |- - - - - -| <- alt_entry main_alt | Body | '- - - - - -' This could probably work today. However prologue data says it needs a special format. Do you happen to know, if that format is important as well, if we would never ever go through main, but only through main_alt? Cheers, Moritz> On Mar 7, 2017, at 10:33 AM, Peter Collingbourne <peter at pcc.me.uk> wrote: > > Firstly, do you need "main.dsp" defined as an external symbol, or can all external references go via "main"? If the answer is the latter, that will make the solution simpler. > > If only the latter, you will need to make a change to LLVM here: http://llvm-cs.pcc.me.uk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp#650 > > Basically you would need to add a hook to the TargetLoweringObjectFile class that allows the object format to control how prefix data is emitted. For Mach-O you would emit a label for a dummy internal symbol, followed by the prefix data and then an alt_entry directive for the function symbol. All other object formats would just emit the prefix data. > > Peter > > On Mon, Mar 6, 2017 at 6:16 PM, Moritz Angermann <moritz.angermann at gmail.com> wrote: > Thank you Peter! > > That seems to do the trick! > > $ cat test.s > .section __TEXT,__text > .globl _main > > .long 1 > _main: > inc %eax > ret > > .alt_entry _main > _main.dsp = _main-4 > > .subsections_via_symbols > > > $ clang test.s -dead_strip > $ otool -vVtdj a.out > a.out: > _main.dsp: > 0000000100000fb1 01 00 addl %eax, (%rax) > 0000000100000fb3 00 00 addb %al, (%rax) > _main: > 0000000100000fb5 ff c0 incl %eax > 0000000100000fb7 c3 retq > > > > However, now I need to figure out how to generate this from > llvm via an alias. > > @.str = private unnamed_addr constant [8 x i8] c"p = %d\0A\00", align 1 > > %struct.prefix = type { i32 } > @main.dsp = alias i32, i32* getelementptr (%struct.prefix, %struct.prefix* bitcast (i32 ()* @main to %struct.prefix*), i32 -1, i32 0) > > declare i32 @printf(i8*, ...) > > define i32 @main() prefix %struct.prefix { i32 123 } { > %main = bitcast i32 ()* @main to i32* > %prefix_ptr = getelementptr inbounds i32, i32* %main, i32 -1 > %prefix_val = load i32, i32* %prefix_ptr > %ret = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([8 x i8], [8 x i8]* @.str, i32 0, i32 0), i32 %prefix_val) > ret i32 0 > } > > > generates the following for me: > > .section __TEXT,__text,regular,pure_instructions > .macosx_version_min 10, 12 > .globl _main > .p2align 4, 0x90 > .long 123 ## @main > ## 0x7b > _main: > .cfi_startproc > ## BB#0: > pushq %rax > Ltmp0: > .cfi_def_cfa_offset 16 > movl _main-4(%rip), %esi > leaq L_.str(%rip), %rdi > xorl %eax, %eax > callq _printf > xorl %eax, %eax > popq %rcx > retq > .cfi_endproc > > .section __TEXT,__cstring,cstring_literals > L_.str: ## @.str > .asciz "p = %d\n" > > > .globl _main.dsp > .alt_entry _main.dsp > _main.dsp = _main-4 > .subsections_via_symbols > > any ideas how to get the .alt_entry right? > > Cheers, > Moritz > > > > On Mar 7, 2017, at 10:02 AM, Peter Collingbourne <peter at pcc.me.uk> wrote: > > > > On Mon, Mar 6, 2017 at 5:54 PM, Moritz Angermann <moritz.angermann at gmail.com> wrote: > > Hi Peter, > > > > I’ve just experimented with this a bit: > > > > Say we would end up with the following assembly: > > > > .section __TEXT,__text > > .globl _main > > > > .long 1 > > _main: > > inc %eax > > ret > > > > .globl _main.dsp > > .alt_entry _main.dsp > > > > What happens if you try ".alt_entry _main" instead? The alt_entry is supposed to be bound to the atom appearing *before* it. > > > > _main.dsp = _main-4 > > > > .subsections_via_symbols > > > > (e.g. we inject the .alt_entry after the fact, pointing to the start of the prefix data) > > > > this will yield: > > > > $ clang test.s -dead_strip > > ld: warning: N_ALT_ENTRY bit set on first atom in section __TEXT/__text > > > > And the prefix data will be stripped again. > > > > E.g. what you end up getting is: > > > > $ otool -vVtdj a.out > > a.out: > > _main: > > 0000000100000fb5 ff c0 incl %eax > > 0000000100000fb7 c3 retq > > > > instead of what we’d like to get: > > > > otool -vVtdj a.out > > a.out: > > _main.dsp: > > 0000000100000fb1 01 00 addl %eax, (%rax) > > 0000000100000fb3 00 00 addb %al, (%rax) > > _main: > > 0000000100000fb5 ff c0 incl %eax > > 0000000100000fb7 c3 retq > > > > .alt_entry’s are not dead_strip protected, and this makes sense I guess, as if the alt_entry is never > > actually called visibly from anywhere, it’s probably not needed. However there is the .no_daed_strip > > directive. Thus if we graft this slightly different: > > > > .section __TEXT,__text > > .globl _main > > > > .long 1 > > _main: > > inc %eax > > ret > > > > .no_dead_strip _main.dsp > > .alt_entry _main.dsp > > _main.dsp = _main-4 > > > > .subsections_via_symbols > > > > we still get a warning, but it won’t get stripped. At that point however, we don’t need the .alt_entry > > anymore (and can drop the warning). > > > > Thus, I’d propose that for functions with prefix_data, a second symbol with .no_dead_strip is emitted > > for the prefix data entry point. > > > > I don't think that is sufficient. I believe that the linker is allowed to move the function away from the prefix data even if the function is not dead stripped. > > > > Peter > > > > > > Cheers, > > Moritz > > > > > > > On Mar 7, 2017, at 3:35 AM, Peter Collingbourne <peter at pcc.me.uk> wrote: > > > > > > That is in theory what omitting the .subsections_via_symbols directive is supposed to do, but in an experiment I ran a year or two ago I found that the Mach-O linker was still dead stripping on symbol boundaries with this directive omitted. > > > > > > In any case, a more precise approach has more recently (~a few months ago) become possible. There is a relatively new asm directive called .altentry that, as I understand it, tells the linker to disregard a given symbol as a section boundary (LLVM already uses this for aliases pointing into the middle of a global). So what you would do is to use .altentry on the function symbol, with an internal symbol appearing before the prefix data to ensure that it is not considered part of the body of the previous function. > > > > > > Peter > > > > > > On Mon, Mar 6, 2017 at 11:19 AM, James Y Knight via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > AFAIK, this cannot actually work on Apple platforms, because its object file format (Mach-O) doesn't use sections to determine the ranges of code/data to keep together, but instead _infers_ boundaries based on the range between global symbols in the symbol table. > > > > > > So, the symbol pointing to the beginning of @main *necessarily* makes that be a section boundary. > > > > > > I think the best that could be done in LLVM is to not emit the ".subsections_via_symbols" asm directive (effectively disabling dead stripping on that object) if any prefix data exists. Currently it emits that flag unconditionally for MachO. > > > > > > On Mon, Mar 6, 2017 at 4:40 AM, Moritz Angermann via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > Hi, > > > > > > I just came across a rather annoying behavior with llvm 3.9. Assuming the following > > > samle code in test.ll: > > > > > > ; Lets have some global int x = 4 > > > @x = global i32 10, align 4 > > > ; and two strings "p = %d\n" for the prefix data, > > > ; as well as "x = %d\n" to print the (global) x value. > > > @.str = private unnamed_addr constant [8 x i8] c"x = %d\0A\00", align 1 > > > @.str2 = private unnamed_addr constant [8 x i8] c"p = %d\0A\00", align 1 > > > > > > ; declare printf, we'll use this later for printf style debugging. > > > declare i32 @printf(i8*, ...) > > > > > > ; define a main function. > > > define i32 @main() prefix i32 123 { > > > ; obtain a i32 pointer to the main function. > > > ; the prefix data is right before that pointer. > > > %main = bitcast i32 ()* @main to i32* > > > > > > ; use the gep, to cmpute the start of the prefix data. > > > %prefix_ptr = getelementptr inbounds i32, i32* %main, i32 -1 > > > ; and load it. > > > %prefix_val = load i32, i32* %prefix_ptr > > > > > > ; print that value. > > > %ret = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([8 x i8], [8 x i8]* @.str2, i32 0, i32 0), i32 %prefix_val) > > > > > > ; similarly let's do the same with the global x. > > > %1 = alloca i32, align 4 > > > store i32 0, i32* %1, align 4 > > > %2 = load i32, i32* @x, align 4 > > > %3 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([8 x i8], [8 x i8]* @.str, i32 0, i32 0), i32 %2) > > > ret i32 0 > > > } > > > > > > gives the following result (expected) > > > > > > $ clang test.ll > > > $ ./a.out > > > p = 123 > > > x = 10 > > > > > > however, with -dead_strip on macOS, we see the following: > > > > > > $ clang test.ll -dead_strip > > > $ ./a.out > > > p = 0 > > > x = 10 > > > > > > Thus I believe we are incorrectly stripping prefix data when linking with -dead_strip on macOS. > > > > > > As I do not have a bugzilla account, and hence cannot post this as a proper bug report. > > > > > > Cheers, > > > Moritz > > > _______________________________________________ > > > LLVM Developers mailing list > > > llvm-dev at lists.llvm.org > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > > > > > > _______________________________________________ > > > LLVM Developers mailing list > > > llvm-dev at lists.llvm.org > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > > > > > > > > > > > > -- > > > -- > > > Peter > > > > > > > > > > -- > > -- > > Peter > > > > > -- > -- > Peter
Peter Collingbourne via llvm-dev
2017-Mar-07  02:54 UTC
[llvm-dev] [BUG Report] -dead_strip, strips prefix data unconditionally on macOS
I suspect that the format isn't important if you do that, but I wouldn't recommend it, at least because inlining (and other inter-procedural optimizations) are not expected to work correctly if you produce IR like that. Peter On Mon, Mar 6, 2017 at 6:44 PM, Moritz Angermann <moritz.angermann at gmail.com> wrote:> Peter, > > thanks again! Yes, we only need to refer to main, but we must ensure that > the prefix data is not stripped. > > I’ll have a look at the AsmPrinter. > > Another idea that came to mind is abusing the prologue data. And simply > injecting the prefix data into the prologue data. Then adding the *real* > entry_point as an alt_entry after the prologue data. > > Right now we have. > > .- - - - -. <- main.dsp > | Prefix | > |- - - - -| <- main > | Body | > '- - - - -' > > with Prologue, I believe: > > .- - - - - -. <- main > | Prologue | > |- - - - - -| <- alt_entry main_alt > | Body | > '- - - - - -' > > This could probably work today. However prologue data says it needs a > special > format. Do you happen to know, if that format is important as well, if we > would never ever go through main, but only through main_alt? > > Cheers, > Moritz > > > On Mar 7, 2017, at 10:33 AM, Peter Collingbourne <peter at pcc.me.uk> > wrote: > > > > Firstly, do you need "main.dsp" defined as an external symbol, or can > all external references go via "main"? If the answer is the latter, that > will make the solution simpler. > > > > If only the latter, you will need to make a change to LLVM here: > http://llvm-cs.pcc.me.uk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp#650 > > > > Basically you would need to add a hook to the TargetLoweringObjectFile > class that allows the object format to control how prefix data is emitted. > For Mach-O you would emit a label for a dummy internal symbol, followed by > the prefix data and then an alt_entry directive for the function symbol. > All other object formats would just emit the prefix data. > > > > Peter > > > > On Mon, Mar 6, 2017 at 6:16 PM, Moritz Angermann < > moritz.angermann at gmail.com> wrote: > > Thank you Peter! > > > > That seems to do the trick! > > > > $ cat test.s > > .section __TEXT,__text > > .globl _main > > > > .long 1 > > _main: > > inc %eax > > ret > > > > .alt_entry _main > > _main.dsp = _main-4 > > > > .subsections_via_symbols > > > > > > $ clang test.s -dead_strip > > $ otool -vVtdj a.out > > a.out: > > _main.dsp: > > 0000000100000fb1 01 00 addl %eax, (%rax) > > 0000000100000fb3 00 00 addb %al, (%rax) > > _main: > > 0000000100000fb5 ff c0 incl %eax > > 0000000100000fb7 c3 retq > > > > > > > > However, now I need to figure out how to generate this from > > llvm via an alias. > > > > @.str = private unnamed_addr constant [8 x i8] c"p = %d\0A\00", align 1 > > > > %struct.prefix = type { i32 } > > @main.dsp = alias i32, i32* getelementptr (%struct.prefix, > %struct.prefix* bitcast (i32 ()* @main to %struct.prefix*), i32 -1, i32 0) > > > > declare i32 @printf(i8*, ...) > > > > define i32 @main() prefix %struct.prefix { i32 123 } { > > %main = bitcast i32 ()* @main to i32* > > %prefix_ptr = getelementptr inbounds i32, i32* %main, i32 -1 > > %prefix_val = load i32, i32* %prefix_ptr > > %ret = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([8 x > i8], [8 x i8]* @.str, i32 0, i32 0), i32 %prefix_val) > > ret i32 0 > > } > > > > > > generates the following for me: > > > > .section __TEXT,__text,regular,pure_instructions > > .macosx_version_min 10, 12 > > .globl _main > > .p2align 4, 0x90 > > .long 123 ## @main > > ## 0x7b > > _main: > > .cfi_startproc > > ## BB#0: > > pushq %rax > > Ltmp0: > > .cfi_def_cfa_offset 16 > > movl _main-4(%rip), %esi > > leaq L_.str(%rip), %rdi > > xorl %eax, %eax > > callq _printf > > xorl %eax, %eax > > popq %rcx > > retq > > .cfi_endproc > > > > .section __TEXT,__cstring,cstring_literals > > L_.str: ## @.str > > .asciz "p = %d\n" > > > > > > .globl _main.dsp > > .alt_entry _main.dsp > > _main.dsp = _main-4 > > .subsections_via_symbols > > > > any ideas how to get the .alt_entry right? > > > > Cheers, > > Moritz > > > > > > > On Mar 7, 2017, at 10:02 AM, Peter Collingbourne <peter at pcc.me.uk> > wrote: > > > > > > On Mon, Mar 6, 2017 at 5:54 PM, Moritz Angermann < > moritz.angermann at gmail.com> wrote: > > > Hi Peter, > > > > > > I’ve just experimented with this a bit: > > > > > > Say we would end up with the following assembly: > > > > > > .section __TEXT,__text > > > .globl _main > > > > > > .long 1 > > > _main: > > > inc %eax > > > ret > > > > > > .globl _main.dsp > > > .alt_entry _main.dsp > > > > > > What happens if you try ".alt_entry _main" instead? The alt_entry is > supposed to be bound to the atom appearing *before* it. > > > > > > _main.dsp = _main-4 > > > > > > .subsections_via_symbols > > > > > > (e.g. we inject the .alt_entry after the fact, pointing to the start > of the prefix data) > > > > > > this will yield: > > > > > > $ clang test.s -dead_strip > > > ld: warning: N_ALT_ENTRY bit set on first atom in section __TEXT/__text > > > > > > And the prefix data will be stripped again. > > > > > > E.g. what you end up getting is: > > > > > > $ otool -vVtdj a.out > > > a.out: > > > _main: > > > 0000000100000fb5 ff c0 incl %eax > > > 0000000100000fb7 c3 retq > > > > > > instead of what we’d like to get: > > > > > > otool -vVtdj a.out > > > a.out: > > > _main.dsp: > > > 0000000100000fb1 01 00 addl %eax, (%rax) > > > 0000000100000fb3 00 00 addb %al, (%rax) > > > _main: > > > 0000000100000fb5 ff c0 incl %eax > > > 0000000100000fb7 c3 retq > > > > > > .alt_entry’s are not dead_strip protected, and this makes sense I > guess, as if the alt_entry is never > > > actually called visibly from anywhere, it’s probably not needed. > However there is the .no_daed_strip > > > directive. Thus if we graft this slightly different: > > > > > > .section __TEXT,__text > > > .globl _main > > > > > > .long 1 > > > _main: > > > inc %eax > > > ret > > > > > > .no_dead_strip _main.dsp > > > .alt_entry _main.dsp > > > _main.dsp = _main-4 > > > > > > .subsections_via_symbols > > > > > > we still get a warning, but it won’t get stripped. At that point > however, we don’t need the .alt_entry > > > anymore (and can drop the warning). > > > > > > Thus, I’d propose that for functions with prefix_data, a second symbol > with .no_dead_strip is emitted > > > for the prefix data entry point. > > > > > > I don't think that is sufficient. I believe that the linker is allowed > to move the function away from the prefix data even if the function is not > dead stripped. > > > > > > Peter > > > > > > > > > Cheers, > > > Moritz > > > > > > > > > > On Mar 7, 2017, at 3:35 AM, Peter Collingbourne <peter at pcc.me.uk> > wrote: > > > > > > > > That is in theory what omitting the .subsections_via_symbols > directive is supposed to do, but in an experiment I ran a year or two ago I > found that the Mach-O linker was still dead stripping on symbol boundaries > with this directive omitted. > > > > > > > > In any case, a more precise approach has more recently (~a few > months ago) become possible. There is a relatively new asm directive called > .altentry that, as I understand it, tells the linker to disregard a given > symbol as a section boundary (LLVM already uses this for aliases pointing > into the middle of a global). So what you would do is to use .altentry on > the function symbol, with an internal symbol appearing before the prefix > data to ensure that it is not considered part of the body of the previous > function. > > > > > > > > Peter > > > > > > > > On Mon, Mar 6, 2017 at 11:19 AM, James Y Knight via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > > > AFAIK, this cannot actually work on Apple platforms, because its > object file format (Mach-O) doesn't use sections to determine the ranges of > code/data to keep together, but instead _infers_ boundaries based on the > range between global symbols in the symbol table. > > > > > > > > So, the symbol pointing to the beginning of @main *necessarily* > makes that be a section boundary. > > > > > > > > I think the best that could be done in LLVM is to not emit the > ".subsections_via_symbols" asm directive (effectively disabling dead > stripping on that object) if any prefix data exists. Currently it emits > that flag unconditionally for MachO. > > > > > > > > On Mon, Mar 6, 2017 at 4:40 AM, Moritz Angermann via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > > > Hi, > > > > > > > > I just came across a rather annoying behavior with llvm 3.9. > Assuming the following > > > > samle code in test.ll: > > > > > > > > ; Lets have some global int x = 4 > > > > @x = global i32 10, align 4 > > > > ; and two strings "p = %d\n" for the prefix data, > > > > ; as well as "x = %d\n" to print the (global) x value. > > > > @.str = private unnamed_addr constant [8 x i8] c"x = %d\0A\00", > align 1 > > > > @.str2 = private unnamed_addr constant [8 x i8] c"p = %d\0A\00", > align 1 > > > > > > > > ; declare printf, we'll use this later for printf style debugging. > > > > declare i32 @printf(i8*, ...) > > > > > > > > ; define a main function. > > > > define i32 @main() prefix i32 123 { > > > > ; obtain a i32 pointer to the main function. > > > > ; the prefix data is right before that pointer. > > > > %main = bitcast i32 ()* @main to i32* > > > > > > > > ; use the gep, to cmpute the start of the prefix data. > > > > %prefix_ptr = getelementptr inbounds i32, i32* %main, i32 -1 > > > > ; and load it. > > > > %prefix_val = load i32, i32* %prefix_ptr > > > > > > > > ; print that value. > > > > %ret = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([8 > x i8], [8 x i8]* @.str2, i32 0, i32 0), i32 %prefix_val) > > > > > > > > ; similarly let's do the same with the global x. > > > > %1 = alloca i32, align 4 > > > > store i32 0, i32* %1, align 4 > > > > %2 = load i32, i32* @x, align 4 > > > > %3 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([8 x > i8], [8 x i8]* @.str, i32 0, i32 0), i32 %2) > > > > ret i32 0 > > > > } > > > > > > > > gives the following result (expected) > > > > > > > > $ clang test.ll > > > > $ ./a.out > > > > p = 123 > > > > x = 10 > > > > > > > > however, with -dead_strip on macOS, we see the following: > > > > > > > > $ clang test.ll -dead_strip > > > > $ ./a.out > > > > p = 0 > > > > x = 10 > > > > > > > > Thus I believe we are incorrectly stripping prefix data when linking > with -dead_strip on macOS. > > > > > > > > As I do not have a bugzilla account, and hence cannot post this as a > proper bug report. > > > > > > > > Cheers, > > > > Moritz > > > > _______________________________________________ > > > > LLVM Developers mailing list > > > > llvm-dev at lists.llvm.org > > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > > > > > > > > > _______________________________________________ > > > > LLVM Developers mailing list > > > > llvm-dev at lists.llvm.org > > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > > > > > > > > > > > > > > > > > -- > > > > -- > > > > Peter > > > > > > > > > > > > > > > -- > > > -- > > > Peter > > > > > > > > > > -- > > -- > > Peter > >-- -- Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170306/6b29f954/attachment.html>
Moritz Angermann via llvm-dev
2017-Mar-09  02:31 UTC
[llvm-dev] [BUG Report] -dead_strip, strips prefix data unconditionally on macOS
Hi Peter, sorry for the delay. With the following patch, which is based on Ben Gamaris commit here[1].> From 94058c133056f2cd372c1044e80359ccec5790ac Mon Sep 17 00:00:00 2001 > From: Moritz Angermann <moritz.angermann at gmail.com> > Date: Thu, 9 Mar 2017 10:23:59 +0800 > Subject: [PATCH] Ensure that prefix data is preserved with > subsections-via-symbols > MIME-Version: 1.0 > Content-Type: text/plain; charset=UTF-8 > Content-Transfer-Encoding: 8bit > > On MachO platforms that use subsections-via-symbols dead code stripping will > drop prefix data. Unfortunately there is no great way to convey the relationship > between a function and its prefix data to the linker. We are forced to use a bit > of a hack: we give the prefix data it’s own symbol, and mark the actual function > entry an .alt_entry. > — > lib/CodeGen/AsmPrinter/AsmPrinter.cpp | 19 +++++++++++++++++— > 1 file changed, 17 insertions(+), 2 deletions(-) > > diff --git a/lib/CodeGen/AsmPrinter/AsmPrinter.cpp b/lib/CodeGen/AsmPrinter/AsmPrinter.cpp > index 7099065a638..85512e45740 100644 > --- a/lib/CodeGen/AsmPrinter/AsmPrinter.cpp > +++ b/lib/CodeGen/AsmPrinter/AsmPrinter.cpp > @@ -646,8 +646,23 @@ void AsmPrinter::EmitFunctionHeader() { > } > > // Emit the prefix data. > - if (F->hasPrefixData()) > - EmitGlobalConstant(F->getParent()->getDataLayout(), F->getPrefixData()); > + if (F->hasPrefixData()) { > + if (MAI->hasSubsectionsViaSymbols()) { > + // Preserving prefix data on platforms which use subsections-via-symbols > + // is a bit tricky. Here we introduce a symbol for the prefix data > + // and use the .alt_entry attribute to mark the function's real entry point > + // as an alternative entry point to the prefix-data symbol. > + MCSymbol *PrefixSym = createTempSymbol("prefix"); > + OutStreamer->EmitLabel(PrefixSym); > + > + EmitGlobalConstant(F->getParent()->getDataLayout(), F->getPrefixData()); > + > + // Emit an .alt_entry directive for the actual function symbol. > + OutStreamer->EmitSymbolAttribute(CurrentFnSym, MCSA_AltEntry); > + } else { > + EmitGlobalConstant(F->getParent()->getDataLayout(), F->getPrefixData()); > + } > + } > > // Emit the CurrentFnSym. This is a virtual function to allow targets to > // do their wild and crazy things as required. > — > 2.11.0We obtain: -------------------------------------------------------------------------------- .section __TEXT,__text,regular,pure_instructions .macosx_version_min 10, 12 .globl _main .p2align 4, 0x90 Lprefix0: ## @main .quad 1 ## 0x1 .quad 0 ## 0x0 .quad -1 ## 0xffffffffffffffff .alt_entry _main _main: .cfi_startproc ## BB#0: pushq %rax Lcfi0: .cfi_def_cfa_offset 16 movq _main-24(%rip), %rsi leaq L_.str(%rip), %rdi xorl %eax, %eax callq _printf xorl %eax, %eax popq %rcx retq .cfi_endproc .p2align 4, 0x90 Lprefix1: ## @test .quad 321 ## 0x141 .quad 0 ## 0x0 .quad -1 ## 0xffffffffffffffff .alt_entry l_test l_test: .cfi_startproc ## BB#0: xorl %eax, %eax retq .cfi_endproc .section __TEXT,__cstring,cstring_literals L_.str: ## @.str .asciz "p = %d\n" .subsections_via_symbols -------------------------------------------------------------------------------- For -------------------------------------------------------------------------------- @.str = private unnamed_addr constant [8 x i8] c"p = %d\0A\00", align 1 %struct.prefix = type <{i64, i64, i64}> declare i32 @printf(i8*, ...) define i32 @main() prefix %struct.prefix <{i64 1, i64 0, i64 -1}> { %main = bitcast i32 ()* @main to %struct.prefix* %prefix_ptr = getelementptr inbounds %struct.prefix, %struct.prefix* %main, i32 -1, i32 0 %prefix_val = load i64, i64* %prefix_ptr %ret = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([8 x i8], [8 x i8]* @.str, i32 0, i32 0), i64 %prefix_val) ret i32 0 } define private i32 @test() prefix %struct.prefix <{ i64 321, i64 0, i64 -1 }> { ret i32 0 } -------------------------------------------------------------------------------- Which can be linked with -dead_strip, without loosing the prefix data. How do we proceed from here to get this into llvm? Cheers, Moritz — [1]: https://github.com/bgamari/llvm/commit/38ae8b5ba5ce8a56c1fdc7df9324aa63957d543e> On Mar 7, 2017, at 10:33 AM, Peter Collingbourne <peter at pcc.me.uk> wrote: > > Firstly, do you need "main.dsp" defined as an external symbol, or can all external references go via "main"? If the answer is the latter, that will make the solution simpler. > > If only the latter, you will need to make a change to LLVM here: http://llvm-cs.pcc.me.uk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp#650 > > Basically you would need to add a hook to the TargetLoweringObjectFile class that allows the object format to control how prefix data is emitted. For Mach-O you would emit a label for a dummy internal symbol, followed by the prefix data and then an alt_entry directive for the function symbol. All other object formats would just emit the prefix data. > > Peter > > On Mon, Mar 6, 2017 at 6:16 PM, Moritz Angermann <moritz.angermann at gmail.com> wrote: > Thank you Peter! > > That seems to do the trick! > > $ cat test.s > .section __TEXT,__text > .globl _main > > .long 1 > _main: > inc %eax > ret > > .alt_entry _main > _main.dsp = _main-4 > > .subsections_via_symbols > > > $ clang test.s -dead_strip > $ otool -vVtdj a.out > a.out: > _main.dsp: > 0000000100000fb1 01 00 addl %eax, (%rax) > 0000000100000fb3 00 00 addb %al, (%rax) > _main: > 0000000100000fb5 ff c0 incl %eax > 0000000100000fb7 c3 retq > > > > However, now I need to figure out how to generate this from > llvm via an alias. > > @.str = private unnamed_addr constant [8 x i8] c"p = %d\0A\00", align 1 > > %struct.prefix = type { i32 } > @main.dsp = alias i32, i32* getelementptr (%struct.prefix, %struct.prefix* bitcast (i32 ()* @main to %struct.prefix*), i32 -1, i32 0) > > declare i32 @printf(i8*, ...) > > define i32 @main() prefix %struct.prefix { i32 123 } { > %main = bitcast i32 ()* @main to i32* > %prefix_ptr = getelementptr inbounds i32, i32* %main, i32 -1 > %prefix_val = load i32, i32* %prefix_ptr > %ret = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([8 x i8], [8 x i8]* @.str, i32 0, i32 0), i32 %prefix_val) > ret i32 0 > } > > > generates the following for me: > > .section __TEXT,__text,regular,pure_instructions > .macosx_version_min 10, 12 > .globl _main > .p2align 4, 0x90 > .long 123 ## @main > ## 0x7b > _main: > .cfi_startproc > ## BB#0: > pushq %rax > Ltmp0: > .cfi_def_cfa_offset 16 > movl _main-4(%rip), %esi > leaq L_.str(%rip), %rdi > xorl %eax, %eax > callq _printf > xorl %eax, %eax > popq %rcx > retq > .cfi_endproc > > .section __TEXT,__cstring,cstring_literals > L_.str: ## @.str > .asciz "p = %d\n" > > > .globl _main.dsp > .alt_entry _main.dsp > _main.dsp = _main-4 > .subsections_via_symbols > > any ideas how to get the .alt_entry right? > > Cheers, > Moritz > > > > On Mar 7, 2017, at 10:02 AM, Peter Collingbourne <peter at pcc.me.uk> wrote: > > > > On Mon, Mar 6, 2017 at 5:54 PM, Moritz Angermann <moritz.angermann at gmail.com> wrote: > > Hi Peter, > > > > I’ve just experimented with this a bit: > > > > Say we would end up with the following assembly: > > > > .section __TEXT,__text > > .globl _main > > > > .long 1 > > _main: > > inc %eax > > ret > > > > .globl _main.dsp > > .alt_entry _main.dsp > > > > What happens if you try ".alt_entry _main" instead? The alt_entry is supposed to be bound to the atom appearing *before* it. > > > > _main.dsp = _main-4 > > > > .subsections_via_symbols > > > > (e.g. we inject the .alt_entry after the fact, pointing to the start of the prefix data) > > > > this will yield: > > > > $ clang test.s -dead_strip > > ld: warning: N_ALT_ENTRY bit set on first atom in section __TEXT/__text > > > > And the prefix data will be stripped again. > > > > E.g. what you end up getting is: > > > > $ otool -vVtdj a.out > > a.out: > > _main: > > 0000000100000fb5 ff c0 incl %eax > > 0000000100000fb7 c3 retq > > > > instead of what we’d like to get: > > > > otool -vVtdj a.out > > a.out: > > _main.dsp: > > 0000000100000fb1 01 00 addl %eax, (%rax) > > 0000000100000fb3 00 00 addb %al, (%rax) > > _main: > > 0000000100000fb5 ff c0 incl %eax > > 0000000100000fb7 c3 retq > > > > .alt_entry’s are not dead_strip protected, and this makes sense I guess, as if the alt_entry is never > > actually called visibly from anywhere, it’s probably not needed. However there is the .no_daed_strip > > directive. Thus if we graft this slightly different: > > > > .section __TEXT,__text > > .globl _main > > > > .long 1 > > _main: > > inc %eax > > ret > > > > .no_dead_strip _main.dsp > > .alt_entry _main.dsp > > _main.dsp = _main-4 > > > > .subsections_via_symbols > > > > we still get a warning, but it won’t get stripped. At that point however, we don’t need the .alt_entry > > anymore (and can drop the warning). > > > > Thus, I’d propose that for functions with prefix_data, a second symbol with .no_dead_strip is emitted > > for the prefix data entry point. > > > > I don't think that is sufficient. I believe that the linker is allowed to move the function away from the prefix data even if the function is not dead stripped. > > > > Peter > > > > > > Cheers, > > Moritz > > > > > > > On Mar 7, 2017, at 3:35 AM, Peter Collingbourne <peter at pcc.me.uk> wrote: > > > > > > That is in theory what omitting the .subsections_via_symbols directive is supposed to do, but in an experiment I ran a year or two ago I found that the Mach-O linker was still dead stripping on symbol boundaries with this directive omitted. > > > > > > In any case, a more precise approach has more recently (~a few months ago) become possible. There is a relatively new asm directive called .altentry that, as I understand it, tells the linker to disregard a given symbol as a section boundary (LLVM already uses this for aliases pointing into the middle of a global). So what you would do is to use .altentry on the function symbol, with an internal symbol appearing before the prefix data to ensure that it is not considered part of the body of the previous function. > > > > > > Peter > > > > > > On Mon, Mar 6, 2017 at 11:19 AM, James Y Knight via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > AFAIK, this cannot actually work on Apple platforms, because its object file format (Mach-O) doesn't use sections to determine the ranges of code/data to keep together, but instead _infers_ boundaries based on the range between global symbols in the symbol table. > > > > > > So, the symbol pointing to the beginning of @main *necessarily* makes that be a section boundary. > > > > > > I think the best that could be done in LLVM is to not emit the ".subsections_via_symbols" asm directive (effectively disabling dead stripping on that object) if any prefix data exists. Currently it emits that flag unconditionally for MachO. > > > > > > On Mon, Mar 6, 2017 at 4:40 AM, Moritz Angermann via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > Hi, > > > > > > I just came across a rather annoying behavior with llvm 3.9. Assuming the following > > > samle code in test.ll: > > > > > > ; Lets have some global int x = 4 > > > @x = global i32 10, align 4 > > > ; and two strings "p = %d\n" for the prefix data, > > > ; as well as "x = %d\n" to print the (global) x value. > > > @.str = private unnamed_addr constant [8 x i8] c"x = %d\0A\00", align 1 > > > @.str2 = private unnamed_addr constant [8 x i8] c"p = %d\0A\00", align 1 > > > > > > ; declare printf, we'll use this later for printf style debugging. > > > declare i32 @printf(i8*, ...) > > > > > > ; define a main function. > > > define i32 @main() prefix i32 123 { > > > ; obtain a i32 pointer to the main function. > > > ; the prefix data is right before that pointer. > > > %main = bitcast i32 ()* @main to i32* > > > > > > ; use the gep, to cmpute the start of the prefix data. > > > %prefix_ptr = getelementptr inbounds i32, i32* %main, i32 -1 > > > ; and load it. > > > %prefix_val = load i32, i32* %prefix_ptr > > > > > > ; print that value. > > > %ret = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([8 x i8], [8 x i8]* @.str2, i32 0, i32 0), i32 %prefix_val) > > > > > > ; similarly let's do the same with the global x. > > > %1 = alloca i32, align 4 > > > store i32 0, i32* %1, align 4 > > > %2 = load i32, i32* @x, align 4 > > > %3 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([8 x i8], [8 x i8]* @.str, i32 0, i32 0), i32 %2) > > > ret i32 0 > > > } > > > > > > gives the following result (expected) > > > > > > $ clang test.ll > > > $ ./a.out > > > p = 123 > > > x = 10 > > > > > > however, with -dead_strip on macOS, we see the following: > > > > > > $ clang test.ll -dead_strip > > > $ ./a.out > > > p = 0 > > > x = 10 > > > > > > Thus I believe we are incorrectly stripping prefix data when linking with -dead_strip on macOS. > > > > > > As I do not have a bugzilla account, and hence cannot post this as a proper bug report. > > > > > > Cheers, > > > Moritz > > > _______________________________________________ > > > LLVM Developers mailing list > > > llvm-dev at lists.llvm.org > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > > > > > > _______________________________________________ > > > LLVM Developers mailing list > > > llvm-dev at lists.llvm.org > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > > > > > > > > > > > > -- > > > -- > > > Peter > > > > > > > > > > -- > > -- > > Peter > > > > > -- > -- > Peter
Peter Collingbourne via llvm-dev
2017-Mar-09  02:44 UTC
[llvm-dev] [BUG Report] -dead_strip, strips prefix data unconditionally on macOS
Please send your patch to llvm-commits. I would recommend uploading it to Phabricator following the instructions at http://llvm.org/docs/Phabri cator.html and adding me (pcc) as a reviewer and llvm-commits as a subscriber. Peter On Wed, Mar 8, 2017 at 6:31 PM, Moritz Angermann <moritz.angermann at gmail.com> wrote:> Hi Peter, > > sorry for the delay. > > With the following patch, which is based on Ben Gamaris commit here[1]. > > > From 94058c133056f2cd372c1044e80359ccec5790ac Mon Sep 17 00:00:00 2001 > > From: Moritz Angermann <moritz.angermann at gmail.com> > > Date: Thu, 9 Mar 2017 10:23:59 +0800 > > Subject: [PATCH] Ensure that prefix data is preserved with > > subsections-via-symbols > > MIME-Version: 1.0 > > Content-Type: text/plain; charset=UTF-8 > > Content-Transfer-Encoding: 8bit > > > > On MachO platforms that use subsections-via-symbols dead code stripping > will > > drop prefix data. Unfortunately there is no great way to convey the > relationship > > between a function and its prefix data to the linker. We are forced to > use a bit > > of a hack: we give the prefix data it’s own symbol, and mark the actual > function > > entry an .alt_entry. > > — > > lib/CodeGen/AsmPrinter/AsmPrinter.cpp | 19 +++++++++++++++++— > > 1 file changed, 17 insertions(+), 2 deletions(-) > > > > diff --git a/lib/CodeGen/AsmPrinter/AsmPrinter.cpp > b/lib/CodeGen/AsmPrinter/AsmPrinter.cpp > > index 7099065a638..85512e45740 100644 > > --- a/lib/CodeGen/AsmPrinter/AsmPrinter.cpp > > +++ b/lib/CodeGen/AsmPrinter/AsmPrinter.cpp > > @@ -646,8 +646,23 @@ void AsmPrinter::EmitFunctionHeader() { > > } > > > > // Emit the prefix data. > > - if (F->hasPrefixData()) > > - EmitGlobalConstant(F->getParent()->getDataLayout(), > F->getPrefixData()); > > + if (F->hasPrefixData()) { > > + if (MAI->hasSubsectionsViaSymbols()) { > > + // Preserving prefix data on platforms which use > subsections-via-symbols > > + // is a bit tricky. Here we introduce a symbol for the prefix data > > + // and use the .alt_entry attribute to mark the function's real > entry point > > + // as an alternative entry point to the prefix-data symbol. > > + MCSymbol *PrefixSym = createTempSymbol("prefix"); > > + OutStreamer->EmitLabel(PrefixSym); > > + > > + EmitGlobalConstant(F->getParent()->getDataLayout(), > F->getPrefixData()); > > + > > + // Emit an .alt_entry directive for the actual function symbol. > > + OutStreamer->EmitSymbolAttribute(CurrentFnSym, MCSA_AltEntry); > > + } else { > > + EmitGlobalConstant(F->getParent()->getDataLayout(), > F->getPrefixData()); > > + } > > + } > > > > // Emit the CurrentFnSym. This is a virtual function to allow > targets to > > // do their wild and crazy things as required. > > — > > 2.11.0 > > We obtain: > > ------------------------------------------------------------ > -------------------- > .section __TEXT,__text,regular,pure_instructions > .macosx_version_min 10, 12 > .globl _main > .p2align 4, 0x90 > Lprefix0: ## @main > .quad 1 ## 0x1 > .quad 0 ## 0x0 > .quad -1 ## 0xffffffffffffffff > .alt_entry _main > _main: > .cfi_startproc > ## BB#0: > pushq %rax > Lcfi0: > .cfi_def_cfa_offset 16 > movq _main-24(%rip), %rsi > leaq L_.str(%rip), %rdi > xorl %eax, %eax > callq _printf > xorl %eax, %eax > popq %rcx > retq > .cfi_endproc > > .p2align 4, 0x90 > Lprefix1: ## @test > .quad 321 ## 0x141 > .quad 0 ## 0x0 > .quad -1 ## 0xffffffffffffffff > .alt_entry l_test > l_test: > .cfi_startproc > ## BB#0: > xorl %eax, %eax > retq > .cfi_endproc > > .section __TEXT,__cstring,cstring_literals > L_.str: ## @.str > .asciz "p = %d\n" > > > .subsections_via_symbols > ------------------------------------------------------------ > -------------------- > > For > > ------------------------------------------------------------ > -------------------- > @.str = private unnamed_addr constant [8 x i8] c"p = %d\0A\00", align 1 > > %struct.prefix = type <{i64, i64, i64}> > declare i32 @printf(i8*, ...) > > define i32 @main() prefix %struct.prefix <{i64 1, i64 0, i64 -1}> { > %main = bitcast i32 ()* @main to %struct.prefix* > %prefix_ptr = getelementptr inbounds %struct.prefix, %struct.prefix* > %main, i32 -1, i32 0 > %prefix_val = load i64, i64* %prefix_ptr > %ret = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([8 x i8], > [8 x i8]* @.str, i32 0, i32 0), i64 %prefix_val) > ret i32 0 > } > > define private i32 @test() prefix %struct.prefix <{ i64 321, i64 0, i64 -1 > }> { > ret i32 0 > } > ------------------------------------------------------------ > -------------------- > > Which can be linked with -dead_strip, without loosing the prefix data. > > How do we proceed from here to get this into llvm? > > Cheers, > Moritz > > — > [1]: https://github.com/bgamari/llvm/commit/38ae8b5ba5ce8a56c1fdc7df9324aa > 63957d543e > > > On Mar 7, 2017, at 10:33 AM, Peter Collingbourne <peter at pcc.me.uk> > wrote: > > > > Firstly, do you need "main.dsp" defined as an external symbol, or can > all external references go via "main"? If the answer is the latter, that > will make the solution simpler. > > > > If only the latter, you will need to make a change to LLVM here: > http://llvm-cs.pcc.me.uk/lib/CodeGen/AsmPrinter/AsmPrinter.cpp#650 > > > > Basically you would need to add a hook to the TargetLoweringObjectFile > class that allows the object format to control how prefix data is emitted. > For Mach-O you would emit a label for a dummy internal symbol, followed by > the prefix data and then an alt_entry directive for the function symbol. > All other object formats would just emit the prefix data. > > > > Peter > > > > On Mon, Mar 6, 2017 at 6:16 PM, Moritz Angermann < > moritz.angermann at gmail.com> wrote: > > Thank you Peter! > > > > That seems to do the trick! > > > > $ cat test.s > > .section __TEXT,__text > > .globl _main > > > > .long 1 > > _main: > > inc %eax > > ret > > > > .alt_entry _main > > _main.dsp = _main-4 > > > > .subsections_via_symbols > > > > > > $ clang test.s -dead_strip > > $ otool -vVtdj a.out > > a.out: > > _main.dsp: > > 0000000100000fb1 01 00 addl %eax, (%rax) > > 0000000100000fb3 00 00 addb %al, (%rax) > > _main: > > 0000000100000fb5 ff c0 incl %eax > > 0000000100000fb7 c3 retq > > > > > > > > However, now I need to figure out how to generate this from > > llvm via an alias. > > > > @.str = private unnamed_addr constant [8 x i8] c"p = %d\0A\00", align 1 > > > > %struct.prefix = type { i32 } > > @main.dsp = alias i32, i32* getelementptr (%struct.prefix, > %struct.prefix* bitcast (i32 ()* @main to %struct.prefix*), i32 -1, i32 0) > > > > declare i32 @printf(i8*, ...) > > > > define i32 @main() prefix %struct.prefix { i32 123 } { > > %main = bitcast i32 ()* @main to i32* > > %prefix_ptr = getelementptr inbounds i32, i32* %main, i32 -1 > > %prefix_val = load i32, i32* %prefix_ptr > > %ret = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([8 x > i8], [8 x i8]* @.str, i32 0, i32 0), i32 %prefix_val) > > ret i32 0 > > } > > > > > > generates the following for me: > > > > .section __TEXT,__text,regular,pure_instructions > > .macosx_version_min 10, 12 > > .globl _main > > .p2align 4, 0x90 > > .long 123 ## @main > > ## 0x7b > > _main: > > .cfi_startproc > > ## BB#0: > > pushq %rax > > Ltmp0: > > .cfi_def_cfa_offset 16 > > movl _main-4(%rip), %esi > > leaq L_.str(%rip), %rdi > > xorl %eax, %eax > > callq _printf > > xorl %eax, %eax > > popq %rcx > > retq > > .cfi_endproc > > > > .section __TEXT,__cstring,cstring_literals > > L_.str: ## @.str > > .asciz "p = %d\n" > > > > > > .globl _main.dsp > > .alt_entry _main.dsp > > _main.dsp = _main-4 > > .subsections_via_symbols > > > > any ideas how to get the .alt_entry right? > > > > Cheers, > > Moritz > > > > > > > On Mar 7, 2017, at 10:02 AM, Peter Collingbourne <peter at pcc.me.uk> > wrote: > > > > > > On Mon, Mar 6, 2017 at 5:54 PM, Moritz Angermann < > moritz.angermann at gmail.com> wrote: > > > Hi Peter, > > > > > > I’ve just experimented with this a bit: > > > > > > Say we would end up with the following assembly: > > > > > > .section __TEXT,__text > > > .globl _main > > > > > > .long 1 > > > _main: > > > inc %eax > > > ret > > > > > > .globl _main.dsp > > > .alt_entry _main.dsp > > > > > > What happens if you try ".alt_entry _main" instead? The alt_entry is > supposed to be bound to the atom appearing *before* it. > > > > > > _main.dsp = _main-4 > > > > > > .subsections_via_symbols > > > > > > (e.g. we inject the .alt_entry after the fact, pointing to the start > of the prefix data) > > > > > > this will yield: > > > > > > $ clang test.s -dead_strip > > > ld: warning: N_ALT_ENTRY bit set on first atom in section __TEXT/__text > > > > > > And the prefix data will be stripped again. > > > > > > E.g. what you end up getting is: > > > > > > $ otool -vVtdj a.out > > > a.out: > > > _main: > > > 0000000100000fb5 ff c0 incl %eax > > > 0000000100000fb7 c3 retq > > > > > > instead of what we’d like to get: > > > > > > otool -vVtdj a.out > > > a.out: > > > _main.dsp: > > > 0000000100000fb1 01 00 addl %eax, (%rax) > > > 0000000100000fb3 00 00 addb %al, (%rax) > > > _main: > > > 0000000100000fb5 ff c0 incl %eax > > > 0000000100000fb7 c3 retq > > > > > > .alt_entry’s are not dead_strip protected, and this makes sense I > guess, as if the alt_entry is never > > > actually called visibly from anywhere, it’s probably not needed. > However there is the .no_daed_strip > > > directive. Thus if we graft this slightly different: > > > > > > .section __TEXT,__text > > > .globl _main > > > > > > .long 1 > > > _main: > > > inc %eax > > > ret > > > > > > .no_dead_strip _main.dsp > > > .alt_entry _main.dsp > > > _main.dsp = _main-4 > > > > > > .subsections_via_symbols > > > > > > we still get a warning, but it won’t get stripped. At that point > however, we don’t need the .alt_entry > > > anymore (and can drop the warning). > > > > > > Thus, I’d propose that for functions with prefix_data, a second symbol > with .no_dead_strip is emitted > > > for the prefix data entry point. > > > > > > I don't think that is sufficient. I believe that the linker is allowed > to move the function away from the prefix data even if the function is not > dead stripped. > > > > > > Peter > > > > > > > > > Cheers, > > > Moritz > > > > > > > > > > On Mar 7, 2017, at 3:35 AM, Peter Collingbourne <peter at pcc.me.uk> > wrote: > > > > > > > > That is in theory what omitting the .subsections_via_symbols > directive is supposed to do, but in an experiment I ran a year or two ago I > found that the Mach-O linker was still dead stripping on symbol boundaries > with this directive omitted. > > > > > > > > In any case, a more precise approach has more recently (~a few > months ago) become possible. There is a relatively new asm directive called > .altentry that, as I understand it, tells the linker to disregard a given > symbol as a section boundary (LLVM already uses this for aliases pointing > into the middle of a global). So what you would do is to use .altentry on > the function symbol, with an internal symbol appearing before the prefix > data to ensure that it is not considered part of the body of the previous > function. > > > > > > > > Peter > > > > > > > > On Mon, Mar 6, 2017 at 11:19 AM, James Y Knight via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > > > AFAIK, this cannot actually work on Apple platforms, because its > object file format (Mach-O) doesn't use sections to determine the ranges of > code/data to keep together, but instead _infers_ boundaries based on the > range between global symbols in the symbol table. > > > > > > > > So, the symbol pointing to the beginning of @main *necessarily* > makes that be a section boundary. > > > > > > > > I think the best that could be done in LLVM is to not emit the > ".subsections_via_symbols" asm directive (effectively disabling dead > stripping on that object) if any prefix data exists. Currently it emits > that flag unconditionally for MachO. > > > > > > > > On Mon, Mar 6, 2017 at 4:40 AM, Moritz Angermann via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > > > Hi, > > > > > > > > I just came across a rather annoying behavior with llvm 3.9. > Assuming the following > > > > samle code in test.ll: > > > > > > > > ; Lets have some global int x = 4 > > > > @x = global i32 10, align 4 > > > > ; and two strings "p = %d\n" for the prefix data, > > > > ; as well as "x = %d\n" to print the (global) x value. > > > > @.str = private unnamed_addr constant [8 x i8] c"x = %d\0A\00", > align 1 > > > > @.str2 = private unnamed_addr constant [8 x i8] c"p = %d\0A\00", > align 1 > > > > > > > > ; declare printf, we'll use this later for printf style debugging. > > > > declare i32 @printf(i8*, ...) > > > > > > > > ; define a main function. > > > > define i32 @main() prefix i32 123 { > > > > ; obtain a i32 pointer to the main function. > > > > ; the prefix data is right before that pointer. > > > > %main = bitcast i32 ()* @main to i32* > > > > > > > > ; use the gep, to cmpute the start of the prefix data. > > > > %prefix_ptr = getelementptr inbounds i32, i32* %main, i32 -1 > > > > ; and load it. > > > > %prefix_val = load i32, i32* %prefix_ptr > > > > > > > > ; print that value. > > > > %ret = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([8 > x i8], [8 x i8]* @.str2, i32 0, i32 0), i32 %prefix_val) > > > > > > > > ; similarly let's do the same with the global x. > > > > %1 = alloca i32, align 4 > > > > store i32 0, i32* %1, align 4 > > > > %2 = load i32, i32* @x, align 4 > > > > %3 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([8 x > i8], [8 x i8]* @.str, i32 0, i32 0), i32 %2) > > > > ret i32 0 > > > > } > > > > > > > > gives the following result (expected) > > > > > > > > $ clang test.ll > > > > $ ./a.out > > > > p = 123 > > > > x = 10 > > > > > > > > however, with -dead_strip on macOS, we see the following: > > > > > > > > $ clang test.ll -dead_strip > > > > $ ./a.out > > > > p = 0 > > > > x = 10 > > > > > > > > Thus I believe we are incorrectly stripping prefix data when linking > with -dead_strip on macOS. > > > > > > > > As I do not have a bugzilla account, and hence cannot post this as a > proper bug report. > > > > > > > > Cheers, > > > > Moritz > > > > _______________________________________________ > > > > LLVM Developers mailing list > > > > llvm-dev at lists.llvm.org > > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > > > > > > > > > _______________________________________________ > > > > LLVM Developers mailing list > > > > llvm-dev at lists.llvm.org > > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > > > > > > > > > > > > > > > > > -- > > > > -- > > > > Peter > > > > > > > > > > > > > > > -- > > > -- > > > Peter > > > > > > > > > > -- > > -- > > Peter > >-- -- Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170308/99615cc0/attachment.html>
Reasonably Related Threads
- [BUG Report] -dead_strip, strips prefix data unconditionally on macOS
- [BUG Report] -dead_strip, strips prefix data unconditionally on macOS
- [BUG Report] -dead_strip, strips prefix data unconditionally on macOS
- [BUG Report] -dead_strip, strips prefix data unconditionally on macOS
- [BUG Report] -dead_strip, strips prefix data unconditionally on macOS