I noticed that LLD doesn’t preserve the symbol type for a defsym directive. For
example:
$ cat f.c
void f() {}
$ clang -c f.c
$ ld.lld -shared --defsym=g=f f.o
$ objdump -T a.out
DYNAMIC SYMBOL TABLE:
00000000000012a0 g    DF .text  0000000000000006 f
00000000000012a0 g    D  .text  0000000000000000 g
f is marked as a function symbol, but g is not.
I recognize this is hard to do in the general case, where you can have e.g.
arithmetic being performed in the defsym, but in this particular case, it would
seem desirable for the alias symbol to have the same type for the target. My
question is if this will end up making any difference in practice. The case
I'm concerned about in particular is ARM-Thumb interworking, where I believe
there might be some logic that's based on symbol types. Is there any
possibility that we'll have issues with that logic because of the alias not
being marked as a function symbol?
>I recognize this is hard to do in the general case, where you can have e.g. arithmetic being performed in the defsym, but in this particular case, it would seem desirable for the alias symbol to have the same type for the target. > My question is if this will end up making any difference in practice. The case I'm concerned about in particular is ARM-Thumb interworking, where I believe there might be some logic that's based on symbol types. > Is there any possibility that we'll have issues with that logic because of the alias not being marked as a function symbol?Thanks for pointing that out. There can be a problem on Arm as no interworking will be performed for symbols that are not STT_FUNC. Given that ld.bfd does preserve the symbol type for aliases I think this is worth raising a PR. To extend your example with: $ cat h.c extern void f(); extern void g(); void h() { f(); g(); } $ clang --target=armv7a-none-eabi -c f.c $ clang --target=armv7a-none-eabi -c h.c -mthumb $ ld.lld f.o h.o --defsym g=f # No --shared to prevent a PLT entry. $ objdump -d a.out 000200e4 <f>: 200e4: e12fff1e bx lr 000200e8 <h>: 200e8: b580 push {r7, lr} 200ea: 466f mov r7, sp 200ec: f7ff effa blx 200e4 <f> 200f0: f7ff fff8 bl 200e4 <f> 200f4: bd80 pop {r7, pc} The blx to f() is correct as a state change is required. The bl to f() will likely crash the program. ld.bfd correctly marks g as STT_FUNC so it gets the state change correct for both calls. 00008000 <f>: 8000: e12fff1e bx lr 00008004 <h>: 8004: b580 push {r7, lr} 8006: 466f mov r7, sp 8008: f7ff effa blx 8000 <f> 800c: f7ff eff8 blx 8000 <f> 8010: bd80 pop {r7, pc} ________________________________________ From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Shoaib Meenai via llvm-dev <llvm-dev at lists.llvm.org> Sent: 03 August 2020 19:23 To: llvm-dev at lists.llvm.org Subject: [llvm-dev] LLD symbol types for defsym I noticed that LLD doesn’t preserve the symbol type for a defsym directive. For example: $ cat f.c void f() {} $ clang -c f.c $ ld.lld -shared --defsym=g=f f.o $ objdump -T a.out DYNAMIC SYMBOL TABLE: 00000000000012a0 g DF .text 0000000000000006 f 00000000000012a0 g D .text 0000000000000000 g f is marked as a function symbol, but g is not. I recognize this is hard to do in the general case, where you can have e.g. arithmetic being performed in the defsym, but in this particular case, it would seem desirable for the alias symbol to have the same type for the target. My question is if this will end up making any difference in practice. The case I'm concerned about in particular is ARM-Thumb interworking, where I believe there might be some logic that's based on symbol types. Is there any possibility that we'll have issues with that logic because of the alias not being marked as a function symbol? _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Thanks! Filed https://llvm.org/PR46790
On 8/3/20, 11:57 AM, "Peter Smith" <Peter.Smith at arm.com>
wrote:
    >I recognize this is hard to do in the general case, where you can have
e.g. arithmetic being performed in the defsym, but in this particular case, it
would seem desirable for the alias symbol to have the same type for the target.
    > My question is if this will end up making any difference in practice.
The case I'm concerned about in particular is ARM-Thumb interworking, where
I believe there might be some logic that's based on symbol types.
    > Is there any possibility that we'll have issues with that logic
because of the alias not being marked as a function symbol?
    Thanks for pointing that out. There can be a problem on Arm as no
interworking will be performed for symbols that are not STT_FUNC. Given that
ld.bfd does preserve the symbol type for aliases I think this is worth raising a
PR.
    To extend your example with:
    $ cat h.c
    extern void f();
    extern void g();
    void h() { f(); g(); }
    $ clang --target=armv7a-none-eabi -c f.c
    $ clang --target=armv7a-none-eabi -c h.c -mthumb
    $ ld.lld f.o h.o --defsym g=f  # No --shared to prevent a PLT entry.
    $ objdump -d a.out
    000200e4 <f>:
       200e4:       e12fff1e        bx      lr
    000200e8 <h>:
       200e8:       b580            push    {r7, lr}
       200ea:       466f            mov     r7, sp
       200ec:       f7ff effa       blx     200e4 <f>
       200f0:       f7ff fff8       bl      200e4 <f>
       200f4:       bd80            pop     {r7, pc}
    The blx to f() is correct as a state change is required. The bl to f() will
likely crash the program.
    ld.bfd correctly marks g as STT_FUNC so it gets the state change correct for
both calls.
    00008000 <f>:
        8000:       e12fff1e        bx      lr
    00008004 <h>:
        8004:       b580            push    {r7, lr}
        8006:       466f            mov     r7, sp
        8008:       f7ff effa       blx     8000 <f>
        800c:       f7ff eff8       blx     8000 <f>
        8010:       bd80            pop     {r7, pc}
    ________________________________________
    From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of
Shoaib Meenai via llvm-dev <llvm-dev at lists.llvm.org>
    Sent: 03 August 2020 19:23
    To: llvm-dev at lists.llvm.org
    Subject: [llvm-dev] LLD symbol types for defsym
    I noticed that LLD doesn’t preserve the symbol type for a defsym directive.
For example:
    $ cat f.c
    void f() {}
    $ clang -c f.c
    $ ld.lld -shared --defsym=g=f f.o
    $ objdump -T a.out
    DYNAMIC SYMBOL TABLE:
    00000000000012a0 g    DF .text  0000000000000006 f
    00000000000012a0 g    D  .text  0000000000000000 g
    f is marked as a function symbol, but g is not.
    I recognize this is hard to do in the general case, where you can have e.g.
arithmetic being performed in the defsym, but in this particular case, it would
seem desirable for the alias symbol to have the same type for the target. My
question is if this will end up making any difference in practice. The case
I'm concerned about in particular is ARM-Thumb interworking, where I believe
there might be some logic that's based on symbol types. Is there any
possibility that we'll have issues with that logic because of the alias not
being marked as a function symbol?
    _______________________________________________
    LLVM Developers mailing list
    llvm-dev at lists.llvm.org
   
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.llvm.org_cgi-2Dbin_mailman_listinfo_llvm-2Ddev&d=DwIF-g&c=5VD0RTtNlTh3ycd41b3MUw&r=o3kDXzdBUE3ljQXKeTWOMw&m=JEb_ZlzU-PlvtJQ_GhNFmjqoqRzPN5RRyItOWb9fPpo&s=SK3GezgXJpNHj933Fym67WDJdR9zXj582HJEDJLg09w&e=