Martin Richtarsky via llvm-dev
2017-Mar-15 16:29 UTC
[llvm-dev] [LLD] Linking static library does not resolve symbols as gold/ld
Hi all, I'm currently trying out lld on a large project. We are currently using gold (and used GNU ld before that). I have come across a few minor issues but could workaround them: - Missing support for --defsym=symbol1=symbol2, --warn-unknown-eh-frame-section, --exclude-libs There are two other issues which are more critical, one of which is currently blocking me, so I would like to find a solution for this one first. I have a static library that is linked into an executable. The binary produced by lld crashes, while the gold version runs fine. The difference is in the call instructions below. The original object file from the archive has an address of zero in the call instruction: 0000000000013832 <func>: 13832: 55 push %rbp 13833: 48 89 e5 mov %rsp,%rbp 13836: 53 push %rbx 13837: 48 83 ec 18 sub $0x18,%rsp 1383b: 48 89 7d e8 mov %rdi,-0x18(%rbp) 1383f: 48 8b 45 e8 mov -0x18(%rbp),%rax 13843: 48 89 c7 mov %rax,%rdi -> 13846: e8 00 00 00 00 callq 1384b <func+0x19> 1384b: 48 8b 45 e8 mov -0x18(%rbp),%rax gdb displays this as a jump to the next instruction: 0x0000000000013832 <+0>: push %rbp 0x0000000000013833 <+1>: mov %rsp,%rbp 0x0000000000013836 <+4>: push %rbx 0x0000000000013837 <+5>: sub $0x18,%rsp 0x000000000001383b <+9>: mov %rdi,-0x18(%rbp) 0x000000000001383f <+13>: mov -0x18(%rbp),%rax 0x0000000000013843 <+17>: mov %rax,%rdi 0x0000000000013846 <+20>: callq 0x1384b <func()+25> 0x000000000001384b <+25>: mov -0x18(%rbp),%rax However, in the executable linked by gold, the calls are magically resolved: 0x000000000018b44e <+0>: push %rbp 0x000000000018b44f <+1>: mov %rsp,%rbp 0x000000000018b452 <+4>: push %rbx 0x000000000018b453 <+5>: sub $0x18,%rsp 0x000000000018b457 <+9>: mov %rdi,-0x18(%rbp) 0x000000000018b45b <+13>: mov -0x18(%rbp),%rax 0x000000000018b45f <+17>: mov %rax,%rdi 0x000000000018b462 <+20>: callq 0x68568c <std::vector<record, std::allocator<record> >::vector()> 0x000000000018b467 <+25>: mov -0x18(%rbp),%rax Even more interesting, several such call instructions with argument 0 are resolved to different functions. So somewhere there must be information stored to what functions they resolve to. lld produces this code: 0x00005555559f304e <+0>: push %rbp 0x00005555559f304f <+1>: mov %rsp,%rbp 0x00005555559f3052 <+4>: push %rbx 0x00005555559f3053 <+5>: sub $0x18,%rsp 0x00005555559f3057 <+9>: mov %rdi,-0x18(%rbp) 0x00005555559f305b <+13>: mov -0x18(%rbp),%rax 0x00005555559f305f <+17>: mov %rax,%rdi 0x00005555559f3062 <+20>: callq 0x555555554000 0x00005555559f3067 <+25>: mov -0x18(%rbp),%rax 0x555555554000 is the start of the mapped region of the executable, so it seems lld just adds the argument 0 to that without doing any relocation processing. Is this a known limitation of lld? Thanks and best regards, Martin
Rafael Avila de Espindola via llvm-dev
2017-Mar-20 14:06 UTC
[llvm-dev] [LLD] Linking static library does not resolve symbols as gold/ld
Martin Richtarsky via llvm-dev <llvm-dev at lists.llvm.org> writes:> Hi all, > > I'm currently trying out lld on a large project. We are currently using > gold (and used GNU ld before that). > > I have come across a few minor issues but could workaround them: > - Missing support for --defsym=symbol1=symbol2, > --warn-unknown-eh-frame-section, --exclude-libs > > There are two other issues which are more critical, one of which is > currently blocking me, so I would like to find a solution for this one > first. > > I have a static library that is linked into an executable. The binary > produced by lld crashes, while the gold version runs fine. > > The difference is in the call instructions below. The original object file > from the archive has an address of zero in the call instruction: > > 0000000000013832 <func>: > 13832: 55 push %rbp > 13833: 48 89 e5 mov %rsp,%rbp > 13836: 53 push %rbx > 13837: 48 83 ec 18 sub $0x18,%rsp > 1383b: 48 89 7d e8 mov %rdi,-0x18(%rbp) > 1383f: 48 8b 45 e8 mov -0x18(%rbp),%rax > 13843: 48 89 c7 mov %rax,%rdi > -> 13846: e8 00 00 00 00 callq 1384b <func+0x19> > 1384b: 48 8b 45 e8 mov -0x18(%rbp),%rax > > gdb displays this as a jump to the next instruction: > > 0x0000000000013832 <+0>: push %rbp > 0x0000000000013833 <+1>: mov %rsp,%rbp > 0x0000000000013836 <+4>: push %rbx > 0x0000000000013837 <+5>: sub $0x18,%rsp > 0x000000000001383b <+9>: mov %rdi,-0x18(%rbp) > 0x000000000001383f <+13>: mov -0x18(%rbp),%rax > 0x0000000000013843 <+17>: mov %rax,%rdi > 0x0000000000013846 <+20>: callq 0x1384b <func()+25> > 0x000000000001384b <+25>: mov -0x18(%rbp),%rax > > However, in the executable linked by gold, the calls are magically resolved: > > 0x000000000018b44e <+0>: push %rbp > 0x000000000018b44f <+1>: mov %rsp,%rbp > 0x000000000018b452 <+4>: push %rbx > 0x000000000018b453 <+5>: sub $0x18,%rsp > 0x000000000018b457 <+9>: mov %rdi,-0x18(%rbp) > 0x000000000018b45b <+13>: mov -0x18(%rbp),%rax > 0x000000000018b45f <+17>: mov %rax,%rdi > 0x000000000018b462 <+20>: callq 0x68568c <std::vector<record, > std::allocator<record> >::vector()> > 0x000000000018b467 <+25>: mov -0x18(%rbp),%rax > > Even more interesting, several such call instructions with argument 0 are > resolved to different functions. So somewhere there must be information > stored to what functions they resolve to. > > lld produces this code: > > 0x00005555559f304e <+0>: push %rbp > 0x00005555559f304f <+1>: mov %rsp,%rbp > 0x00005555559f3052 <+4>: push %rbx > 0x00005555559f3053 <+5>: sub $0x18,%rsp > 0x00005555559f3057 <+9>: mov %rdi,-0x18(%rbp) > 0x00005555559f305b <+13>: mov -0x18(%rbp),%rax > 0x00005555559f305f <+17>: mov %rax,%rdi > 0x00005555559f3062 <+20>: callq 0x555555554000 > 0x00005555559f3067 <+25>: mov -0x18(%rbp),%rax > > 0x555555554000 is the start of the mapped region of the executable, so it > seems lld just adds the argument 0 to that without doing any relocation > processing. > > Is this a known limitation of lld?It is hard to tell without more information. Can you share the result of --reproduce repro.tar? If not, can you try reducing it? Cheers, Rafael