Simon Atanasyan via llvm-dev
2016-Feb-08 15:55 UTC
[llvm-dev] [LLD] Incorrect comparision of pointers to function defined in DSO
Hi, It looks like I have found a bug in LLD. Suppose DSO defines a global variable 'data' and initializes it by the address of function 'set_data' defined in the same DSO. If an executable file (linked by LLD) gets address of the '&set_data' function and compares it with a value stored in the 'data' variable it gets different result. If the executable is linked by BFD or Gold linker it gets the same result. Right now I do not have a time to investigate this problem further. I will plan to do that later. But maybe the reason of this problem is obvious to somebody? The reproduction script: % cat so.c void set_data(void *v) {} void *data = &set_data; % cat main.c int printf(const char *, ...); extern void *data; void set_data(void *v); int main(void) { printf("%p = %p\n", &set_data, data); } % clang -fPIC -shared so.c -o libdump.so % clang -c main.c % clang main.o -Wl,-rpath -Wl,. -L. -ldump % ./a.out 0x400600 = 0x400600 # The same addresses % lld -flavor gnu --sysroot=/ --build-id --no-add-needed --eh-frame-hdr \ -m elf_x86_64 --hash-style=both \ -dynamic-linker /lib64/ld-linux-x86-64.so.2 \ /usr/lib/x86_64-linux-gnu/crt1.o /usr/lib/x86_64-linux-gnu/crti.o \ /usr/lib/gcc/x86_64-linux-gnu/4.7/crtbegin.o \ -L. -L/usr/lib/gcc/x86_64-linux-gnu/4.7 \ -L/usr/lib/x86_64-linux-gnu \ -L/usr/lib -L/lib/x86_64-linux-gnu -L/lib \ -L/usr/lib/x86_64-linux-gnu -L/usr/lib \ main.o -rpath . -ldump -lgcc --as-needed -lgcc_s --no-as-needed \ -lc -lgcc --as-needed -lgcc_s --no-as-needed \ /usr/lib/gcc/x86_64-linux-gnu/4.7/crtend.o \ /usr/lib/x86_64-linux-gnu/crtn.o % ./a.out 0x11250 = 0x7f02915bd6b0 # garbage in 'data' -- Simon Atanasyan
Sean Silva via llvm-dev
2016-Feb-08 22:52 UTC
[llvm-dev] [LLD] Incorrect comparision of pointers to function defined in DSO
Sounds like it is related to this: http://www.airs.com/blog/archives/42 """ The fact that C permits taking the address of a function introduces an interesting wrinkle. In C you are permitted to take the address of a function, and you are permitted to compare that address to another function address. The problem is that if you take the address of a function in a shared library, the natural result would be to get the address of the PLT entry. After all, that is address to which a call to the function will jump. However, each shared library has its own PLT, and thus the address of a particular function would differ in each shared library. That means that comparisons of function pointers generated in different shraed libraries may be different when they should be the same. This is not a purely hypothetical problem; when I did a port which got it wrong, before I fixed the bug I saw failures in the Tcl shared library when it compared function pointers. The fix for this bug on most processors is a special marking for a symbol which has a PLT entry but is not defined. Typically the symbol will be marked as undefined, but with a non-zero value–the value will be set to the address of the PLT entry. When the dynamic linker is searching for the value of a symbol to use for a reloc other than a JMP_SLOT reloc, if it finds such a specially marked symbol, it will use the non-zero value. This will ensure that all references to the symbol which are not function calls will use the same value. To make this work, the compiler and assembler must make sure that any reference to a function which does not involve calling it will not carry a standard PLT reloc. This special handling of function addresses needs to be implemented in both the program linker and the dynamic linker. """ Indeed, comparing the `llvm-readobj -dyn-symbols` output on the executables from gold and lld, I see: --- a.out.gold.readobj 2016-02-08 14:08:52.678160575 -0800 +++ a.out.lld.readobj 2016-02-08 14:08:52.678160575 -0800 Symbol { - Name: set_data@ (142) - Value: 0x400560 - Size: 0 + Name: set_data@ (46) + Value: 0x0 + Size: 10 Binding: Global (0x1) Type: Function (0x2) Other: 0 Section: Undefined (0x0) } You can also see this in LD_DEBUG=all when running the executables (to avoid extraneous diffs, both executables are called "./a.out.lld"; look at the diff header to know which is output from the gold executable vs lld executable): --- ld_debug-a.out.gold 2016-02-08 14:07:27.255734743 -0800 +++ ld_debug-a.out.lld 2016-02-08 14:07:27.255734743 -0800 relocation processing: ./libdump.so (lazy) symbol=set_data; lookup in file=./a.out.lld [0] - binding file ./libdump.so [0] to ./a.out.lld [0]: normal symbol `set_data' + symbol=set_data; lookup in file=./libdump.so [0] + binding file ./libdump.so [0] to ./libdump.so [0]: normal symbol `set_data' For gold, the symbol is bound to the one in a.out (the PLT entry), while for lld it is bound to the one in libdump.so. -- Sean Silva On Mon, Feb 8, 2016 at 7:55 AM, Simon Atanasyan via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi, > > It looks like I have found a bug in LLD. Suppose DSO defines a global > variable 'data' and initializes it by the address of function > 'set_data' defined in the same DSO. If an executable file (linked by > LLD) gets address of the '&set_data' function and compares it with a > value stored in the 'data' variable it gets different result. If the > executable is linked by BFD or Gold linker it gets the same result. > > Right now I do not have a time to investigate this problem further. I > will plan to do that later. But maybe the reason of this problem is > obvious to somebody? > > The reproduction script: > > % cat so.c > void set_data(void *v) {} > void *data = &set_data; > > % cat main.c > int printf(const char *, ...); > > extern void *data; > void set_data(void *v); > > int main(void) > { > printf("%p = %p\n", &set_data, data); > } > > % clang -fPIC -shared so.c -o libdump.so > % clang -c main.c > > % clang main.o -Wl,-rpath -Wl,. -L. -ldump > % ./a.out > 0x400600 = 0x400600 # The same addresses > > % lld -flavor gnu --sysroot=/ --build-id --no-add-needed --eh-frame-hdr \ > -m elf_x86_64 --hash-style=both \ > -dynamic-linker /lib64/ld-linux-x86-64.so.2 \ > /usr/lib/x86_64-linux-gnu/crt1.o /usr/lib/x86_64-linux-gnu/crti.o \ > /usr/lib/gcc/x86_64-linux-gnu/4.7/crtbegin.o \ > -L. -L/usr/lib/gcc/x86_64-linux-gnu/4.7 \ > -L/usr/lib/x86_64-linux-gnu \ > -L/usr/lib -L/lib/x86_64-linux-gnu -L/lib \ > -L/usr/lib/x86_64-linux-gnu -L/usr/lib \ > main.o -rpath . -ldump -lgcc --as-needed -lgcc_s --no-as-needed \ > -lc -lgcc --as-needed -lgcc_s --no-as-needed \ > /usr/lib/gcc/x86_64-linux-gnu/4.7/crtend.o \ > /usr/lib/x86_64-linux-gnu/crtn.o > % ./a.out > 0x11250 = 0x7f02915bd6b0 # garbage in 'data' > > -- > Simon Atanasyan > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160208/dcf25565/attachment.html>
Rafael Espíndola via llvm-dev
2016-Feb-08 23:14 UTC
[llvm-dev] [LLD] Incorrect comparision of pointers to function defined in DSO
Yes, I had just reduced it too. I am pretty sure this was implemented at some point. Taking a look. Cheers, Rafael On 8 February 2016 at 17:52, Sean Silva <chisophugis at gmail.com> wrote:> Sounds like it is related to this: > > http://www.airs.com/blog/archives/42 > """ > > The fact that C permits taking the address of a function introduces an > interesting wrinkle. In C you are permitted to take the address of a > function, and you are permitted to compare that address to another function > address. The problem is that if you take the address of a function in a > shared library, the natural result would be to get the address of the PLT > entry. After all, that is address to which a call to the function will jump. > However, each shared library has its own PLT, and thus the address of a > particular function would differ in each shared library. That means that > comparisons of function pointers generated in different shraed libraries may > be different when they should be the same. This is not a purely hypothetical > problem; when I did a port which got it wrong, before I fixed the bug I saw > failures in the Tcl shared library when it compared function pointers. > > The fix for this bug on most processors is a special marking for a symbol > which has a PLT entry but is not defined. Typically the symbol will be > marked as undefined, but with a non-zero value–the value will be set to the > address of the PLT entry. When the dynamic linker is searching for the value > of a symbol to use for a reloc other than a JMP_SLOT reloc, if it finds such > a specially marked symbol, it will use the non-zero value. This will ensure > that all references to the symbol which are not function calls will use the > same value. To make this work, the compiler and assembler must make sure > that any reference to a function which does not involve calling it will not > carry a standard PLT reloc. This special handling of function addresses > needs to be implemented in both the program linker and the dynamic linker. > > """ > > > Indeed, comparing the `llvm-readobj -dyn-symbols` output on the executables > from gold and lld, I see: > > > --- a.out.gold.readobj 2016-02-08 14:08:52.678160575 -0800 > +++ a.out.lld.readobj 2016-02-08 14:08:52.678160575 -0800 > > Symbol { > - Name: set_data@ (142) > - Value: 0x400560 > - Size: 0 > + Name: set_data@ (46) > + Value: 0x0 > + Size: 10 > Binding: Global (0x1) > Type: Function (0x2) > Other: 0 > Section: Undefined (0x0) > } > > You can also see this in LD_DEBUG=all when running the executables (to avoid > extraneous diffs, both executables are called "./a.out.lld"; look at the > diff header to know which is output from the gold executable vs lld > executable): > > --- ld_debug-a.out.gold 2016-02-08 14:07:27.255734743 -0800 > +++ ld_debug-a.out.lld 2016-02-08 14:07:27.255734743 -0800 > relocation processing: ./libdump.so (lazy) > symbol=set_data; lookup in file=./a.out.lld [0] > - binding file ./libdump.so [0] to ./a.out.lld [0]: normal symbol > `set_data' > + symbol=set_data; lookup in file=./libdump.so [0] > + binding file ./libdump.so [0] to ./libdump.so [0]: normal symbol > `set_data' > > For gold, the symbol is bound to the one in a.out (the PLT entry), while for > lld it is bound to the one in libdump.so. > > -- Sean Silva > > On Mon, Feb 8, 2016 at 7:55 AM, Simon Atanasyan via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> >> Hi, >> >> It looks like I have found a bug in LLD. Suppose DSO defines a global >> variable 'data' and initializes it by the address of function >> 'set_data' defined in the same DSO. If an executable file (linked by >> LLD) gets address of the '&set_data' function and compares it with a >> value stored in the 'data' variable it gets different result. If the >> executable is linked by BFD or Gold linker it gets the same result. >> >> Right now I do not have a time to investigate this problem further. I >> will plan to do that later. But maybe the reason of this problem is >> obvious to somebody? >> >> The reproduction script: >> >> % cat so.c >> void set_data(void *v) {} >> void *data = &set_data; >> >> % cat main.c >> int printf(const char *, ...); >> >> extern void *data; >> void set_data(void *v); >> >> int main(void) >> { >> printf("%p = %p\n", &set_data, data); >> } >> >> % clang -fPIC -shared so.c -o libdump.so >> % clang -c main.c >> >> % clang main.o -Wl,-rpath -Wl,. -L. -ldump >> % ./a.out >> 0x400600 = 0x400600 # The same addresses >> >> % lld -flavor gnu --sysroot=/ --build-id --no-add-needed --eh-frame-hdr \ >> -m elf_x86_64 --hash-style=both \ >> -dynamic-linker /lib64/ld-linux-x86-64.so.2 \ >> /usr/lib/x86_64-linux-gnu/crt1.o /usr/lib/x86_64-linux-gnu/crti.o \ >> /usr/lib/gcc/x86_64-linux-gnu/4.7/crtbegin.o \ >> -L. -L/usr/lib/gcc/x86_64-linux-gnu/4.7 \ >> -L/usr/lib/x86_64-linux-gnu \ >> -L/usr/lib -L/lib/x86_64-linux-gnu -L/lib \ >> -L/usr/lib/x86_64-linux-gnu -L/usr/lib \ >> main.o -rpath . -ldump -lgcc --as-needed -lgcc_s --no-as-needed \ >> -lc -lgcc --as-needed -lgcc_s --no-as-needed \ >> /usr/lib/gcc/x86_64-linux-gnu/4.7/crtend.o \ >> /usr/lib/x86_64-linux-gnu/crtn.o >> % ./a.out >> 0x11250 = 0x7f02915bd6b0 # garbage in 'data' >> >> -- >> Simon Atanasyan >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >
Simon Atanasyan via llvm-dev
2016-Feb-09 07:53 UTC
[llvm-dev] [LLD] Incorrect comparision of pointers to function defined in DSO
On Tue, Feb 9, 2016 at 1:52 AM, Sean Silva <chisophugis at gmail.com> wrote:> Sounds like it is related to this: > > http://www.airs.com/blog/archives/42Thanks for the explanation. For some reasons I was sure that this functionality has been implemented in LLD2. Probably I mixed up LLD and LLD2. It is interesting that as usual MIPS requires some additional work. In particular we need to set up STO_MIPS_PLT bit in the st_other filed of the symbol which points to PLT. "Function addresses" chapter in the https://sourceware.org/ml/binutils/2008-07/txt00000.txt -- Simon Atanasyan