Rafael Espíndola via llvm-dev
2016-Feb-08 23:14 UTC
[llvm-dev] [LLD] Incorrect comparision of pointers to function defined in DSO
Yes, I had just reduced it too. I am pretty sure this was implemented at some point. Taking a look. Cheers, Rafael On 8 February 2016 at 17:52, Sean Silva <chisophugis at gmail.com> wrote:> Sounds like it is related to this: > > http://www.airs.com/blog/archives/42 > """ > > The fact that C permits taking the address of a function introduces an > interesting wrinkle. In C you are permitted to take the address of a > function, and you are permitted to compare that address to another function > address. The problem is that if you take the address of a function in a > shared library, the natural result would be to get the address of the PLT > entry. After all, that is address to which a call to the function will jump. > However, each shared library has its own PLT, and thus the address of a > particular function would differ in each shared library. That means that > comparisons of function pointers generated in different shraed libraries may > be different when they should be the same. This is not a purely hypothetical > problem; when I did a port which got it wrong, before I fixed the bug I saw > failures in the Tcl shared library when it compared function pointers. > > The fix for this bug on most processors is a special marking for a symbol > which has a PLT entry but is not defined. Typically the symbol will be > marked as undefined, but with a non-zero value–the value will be set to the > address of the PLT entry. When the dynamic linker is searching for the value > of a symbol to use for a reloc other than a JMP_SLOT reloc, if it finds such > a specially marked symbol, it will use the non-zero value. This will ensure > that all references to the symbol which are not function calls will use the > same value. To make this work, the compiler and assembler must make sure > that any reference to a function which does not involve calling it will not > carry a standard PLT reloc. This special handling of function addresses > needs to be implemented in both the program linker and the dynamic linker. > > """ > > > Indeed, comparing the `llvm-readobj -dyn-symbols` output on the executables > from gold and lld, I see: > > > --- a.out.gold.readobj 2016-02-08 14:08:52.678160575 -0800 > +++ a.out.lld.readobj 2016-02-08 14:08:52.678160575 -0800 > > Symbol { > - Name: set_data@ (142) > - Value: 0x400560 > - Size: 0 > + Name: set_data@ (46) > + Value: 0x0 > + Size: 10 > Binding: Global (0x1) > Type: Function (0x2) > Other: 0 > Section: Undefined (0x0) > } > > You can also see this in LD_DEBUG=all when running the executables (to avoid > extraneous diffs, both executables are called "./a.out.lld"; look at the > diff header to know which is output from the gold executable vs lld > executable): > > --- ld_debug-a.out.gold 2016-02-08 14:07:27.255734743 -0800 > +++ ld_debug-a.out.lld 2016-02-08 14:07:27.255734743 -0800 > relocation processing: ./libdump.so (lazy) > symbol=set_data; lookup in file=./a.out.lld [0] > - binding file ./libdump.so [0] to ./a.out.lld [0]: normal symbol > `set_data' > + symbol=set_data; lookup in file=./libdump.so [0] > + binding file ./libdump.so [0] to ./libdump.so [0]: normal symbol > `set_data' > > For gold, the symbol is bound to the one in a.out (the PLT entry), while for > lld it is bound to the one in libdump.so. > > -- Sean Silva > > On Mon, Feb 8, 2016 at 7:55 AM, Simon Atanasyan via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> >> Hi, >> >> It looks like I have found a bug in LLD. Suppose DSO defines a global >> variable 'data' and initializes it by the address of function >> 'set_data' defined in the same DSO. If an executable file (linked by >> LLD) gets address of the '&set_data' function and compares it with a >> value stored in the 'data' variable it gets different result. If the >> executable is linked by BFD or Gold linker it gets the same result. >> >> Right now I do not have a time to investigate this problem further. I >> will plan to do that later. But maybe the reason of this problem is >> obvious to somebody? >> >> The reproduction script: >> >> % cat so.c >> void set_data(void *v) {} >> void *data = &set_data; >> >> % cat main.c >> int printf(const char *, ...); >> >> extern void *data; >> void set_data(void *v); >> >> int main(void) >> { >> printf("%p = %p\n", &set_data, data); >> } >> >> % clang -fPIC -shared so.c -o libdump.so >> % clang -c main.c >> >> % clang main.o -Wl,-rpath -Wl,. -L. -ldump >> % ./a.out >> 0x400600 = 0x400600 # The same addresses >> >> % lld -flavor gnu --sysroot=/ --build-id --no-add-needed --eh-frame-hdr \ >> -m elf_x86_64 --hash-style=both \ >> -dynamic-linker /lib64/ld-linux-x86-64.so.2 \ >> /usr/lib/x86_64-linux-gnu/crt1.o /usr/lib/x86_64-linux-gnu/crti.o \ >> /usr/lib/gcc/x86_64-linux-gnu/4.7/crtbegin.o \ >> -L. -L/usr/lib/gcc/x86_64-linux-gnu/4.7 \ >> -L/usr/lib/x86_64-linux-gnu \ >> -L/usr/lib -L/lib/x86_64-linux-gnu -L/lib \ >> -L/usr/lib/x86_64-linux-gnu -L/usr/lib \ >> main.o -rpath . -ldump -lgcc --as-needed -lgcc_s --no-as-needed \ >> -lc -lgcc --as-needed -lgcc_s --no-as-needed \ >> /usr/lib/gcc/x86_64-linux-gnu/4.7/crtend.o \ >> /usr/lib/x86_64-linux-gnu/crtn.o >> % ./a.out >> 0x11250 = 0x7f02915bd6b0 # garbage in 'data' >> >> -- >> Simon Atanasyan >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >
Rafael Espíndola via llvm-dev
2016-Feb-08 23:39 UTC
[llvm-dev] [LLD] Incorrect comparision of pointers to function defined in DSO
No, it was not implemented, I just added a note: // The remaining (unimplemented) problem is making sure pointer equality // still works. We need the help of the dynamic linker for that. We // let it know that we have a direct reference to a so symbol by creating // an undefined symbol with a non zero st_value. Seeing that, the // dynamic linker resolves the symbol to the value of the symbol we created. // This is true even for got entries, so pointer equality is maintained. // To avoid an infinite loop, the only entry that points to the // real function is a dedicated got entry used by the plt. That is // identified by special relocation types (R_X86_64_JUMP_SLOT, // R_386_JMP_SLOT, etc). Taking a look. On 8 February 2016 at 18:14, Rafael Espíndola <rafael.espindola at gmail.com> wrote:> Yes, I had just reduced it too. > > I am pretty sure this was implemented at some point. Taking a look. > > Cheers, > Rafael > > > On 8 February 2016 at 17:52, Sean Silva <chisophugis at gmail.com> wrote: >> Sounds like it is related to this: >> >> http://www.airs.com/blog/archives/42 >> """ >> >> The fact that C permits taking the address of a function introduces an >> interesting wrinkle. In C you are permitted to take the address of a >> function, and you are permitted to compare that address to another function >> address. The problem is that if you take the address of a function in a >> shared library, the natural result would be to get the address of the PLT >> entry. After all, that is address to which a call to the function will jump. >> However, each shared library has its own PLT, and thus the address of a >> particular function would differ in each shared library. That means that >> comparisons of function pointers generated in different shraed libraries may >> be different when they should be the same. This is not a purely hypothetical >> problem; when I did a port which got it wrong, before I fixed the bug I saw >> failures in the Tcl shared library when it compared function pointers. >> >> The fix for this bug on most processors is a special marking for a symbol >> which has a PLT entry but is not defined. Typically the symbol will be >> marked as undefined, but with a non-zero value–the value will be set to the >> address of the PLT entry. When the dynamic linker is searching for the value >> of a symbol to use for a reloc other than a JMP_SLOT reloc, if it finds such >> a specially marked symbol, it will use the non-zero value. This will ensure >> that all references to the symbol which are not function calls will use the >> same value. To make this work, the compiler and assembler must make sure >> that any reference to a function which does not involve calling it will not >> carry a standard PLT reloc. This special handling of function addresses >> needs to be implemented in both the program linker and the dynamic linker. >> >> """ >> >> >> Indeed, comparing the `llvm-readobj -dyn-symbols` output on the executables >> from gold and lld, I see: >> >> >> --- a.out.gold.readobj 2016-02-08 14:08:52.678160575 -0800 >> +++ a.out.lld.readobj 2016-02-08 14:08:52.678160575 -0800 >> >> Symbol { >> - Name: set_data@ (142) >> - Value: 0x400560 >> - Size: 0 >> + Name: set_data@ (46) >> + Value: 0x0 >> + Size: 10 >> Binding: Global (0x1) >> Type: Function (0x2) >> Other: 0 >> Section: Undefined (0x0) >> } >> >> You can also see this in LD_DEBUG=all when running the executables (to avoid >> extraneous diffs, both executables are called "./a.out.lld"; look at the >> diff header to know which is output from the gold executable vs lld >> executable): >> >> --- ld_debug-a.out.gold 2016-02-08 14:07:27.255734743 -0800 >> +++ ld_debug-a.out.lld 2016-02-08 14:07:27.255734743 -0800 >> relocation processing: ./libdump.so (lazy) >> symbol=set_data; lookup in file=./a.out.lld [0] >> - binding file ./libdump.so [0] to ./a.out.lld [0]: normal symbol >> `set_data' >> + symbol=set_data; lookup in file=./libdump.so [0] >> + binding file ./libdump.so [0] to ./libdump.so [0]: normal symbol >> `set_data' >> >> For gold, the symbol is bound to the one in a.out (the PLT entry), while for >> lld it is bound to the one in libdump.so. >> >> -- Sean Silva >> >> On Mon, Feb 8, 2016 at 7:55 AM, Simon Atanasyan via llvm-dev >> <llvm-dev at lists.llvm.org> wrote: >>> >>> Hi, >>> >>> It looks like I have found a bug in LLD. Suppose DSO defines a global >>> variable 'data' and initializes it by the address of function >>> 'set_data' defined in the same DSO. If an executable file (linked by >>> LLD) gets address of the '&set_data' function and compares it with a >>> value stored in the 'data' variable it gets different result. If the >>> executable is linked by BFD or Gold linker it gets the same result. >>> >>> Right now I do not have a time to investigate this problem further. I >>> will plan to do that later. But maybe the reason of this problem is >>> obvious to somebody? >>> >>> The reproduction script: >>> >>> % cat so.c >>> void set_data(void *v) {} >>> void *data = &set_data; >>> >>> % cat main.c >>> int printf(const char *, ...); >>> >>> extern void *data; >>> void set_data(void *v); >>> >>> int main(void) >>> { >>> printf("%p = %p\n", &set_data, data); >>> } >>> >>> % clang -fPIC -shared so.c -o libdump.so >>> % clang -c main.c >>> >>> % clang main.o -Wl,-rpath -Wl,. -L. -ldump >>> % ./a.out >>> 0x400600 = 0x400600 # The same addresses >>> >>> % lld -flavor gnu --sysroot=/ --build-id --no-add-needed --eh-frame-hdr \ >>> -m elf_x86_64 --hash-style=both \ >>> -dynamic-linker /lib64/ld-linux-x86-64.so.2 \ >>> /usr/lib/x86_64-linux-gnu/crt1.o /usr/lib/x86_64-linux-gnu/crti.o \ >>> /usr/lib/gcc/x86_64-linux-gnu/4.7/crtbegin.o \ >>> -L. -L/usr/lib/gcc/x86_64-linux-gnu/4.7 \ >>> -L/usr/lib/x86_64-linux-gnu \ >>> -L/usr/lib -L/lib/x86_64-linux-gnu -L/lib \ >>> -L/usr/lib/x86_64-linux-gnu -L/usr/lib \ >>> main.o -rpath . -ldump -lgcc --as-needed -lgcc_s --no-as-needed \ >>> -lc -lgcc --as-needed -lgcc_s --no-as-needed \ >>> /usr/lib/gcc/x86_64-linux-gnu/4.7/crtend.o \ >>> /usr/lib/x86_64-linux-gnu/crtn.o >>> % ./a.out >>> 0x11250 = 0x7f02915bd6b0 # garbage in 'data' >>> >>> -- >>> Simon Atanasyan >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >>
Rafael Espíndola via llvm-dev
2016-Feb-09 00:22 UTC
[llvm-dev] [LLD] Incorrect comparision of pointers to function defined in DSO
http://reviews.llvm.org/D17007 Cheers, Rafael On 8 February 2016 at 18:39, Rafael Espíndola <rafael.espindola at gmail.com> wrote:> No, it was not implemented, I just added a note: > > // The remaining (unimplemented) problem is making sure pointer > equality > // still works. We need the help of the dynamic linker for that. > We > // let it know that we have a direct reference to a so symbol by > creating > // an undefined symbol with a non zero st_value. Seeing that, the > // dynamic linker resolves the symbol to the value of the symbol > we created. > // This is true even for got entries, so pointer equality is > maintained. > // To avoid an infinite loop, the only entry that points to the > // real function is a dedicated got entry used by the plt. That is > // identified by special relocation types (R_X86_64_JUMP_SLOT, > // R_386_JMP_SLOT, etc). > > Taking a look. > > > > On 8 February 2016 at 18:14, Rafael Espíndola > <rafael.espindola at gmail.com> wrote: >> Yes, I had just reduced it too. >> >> I am pretty sure this was implemented at some point. Taking a look. >> >> Cheers, >> Rafael >> >> >> On 8 February 2016 at 17:52, Sean Silva <chisophugis at gmail.com> wrote: >>> Sounds like it is related to this: >>> >>> http://www.airs.com/blog/archives/42 >>> """ >>> >>> The fact that C permits taking the address of a function introduces an >>> interesting wrinkle. In C you are permitted to take the address of a >>> function, and you are permitted to compare that address to another function >>> address. The problem is that if you take the address of a function in a >>> shared library, the natural result would be to get the address of the PLT >>> entry. After all, that is address to which a call to the function will jump. >>> However, each shared library has its own PLT, and thus the address of a >>> particular function would differ in each shared library. That means that >>> comparisons of function pointers generated in different shraed libraries may >>> be different when they should be the same. This is not a purely hypothetical >>> problem; when I did a port which got it wrong, before I fixed the bug I saw >>> failures in the Tcl shared library when it compared function pointers. >>> >>> The fix for this bug on most processors is a special marking for a symbol >>> which has a PLT entry but is not defined. Typically the symbol will be >>> marked as undefined, but with a non-zero value–the value will be set to the >>> address of the PLT entry. When the dynamic linker is searching for the value >>> of a symbol to use for a reloc other than a JMP_SLOT reloc, if it finds such >>> a specially marked symbol, it will use the non-zero value. This will ensure >>> that all references to the symbol which are not function calls will use the >>> same value. To make this work, the compiler and assembler must make sure >>> that any reference to a function which does not involve calling it will not >>> carry a standard PLT reloc. This special handling of function addresses >>> needs to be implemented in both the program linker and the dynamic linker. >>> >>> """ >>> >>> >>> Indeed, comparing the `llvm-readobj -dyn-symbols` output on the executables >>> from gold and lld, I see: >>> >>> >>> --- a.out.gold.readobj 2016-02-08 14:08:52.678160575 -0800 >>> +++ a.out.lld.readobj 2016-02-08 14:08:52.678160575 -0800 >>> >>> Symbol { >>> - Name: set_data@ (142) >>> - Value: 0x400560 >>> - Size: 0 >>> + Name: set_data@ (46) >>> + Value: 0x0 >>> + Size: 10 >>> Binding: Global (0x1) >>> Type: Function (0x2) >>> Other: 0 >>> Section: Undefined (0x0) >>> } >>> >>> You can also see this in LD_DEBUG=all when running the executables (to avoid >>> extraneous diffs, both executables are called "./a.out.lld"; look at the >>> diff header to know which is output from the gold executable vs lld >>> executable): >>> >>> --- ld_debug-a.out.gold 2016-02-08 14:07:27.255734743 -0800 >>> +++ ld_debug-a.out.lld 2016-02-08 14:07:27.255734743 -0800 >>> relocation processing: ./libdump.so (lazy) >>> symbol=set_data; lookup in file=./a.out.lld [0] >>> - binding file ./libdump.so [0] to ./a.out.lld [0]: normal symbol >>> `set_data' >>> + symbol=set_data; lookup in file=./libdump.so [0] >>> + binding file ./libdump.so [0] to ./libdump.so [0]: normal symbol >>> `set_data' >>> >>> For gold, the symbol is bound to the one in a.out (the PLT entry), while for >>> lld it is bound to the one in libdump.so. >>> >>> -- Sean Silva >>> >>> On Mon, Feb 8, 2016 at 7:55 AM, Simon Atanasyan via llvm-dev >>> <llvm-dev at lists.llvm.org> wrote: >>>> >>>> Hi, >>>> >>>> It looks like I have found a bug in LLD. Suppose DSO defines a global >>>> variable 'data' and initializes it by the address of function >>>> 'set_data' defined in the same DSO. If an executable file (linked by >>>> LLD) gets address of the '&set_data' function and compares it with a >>>> value stored in the 'data' variable it gets different result. If the >>>> executable is linked by BFD or Gold linker it gets the same result. >>>> >>>> Right now I do not have a time to investigate this problem further. I >>>> will plan to do that later. But maybe the reason of this problem is >>>> obvious to somebody? >>>> >>>> The reproduction script: >>>> >>>> % cat so.c >>>> void set_data(void *v) {} >>>> void *data = &set_data; >>>> >>>> % cat main.c >>>> int printf(const char *, ...); >>>> >>>> extern void *data; >>>> void set_data(void *v); >>>> >>>> int main(void) >>>> { >>>> printf("%p = %p\n", &set_data, data); >>>> } >>>> >>>> % clang -fPIC -shared so.c -o libdump.so >>>> % clang -c main.c >>>> >>>> % clang main.o -Wl,-rpath -Wl,. -L. -ldump >>>> % ./a.out >>>> 0x400600 = 0x400600 # The same addresses >>>> >>>> % lld -flavor gnu --sysroot=/ --build-id --no-add-needed --eh-frame-hdr \ >>>> -m elf_x86_64 --hash-style=both \ >>>> -dynamic-linker /lib64/ld-linux-x86-64.so.2 \ >>>> /usr/lib/x86_64-linux-gnu/crt1.o /usr/lib/x86_64-linux-gnu/crti.o \ >>>> /usr/lib/gcc/x86_64-linux-gnu/4.7/crtbegin.o \ >>>> -L. -L/usr/lib/gcc/x86_64-linux-gnu/4.7 \ >>>> -L/usr/lib/x86_64-linux-gnu \ >>>> -L/usr/lib -L/lib/x86_64-linux-gnu -L/lib \ >>>> -L/usr/lib/x86_64-linux-gnu -L/usr/lib \ >>>> main.o -rpath . -ldump -lgcc --as-needed -lgcc_s --no-as-needed \ >>>> -lc -lgcc --as-needed -lgcc_s --no-as-needed \ >>>> /usr/lib/gcc/x86_64-linux-gnu/4.7/crtend.o \ >>>> /usr/lib/x86_64-linux-gnu/crtn.o >>>> % ./a.out >>>> 0x11250 = 0x7f02915bd6b0 # garbage in 'data' >>>> >>>> -- >>>> Simon Atanasyan >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >>>