James Courtier-Dutton via llvm-dev
2020-Mar-22 14:08 UTC
[llvm-dev] Possible bug in CLANG/LLVM
Hi, Taking a sample program: /* A very simple function to test memory stores. */ static int mem1 = 0x123; int *test99() { return &mem1; // Return a 64bit pointer to the heap. } clang-10 -c -o test99.o -O1 test99.c objdump -drt test99.o test99.o: file format elf64-x86-64 SYMBOL TABLE: 0000000000000000 l df *ABS* 0000000000000000 test99.c 0000000000000000 l O .data 0000000000000004 mem1 0000000000000000 l d .text 0000000000000000 .text 0000000000000000 l d .data 0000000000000000 .data 0000000000000000 g F .text 0000000000000006 test99 Disassembly of section .text: 0000000000000000 <test99>: 0: b8 00 00 00 00 mov $0x0,%eax <---32 bit HERE 1: R_X86_64_32 .data 5: c3 retq In this case, %eax is supposed to be a 64bit pointer on x86-64 While mov $0x0,%eax zero extends to fill %rax. Can there be situation where, when the program is linked and loaded as part of a larger program or dynamically loaded as a .so lib, the address of .data might be > 32bits ? What is it that forces the .data segment to be loaded < 32bits address ? I see that the linux elf loader, will fail to load the program if the address is >32 bits, and also compiling with -fPIC gets round the problem to an extent. Kind Regards James
James Courtier-Dutton via llvm-dev <llvm-dev at lists.llvm.org> writes:> Disassembly of section .text: > 0000000000000000 <test99>: > 0: b8 00 00 00 00 mov $0x0,%eax <---32 bit HERE > 1: R_X86_64_32 .data > 5: c3 retq > > In this case, %eax is supposed to be a 64bit pointer on x86-64 > While mov $0x0,%eax zero extends to fill %rax. Can there be situation > where, when the program is linked and loaded as part of a larger > program or dynamically loaded as a .so lib, the address of .data might > be > 32bits ? > > What is it that forces the .data segment to be loaded < 32bits address ?Nothing. By default clang compiles for the x86_64 small memory model, which on SysV systems means pointers are assumed to be < 32 bits (it's a bit more complicated than that, see the SysV ABI document for details).> I see that the linux elf loader, will fail to load the program if the > address is >32 bits, and also compiling with -fPIC gets round the > problem to an extent.If you try to link the object and pointers exceed the small memory model assumptions you will get messages about truncated relocations. The executable will link but will probably not run correctly. You can force the large memory model with -mcmodel=large: clang -mcmodel=large -c -O1 test.c objdump -d test.o 0000000000000000 <foo>: 0: 48 b8 00 00 00 00 00 movabs $0x0,%rax 7: 00 00 00 a: c3 retq The medium memory model is a compromise between the small and large models. "small" objects are assumed to have addresses < 32 bits while "large" objects can have addresses > 32 bits. The reason -fPIC works is that -fPIC forces all addresses through the GOT, whose entries can be > 32 bits. clang -fPIC -c -O1 test.c objdump -d test.o 0000000000000000 <foo>: 0: 48 8d 05 00 00 00 00 lea 0x0(%rip),%rax # 7 <foo+0x7> 7: c3 retq There is also -fpic, which assumes the GOT itself doesn't exceed some specified size. For x86_64 there is no such limit so the two options are equivalent on that platform. -David