So I just got lua to link and run and work on x86-64 Linux with musl and lld. It did require one change to hack around incorrect handling of ELF weak aliases. In musl __stdio_exit.c <http://git.musl-libc.org/cgit/musl/tree/src/stdio/__stdio_exit.c> we have: static FILE *const dummy_file = 0; weak_alias(dummy_file, __stdin_used); weak_alias(dummy_file, __stdout_used); weak_alias(dummy_file, __stderr_used); weak_alias(old, new) is defined as: extern __typeof(old) new __attribute__((weak, alias(#old))) This generates the following object file: mspencer at mspencer-vm:~/Projects/test$ objdump -st ../musl/src/stdio/__stdio_exit.o ../musl/src/stdio/__stdio_exit.o: file format elf64-x86-64 SYMBOL TABLE: 0000000000000000 l df *ABS* 0000000000000000 src/stdio/__stdio_exit.c 0000000000000044 l F .text 0000000000000049 close_file 0000000000000000 l O .rodata 0000000000000008 dummy_file 0000000000000000 l d .text 0000000000000000 .text 0000000000000000 l d .data 0000000000000000 .data 0000000000000000 l d .bss 0000000000000000 .bss 0000000000000000 l d .rodata 0000000000000000 .rodata 0000000000000000 l d .note.GNU-stack 0000000000000000 .note.GNU-stack 0000000000000000 w O .rodata 0000000000000008 __stderr_used 0000000000000000 w O .rodata 0000000000000008 __stdin_used 0000000000000000 g F .text 0000000000000044 __stdio_exit 0000000000000000 w O .rodata 0000000000000008 __stdout_used 0000000000000000 *UND* 0000000000000000 __libc 0000000000000000 *UND* 0000000000000000 __lock 0000000000000000 *UND* 0000000000000000 __lockfile Contents of section .text: 0000 53833d00 00000000 740abf00 000000e8 S.=.....t....... 0010 00000000 488b1d00 000000eb 0c4889df ....H........H.. 0020 e81f0000 00488b5b 704885db 75ef488b .....H.[pH..u.H. 0030 3d000000 00e80a00 0000488b 3d000000 =.........H.=... 0040 005beb00 534889fb 4885db74 3e83bb8c .[..SH..H..t>... 0050 00000000 78084889 dfe80000 0000488b ....x.H.......H. 0060 4328483b 4338760a 4889df31 f631d2ff C(H;C8v.H..1.1.. 0070 5348488b 7308482b 7310730f 488b4350 SHH.s.H+s.s.H.CP 0080 4889dfba 01000000 5bffe05b c3 H.......[..[. Contents of section .rodata: 0000 00000000 00000000 Note that __stdout_used is the last symbol in the .rodata section. This means that the reader assigns the data (16 bytes of 0) to __stdout_used. Because dummy_file and the other __stdx_used symbols come before it, they end up in the right place in the final file. This works great until another object file provides a definition of __stdout_used. The weak definition of it gets totally removed, meaning so does the content for the other __stdx_used symbols. I fixed this by adding weak_alias(dummy_file, __zinurfilestealinurdata); to __stdio_exit.c which allocated the 16 bytes to __zinurfilestealinurdata. Another way to fix this it to, in the reader, assign all the data to the non-weak symbol (dummy_file in this case) when multiple symbols share the same location. However, this fails to work if you have a weak symbol pointing in to the middle of a non weak symbol's data. In this case we actually need to move the data over to the non-weak symbol (or create an anonymous local symbol to hold the data). However, this only needs to happen in specific cases. - Michael Spencer
How are you modeling weak aliases in Atoms? mach-o does not support weak aliases. My mental model of a weak alias is: If foo is a weak alias for bar, then if nothing else defines bar, use foo in place of bar. -Nick On Jan 8, 2013, at 4:50 PM, Michael Spencer wrote:> So I just got lua to link and run and work on x86-64 Linux with musl > and lld. It did require one change to hack around incorrect handling > of ELF weak aliases. > > In musl __stdio_exit.c > <http://git.musl-libc.org/cgit/musl/tree/src/stdio/__stdio_exit.c> we > have: > > static FILE *const dummy_file = 0; > weak_alias(dummy_file, __stdin_used); > weak_alias(dummy_file, __stdout_used); > weak_alias(dummy_file, __stderr_used); > > weak_alias(old, new) is defined as: extern __typeof(old) new > __attribute__((weak, alias(#old))) > > This generates the following object file: > mspencer at mspencer-vm:~/Projects/test$ objdump -st > ../musl/src/stdio/__stdio_exit.o > > ../musl/src/stdio/__stdio_exit.o: file format elf64-x86-64 > > SYMBOL TABLE: > 0000000000000000 l df *ABS* 0000000000000000 src/stdio/__stdio_exit.c > 0000000000000044 l F .text 0000000000000049 close_file > 0000000000000000 l O .rodata 0000000000000008 dummy_file > 0000000000000000 l d .text 0000000000000000 .text > 0000000000000000 l d .data 0000000000000000 .data > 0000000000000000 l d .bss 0000000000000000 .bss > 0000000000000000 l d .rodata 0000000000000000 .rodata > 0000000000000000 l d .note.GNU-stack 0000000000000000 .note.GNU-stack > 0000000000000000 w O .rodata 0000000000000008 __stderr_used > 0000000000000000 w O .rodata 0000000000000008 __stdin_used > 0000000000000000 g F .text 0000000000000044 __stdio_exit > 0000000000000000 w O .rodata 0000000000000008 __stdout_used > 0000000000000000 *UND* 0000000000000000 __libc > 0000000000000000 *UND* 0000000000000000 __lock > 0000000000000000 *UND* 0000000000000000 __lockfile > > Contents of section .text: > 0000 53833d00 00000000 740abf00 000000e8 S.=.....t....... > 0010 00000000 488b1d00 000000eb 0c4889df ....H........H.. > 0020 e81f0000 00488b5b 704885db 75ef488b .....H.[pH..u.H. > 0030 3d000000 00e80a00 0000488b 3d000000 =.........H.=... > 0040 005beb00 534889fb 4885db74 3e83bb8c .[..SH..H..t>... > 0050 00000000 78084889 dfe80000 0000488b ....x.H.......H. > 0060 4328483b 4338760a 4889df31 f631d2ff C(H;C8v.H..1.1.. > 0070 5348488b 7308482b 7310730f 488b4350 SHH.s.H+s.s.H.CP > 0080 4889dfba 01000000 5bffe05b c3 H.......[..[. > Contents of section .rodata: > 0000 00000000 00000000 > > Note that __stdout_used is the last symbol in the .rodata section. > This means that the reader assigns the data (16 bytes of 0) to > __stdout_used. Because dummy_file and the other __stdx_used symbols > come before it, they end up in the right place in the final file. > > This works great until another object file provides a definition of > __stdout_used. The weak definition of it gets totally removed, meaning > so does the content for the other __stdx_used symbols. > > I fixed this by adding weak_alias(dummy_file, > __zinurfilestealinurdata); to __stdio_exit.c which allocated the 16 > bytes to __zinurfilestealinurdata. > > Another way to fix this it to, in the reader, assign all the data to > the non-weak symbol (dummy_file in this case) when multiple symbols > share the same location. However, this fails to work if you have a > weak symbol pointing in to the middle of a non weak symbol's data. In > this case we actually need to move the data over to the non-weak > symbol (or create an anonymous local symbol to hold the data). > However, this only needs to happen in specific cases. > > - Michael Spencer
Hi Michael, Does ELF support aliasing ? How is the relationship captured in ELF symbol table, that one symbol is a alias of another symbol ?> Note that __stdout_used is the last symbol in the .rodata section. > This means that the reader assigns the data (16 bytes of 0) to > __stdout_used. Because dummy_file and the other __stdx_used symbols > come before it, they end up in the right place in the final file.Did you change the Reader too ? The Reader doesnot allocate any space for __stdout_used. The size of the current symbol = (value of next symbol - current symbol). In this case its zero.> > This works great until another object file provides a definition of > __stdout_used. The weak definition of it gets totally removed, meaning > so does the content for the other __stdx_used symbols.When the other object provides a definition for __stdout_used, the atom gets the property of the other object which defines the atom isnt it, and so as the ordinal too riht ? Couldnt follow how did the others move ? This is what I see with binutils/ld :- $cat 1.c #include "stdio_impl.h" static FILE *const dummy_file = 0; weak_alias(dummy_file, __stdin_used); weak_alias(dummy_file, __stdout_used); weak_alias(dummy_file, __stderr_used); $cat 2.c int __stdout_used = 10; $readelf -s 1.o | grep -E 'used|dummy_file' 6: 0000000000000000 8 OBJECT LOCAL DEFAULT 4 dummy_file 9: 0000000000000000 8 OBJECT WEAK DEFAULT 4 __stdin_used 10: 0000000000000000 8 OBJECT WEAK DEFAULT 4 __stdout_used 11: 0000000000000000 8 OBJECT WEAK DEFAULT 4 __stderr_used $readelf -s 2.o | grep -E 'used|dummy_file' 7: 0000000000000000 4 OBJECT GLOBAL DEFAULT 2 __stdout_used $ld 1.o 2.o ld: warning: cannot find entry symbol _start; defaulting to 00000000004000e8 $readelf -s a.out | grep -E 'used|dummy_file' 5: 00000000004000e8 8 OBJECT LOCAL DEFAULT 1 dummy_file 7: 00000000006000f0 4 OBJECT GLOBAL DEFAULT 2 __stdout_used 8: 00000000004000e8 8 OBJECT WEAK DEFAULT 1 __stdin_used 13: 00000000004000e8 8 OBJECT WEAK DEFAULT 1 __stderr_used Thanks Shankar Easwaran
On Tue, Jan 8, 2013 at 6:01 PM, Nick Kledzik <kledzik at apple.com> wrote:> How are you modeling weak aliases in Atoms? > > mach-o does not support weak aliases. My mental model of a weak alias is: > > If foo is a weak alias for bar, then if nothing else defines bar, use foo in place of bar. > > -NickELF doesn't have any specific concept of aliases. The compiler just assigns multiple symbols to the same address. The reader just creates a mergeAsWeak atom with the last symbol in the symbol table at that address getting the content. - Michael Spencer> > > On Jan 8, 2013, at 4:50 PM, Michael Spencer wrote: >> So I just got lua to link and run and work on x86-64 Linux with musl >> and lld. It did require one change to hack around incorrect handling >> of ELF weak aliases. >> >> In musl __stdio_exit.c >> <http://git.musl-libc.org/cgit/musl/tree/src/stdio/__stdio_exit.c> we >> have: >> >> static FILE *const dummy_file = 0; >> weak_alias(dummy_file, __stdin_used); >> weak_alias(dummy_file, __stdout_used); >> weak_alias(dummy_file, __stderr_used); >> >> weak_alias(old, new) is defined as: extern __typeof(old) new >> __attribute__((weak, alias(#old))) >> >> This generates the following object file: >> mspencer at mspencer-vm:~/Projects/test$ objdump -st >> ../musl/src/stdio/__stdio_exit.o >> >> ../musl/src/stdio/__stdio_exit.o: file format elf64-x86-64 >> >> SYMBOL TABLE: >> 0000000000000000 l df *ABS* 0000000000000000 src/stdio/__stdio_exit.c >> 0000000000000044 l F .text 0000000000000049 close_file >> 0000000000000000 l O .rodata 0000000000000008 dummy_file >> 0000000000000000 l d .text 0000000000000000 .text >> 0000000000000000 l d .data 0000000000000000 .data >> 0000000000000000 l d .bss 0000000000000000 .bss >> 0000000000000000 l d .rodata 0000000000000000 .rodata >> 0000000000000000 l d .note.GNU-stack 0000000000000000 .note.GNU-stack >> 0000000000000000 w O .rodata 0000000000000008 __stderr_used >> 0000000000000000 w O .rodata 0000000000000008 __stdin_used >> 0000000000000000 g F .text 0000000000000044 __stdio_exit >> 0000000000000000 w O .rodata 0000000000000008 __stdout_used >> 0000000000000000 *UND* 0000000000000000 __libc >> 0000000000000000 *UND* 0000000000000000 __lock >> 0000000000000000 *UND* 0000000000000000 __lockfile >> >> Contents of section .text: >> 0000 53833d00 00000000 740abf00 000000e8 S.=.....t....... >> 0010 00000000 488b1d00 000000eb 0c4889df ....H........H.. >> 0020 e81f0000 00488b5b 704885db 75ef488b .....H.[pH..u.H. >> 0030 3d000000 00e80a00 0000488b 3d000000 =.........H.=... >> 0040 005beb00 534889fb 4885db74 3e83bb8c .[..SH..H..t>... >> 0050 00000000 78084889 dfe80000 0000488b ....x.H.......H. >> 0060 4328483b 4338760a 4889df31 f631d2ff C(H;C8v.H..1.1.. >> 0070 5348488b 7308482b 7310730f 488b4350 SHH.s.H+s.s.H.CP >> 0080 4889dfba 01000000 5bffe05b c3 H.......[..[. >> Contents of section .rodata: >> 0000 00000000 00000000 >> >> Note that __stdout_used is the last symbol in the .rodata section. >> This means that the reader assigns the data (16 bytes of 0) to >> __stdout_used. Because dummy_file and the other __stdx_used symbols >> come before it, they end up in the right place in the final file. >> >> This works great until another object file provides a definition of >> __stdout_used. The weak definition of it gets totally removed, meaning >> so does the content for the other __stdx_used symbols. >> >> I fixed this by adding weak_alias(dummy_file, >> __zinurfilestealinurdata); to __stdio_exit.c which allocated the 16 >> bytes to __zinurfilestealinurdata. >> >> Another way to fix this it to, in the reader, assign all the data to >> the non-weak symbol (dummy_file in this case) when multiple symbols >> share the same location. However, this fails to work if you have a >> weak symbol pointing in to the middle of a non weak symbol's data. In >> this case we actually need to move the data over to the non-weak >> symbol (or create an anonymous local symbol to hold the data). >> However, this only needs to happen in specific cases. >> >> - Michael Spencer >
On Tue, Jan 8, 2013 at 8:56 PM, <shankare at codeaurora.org> wrote:> Hi Michael, > > Does ELF support aliasing ? > > How is the relationship captured in ELF symbol table, that one symbol is a > alias of another symbol ?It is not explicitly captured. It's an implicit relationship due to the symbols having the same address.> >> Note that __stdout_used is the last symbol in the .rodata section. >> This means that the reader assigns the data (16 bytes of 0) to >> __stdout_used. Because dummy_file and the other __stdx_used symbols >> come before it, they end up in the right place in the final file. > > Did you change the Reader too ?No. I just made another symbol to steal the actual content.> > The Reader doesnot allocate any space for __stdout_used. The size of the > current symbol = (value of next symbol - current symbol). In this case its > zero.__stdout_used is the last symbol at that address, so it gets the data. The hack was to make __stdout_used not get the data.> >> >> This works great until another object file provides a definition of >> __stdout_used. The weak definition of it gets totally removed, meaning >> so does the content for the other __stdx_used symbols. > > When the other object provides a definition for __stdout_used, the atom > gets the property of the other object which defines the atom isnt it, and > so as the ordinal too riht ? > > Couldnt follow how did the others move ?I'm not quite sure what you mean here.> > This is what I see with binutils/ld :- > > $cat 1.c > #include "stdio_impl.h" > > static FILE *const dummy_file = 0; > weak_alias(dummy_file, __stdin_used); > weak_alias(dummy_file, __stdout_used); > weak_alias(dummy_file, __stderr_used); > > $cat 2.c > int __stdout_used = 10; > $readelf -s 1.o | grep -E 'used|dummy_file' > 6: 0000000000000000 8 OBJECT LOCAL DEFAULT 4 dummy_file > 9: 0000000000000000 8 OBJECT WEAK DEFAULT 4 __stdin_used > 10: 0000000000000000 8 OBJECT WEAK DEFAULT 4 __stdout_used > 11: 0000000000000000 8 OBJECT WEAK DEFAULT 4 __stderr_used > $readelf -s 2.o | grep -E 'used|dummy_file' > 7: 0000000000000000 4 OBJECT GLOBAL DEFAULT 2 __stdout_used > $ld 1.o 2.o > ld: warning: cannot find entry symbol _start; defaulting to 00000000004000e8 > $readelf -s a.out | grep -E 'used|dummy_file' > 5: 00000000004000e8 8 OBJECT LOCAL DEFAULT 1 dummy_file > 7: 00000000006000f0 4 OBJECT GLOBAL DEFAULT 2 __stdout_used > 8: 00000000004000e8 8 OBJECT WEAK DEFAULT 1 __stdin_used > 13: 00000000004000e8 8 OBJECT WEAK DEFAULT 1 __stderr_used > > Thanks > > Shankar Easwaran >Yes, which is what we want. Currently we get a dummy_file, __stdin_used, __stderr_used all as 0 size. - Michael Spencer