Oleg Ranevskyy via llvm-dev
2015-Sep-18 16:10 UTC
[llvm-dev] Fwd: Skipping names of temporary symbols increased size of ARM binaries.
CC llvm-dev ---------- Forwarded message ---------- Hello Duncan The size of ARM binaries created by clang has increased after r236642. Would you be able to find some time to look at my findings and share your thoughts about the problem, please? r236642 prevents emitting of temp label names into object files to save memory. This is fine, the label names do not appear in the resulting binaries. However, this creates some problems for the binutils linker, which analyzes symbol names from the input object files and can decide to skip some local compiler generated labels. Now it no longer sees the label names and therefore puts them all into the final binary. I will demonstrate this on an example. If we compile the attached main.cpp file for ARM clang++ -c -o main.o -O0 -g --target=armv7l-linux-gnueabihf main.cpp and then look at the symbols readelf -s main.o there will be a number of similar entries (showing one entry only here for conciseness): Num: Value Size Type Bind Vis Ndx Name 7: 00000062 0 NOTYPE LOCAL DEFAULT 9 These are the .Linfo_string<index> symbols whose names are skipped due to r236642. If we now link it clang++ -o main.out --target=armv7l-linux-gnueabihf main.o all the symbols get through to the final binary: readelf -s main.out Num: Value Size Type Bind Vis Ndx Name 73: 0000006e 0 NOTYPE LOCAL DEFAULT 32 The linker can't decide if the labels are the local ones to be left out not seeing their names. Its bfd_is_local_label_name function returns false and the labels are not skipped. Before r236642 the names were inserted into object files: readelf -s main.o Num: Value Size Type Bind Vis Ndx Name 23: 0000007a 0 NOTYPE LOCAL DEFAULT 9 .Linfo_string7 bfd_is_local_label_name returns true if the name starts with ".L" and the symbol is skipped. This is not critical for small projects but can create noticeable overhead for the big ones. Any help will be much appreciated. Thank you. Oleg -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150918/3916ddcf/attachment.html> -------------- next part -------------- int get_index(int max_value) { static unsigned long x=123456789; x ^= x << 16; return x % max_value; } int main() { int values[10] = {7}; values[get_index(10)] = 5; return 0; }
Duncan P. N. Exon Smith via llvm-dev
2015-Sep-18 23:16 UTC
[llvm-dev] Skipping names of temporary symbols increased size of ARM binaries.
+rafael and pete, who worked on this with me, and a couple of debug info folks.> On 2015-Sep-18, at 09:10, Oleg Ranevskyy <llvm.mail.list at gmail.com> wrote: > > CC llvm-dev > > ---------- Forwarded message ---------- > > Hello Duncan > > The size of ARM binaries created by clang has increased after r236642. > Would you be able to find some time to look at my findings and share your thoughts about the problem, please? > > r236642 prevents emitting of temp label names into object files to save memory. This is fine, the label names do not appear in the resulting binaries. However, this creates some problems for the binutils linker, which analyzes symbol names from the input object files and can decide to skip some local compiler generated labels. Now it no longer sees the label names and therefore puts them all into the final binary. I will demonstrate this on an example. > > If we compile the attached main.cpp file for ARM > clang++ -c -o main.o -O0 -g --target=armv7l-linux-gnueabihf main.cpp > and then look at the symbols > readelf -s main.o > there will be a number of similar entries (showing one entry only here for conciseness): > Num: Value Size Type Bind Vis Ndx Name > 7: 00000062 0 NOTYPE LOCAL DEFAULT 9 > These are the .Linfo_string<index> symbols whose names are skipped due to r236642. > > If we now link it > clang++ -o main.out --target=armv7l-linux-gnueabihf main.o > all the symbols get through to the final binary: > readelf -s main.out > > Num: Value Size Type Bind Vis Ndx Name > 73: 0000006e 0 NOTYPE LOCAL DEFAULT 32 > > The linker can't decide if the labels are the local ones to be left out not seeing their names. Its bfd_is_local_label_name function returns false and the labels are not skipped. > > Before r236642 the names were inserted into object files: > readelf -s main.o > > Num: Value Size Type Bind Vis Ndx Name > 23: 0000007a 0 NOTYPE LOCAL DEFAULT 9 .Linfo_string7 > > bfd_is_local_label_name returns true if the name starts with ".L" and the symbol is skipped. > > This is not critical for small projects but can create noticeable overhead for the big ones.What's the linker really doing here? Is this some form of GC, or is it trying to strip out debug info, or...? I'm surprised it won't drop symbols that have no names if it's dropping local symbols. Could this be an oversight in the linker?
Rafael Espíndola via llvm-dev
2015-Sep-19 19:02 UTC
[llvm-dev] Fwd: Skipping names of temporary symbols increased size of ARM binaries.
Honestly this looks like a bug in the linker. Do they still show up with --discard-locals and/or --discard-all? On 18 September 2015 at 12:10, Oleg Ranevskyy via llvm-dev <llvm-dev at lists.llvm.org> wrote:> CC llvm-dev > > ---------- Forwarded message ---------- > > Hello Duncan > > The size of ARM binaries created by clang has increased after r236642. > Would you be able to find some time to look at my findings and share your > thoughts about the problem, please? > > r236642 prevents emitting of temp label names into object files to save > memory. This is fine, the label names do not appear in the resulting > binaries. However, this creates some problems for the binutils linker, which > analyzes symbol names from the input object files and can decide to skip > some local compiler generated labels. Now it no longer sees the label names > and therefore puts them all into the final binary. I will demonstrate this > on an example. > > If we compile the attached main.cpp file for ARM > clang++ -c -o main.o -O0 -g --target=armv7l-linux-gnueabihf main.cpp > and then look at the symbols > readelf -s main.o > there will be a number of similar entries (showing one entry only here for > conciseness): > Num: Value Size Type Bind Vis Ndx Name > 7: 00000062 0 NOTYPE LOCAL DEFAULT 9 > These are the .Linfo_string<index> symbols whose names are skipped due to > r236642. > > If we now link it > clang++ -o main.out --target=armv7l-linux-gnueabihf main.o > all the symbols get through to the final binary: > readelf -s main.out > > Num: Value Size Type Bind Vis Ndx Name > 73: 0000006e 0 NOTYPE LOCAL DEFAULT 32 > > The linker can't decide if the labels are the local ones to be left out not > seeing their names. Its bfd_is_local_label_name function returns false and > the labels are not skipped. > > Before r236642 the names were inserted into object files: > readelf -s main.o > > Num: Value Size Type Bind Vis Ndx Name > 23: 0000007a 0 NOTYPE LOCAL DEFAULT 9 .Linfo_string7 > > bfd_is_local_label_name returns true if the name starts with ".L" and the > symbol is skipped. > > This is not critical for small projects but can create noticeable overhead > for the big ones. > > Any help will be much appreciated. > Thank you. > Oleg > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >
Rafael Espíndola via llvm-dev
2015-Sep-19 19:06 UTC
[llvm-dev] Skipping names of temporary symbols increased size of ARM binaries.
> What's the linker really doing here? Is this some form of GC, or is it trying to strip out debug info, or...?ELF files can have two symbol tables. The dynamic symbol table is the only one that is required for execution. There can also be a regular symbol table that includes more symbols as a convenience for users. There are various options controlling which symbols are kept in the regular symbol table. I think the default is to keep all but .L symbols in SHF_MERGE sections. Cheers, Rafael
Oleg Ranevskyy via llvm-dev
2015-Sep-21 14:28 UTC
[llvm-dev] Fwd: Skipping names of temporary symbols increased size of ARM binaries.
Thank you for looking into this. --discard-locals produces the same set of symbols. --discard-all removes the unnamed .L symbols as well as other local symbols. On Sat, Sep 19, 2015 at 10:02 PM, Rafael Espíndola < rafael.espindola at gmail.com> wrote:> Honestly this looks like a bug in the linker. > > > Do they still show up with --discard-locals and/or --discard-all? > > On 18 September 2015 at 12:10, Oleg Ranevskyy via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > CC llvm-dev > > > > ---------- Forwarded message ---------- > > > > Hello Duncan > > > > The size of ARM binaries created by clang has increased after r236642. > > Would you be able to find some time to look at my findings and share your > > thoughts about the problem, please? > > > > r236642 prevents emitting of temp label names into object files to save > > memory. This is fine, the label names do not appear in the resulting > > binaries. However, this creates some problems for the binutils linker, > which > > analyzes symbol names from the input object files and can decide to skip > > some local compiler generated labels. Now it no longer sees the label > names > > and therefore puts them all into the final binary. I will demonstrate > this > > on an example. > > > > If we compile the attached main.cpp file for ARM > > clang++ -c -o main.o -O0 -g --target=armv7l-linux-gnueabihf main.cpp > > and then look at the symbols > > readelf -s main.o > > there will be a number of similar entries (showing one entry only here > for > > conciseness): > > Num: Value Size Type Bind Vis Ndx Name > > 7: 00000062 0 NOTYPE LOCAL DEFAULT 9 > > These are the .Linfo_string<index> symbols whose names are skipped due to > > r236642. > > > > If we now link it > > clang++ -o main.out --target=armv7l-linux-gnueabihf main.o > > all the symbols get through to the final binary: > > readelf -s main.out > > > > Num: Value Size Type Bind Vis Ndx Name > > 73: 0000006e 0 NOTYPE LOCAL DEFAULT 32 > > > > The linker can't decide if the labels are the local ones to be left out > not > > seeing their names. Its bfd_is_local_label_name function returns false > and > > the labels are not skipped. > > > > Before r236642 the names were inserted into object files: > > readelf -s main.o > > > > Num: Value Size Type Bind Vis Ndx Name > > 23: 0000007a 0 NOTYPE LOCAL DEFAULT 9 .Linfo_string7 > > > > bfd_is_local_label_name returns true if the name starts with ".L" and the > > symbol is skipped. > > > > This is not critical for small projects but can create noticeable > overhead > > for the big ones. > > > > Any help will be much appreciated. > > Thank you. > > Oleg > > > > > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150921/2fd79d80/attachment-0001.html>
Oleg Ranevskyy via llvm-dev
2015-Sep-29 20:39 UTC
[llvm-dev] Fwd: Skipping names of temporary symbols increased size of ARM binaries.
Hi Rafael, Could you advise what might be done next about this, please? Would it be reasonable to discuss the issue in the binutils mailing list? Thank you. On Sat, Sep 19, 2015 at 10:02 PM, Rafael Espíndola < rafael.espindola at gmail.com> wrote:> Honestly this looks like a bug in the linker. > > > Do they still show up with --discard-locals and/or --discard-all? > > On 18 September 2015 at 12:10, Oleg Ranevskyy via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > CC llvm-dev > > > > ---------- Forwarded message ---------- > > > > Hello Duncan > > > > The size of ARM binaries created by clang has increased after r236642. > > Would you be able to find some time to look at my findings and share your > > thoughts about the problem, please? > > > > r236642 prevents emitting of temp label names into object files to save > > memory. This is fine, the label names do not appear in the resulting > > binaries. However, this creates some problems for the binutils linker, > which > > analyzes symbol names from the input object files and can decide to skip > > some local compiler generated labels. Now it no longer sees the label > names > > and therefore puts them all into the final binary. I will demonstrate > this > > on an example. > > > > If we compile the attached main.cpp file for ARM > > clang++ -c -o main.o -O0 -g --target=armv7l-linux-gnueabihf main.cpp > > and then look at the symbols > > readelf -s main.o > > there will be a number of similar entries (showing one entry only here > for > > conciseness): > > Num: Value Size Type Bind Vis Ndx Name > > 7: 00000062 0 NOTYPE LOCAL DEFAULT 9 > > These are the .Linfo_string<index> symbols whose names are skipped due to > > r236642. > > > > If we now link it > > clang++ -o main.out --target=armv7l-linux-gnueabihf main.o > > all the symbols get through to the final binary: > > readelf -s main.out > > > > Num: Value Size Type Bind Vis Ndx Name > > 73: 0000006e 0 NOTYPE LOCAL DEFAULT 32 > > > > The linker can't decide if the labels are the local ones to be left out > not > > seeing their names. Its bfd_is_local_label_name function returns false > and > > the labels are not skipped. > > > > Before r236642 the names were inserted into object files: > > readelf -s main.o > > > > Num: Value Size Type Bind Vis Ndx Name > > 23: 0000007a 0 NOTYPE LOCAL DEFAULT 9 .Linfo_string7 > > > > bfd_is_local_label_name returns true if the name starts with ".L" and the > > symbol is skipped. > > > > This is not critical for small projects but can create noticeable > overhead > > for the big ones. > > > > Any help will be much appreciated. > > Thank you. > > Oleg > > > > > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150929/3928347f/attachment.html>