Akira Hatanaka
2014-Aug-21  23:32 UTC
[LLVMdev] How to tell whether a GlobalValue is user-defined
Is there a way to distinguish between GlobalValues that are user-defined
and those that are compiler-defined? I am looking for a function that I can
use to tell if a GlobalValue is user-defined , something like
"GlobalValue::isUserDefined", which returns true for user-defined
GlobalValue.
I'm trying to make changes to prevent llvm from placing user-defined
constant arrays in the merge able constant sections. Currently, clang
places 16-byte constant arrays that are marked "unnamed_addr" into
__literal16 for macho (see following example).
$ cat test1.c
static const int s_dashArraysSize1[4] = {2, 2, 4, 6};
int foo1(int a) {
  return s_dashArraysSize1[a];
}
$ clang test1.c -S -O3 -o - | tail -n 10
.section __TEXT,__literal16,16byte_literals
.align 4                       ## @s_dashArraysSize1
_s_dashArraysSize1:
.long 2                       ## 0x2
.long 2                       ## 0x2
.long 4                       ## 0x4
.long 6                       ## 0x6
This is not desirable because macho linker wasn't originally designed to
handle user-defined symbols in those sections and having to handle them
complicates the linker. Also, there is no benefit in doing so, since the
linker currently doesn't try to merge user-defined variables anyway.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140821/71a7f07a/attachment.html>
Bob Wilson
2014-Aug-22  00:48 UTC
[LLVMdev] How to tell whether a GlobalValue is user-defined
> On Aug 21, 2014, at 4:32 PM, Akira Hatanaka <ahatanak at gmail.com> wrote: > > Is there a way to distinguish between GlobalValues that are user-defined and those that are compiler-defined? I am looking for a function that I can use to tell if a GlobalValue is user-defined , something like "GlobalValue::isUserDefined", which returns true for user-defined GlobalValue. > > I'm trying to make changes to prevent llvm from placing user-defined constant arrays in the merge able constant sections. Currently, clang places 16-byte constant arrays that are marked "unnamed_addr" into __literal16 for macho (see following example). > > $ cat test1.c > static const int s_dashArraysSize1[4] = {2, 2, 4, 6}; > > int foo1(int a) { > return s_dashArraysSize1[a]; > } > > $ clang test1.c -S -O3 -o - | tail -n 10 > .section __TEXT,__literal16,16byte_literals > .align 4 ## @s_dashArraysSize1 > _s_dashArraysSize1: > .long 2 ## 0x2 > .long 2 ## 0x2 > .long 4 ## 0x4 > .long 6 ## 0x6 > > > This is not desirable because macho linker wasn't originally designed to handle user-defined symbols in those sections and having to handle them complicates the linker. Also, there is no benefit in doing so, since the linker currently doesn't try to merge user-defined variables anyway.I would also appreciate opinions on whether this issue is relevant for other platforms. In general, should we put user-defined symbols into literal sections? In Akira’s example above, GlobalOpt is checking that the variable does not have its address taken and marking them it unnamed_addr. Even if that is a legal optimization, it may cause problems, e.g., for debugging, if the linker removes the symbol. I don’t know whether other linkers will keep all the symbols in literal sections or not. I think you could also make a reasonable argument that we don’t guarantee that the variable will remain visible when debugging optimized code. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140821/080a4269/attachment.html>
Rafael Espíndola
2014-Aug-25  15:26 UTC
[LLVMdev] How to tell whether a GlobalValue is user-defined
On 21 August 2014 19:32, Akira Hatanaka <ahatanak at gmail.com> wrote:> Is there a way to distinguish between GlobalValues that are user-defined and > those that are compiler-defined? I am looking for a function that I can use > to tell if a GlobalValue is user-defined , something like > "GlobalValue::isUserDefined", which returns true for user-defined > GlobalValue. > > I'm trying to make changes to prevent llvm from placing user-defined > constant arrays in the merge able constant sections. Currently, clang places > 16-byte constant arrays that are marked "unnamed_addr" into __literal16 for > macho (see following example). > > $ cat test1.c > > static const int s_dashArraysSize1[4] = {2, 2, 4, 6}; > > > int foo1(int a) { > > return s_dashArraysSize1[a]; > > } > > > $ clang test1.c -S -O3 -o - | tail -n 10 > > .section __TEXT,__literal16,16byte_literals > > .align 4 ## @s_dashArraysSize1 > > _s_dashArraysSize1: > > .long 2 ## 0x2 > > .long 2 ## 0x2 > > .long 4 ## 0x4 > > .long 6 ## 0x6 > > > > This is not desirable because macho linker wasn't originally designed to > handle user-defined symbols in those sections and having to handle them > complicates the linker. Also, there is no benefit in doing so, since the > linker currently doesn't try to merge user-defined variables anyway.What does "user-defined" means in here? Since the linker can is involved, I assume it has something to do with the final symbol name. At the linker level (symbol names, sections, atoms, relocations, etc), what exactly that is not supported? Cheers, Rafael
Nick Kledzik
2014-Aug-25  16:54 UTC
[LLVMdev] How to tell whether a GlobalValue is user-defined
On Aug 25, 2014, at 8:26 AM, Rafael Espíndola <rafael.espindola at gmail.com> wrote:> On 21 August 2014 19:32, Akira Hatanaka <ahatanak at gmail.com> wrote: >> Is there a way to distinguish between GlobalValues that are user-defined and >> those that are compiler-defined? I am looking for a function that I can use >> to tell if a GlobalValue is user-defined , something like >> "GlobalValue::isUserDefined", which returns true for user-defined >> GlobalValue. >> >> I'm trying to make changes to prevent llvm from placing user-defined >> constant arrays in the merge able constant sections. Currently, clang places >> 16-byte constant arrays that are marked "unnamed_addr" into __literal16 for >> macho (see following example). >> >> $ cat test1.c >> >> static const int s_dashArraysSize1[4] = {2, 2, 4, 6}; >> >> >> int foo1(int a) { >> >> return s_dashArraysSize1[a]; >> >> } >> >> >> $ clang test1.c -S -O3 -o - | tail -n 10 >> >> .section __TEXT,__literal16,16byte_literals >> >> .align 4 ## @s_dashArraysSize1 >> >> _s_dashArraysSize1: >> >> .long 2 ## 0x2 >> >> .long 2 ## 0x2 >> >> .long 4 ## 0x4 >> >> .long 6 ## 0x6 >> >> >> >> This is not desirable because macho linker wasn't originally designed to >> handle user-defined symbols in those sections and having to handle them >> complicates the linker. Also, there is no benefit in doing so, since the >> linker currently doesn't try to merge user-defined variables anyway. > > What does "user-defined" means in here? Since the linker can is > involved, I assume it has something to do with the final symbol name. > > At the linker level (symbol names, sections, atoms, relocations, etc), > what exactly that is not supported?The literalN sections were developed long ago to support coalescing of unnamed constants like 9.897 in source code for architectures that could not embed large constants in instructions. The linker could knew how to break up the section (e.g. __literal8 is always 8 byte chunks) and coalesce copies by content. ~6 years ago we discovered that gcc would sometimes put user named constants into the literal sections (e.g. const double foo 9.897). This was an issue because C language rules say &a != &b, but if ‘a’ and ‘b’ are the contain the same literal value from different translation units, the linker could merge them to the same address. For whatever reason, we could not fix gcc, so we changed to linker to never coalesce items in literal sections if there was a (non ‘L’ and non ‘l’) symbol on it. The current state of LLVM is that is it going out of its way to move “named” constants from __const section to __literalN section. But the only possible advantage to doing that is that the hopes that the linker might coalesce it. But the linker won’t coalesce it because it is named. So, is there a way to keep the named values in the __const section? -Nick