thr3ads.net - llvm dev - [llvm-dev] Weak symbol/alias semantics [Jan 2017]

If this information is useful, please help other people find it:
Share via:

Mehdi Amini via llvm-dev

2017-Jan-18 15:35 UTC

[llvm-dev] Weak symbol/alias semantics

> On Jan 18, 2017, at 7:11 AM, Teresa Johnson <tejohnson at google.com>
wrote:
> 
> 
> 
>> On Sat, Jan 14, 2017 at 12:15 PM, Mehdi Amini <mehdi.amini at
apple.com> wrote:
>> 
>>> On Jan 14, 2017, at 8:14 AM, Teresa Johnson <tejohnson at
google.com> wrote:
>>> 
>>> 
>>> 
>>> On Fri, Jan 13, 2017 at 9:54 PM, Mehdi Amini <mehdi.amini at
apple.com> wrote:
>>>> Hi,
>>>> 
>>>> I believe we should get it right (and simpler) if (when…) we
move to the representation we discussed last spring:
https://llvm.org/bugs/show_bug.cgi?id=27866
>>>> 
>>>> Our conclusion was that we should always have alias pointing to
private anonymous and nothing else, so when we currently have:
>>>> 
>>>> @a = global i32 0
>>>> @b = alias @a
>>>> 
>>>> It should always become:
>>>> 
>>>> @0 = private i32 0
>>>> @a = alias @0
>>>> @b = alias @0
>>> 
>>> Yes that has some nice properties. I think it makes at least one
case harder though.
>>> 
>>> Consider:
>>> 
>>> 1) Original: 
>>> define linkonce_odr void @a() { ... }
>>> @b = linkonce_odr alias void (), void ()* @a
>>> 
>>> 2) In the proposed canonical form this becomes:
>>> define private void @0() { ... }
>>> @a = linkonce_odr alias void (), void ()* @0
>>> @b = linkonce_odr alias void (), void ()* @0
>>> 
>>> And let's say @a and @b are both non-prevailing in this module.
I think we lose inlining ability for these functions when we start dropping
non-prevailing copies.
>>> 
>>> In 1) we could transform this (as described in my earlier email)
to:
>>> 
>>> define available_externally void @a() { ... }
>>> /* RAUW @b with @a */
>>> 
>>> and we can inline @a into all callsites (which includes calls
original via @b), before dropping it.
>>> 
>>> With 2), I think we would need to transform this to:
>>> 
>>> define private void @0() { ... }
>>> declare void @a()
>>> declare void @b()
>>> 
>>> and we no longer can inline calls to @a and @b. I suppose we could
intervene and special case this scenario so that it is transformed as for 1)
when there are non-prevailing odr aliases.
>> 
>> The complication is that a cloneFunction() step would be needed right?
> 
> I hadn't thought about that level of implementation detail, but in any
case a cloneFunction step will be needed with both the current alias format and
with your proposed new form (just in different cases though). My point was that
the proposed format doesn't always make the handling simpler.
> 
>> 
>> Alternatively I can imagine us allowing available_externally alias, I’m
not sure why they are forbidden today other than because we map the object-level
limitations in the IR, but I can miss something (we don’t have alias on Darwin
either…).
> 
> Allowing available_externally alias helps simplify the case where we have a
non-prevailing weak alias. We would still need to address the case where the
aliasee is non-prevailing, which I think is the trickier situation to handle.
I'm missing something, aren't we talking about the proposal where all
aliasee are always private?

Mehdi
>> 
>> 
>>> 
>>> I need to think through the various permutations of linkage types
and see how they get transformed in the current and proposed canonical forms to
see if there are other cases that need special handling.
>>> 
>>> Teresa
>>> 
>>> 
>>>> 
>>>> 
>>>> — 
>>>> Mehdi
>>>> 
>>>> 
>>>> 
>>>>> On Jan 13, 2017, at 3:21 PM, Peter Collingbourne <peter
at pcc.me.uk> wrote:
>>>>> 
>>>>> Hi Teresa,
>>>>> 
>>>>> I think that to answer your question correctly it is
helpful to consider what is going on at the object file level. For your test1.c
we conceptually have a .text section containing the body of f, and then three
symbols:
>>>>> 
>>>>> .weak f
>>>>> f = .text
>>>>> .globl strongalias
>>>>> strongalias = .text
>>>>> .weak weakalias
>>>>> weakalias = .text
>>>>> 
>>>>> Note that f, strongalias and weakalias are not related at
all, except that they happen to point to the same place. If f is overridden by a
symbol in another object file, it does not affect the symbols strongalias and
weakalias, so we still need to make them point to .text. I don't think it
would be right to make strongalias and weakalias into copies of f, as that would
be observable through function pointer equality. Most likely all you need to do
is to internalize f and keep strongalias and weakalias as aliases of f.
>>>>> 
>>>>> If we're resolving strongalias to f at -O2, that seems
like a bug to me. We can probably only resolve an alias to the symbol it
references if we are guaranteed that both symbols will have the same resolution,
i.e. we must check at least that both symbols have strong or internal linkage.
If we cared about symbol interposition, we might also want to check that both
symbols have non-default visibility, but I think that our support for that is
still a little fuzzy at the moment.
>>>>> 
>>>>> Thanks,
>>>>> Peter
>>>>> 
>>>>>> On Fri, Jan 13, 2017 at 2:33 PM, Teresa Johnson
<tejohnson at google.com> wrote:
>>>>>> Hi Mehdi, Peter and David (and anyone else who sees
this),
>>>>>> 
>>>>>> I've been playing with some examples to handle the
weak symbol cases we discussed in IRC earlier this week in the context of
D28523. I was going to implement the support for turning aliases into copies in
order to enable performing thinLTOResolveWeakForLinkerGUID on both aliases and
aliasees, as a first step to being able to drop non-prevailing weak symbols in
ThinLTO backends.
>>>>>> 
>>>>>> I was wondering though what happens if we have an
alias, which may or may not be weak itself, to a non-odr weak symbol that
isn't prevailing. In that case, do we eventually want references via the
alias to go to the prevailing copy (in another module), or to the original copy
in the alias's module? I looked at some examples without ThinLTO, and am a
little confused. Current (non-ThinLTO) behavior in some cases seems to depend on
opt level.
>>>>>> 
>>>>>> Example:
>>>>>> 
>>>>>> $ cat weak12main.c
>>>>>> extern void test2();
>>>>>> int main() {
>>>>>>   test2();
>>>>>> }
>>>>>> 
>>>>>> $ cat weak1.c
>>>>>> #include <stdio.h>
>>>>>> 
>>>>>> void weakalias() __attribute__((weak, alias
("f")));
>>>>>> void strongalias() __attribute__((alias
("f")));
>>>>>> 
>>>>>> void f () __attribute__ ((weak));
>>>>>> void f()
>>>>>> {
>>>>>>   printf("In weak1.c:f\n");
>>>>>> }
>>>>>> void test1() {
>>>>>>   printf("Call f() from weak1.c:\n");
>>>>>>   f();
>>>>>>   printf("Call weakalias() from weak1.c:\n");
>>>>>>   weakalias();
>>>>>>   printf("Call strongalias() from
weak1.c:\n");
>>>>>>   strongalias();
>>>>>> }
>>>>>> 
>>>>>> $ cat weak2.c
>>>>>> #include <stdio.h>
>>>>>> 
>>>>>> void f () __attribute__ ((weak));
>>>>>> void f()
>>>>>> {
>>>>>>   printf("In weak2.c:f\n");
>>>>>> }
>>>>>> extern void test1();
>>>>>> void test2()
>>>>>> {
>>>>>>   test1();
>>>>>>   printf("Call f() from weak2.c\n");
>>>>>>   f();
>>>>>> }
>>>>>> 
>>>>>> If I link weak1.c before weak2.c, nothing is surprising
(we always invoke weak1.c:f at both -O0 and -O2):
>>>>>> 
>>>>>> $ clang weak12main.c weak1.c weak2.c -O0
>>>>>> $ a.out
>>>>>> Call f() from weak1.c:
>>>>>> In weak1.c:f
>>>>>> Call weakalias() from weak1.c:
>>>>>> In weak1.c:f
>>>>>> Call strongalias() from weak1.c:
>>>>>> In weak1.c:f
>>>>>> Call f() from weak2.c
>>>>>> In weak1.c:f
>>>>>> 
>>>>>> $ clang weak12main.c weak1.c weak2.c -O2
>>>>>> $ a.out
>>>>>> Call f() from weak1.c:
>>>>>> In weak1.c:f
>>>>>> Call weakalias() from weak1.c:
>>>>>> In weak1.c:f
>>>>>> Call strongalias() from weak1.c:
>>>>>> In weak1.c:f
>>>>>> Call f() from weak2.c
>>>>>> In weak1.c:f
>>>>>> 
>>>>>> If I instead link weak2.c first, so it's copy of
f() is prevailing, I still get weak1.c:f for the call via weakalias() (both opt
levels), and for strongalias() when building at -O0. At -O2 the compiler
replaces the call to strongalias() with a call to f(), so it get's the weak2
copy in that case.
>>>>>> 
>>>>>> $ clang weak12main.c weak2.c weak1.c -O2
>>>>>> $ a.out
>>>>>> Call f() from weak1.c:
>>>>>> In weak2.c:f
>>>>>> Call weakalias() from weak1.c:
>>>>>> In weak1.c:f
>>>>>> Call strongalias() from weak1.c:
>>>>>> In weak2.c:f
>>>>>> Call f() from weak2.c
>>>>>> In weak2.c:f
>>>>>> 
>>>>>> $ clang weak12main.c weak2.c weak1.c -O0
>>>>>> $ a.out
>>>>>> Call f() from weak1.c:
>>>>>> In weak2.c:f
>>>>>> Call weakalias() from weak1.c:
>>>>>> In weak1.c:f
>>>>>> Call strongalias() from weak1.c:
>>>>>> In weak1.c:f
>>>>>> Call f() from weak2.c
>>>>>> In weak2.c:f
>>>>>> 
>>>>>> I'm wondering what the expected/correct behavior
is? Depending on what is correct, we need to handle this differently in ThinLTO
mode. Let's say weak1.c's copy of f() is not prevailing and I am going
to drop it (it needs to be removed completely, not turned into
available_externally to ensure it isn't inlined since weak isInterposable).
If we want the aliases in weak1.c to reference the original version, then
copying is correct (e.g. weakalias and strong alias would each become a copy of
weak1.c's f()). If we however want them to resolve to the prevailing copy of
f(), then we need to turn the aliases into declarations (external linkage in the
case of strongalias and external weak in the case of weakalias?).
>>>>>> 
>>>>>> I also tried the case where f() was in a comdat,
because I also need to handle that case in ThinLTO (when f() is not prevailing,
drop it from the comdat and remove the comdat from that module). Interestingly,
in this case when weak2.c is prevailing, I get the following warning when
linking and get a seg fault at runtime:
>>>>>> 
>>>>>> weak1.o:weak1.o:function test1: warning: relocation
refers to discarded section
>>>>>> 
>>>>>> Presumably the aliases still refer to the copy in
weak1.c, which is in the comdat that gets dropped by the linker. So is it not
legal to have an alias to a weak symbol in a comdat (i.e. alias from outside the
comdat)? We don't complain in the compiler.
>>>>>> 
>>>>>> Thanks,
>>>>>> Teresa
>>>>>> -- 
>>>>>> Teresa Johnson |	 Software Engineer |	 tejohnson at
google.com |	 408-460-2413
>>>>> 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> -- 
>>>>> Peter
>>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> Teresa Johnson |	 Software Engineer |	 tejohnson at google.com |	
408-460-2413
>> 
> 
> 
> 
> -- 
> Teresa Johnson |	 Software Engineer |	 tejohnson at google.com |	
408-460-2413-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170118/51db12c3/attachment.html>

Teresa Johnson via llvm-dev

2017-Jan-18 15:40 UTC

head link

[llvm-dev] Weak symbol/alias semantics

On Wed, Jan 18, 2017 at 7:35 AM, Mehdi Amini <mehdi.amini at apple.com>
wrote:
>
>
> On Jan 18, 2017, at 7:11 AM, Teresa Johnson <tejohnson at google.com>
wrote:
>
>
>
> On Sat, Jan 14, 2017 at 12:15 PM, Mehdi Amini <mehdi.amini at
apple.com>
> wrote:
>
>>
>> On Jan 14, 2017, at 8:14 AM, Teresa Johnson <tejohnson at
google.com> wrote:
>>
>>
>>
>> On Fri, Jan 13, 2017 at 9:54 PM, Mehdi Amini <mehdi.amini at
apple.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I believe we should get it right (and simpler) if (when…) we move
to the
>>> representation we discussed last spring: https://llvm.org/bugs/
>>> show_bug.cgi?id=27866
>>>
>>> Our conclusion was that we should always have alias pointing to
private
>>> anonymous and nothing else, so when we currently have:
>>>
>>> @a = global i32 0
>>> @b = alias @a
>>>
>>> It should always become:
>>>
>>> @0 = private i32 0
>>> @a = alias @0
>>> @b = alias @0
>>>
>>
>> Yes that has some nice properties. I think it makes at least one case
>> harder though.
>>
>> Consider:
>>
>> 1) Original:
>> define linkonce_odr void @a() { ... }
>> @b = linkonce_odr alias void (), void ()* @a
>>
>> 2) In the proposed canonical form this becomes:
>> define private void @0() { ... }
>> @a = linkonce_odr alias void (), void ()* @0
>> @b = linkonce_odr alias void (), void ()* @0
>>
>> And let's say @a and @b are both non-prevailing in this module. I
think
>> we lose inlining ability for these functions when we start dropping
>> non-prevailing copies.
>>
>> In 1) we could transform this (as described in my earlier email) to:
>>
>> define available_externally void @a() { ... }
>> /* RAUW @b with @a */
>>
>> and we can inline @a into all callsites (which includes calls original
>> via @b), before dropping it.
>>
>> With 2), I think we would need to transform this to:
>>
>> define private void @0() { ... }
>> declare void @a()
>> declare void @b()
>>
>> and we no longer can inline calls to @a and @b. I suppose we could
>> intervene and special case this scenario so that it is transformed as
for
>> 1) when there are non-prevailing odr aliases.
>>
>>
>> The complication is that a cloneFunction() step would be needed right?
>>
>
> I hadn't thought about that level of implementation detail, but in any
> case a cloneFunction step will be needed with both the current alias format
> and with your proposed new form (just in different cases though). My point
> was that the proposed format doesn't always make the handling simpler.
>
>
>> Alternatively I can imagine us allowing available_externally alias, I’m
>> not sure why they are forbidden today other than because we map the
>> object-level limitations in the IR, but I can miss something (we don’t
have
>> alias on Darwin either…).
>>
>
> Allowing available_externally alias helps simplify the case where we have
> a non-prevailing weak alias. We would still need to address the case where
> the aliasee is non-prevailing, which I think is the trickier situation to
> handle.
>
>
> I'm missing something, aren't we talking about the proposal where
all
> aliasee are always private?
>
Oh I see. Yes, that is the part simplified by the proposed new alias
handling. I misunderstood what your "Alternatively" was referring to.

I'm a little wary of making fixing this issue for ThinLTO dependent on
changing the alias representation in the compiler. Maybe it wouldn't be
that difficult to do though.

Teresa

> Mehdi
>
>
>>
>>
>> I need to think through the various permutations of linkage types and
see
>> how they get transformed in the current and proposed canonical forms to
see
>> if there are other cases that need special handling.
>>
>> Teresa
>>
>>
>>
>>>
>>> —
>>> Mehdi
>>>
>>>
>>>
>>> On Jan 13, 2017, at 3:21 PM, Peter Collingbourne <peter at
pcc.me.uk>
>>> wrote:
>>>
>>> Hi Teresa,
>>>
>>> I think that to answer your question correctly it is helpful to
consider
>>> what is going on at the object file level. For your test1.c we
conceptually
>>> have a .text section containing the body of f, and then three
symbols:
>>>
>>> .weak f
>>> f = .text
>>> .globl strongalias
>>> strongalias = .text
>>> .weak weakalias
>>> weakalias = .text
>>>
>>> Note that f, strongalias and weakalias are not related at all,
except
>>> that they happen to point to the same place. If f is overridden by
a symbol
>>> in another object file, it does not affect the symbols strongalias
and
>>> weakalias, so we still need to make them point to .text. I
don't think it
>>> would be right to make strongalias and weakalias into copies of f,
as that
>>> would be observable through function pointer equality. Most likely
all you
>>> need to do is to internalize f and keep strongalias and weakalias
as
>>> aliases of f.
>>>
>>> If we're resolving strongalias to f at -O2, that seems like a
bug to me.
>>> We can probably only resolve an alias to the symbol it references
if we are
>>> guaranteed that both symbols will have the same resolution, i.e. we
must
>>> check at least that both symbols have strong or internal linkage.
If we
>>> cared about symbol interposition, we might also want to check that
both
>>> symbols have non-default visibility, but I think that our support
for that
>>> is still a little fuzzy at the moment.
>>>
>>> Thanks,
>>> Peter
>>>
>>> On Fri, Jan 13, 2017 at 2:33 PM, Teresa Johnson <tejohnson at
google.com>
>>> wrote:
>>>
>>>> Hi Mehdi, Peter and David (and anyone else who sees this),
>>>>
>>>> I've been playing with some examples to handle the weak
symbol cases we
>>>> discussed in IRC earlier this week in the context of D28523. I
was going to
>>>> implement the support for turning aliases into copies in order
to enable
>>>> performing thinLTOResolveWeakForLinkerGUID on both aliases and
>>>> aliasees, as a first step to being able to drop non-prevailing
weak symbols
>>>> in ThinLTO backends.
>>>>
>>>> I was wondering though what happens if we have an alias, which
may or
>>>> may not be weak itself, to a non-odr weak symbol that isn't
prevailing. In
>>>> that case, do we eventually want references via the alias to go
to the
>>>> prevailing copy (in another module), or to the original copy in
the alias's
>>>> module? I looked at some examples without ThinLTO, and am a
little
>>>> confused. Current (non-ThinLTO) behavior in some cases seems to
depend on
>>>> opt level.
>>>>
>>>> Example:
>>>>
>>>> $ cat weak12main.c
>>>> extern void test2();
>>>> int main() {
>>>>   test2();
>>>> }
>>>>
>>>> $ cat weak1.c
>>>> #include <stdio.h>
>>>>
>>>> void weakalias() __attribute__((weak, alias ("f")));
>>>> void strongalias() __attribute__((alias ("f")));
>>>>
>>>> void f () __attribute__ ((weak));
>>>> void f()
>>>> {
>>>>   printf("In weak1.c:f\n");
>>>> }
>>>> void test1() {
>>>>   printf("Call f() from weak1.c:\n");
>>>>   f();
>>>>   printf("Call weakalias() from weak1.c:\n");
>>>>   weakalias();
>>>>   printf("Call strongalias() from weak1.c:\n");
>>>>   strongalias();
>>>> }
>>>>
>>>> $ cat weak2.c
>>>> #include <stdio.h>
>>>>
>>>> void f () __attribute__ ((weak));
>>>> void f()
>>>> {
>>>>   printf("In weak2.c:f\n");
>>>> }
>>>> extern void test1();
>>>> void test2()
>>>> {
>>>>   test1();
>>>>   printf("Call f() from weak2.c\n");
>>>>   f();
>>>> }
>>>>
>>>> If I link weak1.c before weak2.c, nothing is surprising (we
always
>>>> invoke weak1.c:f at both -O0 and -O2):
>>>>
>>>> $ clang weak12main.c weak1.c weak2.c -O0
>>>> $ a.out
>>>> Call f() from weak1.c:
>>>> In weak1.c:f
>>>> Call weakalias() from weak1.c:
>>>> In weak1.c:f
>>>> Call strongalias() from weak1.c:
>>>> In weak1.c:f
>>>> Call f() from weak2.c
>>>> In weak1.c:f
>>>>
>>>> $ clang weak12main.c weak1.c weak2.c -O2
>>>> $ a.out
>>>> Call f() from weak1.c:
>>>> In weak1.c:f
>>>> Call weakalias() from weak1.c:
>>>> In weak1.c:f
>>>> Call strongalias() from weak1.c:
>>>> In weak1.c:f
>>>> Call f() from weak2.c
>>>> In weak1.c:f
>>>>
>>>> If I instead link weak2.c first, so it's copy of f() is
prevailing, I
>>>> still get weak1.c:f for the call via weakalias() (both opt
levels), and for
>>>> strongalias() when building at -O0. At -O2 the compiler
replaces the call
>>>> to strongalias() with a call to f(), so it get's the weak2
copy in that
>>>> case.
>>>>
>>>> $ clang weak12main.c weak2.c weak1.c -O2
>>>> $ a.out
>>>> Call f() from weak1.c:
>>>> In weak2.c:f
>>>> Call weakalias() from weak1.c:
>>>> In weak1.c:f
>>>> Call strongalias() from weak1.c:
>>>> In weak2.c:f
>>>> Call f() from weak2.c
>>>> In weak2.c:f
>>>>
>>>> $ clang weak12main.c weak2.c weak1.c -O0
>>>> $ a.out
>>>> Call f() from weak1.c:
>>>> In weak2.c:f
>>>> Call weakalias() from weak1.c:
>>>> In weak1.c:f
>>>> Call strongalias() from weak1.c:
>>>> In weak1.c:f
>>>> Call f() from weak2.c
>>>> In weak2.c:f
>>>>
>>>> I'm wondering what the expected/correct behavior is?
Depending on what
>>>> is correct, we need to handle this differently in ThinLTO mode.
Let's say
>>>> weak1.c's copy of f() is not prevailing and I am going to
drop it (it needs
>>>> to be removed completely, not turned into available_externally
to ensure it
>>>> isn't inlined since weak isInterposable). If we want the
aliases in weak1.c
>>>> to reference the original version, then copying is correct
(e.g. weakalias
>>>> and strong alias would each become a copy of weak1.c's
f()). If we however
>>>> want them to resolve to the prevailing copy of f(), then we
need to turn
>>>> the aliases into declarations (external linkage in the case of
strongalias
>>>> and external weak in the case of weakalias?).
>>>>
>>>> I also tried the case where f() was in a comdat, because I also
need to
>>>> handle that case in ThinLTO (when f() is not prevailing, drop
it from the
>>>> comdat and remove the comdat from that module). Interestingly,
in this case
>>>> when weak2.c is prevailing, I get the following warning when
linking and
>>>> get a seg fault at runtime:
>>>>
>>>> weak1.o:weak1.o:function test1: warning: relocation refers to
discarded
>>>> section
>>>>
>>>> Presumably the aliases still refer to the copy in weak1.c,
which is in
>>>> the comdat that gets dropped by the linker. So is it not legal
to have an
>>>> alias to a weak symbol in a comdat (i.e. alias from outside the
comdat)? We
>>>> don't complain in the compiler.
>>>>
>>>> Thanks,
>>>> Teresa
>>>> --
>>>> Teresa Johnson |  Software Engineer |  tejohnson at google.com
|
>>>> 408-460-2413 <(408)%20460-2413>
>>>>
>>>
>>>
>>>
>>> --
>>> --
>>> Peter
>>>
>>>
>>>
>>
>>
>> --
>> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
>> 408-460-2413 <(408)%20460-2413>
>>
>>
>>
>
>
> --
> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
> 408-460-2413 <(408)%20460-2413>
>
>

-- 
Teresa Johnson |  Software Engineer |  tejohnson at google.com |  408-460-2413
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170118/00e0325e/attachment-0001.html>

Teresa Johnson via llvm-dev

2017-Jan-18 16:03 UTC

head link

[llvm-dev] Weak symbol/alias semantics

On Wed, Jan 18, 2017 at 7:40 AM, Teresa Johnson <tejohnson at google.com>
wrote:
>
>
> On Wed, Jan 18, 2017 at 7:35 AM, Mehdi Amini <mehdi.amini at
apple.com>
> wrote:
>
>>
>>
>> On Jan 18, 2017, at 7:11 AM, Teresa Johnson <tejohnson at
google.com> wrote:
>>
>>
>>
>> On Sat, Jan 14, 2017 at 12:15 PM, Mehdi Amini <mehdi.amini at
apple.com>
>> wrote:
>>
>>>
>>> On Jan 14, 2017, at 8:14 AM, Teresa Johnson <tejohnson at
google.com>
>>> wrote:
>>>
>>>
>>>
>>> On Fri, Jan 13, 2017 at 9:54 PM, Mehdi Amini <mehdi.amini at
apple.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I believe we should get it right (and simpler) if (when…) we
move to
>>>> the representation we discussed last spring:
https://llvm.org/bugs/
>>>> show_bug.cgi?id=27866
>>>>
>>>> Our conclusion was that we should always have alias pointing to
private
>>>> anonymous and nothing else, so when we currently have:
>>>>
>>>> @a = global i32 0
>>>> @b = alias @a
>>>>
>>>> It should always become:
>>>>
>>>> @0 = private i32 0
>>>> @a = alias @0
>>>> @b = alias @0
>>>>
>>>
>>> Yes that has some nice properties. I think it makes at least one
case
>>> harder though.
>>>
>>> Consider:
>>>
>>> 1) Original:
>>> define linkonce_odr void @a() { ... }
>>> @b = linkonce_odr alias void (), void ()* @a
>>>
>>> 2) In the proposed canonical form this becomes:
>>> define private void @0() { ... }
>>> @a = linkonce_odr alias void (), void ()* @0
>>> @b = linkonce_odr alias void (), void ()* @0
>>>
>>> And let's say @a and @b are both non-prevailing in this module.
I think
>>> we lose inlining ability for these functions when we start dropping
>>> non-prevailing copies.
>>>
>>> In 1) we could transform this (as described in my earlier email)
to:
>>>
>>> define available_externally void @a() { ... }
>>> /* RAUW @b with @a */
>>>
>>> and we can inline @a into all callsites (which includes calls
original
>>> via @b), before dropping it.
>>>
>>> With 2), I think we would need to transform this to:
>>>
>>> define private void @0() { ... }
>>> declare void @a()
>>> declare void @b()
>>>
>>> and we no longer can inline calls to @a and @b. I suppose we could
>>> intervene and special case this scenario so that it is transformed
as for
>>> 1) when there are non-prevailing odr aliases.
>>>
>>>
>>> The complication is that a cloneFunction() step would be needed
right?
>>>
>>
>> I hadn't thought about that level of implementation detail, but in
any
>> case a cloneFunction step will be needed with both the current alias
format
>> and with your proposed new form (just in different cases though). My
point
>> was that the proposed format doesn't always make the handling
simpler.
>>
>>
>>> Alternatively I can imagine us allowing available_externally alias,
I’m
>>> not sure why they are forbidden today other than because we map the
>>> object-level limitations in the IR, but I can miss something (we
don’t have
>>> alias on Darwin either…).
>>>
>>
>> Allowing available_externally alias helps simplify the case where we
have
>> a non-prevailing weak alias. We would still need to address the case
where
>> the aliasee is non-prevailing, which I think is the trickier situation
to
>> handle.
>>
>>
>> I'm missing something, aren't we talking about the proposal
where all
>> aliasee are always private?
>>
>
> Oh I see. Yes, that is the part simplified by the proposed new alias
> handling. I misunderstood what your "Alternatively" was referring
to.
>
> I'm a little wary of making fixing this issue for ThinLTO dependent on
> changing the alias representation in the compiler. Maybe it wouldn't be
> that difficult to do though.
>
I suppose I could do this by transforming aliases to the proposed canonical
form just before we do the weak resolution in the ThinLTO backends. Longer
term, how would such a change be implemented? I.e. would the current IR
continue to be accepted (and generated by clang), and the transformation
occur within LLVM early in the pipeline?

Teresa

> Teresa
>
>
>> Mehdi
>>
>>
>>>
>>>
>>> I need to think through the various permutations of linkage types
and
>>> see how they get transformed in the current and proposed canonical
forms to
>>> see if there are other cases that need special handling.
>>>
>>> Teresa
>>>
>>>
>>>
>>>>
>>>> —
>>>> Mehdi
>>>>
>>>>
>>>>
>>>> On Jan 13, 2017, at 3:21 PM, Peter Collingbourne <peter at
pcc.me.uk>
>>>> wrote:
>>>>
>>>> Hi Teresa,
>>>>
>>>> I think that to answer your question correctly it is helpful to
>>>> consider what is going on at the object file level. For your
test1.c we
>>>> conceptually have a .text section containing the body of f, and
then three
>>>> symbols:
>>>>
>>>> .weak f
>>>> f = .text
>>>> .globl strongalias
>>>> strongalias = .text
>>>> .weak weakalias
>>>> weakalias = .text
>>>>
>>>> Note that f, strongalias and weakalias are not related at all,
except
>>>> that they happen to point to the same place. If f is overridden
by a symbol
>>>> in another object file, it does not affect the symbols
strongalias and
>>>> weakalias, so we still need to make them point to .text. I
don't think it
>>>> would be right to make strongalias and weakalias into copies of
f, as that
>>>> would be observable through function pointer equality. Most
likely all you
>>>> need to do is to internalize f and keep strongalias and
weakalias as
>>>> aliases of f.
>>>>
>>>> If we're resolving strongalias to f at -O2, that seems like
a bug to
>>>> me. We can probably only resolve an alias to the symbol it
references if we
>>>> are guaranteed that both symbols will have the same resolution,
i.e. we
>>>> must check at least that both symbols have strong or internal
linkage. If
>>>> we cared about symbol interposition, we might also want to
check that both
>>>> symbols have non-default visibility, but I think that our
support for that
>>>> is still a little fuzzy at the moment.
>>>>
>>>> Thanks,
>>>> Peter
>>>>
>>>> On Fri, Jan 13, 2017 at 2:33 PM, Teresa Johnson <tejohnson
at google.com>
>>>> wrote:
>>>>
>>>>> Hi Mehdi, Peter and David (and anyone else who sees this),
>>>>>
>>>>> I've been playing with some examples to handle the weak
symbol cases
>>>>> we discussed in IRC earlier this week in the context of
D28523. I was going
>>>>> to implement the support for turning aliases into copies in
order to enable
>>>>> performing thinLTOResolveWeakForLinkerGUID on both aliases
and
>>>>> aliasees, as a first step to being able to drop
non-prevailing weak symbols
>>>>> in ThinLTO backends.
>>>>>
>>>>> I was wondering though what happens if we have an alias,
which may or
>>>>> may not be weak itself, to a non-odr weak symbol that
isn't prevailing. In
>>>>> that case, do we eventually want references via the alias
to go to the
>>>>> prevailing copy (in another module), or to the original
copy in the alias's
>>>>> module? I looked at some examples without ThinLTO, and am a
little
>>>>> confused. Current (non-ThinLTO) behavior in some cases
seems to depend on
>>>>> opt level.
>>>>>
>>>>> Example:
>>>>>
>>>>> $ cat weak12main.c
>>>>> extern void test2();
>>>>> int main() {
>>>>>   test2();
>>>>> }
>>>>>
>>>>> $ cat weak1.c
>>>>> #include <stdio.h>
>>>>>
>>>>> void weakalias() __attribute__((weak, alias
("f")));
>>>>> void strongalias() __attribute__((alias ("f")));
>>>>>
>>>>> void f () __attribute__ ((weak));
>>>>> void f()
>>>>> {
>>>>>   printf("In weak1.c:f\n");
>>>>> }
>>>>> void test1() {
>>>>>   printf("Call f() from weak1.c:\n");
>>>>>   f();
>>>>>   printf("Call weakalias() from weak1.c:\n");
>>>>>   weakalias();
>>>>>   printf("Call strongalias() from weak1.c:\n");
>>>>>   strongalias();
>>>>> }
>>>>>
>>>>> $ cat weak2.c
>>>>> #include <stdio.h>
>>>>>
>>>>> void f () __attribute__ ((weak));
>>>>> void f()
>>>>> {
>>>>>   printf("In weak2.c:f\n");
>>>>> }
>>>>> extern void test1();
>>>>> void test2()
>>>>> {
>>>>>   test1();
>>>>>   printf("Call f() from weak2.c\n");
>>>>>   f();
>>>>> }
>>>>>
>>>>> If I link weak1.c before weak2.c, nothing is surprising (we
always
>>>>> invoke weak1.c:f at both -O0 and -O2):
>>>>>
>>>>> $ clang weak12main.c weak1.c weak2.c -O0
>>>>> $ a.out
>>>>> Call f() from weak1.c:
>>>>> In weak1.c:f
>>>>> Call weakalias() from weak1.c:
>>>>> In weak1.c:f
>>>>> Call strongalias() from weak1.c:
>>>>> In weak1.c:f
>>>>> Call f() from weak2.c
>>>>> In weak1.c:f
>>>>>
>>>>> $ clang weak12main.c weak1.c weak2.c -O2
>>>>> $ a.out
>>>>> Call f() from weak1.c:
>>>>> In weak1.c:f
>>>>> Call weakalias() from weak1.c:
>>>>> In weak1.c:f
>>>>> Call strongalias() from weak1.c:
>>>>> In weak1.c:f
>>>>> Call f() from weak2.c
>>>>> In weak1.c:f
>>>>>
>>>>> If I instead link weak2.c first, so it's copy of f() is
prevailing, I
>>>>> still get weak1.c:f for the call via weakalias() (both opt
levels), and for
>>>>> strongalias() when building at -O0. At -O2 the compiler
replaces the call
>>>>> to strongalias() with a call to f(), so it get's the
weak2 copy in that
>>>>> case.
>>>>>
>>>>> $ clang weak12main.c weak2.c weak1.c -O2
>>>>> $ a.out
>>>>> Call f() from weak1.c:
>>>>> In weak2.c:f
>>>>> Call weakalias() from weak1.c:
>>>>> In weak1.c:f
>>>>> Call strongalias() from weak1.c:
>>>>> In weak2.c:f
>>>>> Call f() from weak2.c
>>>>> In weak2.c:f
>>>>>
>>>>> $ clang weak12main.c weak2.c weak1.c -O0
>>>>> $ a.out
>>>>> Call f() from weak1.c:
>>>>> In weak2.c:f
>>>>> Call weakalias() from weak1.c:
>>>>> In weak1.c:f
>>>>> Call strongalias() from weak1.c:
>>>>> In weak1.c:f
>>>>> Call f() from weak2.c
>>>>> In weak2.c:f
>>>>>
>>>>> I'm wondering what the expected/correct behavior is?
Depending on what
>>>>> is correct, we need to handle this differently in ThinLTO
mode. Let's say
>>>>> weak1.c's copy of f() is not prevailing and I am going
to drop it (it needs
>>>>> to be removed completely, not turned into
available_externally to ensure it
>>>>> isn't inlined since weak isInterposable). If we want
the aliases in weak1.c
>>>>> to reference the original version, then copying is correct
(e.g. weakalias
>>>>> and strong alias would each become a copy of weak1.c's
f()). If we however
>>>>> want them to resolve to the prevailing copy of f(), then we
need to turn
>>>>> the aliases into declarations (external linkage in the case
of strongalias
>>>>> and external weak in the case of weakalias?).
>>>>>
>>>>> I also tried the case where f() was in a comdat, because I
also need
>>>>> to handle that case in ThinLTO (when f() is not prevailing,
drop it from
>>>>> the comdat and remove the comdat from that module).
Interestingly, in this
>>>>> case when weak2.c is prevailing, I get the following
warning when linking
>>>>> and get a seg fault at runtime:
>>>>>
>>>>> weak1.o:weak1.o:function test1: warning: relocation refers
to
>>>>> discarded section
>>>>>
>>>>> Presumably the aliases still refer to the copy in weak1.c,
which is in
>>>>> the comdat that gets dropped by the linker. So is it not
legal to have an
>>>>> alias to a weak symbol in a comdat (i.e. alias from outside
the comdat)? We
>>>>> don't complain in the compiler.
>>>>>
>>>>> Thanks,
>>>>> Teresa
>>>>> --
>>>>> Teresa Johnson |  Software Engineer |  tejohnson at
google.com |
>>>>> 408-460-2413 <(408)%20460-2413>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> --
>>>> Peter
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
>>> 408-460-2413 <(408)%20460-2413>
>>>
>>>
>>>
>>
>>
>> --
>> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
>> 408-460-2413 <(408)%20460-2413>
>>
>>
>
>
> --
> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
> 408-460-2413 <(408)%20460-2413>
>


-- 
Teresa Johnson |  Software Engineer |  tejohnson at google.com |  408-460-2413
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170118/4aa96afd/attachment.html>

llvm dev - Jan 2017 - Weak symbol/alias semantics

[llvm-dev] Weak symbol/alias semantics

[llvm-dev] Weak symbol/alias semantics

[llvm-dev] Weak symbol/alias semantics