thr3ads.net - llvm dev - [llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics") [Feb 2016]

If this information is useful, please help other people find it:
Share via:

Duncan P. N. Exon Smith via llvm-dev

2016-Feb-25 03:34 UTC

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

> On 2016-Feb-24, at 19:17, Chandler Carruth <chandlerc at google.com>
wrote:
> 
> On Wed, Feb 24, 2016 at 7:10 PM Sanjoy Das via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> On Wed, Feb 24, 2016 at 6:51 PM, Duncan P. N. Exon Smith
> <dexonsmith at apple.com> wrote:
> >> If we do not inline @foo(), and instead re-link the call site in
@main
> >> to some non-optimized copy (or differently optimized copy) of
@foo,
> >> then it is possible for the program to have the behavior
{print("Y");
> >> print ("X")}, which was disallowed in the earlier
program.
> >>
> >> In other words, opt refined the semantics of @foo() (i.e. reduced
the
> >> set of behaviors it may have) in ways that would make later
> >> optimizations invalid if we de-refine the implementation of
@foo().
> >
> > I'm probably missing something obvious here.  How could the result
of
> > `%t0 != %t1` be different at optimization time in one file than from
> > runtime in the "real" implementation?  Doesn't this make
the CSE
> > invalid?
> 
> `%t0` and `%t1` are "allowed" to "always be the same",
i.e. an
> implementation of @foo that always feeds in the same
> value for `%t0` and `%t1` is a valid implementation (which is why the
> CSE was valid); but it is not the *only* valid implementation.  If I
> don't CSE the two load instructions (also a valid thing to do), and
> this is a second thread writing to `%par`, then the two values loaded
> can be different, and you could end up printing `"X"` in `@foo`.
> 
> Did that make sense?
Yes.  To be sure I understand the scope: this is only a problem for
atomics, correct?  (Because multi-threaded behaviour with other globals
is UB?)
> > Does linkonce_odr linkage have the same problem?
> > - If so, do you want to change it too?
> > - Else, why not?
> 
> Going by the specification in the LangRef, I'd say it depends on how
> you define "definitive".  If you're allowed to replace the
body of a
> function with a differently optimized body, then the above problem
> exists.
> 
> I believe that is the case, and I strongly believe the problem you outline
exists for linkonce_odr exactly as it does for available_externally.
> 
> Which is what makes this scary: every C++ inline function today can trigger
this.
Every C/C++ inline or template function.  But only the ones that use
atomics, right?

Not that I'm sure that will end up being a helpful distinction.
> >> The above example is clearly fabricated, but such cases can come
up
> >> even if everything is optimized to the same level.  E.g. one of
the
> >> atomic loads in the unrefined implementation of @foo() could have
been
> >> hidden behind a function call, whose body existed in only one
module.
> >> That module would then be able to refine @foo() to `ret void` but
> >> other modules won't.
> >>
> >> The only solution I can think of is to redefine
available_externally
> >> to mean "the only kind of IPO/IPA you can do over a call to
this
> >> function is to inline it".  Redefining available_externally
this way
> >> will also let us soundly use it to represent calls to functions
that
> >> have guard intrinsics, since a failed guard intrinsic basically
> >> replaces the function with a "very de-refined"
implementation (the
> >> interpreter).
> >>
> >> What do you think?  I don't think implementing the above above
will be
> >> very difficult, but needless to say, it will still be a fairly
> >> non-trivial semantic change (hence I'm not directly jumping to
> >> implementation).
> >
> > This linkage is used in three places (that I know of) by clang:
> >
> >   1. C-style `inline` functions.
> >   2. Functions defined in C++ template classes with external explicit
> >      instantiations, e.g. S::foo() in:
> >
> >          template <class T> struct S { void foo() {} };
> >          void bar() { S<int>().foo(); }
> >          extern template struct S<int>;
> >
> >   3. -flto=thin cross-module function importing.
> >
> > (No comment on (1); its exact semantics are a little fuzzy to me.)
> > For (2) and (3), the current behaviour seems correct, and I'd be
> > hesitant to lose optimizing power.  (2) is under the "ODR"
rule, and
> > I think we've been applying the same logic to (3).  Unless, are
you
> > saying ODR isn't enough?
> 
> By ODR, do you mean you only have one definition of the function in
> the whole link (i.e. across all modules you'll link together)?
> Then yes, ODR should be enough to avoid this.  But in any place where
> the linker sees two differently optimized definitions for a function
> and picks one as the definitive version all non-inlined calls link to,
> we have this problem.
> 
> No, different levels of optimization must be allowed within ODR. So this is
a problem within an ODR context.
> 
> (The term ODR applies to one *source* definition, not one optimized
definition)

Chandler Carruth via llvm-dev

2016-Feb-25 03:38 UTC

head link

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

On Wed, Feb 24, 2016 at 7:34 PM Duncan P. N. Exon Smith <
dexonsmith at apple.com> wrote:
>
> > On 2016-Feb-24, at 19:17, Chandler Carruth <chandlerc at
google.com> wrote:
> >
> > On Wed, Feb 24, 2016 at 7:10 PM Sanjoy Das via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> > On Wed, Feb 24, 2016 at 6:51 PM, Duncan P. N. Exon Smith
> > <dexonsmith at apple.com> wrote:
> > >> If we do not inline @foo(), and instead re-link the call site
in @main
> > >> to some non-optimized copy (or differently optimized copy) of
@foo,
> > >> then it is possible for the program to have the behavior
{print("Y");
> > >> print ("X")}, which was disallowed in the earlier
program.
> > >>
> > >> In other words, opt refined the semantics of @foo() (i.e.
reduced the
> > >> set of behaviors it may have) in ways that would make later
> > >> optimizations invalid if we de-refine the implementation of
@foo().
> > >
> > > I'm probably missing something obvious here.  How could the
result of
> > > `%t0 != %t1` be different at optimization time in one file than
from
> > > runtime in the "real" implementation?  Doesn't this
make the CSE
> > > invalid?
> >
> > `%t0` and `%t1` are "allowed" to "always be the
same", i.e. an
> > implementation of @foo that always feeds in the same
> > value for `%t0` and `%t1` is a valid implementation (which is why the
> > CSE was valid); but it is not the *only* valid implementation.  If I
> > don't CSE the two load instructions (also a valid thing to do),
and
> > this is a second thread writing to `%par`, then the two values loaded
> > can be different, and you could end up printing `"X"` in
`@foo`.
> >
> > Did that make sense?
>
> Yes.  To be sure I understand the scope: this is only a problem for
> atomics, correct?  (Because multi-threaded behaviour with other globals
> is UB?)
>
> > > Does linkonce_odr linkage have the same problem?
> > > - If so, do you want to change it too?
> > > - Else, why not?
> >
> > Going by the specification in the LangRef, I'd say it depends on
how
> > you define "definitive".  If you're allowed to replace
the body of a
> > function with a differently optimized body, then the above problem
> > exists.
> >
> > I believe that is the case, and I strongly believe the problem you
> outline exists for linkonce_odr exactly as it does for
available_externally.
> >
> > Which is what makes this scary: every C++ inline function today can
> trigger this.
>
> Every C/C++ inline or template function.  But only the ones that use
> atomics, right?
>
Well, with *this* example...

>
> Not that I'm sure that will end up being a helpful distinction.
>
Right. See Richard's comment. I think that sums up the real issue here. =/
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160225/29164228/attachment.html>

Sanjoy Das via llvm-dev

2016-Feb-25 03:46 UTC

head link

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

On Wed, Feb 24, 2016 at 7:38 PM, Chandler Carruth <chandlerc at
google.com> wrote:> On Wed, Feb 24, 2016 at 7:34 PM Duncan P. N. Exon Smith
> <dexonsmith at apple.com> wrote:
>>
>>
>> > On 2016-Feb-24, at 19:17, Chandler Carruth <chandlerc at
google.com> wrote:
>> >
>> > On Wed, Feb 24, 2016 at 7:10 PM Sanjoy Das via llvm-dev
>> > <llvm-dev at lists.llvm.org> wrote:
>> > On Wed, Feb 24, 2016 at 6:51 PM, Duncan P. N. Exon Smith
>> > <dexonsmith at apple.com> wrote:
>> > >> If we do not inline @foo(), and instead re-link the call
site in
>> > >> @main
>> > >> to some non-optimized copy (or differently optimized
copy) of @foo,
>> > >> then it is possible for the program to have the behavior
{print("Y");
>> > >> print ("X")}, which was disallowed in the
earlier program.
>> > >>
>> > >> In other words, opt refined the semantics of @foo() (i.e.
reduced the
>> > >> set of behaviors it may have) in ways that would make
later
>> > >> optimizations invalid if we de-refine the implementation
of @foo().
>> > >
>> > > I'm probably missing something obvious here.  How could
the result of
>> > > `%t0 != %t1` be different at optimization time in one file
than from
>> > > runtime in the "real" implementation?  Doesn't
this make the CSE
>> > > invalid?
>> >
>> > `%t0` and `%t1` are "allowed" to "always be the
same", i.e. an
>> > implementation of @foo that always feeds in the same
>> > value for `%t0` and `%t1` is a valid implementation (which is why
the
>> > CSE was valid); but it is not the *only* valid implementation.  If
I
>> > don't CSE the two load instructions (also a valid thing to
do), and
>> > this is a second thread writing to `%par`, then the two values
loaded
>> > can be different, and you could end up printing `"X"` in
`@foo`.
>> >
>> > Did that make sense?
>>
>> Yes.  To be sure I understand the scope: this is only a problem for
>> atomics, correct?  (Because multi-threaded behaviour with other globals
>> is UB?)
>>
>> > > Does linkonce_odr linkage have the same problem?
>> > > - If so, do you want to change it too?
>> > > - Else, why not?
>> >
>> > Going by the specification in the LangRef, I'd say it depends
on how
>> > you define "definitive".  If you're allowed to
replace the body of a
>> > function with a differently optimized body, then the above problem
>> > exists.
>> >
>> > I believe that is the case, and I strongly believe the problem you
>> > outline exists for linkonce_odr exactly as it does for
available_externally.
>> >
>> > Which is what makes this scary: every C++ inline function today
can
>> > trigger this.
>>
>> Every C/C++ inline or template function.  But only the ones that use
>> atomics, right?
>
>
> Well, with *this* example...
Atomic are one source of non-determinism that compilers can reason
about.  I don't know if the following snippet is well defined or not,
but you could have similar issues with


  void foo() {
    int *p = malloc(sizeof(int));
    if (*p < 10) print("X");
  }

or (again, I don't know if this is actually well defined)

  void foo() {
    int t;  // it is probably reasonable to fold compares with
ptrtoint(alloca) to undef
    if ((intptr_t)(&t) < 10) print("X");
  }

-- Sanjoy
>
>>
>>
>> Not that I'm sure that will end up being a helpful distinction.
>
>
> Right. See Richard's comment. I think that sums up the real issue here.
=/

Possibly Parallel Threads

Search for more maybe matching threads

llvm dev - Feb 2016 - Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

Possibly Parallel Threads