thr3ads.net - llvm dev - [llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics") [Feb 2016]

If this information is useful, please help other people find it:
Share via:

Sanjoy Das via llvm-dev

2016-Feb-25 03:09 UTC

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

On Wed, Feb 24, 2016 at 6:51 PM, Duncan P. N. Exon Smith
<dexonsmith at apple.com> wrote:>> If we do not inline @foo(), and instead re-link the call site in @main
>> to some non-optimized copy (or differently optimized copy) of @foo,
>> then it is possible for the program to have the behavior
{print("Y");
>> print ("X")}, which was disallowed in the earlier program.
>>
>> In other words, opt refined the semantics of @foo() (i.e. reduced the
>> set of behaviors it may have) in ways that would make later
>> optimizations invalid if we de-refine the implementation of @foo().
>
> I'm probably missing something obvious here.  How could the result of
> `%t0 != %t1` be different at optimization time in one file than from
> runtime in the "real" implementation?  Doesn't this make the
CSE
> invalid?
`%t0` and `%t1` are "allowed" to "always be the same", i.e.
an
implementation of @foo that always feeds in the same
value for `%t0` and `%t1` is a valid implementation (which is why the
CSE was valid); but it is not the *only* valid implementation.  If I
don't CSE the two load instructions (also a valid thing to do), and
this is a second thread writing to `%par`, then the two values loaded
can be different, and you could end up printing `"X"` in `@foo`.

Did that make sense?
> Does linkonce_odr linkage have the same problem?
> - If so, do you want to change it too?
> - Else, why not?
Going by the specification in the LangRef, I'd say it depends on how
you define "definitive".  If you're allowed to replace the body of
a
function with a differently optimized body, then the above problem
exists.
>> The above example is clearly fabricated, but such cases can come up
>> even if everything is optimized to the same level.  E.g. one of the
>> atomic loads in the unrefined implementation of @foo() could have been
>> hidden behind a function call, whose body existed in only one module.
>> That module would then be able to refine @foo() to `ret void` but
>> other modules won't.
>>
>> The only solution I can think of is to redefine available_externally
>> to mean "the only kind of IPO/IPA you can do over a call to this
>> function is to inline it".  Redefining available_externally this
way
>> will also let us soundly use it to represent calls to functions that
>> have guard intrinsics, since a failed guard intrinsic basically
>> replaces the function with a "very de-refined" implementation
(the
>> interpreter).
>>
>> What do you think?  I don't think implementing the above above will
be
>> very difficult, but needless to say, it will still be a fairly
>> non-trivial semantic change (hence I'm not directly jumping to
>> implementation).
>
> This linkage is used in three places (that I know of) by clang:
>
>   1. C-style `inline` functions.
>   2. Functions defined in C++ template classes with external explicit
>      instantiations, e.g. S::foo() in:
>
>          template <class T> struct S { void foo() {} };
>          void bar() { S<int>().foo(); }
>          extern template struct S<int>;
>
>   3. -flto=thin cross-module function importing.
>
> (No comment on (1); its exact semantics are a little fuzzy to me.)
> For (2) and (3), the current behaviour seems correct, and I'd be
> hesitant to lose optimizing power.  (2) is under the "ODR" rule,
and
> I think we've been applying the same logic to (3).  Unless, are you
> saying ODR isn't enough?
By ODR, do you mean you only have one definition of the function in
the whole link (i.e. across all modules you'll link together)?
Then yes, ODR should be enough to avoid this.  But in any place where
the linker sees two differently optimized definitions for a function
and picks one as the definitive version all non-inlined calls link to,
we have this problem.
> Assuming you need this new definition (but under ODR, the semantics
> are correct), I would rather split the linkage than change it.  E.g.,
> use a name like available_externally_odr for (2) and (3).
If what I said above is correct (i.e. ODR == OD across everything
you're linking into your final executable) then splitting the linkage
/ adding a new one is probably the best alternative.

-- Sanjoy

Chandler Carruth via llvm-dev

2016-Feb-25 03:17 UTC

head link

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

On Wed, Feb 24, 2016 at 7:10 PM Sanjoy Das via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> On Wed, Feb 24, 2016 at 6:51 PM, Duncan P. N. Exon Smith
> <dexonsmith at apple.com> wrote:
> >> If we do not inline @foo(), and instead re-link the call site in
@main
> >> to some non-optimized copy (or differently optimized copy) of
@foo,
> >> then it is possible for the program to have the behavior
{print("Y");
> >> print ("X")}, which was disallowed in the earlier
program.
> >>
> >> In other words, opt refined the semantics of @foo() (i.e. reduced
the
> >> set of behaviors it may have) in ways that would make later
> >> optimizations invalid if we de-refine the implementation of
@foo().
> >
> > I'm probably missing something obvious here.  How could the result
of
> > `%t0 != %t1` be different at optimization time in one file than from
> > runtime in the "real" implementation?  Doesn't this make
the CSE
> > invalid?
>
> `%t0` and `%t1` are "allowed" to "always be the same",
i.e. an
> implementation of @foo that always feeds in the same
> value for `%t0` and `%t1` is a valid implementation (which is why the
> CSE was valid); but it is not the *only* valid implementation.  If I
> don't CSE the two load instructions (also a valid thing to do), and
> this is a second thread writing to `%par`, then the two values loaded
> can be different, and you could end up printing `"X"` in `@foo`.
>
> Did that make sense?
>
> > Does linkonce_odr linkage have the same problem?
> > - If so, do you want to change it too?
> > - Else, why not?
>
> Going by the specification in the LangRef, I'd say it depends on how
> you define "definitive".  If you're allowed to replace the
body of a
> function with a differently optimized body, then the above problem
> exists.
>
I believe that is the case, and I strongly believe the problem you outline
exists for linkonce_odr exactly as it does for available_externally.

Which is what makes this scary: every C++ inline function today can trigger
this.

>
> >> The above example is clearly fabricated, but such cases can come
up
> >> even if everything is optimized to the same level.  E.g. one of
the
> >> atomic loads in the unrefined implementation of @foo() could have
been
> >> hidden behind a function call, whose body existed in only one
module.
> >> That module would then be able to refine @foo() to `ret void` but
> >> other modules won't.
> >>
> >> The only solution I can think of is to redefine
available_externally
> >> to mean "the only kind of IPO/IPA you can do over a call to
this
> >> function is to inline it".  Redefining available_externally
this way
> >> will also let us soundly use it to represent calls to functions
that
> >> have guard intrinsics, since a failed guard intrinsic basically
> >> replaces the function with a "very de-refined"
implementation (the
> >> interpreter).
> >>
> >> What do you think?  I don't think implementing the above above
will be
> >> very difficult, but needless to say, it will still be a fairly
> >> non-trivial semantic change (hence I'm not directly jumping to
> >> implementation).
> >
> > This linkage is used in three places (that I know of) by clang:
> >
> >   1. C-style `inline` functions.
> >   2. Functions defined in C++ template classes with external explicit
> >      instantiations, e.g. S::foo() in:
> >
> >          template <class T> struct S { void foo() {} };
> >          void bar() { S<int>().foo(); }
> >          extern template struct S<int>;
> >
> >   3. -flto=thin cross-module function importing.
> >
> > (No comment on (1); its exact semantics are a little fuzzy to me.)
> > For (2) and (3), the current behaviour seems correct, and I'd be
> > hesitant to lose optimizing power.  (2) is under the "ODR"
rule, and
> > I think we've been applying the same logic to (3).  Unless, are
you
> > saying ODR isn't enough?
>
> By ODR, do you mean you only have one definition of the function in
> the whole link (i.e. across all modules you'll link together)?
> Then yes, ODR should be enough to avoid this.  But in any place where
> the linker sees two differently optimized definitions for a function
> and picks one as the definitive version all non-inlined calls link to,
> we have this problem.
>
No, different levels of optimization must be allowed within ODR. So this is
a problem within an ODR context.

(The term ODR applies to one *source* definition, not one optimized
definition)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160225/6c4ed949/attachment.html>

Sanjoy Das via llvm-dev

2016-Feb-25 03:26 UTC

head link

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

On Wed, Feb 24, 2016 at 7:09 PM, Sanjoy Das
<sanjoy at playingwithpointers.com> wrote:> CSE was valid); but it is not the *only* valid implementation.  If I
> don't CSE the two load instructions (also a valid thing to do), and
> this is a second thread writing to `%par`, then the two values loaded
Typoe'd a little bit here:  should be "and there is a second thread
writing to `%ptr`"

-- Sanjoy

Richard Smith via llvm-dev

2016-Feb-25 03:33 UTC

head link

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

On Wed, Feb 24, 2016 at 7:17 PM, Chandler Carruth via llvm-dev
<llvm-dev at lists.llvm.org> wrote:> On Wed, Feb 24, 2016 at 7:10 PM Sanjoy Das via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>>
>> On Wed, Feb 24, 2016 at 6:51 PM, Duncan P. N. Exon Smith
>> <dexonsmith at apple.com> wrote:
>> >> If we do not inline @foo(), and instead re-link the call site
in @main
>> >> to some non-optimized copy (or differently optimized copy) of
@foo,
>> >> then it is possible for the program to have the behavior
{print("Y");
>> >> print ("X")}, which was disallowed in the earlier
program.
>> >>
>> >> In other words, opt refined the semantics of @foo() (i.e.
reduced the
>> >> set of behaviors it may have) in ways that would make later
>> >> optimizations invalid if we de-refine the implementation of
@foo().
>> >
>> > I'm probably missing something obvious here.  How could the
result of
>> > `%t0 != %t1` be different at optimization time in one file than
from
>> > runtime in the "real" implementation?  Doesn't this
make the CSE
>> > invalid?
>>
>> `%t0` and `%t1` are "allowed" to "always be the
same", i.e. an
>> implementation of @foo that always feeds in the same
>> value for `%t0` and `%t1` is a valid implementation (which is why the
>> CSE was valid); but it is not the *only* valid implementation.  If I
>> don't CSE the two load instructions (also a valid thing to do), and
>> this is a second thread writing to `%par`, then the two values loaded
>> can be different, and you could end up printing `"X"` in
`@foo`.
>>
>> Did that make sense?
>>
>> > Does linkonce_odr linkage have the same problem?
>> > - If so, do you want to change it too?
>> > - Else, why not?
>>
>> Going by the specification in the LangRef, I'd say it depends on
how
>> you define "definitive".  If you're allowed to replace
the body of a
>> function with a differently optimized body, then the above problem
>> exists.
>
>
> I believe that is the case, and I strongly believe the problem you outline
> exists for linkonce_odr exactly as it does for available_externally.
>
> Which is what makes this scary: every C++ inline function today can trigger
> this.
More generally, the same problem applies to all entities within
comdats (we can view available_externally and linkonce_odr as special
cases of single-symbol comdats). Similar problems show up with static
local variables in inline functions (where the comdats may be
equivalent across object files, but aren't necessarily identical -- in
particular, one can use constant initialization for a static local
variable where the other uses dynamic initialization, and inlining the
one with constant initialization then selecting the one with dynamic
initialization results in the variable not being initialized).

One approach that appears superficially correct would be to say that
we cannot move information across a comdat / available_externally /
linkonce_odr boundary unless doing so allows us to completely remove
that reference to the comdat / function / global. (In particular, you
can't use the attrs on a function in a comdat from outside that comdat
-- unless you somehow know they're present for all versions of that
comdat.) But that seems to have a pretty huge impact on optimizability
of comdats.
>> >> The above example is clearly fabricated, but such cases can
come up
>> >> even if everything is optimized to the same level.  E.g. one
of the
>> >> atomic loads in the unrefined implementation of @foo() could
have been
>> >> hidden behind a function call, whose body existed in only one
module.
>> >> That module would then be able to refine @foo() to `ret void`
but
>> >> other modules won't.
>> >>
>> >> The only solution I can think of is to redefine
available_externally
>> >> to mean "the only kind of IPO/IPA you can do over a call
to this
>> >> function is to inline it".  Redefining
available_externally this way
>> >> will also let us soundly use it to represent calls to
functions that
>> >> have guard intrinsics, since a failed guard intrinsic
basically
>> >> replaces the function with a "very de-refined"
implementation (the
>> >> interpreter).
>> >>
>> >> What do you think?  I don't think implementing the above
above will be
>> >> very difficult, but needless to say, it will still be a fairly
>> >> non-trivial semantic change (hence I'm not directly
jumping to
>> >> implementation).
>> >
>> > This linkage is used in three places (that I know of) by clang:
>> >
>> >   1. C-style `inline` functions.
>> >   2. Functions defined in C++ template classes with external
explicit
>> >      instantiations, e.g. S::foo() in:
>> >
>> >          template <class T> struct S { void foo() {} };
>> >          void bar() { S<int>().foo(); }
>> >          extern template struct S<int>;
>> >
>> >   3. -flto=thin cross-module function importing.
>> >
>> > (No comment on (1); its exact semantics are a little fuzzy to me.)
>> > For (2) and (3), the current behaviour seems correct, and I'd
be
>> > hesitant to lose optimizing power.  (2) is under the
"ODR" rule, and
>> > I think we've been applying the same logic to (3).  Unless,
are you
>> > saying ODR isn't enough?
>>
>> By ODR, do you mean you only have one definition of the function in
>> the whole link (i.e. across all modules you'll link together)?
>> Then yes, ODR should be enough to avoid this.  But in any place where
>> the linker sees two differently optimized definitions for a function
>> and picks one as the definitive version all non-inlined calls link to,
>> we have this problem.
>
>
> No, different levels of optimization must be allowed within ODR. So this is
> a problem within an ODR context.
>
> (The term ODR applies to one *source* definition, not one optimized
> definition)
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

Duncan P. N. Exon Smith via llvm-dev

2016-Feb-25 03:34 UTC

head link

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

> On 2016-Feb-24, at 19:17, Chandler Carruth <chandlerc at google.com>
wrote:
> 
> On Wed, Feb 24, 2016 at 7:10 PM Sanjoy Das via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> On Wed, Feb 24, 2016 at 6:51 PM, Duncan P. N. Exon Smith
> <dexonsmith at apple.com> wrote:
> >> If we do not inline @foo(), and instead re-link the call site in
@main
> >> to some non-optimized copy (or differently optimized copy) of
@foo,
> >> then it is possible for the program to have the behavior
{print("Y");
> >> print ("X")}, which was disallowed in the earlier
program.
> >>
> >> In other words, opt refined the semantics of @foo() (i.e. reduced
the
> >> set of behaviors it may have) in ways that would make later
> >> optimizations invalid if we de-refine the implementation of
@foo().
> >
> > I'm probably missing something obvious here.  How could the result
of
> > `%t0 != %t1` be different at optimization time in one file than from
> > runtime in the "real" implementation?  Doesn't this make
the CSE
> > invalid?
> 
> `%t0` and `%t1` are "allowed" to "always be the same",
i.e. an
> implementation of @foo that always feeds in the same
> value for `%t0` and `%t1` is a valid implementation (which is why the
> CSE was valid); but it is not the *only* valid implementation.  If I
> don't CSE the two load instructions (also a valid thing to do), and
> this is a second thread writing to `%par`, then the two values loaded
> can be different, and you could end up printing `"X"` in `@foo`.
> 
> Did that make sense?
Yes.  To be sure I understand the scope: this is only a problem for
atomics, correct?  (Because multi-threaded behaviour with other globals
is UB?)
> > Does linkonce_odr linkage have the same problem?
> > - If so, do you want to change it too?
> > - Else, why not?
> 
> Going by the specification in the LangRef, I'd say it depends on how
> you define "definitive".  If you're allowed to replace the
body of a
> function with a differently optimized body, then the above problem
> exists.
> 
> I believe that is the case, and I strongly believe the problem you outline
exists for linkonce_odr exactly as it does for available_externally.
> 
> Which is what makes this scary: every C++ inline function today can trigger
this.
Every C/C++ inline or template function.  But only the ones that use
atomics, right?

Not that I'm sure that will end up being a helpful distinction.
> >> The above example is clearly fabricated, but such cases can come
up
> >> even if everything is optimized to the same level.  E.g. one of
the
> >> atomic loads in the unrefined implementation of @foo() could have
been
> >> hidden behind a function call, whose body existed in only one
module.
> >> That module would then be able to refine @foo() to `ret void` but
> >> other modules won't.
> >>
> >> The only solution I can think of is to redefine
available_externally
> >> to mean "the only kind of IPO/IPA you can do over a call to
this
> >> function is to inline it".  Redefining available_externally
this way
> >> will also let us soundly use it to represent calls to functions
that
> >> have guard intrinsics, since a failed guard intrinsic basically
> >> replaces the function with a "very de-refined"
implementation (the
> >> interpreter).
> >>
> >> What do you think?  I don't think implementing the above above
will be
> >> very difficult, but needless to say, it will still be a fairly
> >> non-trivial semantic change (hence I'm not directly jumping to
> >> implementation).
> >
> > This linkage is used in three places (that I know of) by clang:
> >
> >   1. C-style `inline` functions.
> >   2. Functions defined in C++ template classes with external explicit
> >      instantiations, e.g. S::foo() in:
> >
> >          template <class T> struct S { void foo() {} };
> >          void bar() { S<int>().foo(); }
> >          extern template struct S<int>;
> >
> >   3. -flto=thin cross-module function importing.
> >
> > (No comment on (1); its exact semantics are a little fuzzy to me.)
> > For (2) and (3), the current behaviour seems correct, and I'd be
> > hesitant to lose optimizing power.  (2) is under the "ODR"
rule, and
> > I think we've been applying the same logic to (3).  Unless, are
you
> > saying ODR isn't enough?
> 
> By ODR, do you mean you only have one definition of the function in
> the whole link (i.e. across all modules you'll link together)?
> Then yes, ODR should be enough to avoid this.  But in any place where
> the linker sees two differently optimized definitions for a function
> and picks one as the definitive version all non-inlined calls link to,
> we have this problem.
> 
> No, different levels of optimization must be allowed within ODR. So this is
a problem within an ODR context.
> 
> (The term ODR applies to one *source* definition, not one optimized
definition)

Maybe Matching Threads

Search for more maybe matching threads

llvm dev - Feb 2016 - Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

Maybe Matching Threads