thr3ads.net - llvm dev - [llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics") [Feb 2016]

If this information is useful, please help other people find it:
Share via:

Hal Finkel via llvm-dev

2016-Feb-25 06:00 UTC

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

----- Original Message ----- 
> From: "Chandler Carruth" <chandlerc at google.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "llvm-dev" <llvm-dev at lists.llvm.org>, "Philip
Reames"
> <listmail at philipreames.com>, "Duncan P. N. Exon Smith"
> <dexonsmith at apple.com>, "Sanjoy Das"
> <sanjoy at playingwithpointers.com>
> Sent: Wednesday, February 24, 2016 11:41:59 PM
> Subject: Re: [llvm-dev] Possible soundness issue with
> available_externally (split from "RFC: Add guard intrinsics")
> On Wed, Feb 24, 2016 at 9:35 PM Hal Finkel < hfinkel at anl.gov >
wrote:
> > ----- Original Message -----
> 
> > > From: "Chandler Carruth via llvm-dev" < llvm-dev at
lists.llvm.org >
> 
> > > To: "Philip Reames" < listmail at philipreames.com
>, "Duncan P. N.
> > > Exon
> 
> > > Smith" < dexonsmith at apple.com >, "Sanjoy
Das"
> 
> > > < sanjoy at playingwithpointers.com >
> 
> > > Cc: "llvm-dev" < llvm-dev at lists.llvm.org >
> 
> > > Sent: Wednesday, February 24, 2016 10:29:23 PM
> 
> > > Subject: Re: [llvm-dev] Possible soundness issue with
> 
> > > available_externally (split from "RFC: Add guard
intrinsics")
> 
> > > Yea, I'm pretty sad about all of this. I'm also not
seeing a lot
> > > of
> 
> > > awesome paths forward.
> 
> > > Here is the least bad strategy I can come up with. Curious if
> > > folks
> 
> > > think this is sufficient:
> 
> > I may not completely understand the problem, but this seems like
> > overkill. The underlying restriction is that, if the compiler makes
> > a non-determinism-collapsing choice when optimizing a function, it
> > must make the same choice for all definitions of that function
> > (undefined behavior excluded).
> 
> This isn't enough, because some definition in some other module may
> *not be optimized at all*, and yet may get selected at link time.
> Put another way, it must *prove* that the same choice will *always*
> be made for all definitions. This is akin to proving that the
> optimizer is run over all translation units for C++ linkonce_odr
> functions, which you can't do.
Sure; which is way I said that we should not perform those optimizations
(instead of saying that we just need to make sure that the same choice will be
made everywhere - as you say, LTO aside, we can't do that).
> The result would be failing to optimize the bodies of linkonce_odr
> functions in any way which was externally detectable such as this. I
> think that would be *much* worse than losing the ability to do
> function attribute deduction for such functions?
But it is not all optimizations that are the problem. Rather, it seems like a
select few (e.g. things involving collapsing allowed non-determinism in
atomics), and losing those optimizations seems better than generally losing
function-attribute deduction.

 -Hal
> > Thus, with an externally_available function, the CSE in Sanjoy's
> > original example should be forbidden. Richard's example again
> > demonstrates this principle, although in this case the
> > non-determinism is in the choice of a globally-visible
> > implementation technique rather than non-determinism from
> > memory-subsystem reordering.
> 
> > There is a complication, which you imply in your proposal, that
> > such
> > optimizations need to be forbidden not just in the
> > externally_available functions themselves, but in any local
> > function
> > transitively called by one. This, however, we can take care of with
> > an (easily-deduced) attribute.
> 
> > In short, it is not clear to me that the number of problematic
> > optimizations is large (seems likely restricted to things involving
> > atomics in practice), and while I understand the auditing
> > difficulties here, we should just restrict these in appropriate
> > contexts instead of trying to restrict all information flow into or
> > out of comdats.
> 
> > -Hal
> 
> > > 1) Stop deducing function attributes within comdats by examining
> > > the
> 
> > > bodies of the functions (so that we remain free to transform the
> 
> > > bodies of functions).
> 
> > > 2) Teach frontends to emit (even at O0!!!) trivially deduced
> > > function
> 
> > > attributes for comdats so that we continue to catch easy cases.
> 
> > > 3) Ensure and specify that we never hoist code *into* a comdat
> > > group
> 
> > > in which it would not have been executed previously. I don't
know
> > > of
> 
> > > anything in LLVM that does this today, but it would become an
> 
> > > important invariant.
> 
> > > 4) Work a lot harder to do internalizing and removing of this
> 
> > > restriction.
> 
> > > Pretty horrible. But I think it is correct.
> 
> > > As a slight modification to #1 and #2, we could have a very
> > > carefully
> 
> > > crafted deduction rule where we only deduce function attributes
> > > for
> 
> > > functions prior to any modification of their function bodies.
> > > Such
> 
> > > attributes should be conservatively correct because we would
> > > never
> 
> > > lift new code into the function bodies. This would at least allow
> > > us
> 
> > > to do bottom-up deduction to catch interprocedural cases. But it
> 
> > > would become incredibly subtle that this is only valid prior to
> 
> > > *any* transformations of the comdat-containing functions.
> 
> > > I'm starting to think this subtle rule might be worth it. But
I'm
> 
> > > frankly terrified by the implications.
> 
> > > On Wed, Feb 24, 2016 at 8:13 PM Philip Reames via llvm-dev <
> 
> > > llvm-dev at lists.llvm.org > wrote:
> 
> > > > On 02/24/2016 08:10 PM, Duncan P. N. Exon Smith via llvm-dev
> > > > wrote:
> 
> > >
> 
> > > > >> On 2016-Feb-24, at 19:46, Sanjoy Das <
> 
> > > > >> sanjoy at playingwithpointers.com > wrote:
> 
> > >
> 
> > > > >>
> 
> > >
> 
> > > > >> On Wed, Feb 24, 2016 at 7:38 PM, Chandler Carruth
<
> 
> > > > >> chandlerc at google.com > wrote:
> 
> > >
> 
> > > > >>> On Wed, Feb 24, 2016 at 7:34 PM Duncan P. N.
Exon Smith
> 
> > >
> 
> > > > >>> < dexonsmith at apple.com > wrote:
> 
> > >
> 
> > > > >>>>
> 
> > >
> 
> > > > >>>>> On 2016-Feb-24, at 19:17, Chandler
Carruth <
> 
> > > > >>>>> chandlerc at google.com > wrote:
> 
> > >
> 
> > > > >>>>>
> 
> > >
> 
> > > > >>>>> On Wed, Feb 24, 2016 at 7:10 PM Sanjoy
Das via llvm-dev
> 
> > >
> 
> > > > >>>>> < llvm-dev at lists.llvm.org >
wrote:
> 
> > >
> 
> > > > >>>>> On Wed, Feb 24, 2016 at 6:51 PM, Duncan
P. N. Exon Smith
> 
> > >
> 
> > > > >>>>> < dexonsmith at apple.com >
wrote:
> 
> > >
> 
> > > > >>>>>>> If we do not inline @foo(), and
instead re-link the
> > > > >>>>>>> call
> 
> > > > >>>>>>> site
> 
> > > > >>>>>>> in
> 
> > >
> 
> > > > >>>>>>> @main
> 
> > >
> 
> > > > >>>>>>> to some non-optimized copy (or
differently optimized
> > > > >>>>>>> copy)
> 
> > > > >>>>>>> of
> 
> > > > >>>>>>> @foo,
> 
> > >
> 
> > > > >>>>>>> then it is possible for the
program to have the
> > > > >>>>>>> behavior
> 
> > > > >>>>>>> {print("Y");
> 
> > >
> 
> > > > >>>>>>> print ("X")}, which
was disallowed in the earlier
> > > > >>>>>>> program.
> 
> > >
> 
> > > > >>>>>>>
> 
> > >
> 
> > > > >>>>>>> In other words, opt refined the
semantics of @foo()
> > > > >>>>>>> (i.e.
> 
> > > > >>>>>>> reduced the
> 
> > >
> 
> > > > >>>>>>> set of behaviors it may have)
in ways that would make
> > > > >>>>>>> later
> 
> > >
> 
> > > > >>>>>>> optimizations invalid if we
de-refine the
> > > > >>>>>>> implementation
> > > > >>>>>>> of
> 
> > > > >>>>>>> @foo().
> 
> > >
> 
> > > > >>>>>> I'm probably missing something
obvious here. How could
> > > > >>>>>> the
> 
> > > > >>>>>> result of
> 
> > >
> 
> > > > >>>>>> `%t0 != %t1` be different at
optimization time in one
> > > > >>>>>> file
> 
> > > > >>>>>> than from
> 
> > >
> 
> > > > >>>>>> runtime in the "real"
implementation? Doesn't this make
> > > > >>>>>> the
> 
> > > > >>>>>> CSE
> 
> > >
> 
> > > > >>>>>> invalid?
> 
> > >
> 
> > > > >>>>> `%t0` and `%t1` are "allowed"
to "always be the same",
> > > > >>>>> i.e.
> 
> > > > >>>>> an
> 
> > >
> 
> > > > >>>>> implementation of @foo that always
feeds in the same
> 
> > >
> 
> > > > >>>>> value for `%t0` and `%t1` is a valid
implementation
> > > > >>>>> (which
> > > > >>>>> is
> 
> > > > >>>>> why the
> 
> > >
> 
> > > > >>>>> CSE was valid); but it is not the
*only* valid
> 
> > > > >>>>> implementation.
> 
> > > > >>>>> If I
> 
> > >
> 
> > > > >>>>> don't CSE the two load instructions
(also a valid thing
> > > > >>>>> to
> 
> > > > >>>>> do),
> 
> > > > >>>>> and
> 
> > >
> 
> > > > >>>>> this is a second thread writing to
`%par`, then the two
> 
> > > > >>>>> values
> 
> > > > >>>>> loaded
> 
> > >
> 
> > > > >>>>> can be different, and you could end up
printing `"X"` in
> 
> > > > >>>>> `@foo`.
> 
> > >
> 
> > > > >>>>>
> 
> > >
> 
> > > > >>>>> Did that make sense?
> 
> > >
> 
> > > > >>>> Yes. To be sure I understand the scope:
this is only a
> > > > >>>> problem
> 
> > > > >>>> for
> 
> > >
> 
> > > > >>>> atomics, correct? (Because multi-threaded
behaviour with
> > > > >>>> other
> 
> > > > >>>> globals
> 
> > >
> 
> > > > >>>> is UB?)
> 
> > >
> 
> > > > >>>>
> 
> > >
> 
> > > > >>>>>> Does linkonce_odr linkage have the
same problem?
> 
> > >
> 
> > > > >>>>>> - If so, do you want to change it
too?
> 
> > >
> 
> > > > >>>>>> - Else, why not?
> 
> > >
> 
> > > > >>>>> Going by the specification in the
LangRef, I'd say it
> > > > >>>>> depends
> 
> > > > >>>>> on how
> 
> > >
> 
> > > > >>>>> you define "definitive". If
you're allowed to replace the
> 
> > > > >>>>> body
> 
> > > > >>>>> of a
> 
> > >
> 
> > > > >>>>> function with a differently optimized
body, then the
> > > > >>>>> above
> 
> > > > >>>>> problem
> 
> > >
> 
> > > > >>>>> exists.
> 
> > >
> 
> > > > >>>>>
> 
> > >
> 
> > > > >>>>> I believe that is the case, and I
strongly believe the
> 
> > > > >>>>> problem
> 
> > > > >>>>> you
> 
> > >
> 
> > > > >>>>> outline exists for linkonce_odr exactly
as it does for
> 
> > > > >>>>> available_externally.
> 
> > >
> 
> > > > >>>>>
> 
> > >
> 
> > > > >>>>> Which is what makes this scary: every
C++ inline function
> 
> > > > >>>>> today
> 
> > > > >>>>> can
> 
> > >
> 
> > > > >>>>> trigger this.
> 
> > >
> 
> > > > >>>> Every C/C++ inline or template function.
But only the ones
> 
> > > > >>>> that
> 
> > > > >>>> use
> 
> > >
> 
> > > > >>>> atomics, right?
> 
> > >
> 
> > > > >>>
> 
> > >
> 
> > > > >>> Well, with *this* example...
> 
> > >
> 
> > > > >> Atomic are one source of non-determinism that
compilers can
> 
> > > > >> reason
> 
> > >
> 
> > > > >> about. I don't know if the following snippet is
well defined
> > > > >> or
> 
> > > > >> not,
> 
> > >
> 
> > > > >> but you could have similar issues with
> 
> > >
> 
> > > > >>
> 
> > >
> 
> > > > >>
> 
> > >
> 
> > > > >> void foo() {
> 
> > >
> 
> > > > >> int *p = malloc(sizeof(int));
> 
> > >
> 
> > > > >> if (*p < 10) print("X");
> 
> > >
> 
> > > > >> }
> 
> > >
> 
> > > > >>
> 
> > >
> 
> > > > >> or (again, I don't know if this is actually
well defined)
> 
> > >
> 
> > > > >>
> 
> > >
> 
> > > > >> void foo() {
> 
> > >
> 
> > > > >> int t; // it is probably reasonable to fold
compares with
> 
> > >
> 
> > > > >> ptrtoint(alloca) to undef
> 
> > >
> 
> > > > >> if ((intptr_t)(&t) < 10)
print("X");
> 
> > >
> 
> > > > >> }
> 
> > >
> 
> > > > >>
> 
> > >
> 
> > > > > The first one at least is UB, but as Richard pointed
out the
> 
> > > > > scope
> 
> > >
> 
> > > > > is certainly broader than atomics (it's not even
just
> 
> > > > > well-defined
> 
> > >
> 
> > > > > non-deterministism).
> 
> > >
> 
> > > > >
> 
> > >
> 
> > > > > I'm kind of terrified by the implications.
> 
> > >
> 
> > > > Me too. :(
> 
> > >
> 
> > > > >
> 
> > >
> 
> > > > >> -- Sanjoy
> 
> > >
> 
> > > > >>
> 
> > >
> 
> > > > >>>>
> 
> > >
> 
> > > > >>>> Not that I'm sure that will end up
being a helpful
> 
> > > > >>>> distinction.
> 
> > >
> 
> > > > >>>
> 
> > >
> 
> > > > >>> Right. See Richard's comment. I think that
sums up the real
> 
> > > > >>> issue
> 
> > > > >>> here. =/
> 
> > >
> 
> > > > > _______________________________________________
> 
> > >
> 
> > > > > LLVM Developers mailing list
> 
> > >
> 
> > > > > llvm-dev at lists.llvm.org
> 
> > >
> 
> > > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
> > >
> 
> > > > _______________________________________________
> 
> > >
> 
> > > > LLVM Developers mailing list
> 
> > >
> 
> > > > llvm-dev at lists.llvm.org
> 
> > >
> 
> > > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
> > >
> 
> > > _______________________________________________
> 
> > > LLVM Developers mailing list
> 
> > > llvm-dev at lists.llvm.org
> 
> > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
> > --
> 
> > --
> 
> > Hal Finkel
> 
> > Assistant Computational Scientist
> 
> > Leadership Computing Facility
> 
> > Argonne National Laboratory
> 
-- 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Sanjoy Das via llvm-dev

2016-Feb-25 06:25 UTC

head link

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

Hal Finkel wrote:

 > But it is not all optimizations that are the problem. Rather, it
 > seems like a select few (e.g. things involving collapsing allowed
 > non-determinism in atomics), and losing those optimizations seems
 > better than generally losing function-attribute deduction.

If we go by the langref, then optimizations that fold undef are also
problematic (though most C/C++ programs resulting in such IR would
have UB in practice).

I think there are non-(non-deterministic) problematic cases too.  The
following won't happen today since `readnone` does not imply
`safe_to_speculate`, but if we add a `safe_to_speculate` property some
day:

   int foo(bool C) available_externally {
     if (C)
       ((int *) null)++; // UB
     ret 42;
   }

   void bar() {
     if (<some cond>)
      foo(true);
   }

Now, normally you can just delete the `if (C)` branch in foo, and it
would become just a `ret 42`, and would look like it is speculatable
above the `<some cond>` check.  But if you then link with an -O0
version, you'll have introduced UB if `<some cond>` is always false at
runtime.

Today this won't happen since we don't speculate `readnone nounwind`
functions, but could become a problem in the future.

-- Sanjoy

Chandler Carruth via llvm-dev

2016-Feb-25 06:40 UTC

head link

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

On Wed, Feb 24, 2016 at 10:25 PM Sanjoy Das <sanjoy at
playingwithpointers.com>
wrote:
>
> Hal Finkel wrote:
>
>  > But it is not all optimizations that are the problem. Rather, it
>  > seems like a select few (e.g. things involving collapsing allowed
>  > non-determinism in atomics), and losing those optimizations seems
>  > better than generally losing function-attribute deduction.
>
> If we go by the langref, then optimizations that fold undef are also
> problematic (though most C/C++ programs resulting in such IR would
> have UB in practice).
>
I'm also concerned about undef. It isn't clear that we can reasonably
ensure that all things involving undef suitably stem from UB in the face of
speculation to preclude all "impossible" results coming from
interposition.

I think trying to enumerate the function-local optimizations which are
unsafe is *much* more challenging than identifying the interprocedural
optimizations which are unsafe, which is essentially what disabling
function attribute deduction is doing.

>
> I think there are non-(non-deterministic) problematic cases too.  The
> following won't happen today since `readnone` does not imply
> `safe_to_speculate`, but if we add a `safe_to_speculate` property some
> day:
>
>    int foo(bool C) available_externally {
>      if (C)
>        ((int *) null)++; // UB
>      ret 42;
>    }
>
>    void bar() {
>      if (<some cond>)
>       foo(true);
>    }
>
> Now, normally you can just delete the `if (C)` branch in foo, and it
> would become just a `ret 42`, and would look like it is speculatable
> above the `<some cond>` check.  But if you then link with an -O0
> version, you'll have introduced UB if `<some cond>` is always
false at
> runtime.
>
> Today this won't happen since we don't speculate `readnone
nounwind`
> functions, but could become a problem in the future.
>
All of this makes me think we're just seeing the tip of the iceberg of ways
in which local optimizations can combine with attributes to break this. =/
I really think the problem is as Richard describes the interprocedural
movement of information across a comdat-like boundary where replacements
may occur with differently transformed code.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160225/c2f8883d/attachment.html>

Hal Finkel via llvm-dev

2016-Feb-25 06:56 UTC

head link

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

----- Original Message -----> From: "Sanjoy Das" <sanjoy at playingwithpointers.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "Chandler Carruth" <chandlerc at google.com>,
"llvm-dev" <llvm-dev at lists.llvm.org>, "Philip
Reames"
> <listmail at philipreames.com>, "Duncan P. N. Exon Smith"
<dexonsmith at apple.com>
> Sent: Thursday, February 25, 2016 12:25:54 AM
> Subject: Re: [llvm-dev] Possible soundness issue with available_externally
(split from "RFC: Add guard intrinsics")
> 
> 
> Hal Finkel wrote:
> 
>  > But it is not all optimizations that are the problem. Rather, it
>  > seems like a select few (e.g. things involving collapsing allowed
>  > non-determinism in atomics), and losing those optimizations seems
>  > better than generally losing function-attribute deduction.
> 
> If we go by the langref, then optimizations that fold undef are also
> problematic (though most C/C++ programs resulting in such IR would
> have UB in practice).
If the undef is folded to some concrete value (instead of just being
propagated), then yes, I agree. We really should be propagating the undef,
however, right?
> 
> I think there are non-(non-deterministic) problematic cases too.  The
> following won't happen today since `readnone` does not imply
> `safe_to_speculate`, but if we add a `safe_to_speculate` property
> some
> day:
> 
>    int foo(bool C) available_externally {
>      if (C)
>        ((int *) null)++; // UB
>      ret 42;
>    }
> 
>    void bar() {
>      if (<some cond>)
>       foo(true);
>    }
> 
> Now, normally you can just delete the `if (C)` branch in foo, and it
> would become just a `ret 42`, and would look like it is speculatable
> above the `<some cond>` check.  But if you then link with an -O0
> version, you'll have introduced UB if `<some cond>` is always
false
> at
> runtime.
So this is a good point, but I'm not sure how much to generalize this
example. When we add a safe_to_speculate attribute, we'll need to keep this
in mind (special care must be taken in such non-definitive-definition contexts).

 -Hal
> 
> Today this won't happen since we don't speculate `readnone
nounwind`
> functions, but could become a problem in the future.
> 
> -- Sanjoy
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Maybe Matching Threads

Search for more reasonably related threads

llvm dev - Feb 2016 - Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

Maybe Matching Threads