thr3ads.net - llvm dev - [llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics") [Feb 2016]

If this information is useful, please help other people find it:
Share via:

Hal Finkel via llvm-dev

2016-Feb-27 03:26 UTC

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

----- Original Message -----> From: "Chandler Carruth" <chandlerc at google.com>
> To: "Hal Finkel" <hfinkel at anl.gov>, "Sanjoy
Das" <sanjoy at playingwithpointers.com>
> Cc: "llvm-dev" <llvm-dev at lists.llvm.org>, "Philip
Reames" <listmail at philipreames.com>, "Duncan P. N. Exon
Smith"
> <dexonsmith at apple.com>, "Xinliang David Li"
<xinliangli at gmail.com>
> Sent: Friday, February 26, 2016 9:01:48 PM
> Subject: Re: [llvm-dev] Possible soundness issue with available_externally
(split from "RFC: Add guard intrinsics")
> 
> 
> I think this will have a much higher cost than my proposal to
> constrain how we deduce function attributes (which still fixes
> Sanjoy's latest example).
> 
> Specifically, I think this will force us to constrain far too many
> transformations for the sake of code size in functions that we won't
> inline. Even if we were never going to deduce function attributes
> for anything in the function (because its big and reads and writes
> everything), we'll still have to constrain our transformations just
> because we *might* later deduce a function attribute that triggers
> these kinds of bugs.
> 
> Essentially, you're proposing to limit intraprocedural optimization
> to when we can successfully to interprocedural optimization
> ("privatization"), where I'm suggesting we limit
interprocedural
> optimization to leave intraprocedural optimization unconstrained.
> Given the ratio of our optimizations (almost all are intra, very few
> are inter), I'm much more comfortable with the latter.
This is a good point; we can certainly (easily) delay the privatization decision
until we modify any IPA-level function information (at which point we can either
reject the attribute change (when optimizing for code size), or keep it locally
(when optimizing for speed). Ideally, you'd want to delay this even further
(until you knew the attribute information was used), but I'm not sure
that's practical.

Actually, what if instead of actually privatizing, we moved the function into a
different comdat section named after some hash of the function body? That way,
if all versions are actually optimized identically, we'll still only end up
with one copy in the final executable. If this is technically possible, it seems
like the best kind of solution.

 -Hal
 > 
> On Fri, Feb 26, 2016 at 6:10 PM Hal Finkel < hfinkel at anl.gov >
wrote:
> 
> 
> Hi Sanjoy,
> 
> These are both very interesting examples, and demonstrate that the
> problems extends beyond function attributes (encompassing
> dead-argument elimination, etc.).
> 
> I'm beginning to think that the best solution, at least when
> optimizing for speed, is the one that David Li suggested: we need to
> internalize functions that have been optimized in certain ways (e.g.
> instructions with potential side effects are removed). The trick
> here may be to be as intelligent about this as possible to minimize
> the effect on code size. Maybe this is as easy as checking whether
> isSafeToSpeculativelyExecute returns false on the deleted
> instruction? Perhaps when optimizing for size, we need to forbid
> such deletions.
> 
> Thanks again,
> Hal
> 
> ----- Original Message -----
> > From: "Sanjoy Das" < sanjoy at playingwithpointers.com
>
> > To: "Hal Finkel" < hfinkel at anl.gov >
> > Cc: "Chandler Carruth" < chandlerc at google.com >,
"llvm-dev" <
> > llvm-dev at lists.llvm.org >, "Philip Reames"
> > < listmail at philipreames.com >, "Duncan P. N. Exon
Smith" <
> > dexonsmith at apple.com >
> > Sent: Thursday, February 25, 2016 11:59:27 AM
> > Subject: Re: [llvm-dev] Possible soundness issue with
> > available_externally (split from "RFC: Add guard
intrinsics")
> > 
> > Couple of other examples:
> > 
> > void @foo(i32* %ptr) available_externally {
> > %discard = load i32, i32* %ptr
> > }
> > void bar() {
> > call @foo(i32* %x)
> > }
> > 
> > ==>
> > 
> > void @foo(i32* %ptr) available_externally {
> > }
> > void bar() {
> > call @foo(i32* %x)
> > }
> > 
> > ==>
> > 
> > void @foo(i32* %ptr) available_externally {
> > }
> > void bar() {
> > call @foo(i32* undef) ;; non optimized @foo will crash
> > }
> > 
> > ;; Similar example if @foo was dividing something by an integer
> > ;; argument
> > 
> > We've actually seen the above in our VM (though back then we
> > didn't realize that the problem was more general than the one
> > case above).
> > 
> > Another one involving `undef` (semantically same as "folding
> > undef",
> > but different enough to state separately):
> > 
> > void @foo(i32* %ptr) available_externally {
> > store i32 undef, i32* %ptr
> > }
> > void bar() {
> > %val = load i32, i32* %x
> > call @foo(i32* %x)
> > }
> > 
> > ==>
> > 
> > void @foo(i32* %ptr) readonly available_externally {
> > }
> > void bar() {
> > %val = load i32, i32* %x
> > call @foo(i32* %x)
> > }
> > 
> > ==>
> > 
> > void @foo(i32* %ptr) readonly available_externally {
> > }
> > void bar() {
> > call @foo(i32* %x)
> > %val = load i32, i32* %x
> > }
> > 
> > With a non-optimized @foo, %val can be garbage.
> > 
> > 
> > I'll also note we've not really had bug reports (that I'm
aware of)
> > around this issue. Given that, it is possible that this is a purely
> > theoretical problem.
> > 
> > -- Sanjoy
> > 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Chandler Carruth via llvm-dev

2016-Feb-27 03:33 UTC

head link

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

On Fri, Feb 26, 2016 at 7:26 PM Hal Finkel <hfinkel at anl.gov> wrote:
> ----- Original Message -----
> > From: "Chandler Carruth" <chandlerc at google.com>
> > To: "Hal Finkel" <hfinkel at anl.gov>, "Sanjoy
Das" <
> sanjoy at playingwithpointers.com>
> > Cc: "llvm-dev" <llvm-dev at lists.llvm.org>,
"Philip Reames" <
> listmail at philipreames.com>, "Duncan P. N. Exon Smith"
> > <dexonsmith at apple.com>, "Xinliang David Li"
<xinliangli at gmail.com>
> > Sent: Friday, February 26, 2016 9:01:48 PM
> > Subject: Re: [llvm-dev] Possible soundness issue with
> available_externally (split from "RFC: Add guard intrinsics")
> >
> >
> > I think this will have a much higher cost than my proposal to
> > constrain how we deduce function attributes (which still fixes
> > Sanjoy's latest example).
> >
> > Specifically, I think this will force us to constrain far too many
> > transformations for the sake of code size in functions that we
won't
> > inline. Even if we were never going to deduce function attributes
> > for anything in the function (because its big and reads and writes
> > everything), we'll still have to constrain our transformations
just
> > because we *might* later deduce a function attribute that triggers
> > these kinds of bugs.
> >
> > Essentially, you're proposing to limit intraprocedural
optimization
> > to when we can successfully to interprocedural optimization
> > ("privatization"), where I'm suggesting we limit
interprocedural
> > optimization to leave intraprocedural optimization unconstrained.
> > Given the ratio of our optimizations (almost all are intra, very few
> > are inter), I'm much more comfortable with the latter.
>
> This is a good point; we can certainly (easily) delay the privatization
> decision until we modify any IPA-level function information (at which point
> we can either reject the attribute change (when optimizing for code size),
> or keep it locally (when optimizing for speed). Ideally, you'd want to
> delay this even further (until you knew the attribute information was
> used), but I'm not sure that's practical.
>
> Actually, what if instead of actually privatizing, we moved the function
> into a different comdat section named after some hash of the function body?
> That way, if all versions are actually optimized identically, we'll
still
> only end up with one copy in the final executable. If this is technically
> possible, it seems like the best kind of solution.
>
This is how I want to do a revamped function merging anyways and it would
fall out naturally of that.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160227/ccac77f5/attachment.html>

Hal Finkel via llvm-dev

2016-Feb-27 03:38 UTC

head link

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

----- Original Message -----
> From: "Chandler Carruth" <chandlerc at google.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "llvm-dev" <llvm-dev at lists.llvm.org>, "Philip
Reames"
> <listmail at philipreames.com>, "Duncan P. N. Exon Smith"
> <dexonsmith at apple.com>, "Xinliang David Li"
<xinliangli at gmail.com>,
> "Sanjoy Das" <sanjoy at playingwithpointers.com>
> Sent: Friday, February 26, 2016 9:33:55 PM
> Subject: Re: [llvm-dev] Possible soundness issue with
> available_externally (split from "RFC: Add guard intrinsics")
> On Fri, Feb 26, 2016 at 7:26 PM Hal Finkel < hfinkel at anl.gov >
wrote:
> > ----- Original Message -----
> 
> > > From: "Chandler Carruth" < chandlerc at google.com
>
> 
> > > To: "Hal Finkel" < hfinkel at anl.gov >,
"Sanjoy Das" <
> > > sanjoy at playingwithpointers.com >
> 
> > > Cc: "llvm-dev" < llvm-dev at lists.llvm.org >,
"Philip Reames" <
> > > listmail at philipreames.com >, "Duncan P. N. Exon
Smith"
> 
> > > < dexonsmith at apple.com >, "Xinliang David Li"
<
> > > xinliangli at gmail.com >
> 
> > > Sent: Friday, February 26, 2016 9:01:48 PM
> 
> > > Subject: Re: [llvm-dev] Possible soundness issue with
> > > available_externally (split from "RFC: Add guard
intrinsics")
> 
> > >
> 
> > >
> 
> > > I think this will have a much higher cost than my proposal to
> 
> > > constrain how we deduce function attributes (which still fixes
> 
> > > Sanjoy's latest example).
> 
> > >
> 
> > > Specifically, I think this will force us to constrain far too
> > > many
> 
> > > transformations for the sake of code size in functions that we
> > > won't
> 
> > > inline. Even if we were never going to deduce function attributes
> 
> > > for anything in the function (because its big and reads and
> > > writes
> 
> > > everything), we'll still have to constrain our
transformations
> > > just
> 
> > > because we *might* later deduce a function attribute that
> > > triggers
> 
> > > these kinds of bugs.
> 
> > >
> 
> > > Essentially, you're proposing to limit intraprocedural
> > > optimization
> 
> > > to when we can successfully to interprocedural optimization
> 
> > > ("privatization"), where I'm suggesting we limit
interprocedural
> 
> > > optimization to leave intraprocedural optimization unconstrained.
> 
> > > Given the ratio of our optimizations (almost all are intra, very
> > > few
> 
> > > are inter), I'm much more comfortable with the latter.
> 
> > This is a good point; we can certainly (easily) delay the
> > privatization decision until we modify any IPA-level function
> > information (at which point we can either reject the attribute
> > change (when optimizing for code size), or keep it locally (when
> > optimizing for speed). Ideally, you'd want to delay this even
> > further (until you knew the attribute information was used), but
> > I'm
> > not sure that's practical.
> 
> > Actually, what if instead of actually privatizing, we moved the
> > function into a different comdat section named after some hash of
> > the function body? That way, if all versions are actually optimized
> > identically, we'll still only end up with one copy in the final
> > executable. If this is technically possible, it seems like the best
> > kind of solution.
> 
> This is how I want to do a revamped function merging anyways and it
> would fall out naturally of that.Excellent, so let's fix this at the same time we implement this function
merging (so that we don't have performance regressions in an intermediate
state). This will also allow us to have uniform logic across different
optimization levels, which is obviously preferable.

Thanks again, 
Hal 

-- 

Hal Finkel 
Assistant Computational Scientist 
Leadership Computing Facility 
Argonne National Laboratory 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160226/f1841f42/attachment.html>

Xinliang David Li via llvm-dev

2016-Feb-27 03:59 UTC

head link

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

On Fri, Feb 26, 2016 at 7:26 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> ----- Original Message -----
> > From: "Chandler Carruth" <chandlerc at google.com>
> > To: "Hal Finkel" <hfinkel at anl.gov>, "Sanjoy
Das" <
> sanjoy at playingwithpointers.com>
> > Cc: "llvm-dev" <llvm-dev at lists.llvm.org>,
"Philip Reames" <
> listmail at philipreames.com>, "Duncan P. N. Exon Smith"
> > <dexonsmith at apple.com>, "Xinliang David Li"
<xinliangli at gmail.com>
> > Sent: Friday, February 26, 2016 9:01:48 PM
> > Subject: Re: [llvm-dev] Possible soundness issue with
> available_externally (split from "RFC: Add guard intrinsics")
> >
> >
> > I think this will have a much higher cost than my proposal to
> > constrain how we deduce function attributes (which still fixes
> > Sanjoy's latest example).
> >
> > Specifically, I think this will force us to constrain far too many
> > transformations for the sake of code size in functions that we
won't
> > inline. Even if we were never going to deduce function attributes
> > for anything in the function (because its big and reads and writes
> > everything), we'll still have to constrain our transformations
just
> > because we *might* later deduce a function attribute that triggers
> > these kinds of bugs.
> >
> > Essentially, you're proposing to limit intraprocedural
optimization
> > to when we can successfully to interprocedural optimization
> > ("privatization"), where I'm suggesting we limit
interprocedural
> > optimization to leave intraprocedural optimization unconstrained.
> > Given the ratio of our optimizations (almost all are intra, very few
> > are inter), I'm much more comfortable with the latter.
>
> This is a good point; we can certainly (easily) delay the privatization
> decision until we modify any IPA-level function information (at which point
> we can either reject the attribute change (when optimizing for code size),
> or keep it locally (when optimizing for speed). Ideally, you'd want to
> delay this even further (until you knew the attribute information was
> used), but I'm not sure that's practical.
>
> Actually, what if instead of actually privatizing, we moved the function
> into a different comdat section named after some hash of the function body?
> That way, if all versions are actually optimized identically, we'll
still
> only end up with one copy in the final executable. If this is technically
> possible, it seems like the best kind of solution.
>
Funny that you  mention this approach. We are thinking doing this kind of
trick in IR based PGO context -- in order to improve profile matching for
COMDAT functions (when pre-inlining is enabled, the comdat counters are not
guaranteed to match due to possible different inlining ).

David
>
>  -Hal
>
> >
> > On Fri, Feb 26, 2016 at 6:10 PM Hal Finkel < hfinkel at anl.gov
> wrote:
> >
> >
> > Hi Sanjoy,
> >
> > These are both very interesting examples, and demonstrate that the
> > problems extends beyond function attributes (encompassing
> > dead-argument elimination, etc.).
> >
> > I'm beginning to think that the best solution, at least when
> > optimizing for speed, is the one that David Li suggested: we need to
> > internalize functions that have been optimized in certain ways (e.g.
> > instructions with potential side effects are removed). The trick
> > here may be to be as intelligent about this as possible to minimize
> > the effect on code size. Maybe this is as easy as checking whether
> > isSafeToSpeculativelyExecute returns false on the deleted
> > instruction? Perhaps when optimizing for size, we need to forbid
> > such deletions.
> >
> > Thanks again,
> > Hal
> >
> > ----- Original Message -----
> > > From: "Sanjoy Das" < sanjoy at
playingwithpointers.com >
> > > To: "Hal Finkel" < hfinkel at anl.gov >
> > > Cc: "Chandler Carruth" < chandlerc at google.com
>, "llvm-dev" <
> > > llvm-dev at lists.llvm.org >, "Philip Reames"
> > > < listmail at philipreames.com >, "Duncan P. N. Exon
Smith" <
> > > dexonsmith at apple.com >
> > > Sent: Thursday, February 25, 2016 11:59:27 AM
> > > Subject: Re: [llvm-dev] Possible soundness issue with
> > > available_externally (split from "RFC: Add guard
intrinsics")
> > >
> > > Couple of other examples:
> > >
> > > void @foo(i32* %ptr) available_externally {
> > > %discard = load i32, i32* %ptr
> > > }
> > > void bar() {
> > > call @foo(i32* %x)
> > > }
> > >
> > > ==>
> > >
> > > void @foo(i32* %ptr) available_externally {
> > > }
> > > void bar() {
> > > call @foo(i32* %x)
> > > }
> > >
> > > ==>
> > >
> > > void @foo(i32* %ptr) available_externally {
> > > }
> > > void bar() {
> > > call @foo(i32* undef) ;; non optimized @foo will crash
> > > }
> > >
> > > ;; Similar example if @foo was dividing something by an integer
> > > ;; argument
> > >
> > > We've actually seen the above in our VM (though back then we
> > > didn't realize that the problem was more general than the one
> > > case above).
> > >
> > > Another one involving `undef` (semantically same as "folding
> > > undef",
> > > but different enough to state separately):
> > >
> > > void @foo(i32* %ptr) available_externally {
> > > store i32 undef, i32* %ptr
> > > }
> > > void bar() {
> > > %val = load i32, i32* %x
> > > call @foo(i32* %x)
> > > }
> > >
> > > ==>
> > >
> > > void @foo(i32* %ptr) readonly available_externally {
> > > }
> > > void bar() {
> > > %val = load i32, i32* %x
> > > call @foo(i32* %x)
> > > }
> > >
> > > ==>
> > >
> > > void @foo(i32* %ptr) readonly available_externally {
> > > }
> > > void bar() {
> > > call @foo(i32* %x)
> > > %val = load i32, i32* %x
> > > }
> > >
> > > With a non-optimized @foo, %val can be garbage.
> > >
> > >
> > > I'll also note we've not really had bug reports (that
I'm aware of)
> > > around this issue. Given that, it is possible that this is a
purely
> > > theoretical problem.
> > >
> > > -- Sanjoy
> > >
> >
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> >
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160226/faa4723a/attachment.html>

Seemingly Similar Threads

Search for more seemingly similar threads

llvm dev - Feb 2016 - Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

Seemingly Similar Threads