thr3ads.net - llvm dev - [llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics") [Feb 2016]

If this information is useful, please help other people find it:
Share via:

Hal Finkel via llvm-dev

2016-Feb-27 05:35 UTC

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

----- Original Message -----
> From: "Chandler Carruth" <chandlerc at google.com>
> To: "Xinliang David Li" <xinliangli at gmail.com>
> Cc: "Hal Finkel" <hfinkel at anl.gov>, "llvm-dev"
> <llvm-dev at lists.llvm.org>, "Philip Reames"
> <listmail at philipreames.com>, "Duncan P. N. Exon Smith"
> <dexonsmith at apple.com>, "Sanjoy Das"
> <sanjoy at playingwithpointers.com>
> Sent: Friday, February 26, 2016 10:43:30 PM
> Subject: Re: [llvm-dev] Possible soundness issue with
> available_externally (split from "RFC: Add guard intrinsics")
> On Fri, Feb 26, 2016 at 8:40 PM Xinliang David Li <
> xinliangli at gmail.com > wrote:
> > On Fri, Feb 26, 2016 at 8:30 PM, Chandler Carruth <
> > chandlerc at google.com > wrote:
> 
> > > On Fri, Feb 26, 2016 at 8:15 PM Hal Finkel < hfinkel at
anl.gov >
> > > wrote:
> > 
> 
> > > > Can you explain your concerns about reliability? My idea is
> > > > that
> > > > we
> > > > have a kind of delayed-name comdat section attribute that
says
> > > > that
> > > > the name of the comdat section into which the function will
be
> > > > placed will be based on the hash of the contents (which
should
> > > > probably be the pre-isel IR-level contexts plus perhaps the
> > > > other
> > > > state that might affect backend optimization). Maybe just
using
> > > > the
> > > > MI-level function contents will work too. It seems like this
> > > > should
> > > > be "perfect". I don't see the circumstances
why function
> > > > definitions
> > > > optimized under identical optimization settings should
differ
> > > > in
> > > > this sense.
> > > 
> > 
> 
> > > Ok, let's consider Sanjoy's latest post.
> > 
> 
> > > I think we could end up with this routine in N different modules
> > > with
> > > slightly different amounts of information in each. Each of these
> > > in
> > > turn could cause us to deduce a slightly different set of
> > > attributes, and thus compute a different hash.
> > 
> 
> > > Now, certainly, there are a small number of variants here. Two,
> > > maybe
> > > three. But this could in the worst case cause us to have two
> > > copies
> > > of a *huge* number of functions.
> > 
> 
> > > My concern here is not that this is a *theoretically* hard
> > > problem
> > > to
> > > solve. I think there are good techniques to handle this. My
> > > concern
> > > is that it will be a substantial amount of work and require a
> > > reasonable amount of testing and experimentation.
> > 
> 
> > Such data won't be too hard to find out. It can be conservatively
> > estimated by looking at the final binary code for each COMDAT to
> > see
> > how many 'clones's are needed due to different code gen -- the
> > result is the upper bound as different code gen does not mean
> > different deduced function attributes.
> 
> > With PGO, the decision process will also become much simpler.
> 
> Sure. And if you want to work on privatization in LLVM's IPO, that
> would be awesome.
> But I think it is completely unreasonable to gate checking in code to
> test for the invalid transformation until this new optimization
> technique is available. We should be fixing the bug now, and then
> figuring out how to do privatization.In any case, please don't misunderstand. I'm not trying to gate bug
fixes on significant new optimization features. I was (and am) suggesting that
we not rush to just turning things off until we've given ourselves some time
(at least a few days) to understand the extent of the problem and the options
available for fixing it, and have a plan for doing so without introducing (or at
least soon recovering from) significant regressions (if at all possible).
Turning things off is always an easy (and available) solution, and the strength
of this project is that we're generally hesitant to use it. This is a pretty
fundamental design problem that was uncovered "by inspection", if this
gets fixed tomorrow or two weeks from now is not a significant distinction.
Every time I've pushed back on this the resulting discussion has been very
productive (IMHO), and I'm glad I did. And, as I've stated, given that
we do have solutions on the table that can fix the problem without overt
performance regressions (i.e. aggressive privatization), we should prefer to use
them unless they prove impractically complicated for short-term implementation.

And I think this issue is worth a significant amount of discussion. Every C++
inline method, constructors, etc. all end up in comdat sections. To pick one
example, for almost all of these, they take a "this" pointer, and we
very commonly prove that we don't capture the "this" pointer. I
suspect that giving up that alone is going to be a significant problem.

-Hal 
> -Chandler-- 

Hal Finkel 
Assistant Computational Scientist 
Leadership Computing Facility 
Argonne National Laboratory 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160226/a7728eed/attachment.html>

Sean Silva via llvm-dev

2016-Feb-27 05:43 UTC

head link

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

On Fri, Feb 26, 2016 at 9:35 PM, Hal Finkel via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
>
> ------------------------------
>
> *From: *"Chandler Carruth" <chandlerc at google.com>
> *To: *"Xinliang David Li" <xinliangli at gmail.com>
> *Cc: *"Hal Finkel" <hfinkel at anl.gov>,
"llvm-dev" <llvm-dev at lists.llvm.org>,
> "Philip Reames" <listmail at philipreames.com>,
"Duncan P. N. Exon Smith" <
> dexonsmith at apple.com>, "Sanjoy Das" <sanjoy at
playingwithpointers.com>
> *Sent: *Friday, February 26, 2016 10:43:30 PM
> *Subject: *Re: [llvm-dev] Possible soundness issue with
> available_externally (split from "RFC: Add guard intrinsics")
>
> On Fri, Feb 26, 2016 at 8:40 PM Xinliang David Li <xinliangli at
gmail.com>
> wrote:
>
>> On Fri, Feb 26, 2016 at 8:30 PM, Chandler Carruth <chandlerc at
google.com>
>> wrote:
>>
>>> On Fri, Feb 26, 2016 at 8:15 PM Hal Finkel <hfinkel at
anl.gov> wrote:
>>>
>>>> Can you explain your concerns about reliability? My idea is
that we
>>>> have a kind of delayed-name comdat section attribute that says
that the
>>>> name of the comdat section into which the function will be
placed will be
>>>> based on the hash of the contents (which should probably be the
pre-isel
>>>> IR-level contexts plus perhaps the other state that might
affect backend
>>>> optimization). Maybe just using the MI-level function contents
will work
>>>> too. It seems like this should be "perfect". I
don't see the circumstances
>>>> why function definitions optimized under identical optimization
settings
>>>> should differ in this sense.
>>>>
>>>
>>> Ok, let's consider Sanjoy's latest post.
>>>
>>> I think we could end up with this routine in N different modules
with
>>> slightly different amounts of information in each. Each of these in
turn
>>> could cause us to deduce a slightly different set of attributes,
and thus
>>> compute a different hash.
>>>
>>> Now, certainly, there are a small number of variants here. Two,
maybe
>>> three. But this could in the worst case cause us to have two copies
of a
>>> *huge* number of functions.
>>>
>>> My concern here is not that this is a *theoretically* hard problem
to
>>> solve. I think there are good techniques to handle this. My concern
is that
>>> it will be a substantial amount of work and require a reasonable
amount of
>>> testing and experimentation.
>>>
>>
>> Such data won't be too hard to find out. It can be conservatively
>> estimated by looking at the final binary code for each COMDAT to see
how
>> many 'clones's are needed due to different code gen -- the
result is the
>> upper bound as different code gen does not mean different deduced
function
>> attributes.
>>
>> With PGO, the decision process will also become much simpler.
>>
>
> Sure. And if you want to work on privatization in LLVM's IPO, that
would
> be awesome.
>
> But I think it is completely unreasonable to gate checking in code to test
> for the invalid transformation until this new optimization technique is
> available. We should be fixing the bug now, and then figuring out how to do
> privatization.
>
>
> In any case, please don't misunderstand. I'm not trying to gate bug
fixes
> on significant new optimization features. I was (and am) suggesting that we
> not rush to just turning things off until we've given ourselves some
time
> (at least a few days) to understand the extent of the problem and the
> options available for fixing it, and have a plan for doing so without
> introducing (or at least soon recovering from) significant regressions (if
> at all possible).
>
Yeah. If we have survived for years with this issue present, I think we can
take the time to discuss a bit more considering the potential implications.

I think we probably want some quantitative information from at least one
practical codebase before diving into anything that could have substantial
performance regressions. E.g. if we found out tomorrow that due to a typo
we have actually had fast-math on for everything all the time for years
now, we would want to fix that obviously, but it's worth thinking a bit
about (how have we survived so long without this coming to our attention
through a bug report?). David and Andy clearly do have experience seeing
bugs for this in practice. Why haven't we? (or have we but just never
connected the dots?)

-- Sean Silva

> Turning things off is always an easy (and available) solution, and the
> strength of this project is that we're generally hesitant to use it.
This
> is a pretty fundamental design problem that was uncovered "by
inspection",
> if this gets fixed tomorrow or two weeks from now is not a significant
> distinction. Every time I've pushed back on this the resulting
discussion
> has been very productive (IMHO), and I'm glad I did. And, as I've
stated,
> given that we do have solutions on the table that can fix the problem
> without overt performance regressions (i.e. aggressive privatization), we
> should prefer to use them unless they prove impractically complicated for
> short-term implementation.
>
> And I think this issue is worth a significant amount of discussion. Every
> C++ inline method, constructors, etc. all end up in comdat sections. To
> pick one example, for almost all of these, they take a "this"
pointer, and
> we very commonly prove that we don't capture the "this"
pointer. I suspect
> that giving up that alone is going to be a significant problem.
>
>  -Hal
>
>
> -Chandler
>
>
>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160226/5a67cd9b/attachment.html>

Sanjoy Das via llvm-dev

2016-Feb-27 05:53 UTC

head link

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

On Fri, Feb 26, 2016 at 9:43 PM, Sean Silva via llvm-dev
<llvm-dev at lists.llvm.org> wrote:>
>
> On Fri, Feb 26, 2016 at 9:35 PM, Hal Finkel via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>>
>>
>> ________________________________
>>
>> From: "Chandler Carruth" <chandlerc at google.com>
>> To: "Xinliang David Li" <xinliangli at gmail.com>
>> Cc: "Hal Finkel" <hfinkel at anl.gov>,
"llvm-dev" <llvm-dev at lists.llvm.org>,
>> "Philip Reames" <listmail at philipreames.com>,
"Duncan P. N. Exon Smith"
>> <dexonsmith at apple.com>, "Sanjoy Das" <sanjoy at
playingwithpointers.com>
>> Sent: Friday, February 26, 2016 10:43:30 PM
>> Subject: Re: [llvm-dev] Possible soundness issue with
available_externally
>> (split from "RFC: Add guard intrinsics")
>>
>> On Fri, Feb 26, 2016 at 8:40 PM Xinliang David Li <xinliangli at
gmail.com>
>> wrote:
>>>
>>> On Fri, Feb 26, 2016 at 8:30 PM, Chandler Carruth <chandlerc at
google.com>
>>> wrote:
>>>>
>>>> On Fri, Feb 26, 2016 at 8:15 PM Hal Finkel <hfinkel at
anl.gov> wrote:
>>>>>
>>>>> Can you explain your concerns about reliability? My idea is
that we
>>>>> have a kind of delayed-name comdat section attribute that
says that the name
>>>>> of the comdat section into which the function will be
placed will be based
>>>>> on the hash of the contents (which should probably be the
pre-isel IR-level
>>>>> contexts plus perhaps the other state that might affect
backend
>>>>> optimization). Maybe just using the MI-level function
contents will work
>>>>> too. It seems like this should be "perfect". I
don't see the circumstances
>>>>> why function definitions optimized under identical
optimization settings
>>>>> should differ in this sense.
>>>>
>>>>
>>>> Ok, let's consider Sanjoy's latest post.
>>>>
>>>> I think we could end up with this routine in N different
modules with
>>>> slightly different amounts of information in each. Each of
these in turn
>>>> could cause us to deduce a slightly different set of
attributes, and thus
>>>> compute a different hash.
>>>>
>>>> Now, certainly, there are a small number of variants here. Two,
maybe
>>>> three. But this could in the worst case cause us to have two
copies of a
>>>> *huge* number of functions.
>>>>
>>>> My concern here is not that this is a *theoretically* hard
problem to
>>>> solve. I think there are good techniques to handle this. My
concern is that
>>>> it will be a substantial amount of work and require a
reasonable amount of
>>>> testing and experimentation.
>>>
>>>
>>> Such data won't be too hard to find out. It can be
conservatively
>>> estimated by looking at the final binary code for each COMDAT to
see how
>>> many 'clones's are needed due to different code gen -- the
result is the
>>> upper bound as different code gen does not mean different deduced
function
>>> attributes.
>>>
>>> With PGO, the decision process will also become much simpler.
>>
>>
>> Sure. And if you want to work on privatization in LLVM's IPO, that
would
>> be awesome.
>>
>> But I think it is completely unreasonable to gate checking in code to
test
>> for the invalid transformation until this new optimization technique is
>> available. We should be fixing the bug now, and then figuring out how
to do
>> privatization.
>>
>>
>> In any case, please don't misunderstand. I'm not trying to gate
bug fixes
>> on significant new optimization features. I was (and am) suggesting
that we
>> not rush to just turning things off until we've given ourselves
some time
>> (at least a few days) to understand the extent of the problem and the
>> options available for fixing it, and have a plan for doing so without
>> introducing (or at least soon recovering from) significant regressions
(if
>> at all possible).
>
>
> Yeah. If we have survived for years with this issue present, I think we can
> take the time to discuss a bit more considering the potential implications.
>
> I think we probably want some quantitative information from at least one
> practical codebase before diving into anything that could have substantial
> performance regressions. E.g. if we found out tomorrow that due to a typo
we
> have actually had fast-math on for everything all the time for years now,
we
> would want to fix that obviously, but it's worth thinking a bit about
(how
> have we survived so long without this coming to our attention through a bug
> report?). David and Andy clearly do have experience seeing bugs for this in
> practice. Why haven't we? (or have we but just never connected the
dots?)
We have, e.g. http://reviews.llvm.org/rL192302 (the only one I can find).

-- Sanjoy

Mehdi Amini via llvm-dev

2016-Feb-29 21:30 UTC

head link

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

> On Feb 26, 2016, at 9:35 PM, Hal Finkel via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> 
> From: "Chandler Carruth" <chandlerc at google.com>
> To: "Xinliang David Li" <xinliangli at gmail.com>
> Cc: "Hal Finkel" <hfinkel at anl.gov>, "llvm-dev"
<llvm-dev at lists.llvm.org>, "Philip Reames" <listmail at
philipreames.com>, "Duncan P. N. Exon Smith" <dexonsmith at
apple.com>, "Sanjoy Das" <sanjoy at playingwithpointers.com>
> Sent: Friday, February 26, 2016 10:43:30 PM
> Subject: Re: [llvm-dev] Possible soundness issue with available_externally
(split from "RFC: Add guard intrinsics")
> 
> On Fri, Feb 26, 2016 at 8:40 PM Xinliang David Li <xinliangli at
gmail.com <mailto:xinliangli at gmail.com>> wrote:
> On Fri, Feb 26, 2016 at 8:30 PM, Chandler Carruth <chandlerc at
google.com <mailto:chandlerc at google.com>> wrote:
> On Fri, Feb 26, 2016 at 8:15 PM Hal Finkel <hfinkel at anl.gov
<mailto:hfinkel at anl.gov>> wrote:
> Can you explain your concerns about reliability? My idea is that we have a
kind of delayed-name comdat section attribute that says that the name of the
comdat section into which the function will be placed will be based on the hash
of the contents (which should probably be the pre-isel IR-level contexts plus
perhaps the other state that might affect backend optimization). Maybe just
using the MI-level function contents will work too. It seems like this should be
"perfect". I don't see the circumstances why function definitions
optimized under identical optimization settings should differ in this sense.
> 
> Ok, let's consider Sanjoy's latest post.
> 
> I think we could end up with this routine in N different modules with
slightly different amounts of information in each. Each of these in turn could
cause us to deduce a slightly different set of attributes, and thus compute a
different hash.
> 
> Now, certainly, there are a small number of variants here. Two, maybe
three. But this could in the worst case cause us to have two copies of a *huge*
number of functions.
> 
> My concern here is not that this is a *theoretically* hard problem to
solve. I think there are good techniques to handle this. My concern is that it
will be a substantial amount of work and require a reasonable amount of testing
and experimentation.
> 
> Such data won't be too hard to find out. It can be conservatively
estimated by looking at the final binary code for each COMDAT to see how many
'clones's are needed due to different code gen -- the result is the
upper bound as different code gen does not mean different deduced function
attributes.
> 
> With PGO, the decision process will also become much simpler.
> 
> Sure. And if you want to work on privatization in LLVM's IPO, that
would be awesome.
> 
> But I think it is completely unreasonable to gate checking in code to test
for the invalid transformation until this new optimization technique is
available. We should be fixing the bug now, and then figuring out how to do
privatization.
> 
> In any case, please don't misunderstand. I'm not trying to gate bug
fixes on significant new optimization features. I was (and am) suggesting that
we not rush to just turning things off until we've given ourselves some time
(at least a few days) to understand the extent of the problem and the options
available for fixing it, and have a plan for doing so without introducing (or at
least soon recovering from) significant regressions (if at all possible).
Turning things off is always an easy (and available) solution, and the strength
of this project is that we're generally hesitant to use it. This is a pretty
fundamental design problem that was uncovered "by inspection", if this
gets fixed tomorrow or two weeks from now is not a significant distinction.
Every time I've pushed back on this the resulting discussion has been very
productive (IMHO), and I'm glad I did. And, as I've stated, given that
we do have solutions on the table that can fix the problem without overt
performance regressions (i.e. aggressive privatization), we should prefer to use
them unless they prove impractically complicated for short-term implementation.
> 
> And I think this issue is worth a significant amount of discussion. Every
C++ inline method, constructors, etc. all end up in comdat sections. To pick one
example, for almost all of these, they take a "this" pointer, and we
very commonly prove that we don't capture the "this" pointer. I
suspect that giving up that alone is going to be a significant problem.

I'm coming back on this email, since you are concerned about the performance
impact of this (and I think it is an important aspect).
I think it is worth mentioning that if you perform an LTO build (which you will
do if you're serious about performance anyway), then *most* ODR functions
will be internalized and you won't have any performance problem at all.

-- 
Mehdi
 

> 
>  -Hal
> 
> -Chandler
> 
> 
> 
> -- 
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160229/d65a66aa/attachment.html>

Seemingly Similar Threads

Search for more apparently analagous threads

llvm dev - Feb 2016 - Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

Seemingly Similar Threads