thr3ads.net - llvm dev - [llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics") [Feb 2016]

If this information is useful, please help other people find it:
Share via:

Xinliang David Li via llvm-dev

2016-Feb-27 04:59 UTC

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

On Fri, Feb 26, 2016 at 8:57 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>
>
> ------------------------------
>
> *From: *"Xinliang David Li" <xinliangli at gmail.com>
> *To: *"Chandler Carruth" <chandlerc at google.com>
> *Cc: *"Hal Finkel" <hfinkel at anl.gov>,
"llvm-dev" <llvm-dev at lists.llvm.org>,
> "Philip Reames" <listmail at philipreames.com>,
"Duncan P. N. Exon Smith" <
> dexonsmith at apple.com>, "Sanjoy Das" <sanjoy at
playingwithpointers.com>
> *Sent: *Friday, February 26, 2016 10:48:13 PM
> *Subject: *Re: [llvm-dev] Possible soundness issue with
> available_externally (split from "RFC: Add guard intrinsics")
>
>
>
> On Fri, Feb 26, 2016 at 8:43 PM, Chandler Carruth <chandlerc at
google.com>
> wrote:
>
>> On Fri, Feb 26, 2016 at 8:40 PM Xinliang David Li <xinliangli at
gmail.com>
>> wrote:
>>
>>> On Fri, Feb 26, 2016 at 8:30 PM, Chandler Carruth <chandlerc at
google.com>
>>> wrote:
>>>
>>>> On Fri, Feb 26, 2016 at 8:15 PM Hal Finkel <hfinkel at
anl.gov> wrote:
>>>>
>>>>> Can you explain your concerns about reliability? My idea is
that we
>>>>> have a kind of delayed-name comdat section attribute that
says that the
>>>>> name of the comdat section into which the function will be
placed will be
>>>>> based on the hash of the contents (which should probably be
the pre-isel
>>>>> IR-level contexts plus perhaps the other state that might
affect backend
>>>>> optimization). Maybe just using the MI-level function
contents will work
>>>>> too. It seems like this should be "perfect". I
don't see the circumstances
>>>>> why function definitions optimized under identical
optimization settings
>>>>> should differ in this sense.
>>>>>
>>>>
>>>> Ok, let's consider Sanjoy's latest post.
>>>>
>>>> I think we could end up with this routine in N different
modules with
>>>> slightly different amounts of information in each. Each of
these in turn
>>>> could cause us to deduce a slightly different set of
attributes, and thus
>>>> compute a different hash.
>>>>
>>>> Now, certainly, there are a small number of variants here. Two,
maybe
>>>> three. But this could in the worst case cause us to have two
copies of a
>>>> *huge* number of functions.
>>>>
>>>> My concern here is not that this is a *theoretically* hard
problem to
>>>> solve. I think there are good techniques to handle this. My
concern is that
>>>> it will be a substantial amount of work and require a
reasonable amount of
>>>> testing and experimentation.
>>>>
>>>
>>> Such data won't be too hard to find out. It can be
conservatively
>>> estimated by looking at the final binary code for each COMDAT to
see how
>>> many 'clones's are needed due to different code gen -- the
result is the
>>> upper bound as different code gen does not mean different deduced
function
>>> attributes.
>>>
>>> With PGO, the decision process will also become much simpler.
>>>
>>
>> Sure. And if you want to work on privatization in LLVM's IPO, that
would
>> be awesome.
>>
>> But I think it is completely unreasonable to gate checking in code to
>> test for the invalid transformation until this new optimization
technique
>> is available. We should be fixing the bug now, and then figuring out
how to
>> do privatization.
>>
>
> Isn't privatization one way to get correctness? Other techniques such
as
> hash based de-dup or use PGO etc for tuning are orthognal methods to reduce
> the overall cost of the former.
>
>
> It is, and I agree. However, the code-size growth seems like it could
> easily be unacceptable even for speed-optimized builds without the kind of
> mitigation options mentioned.
>
Except that there might be icache/itlb impact so performance is not totally
immune to size increase -- otherwise we may as well just force inlining
those :)

David

>
>  -Hal
>
>
> David
>
>>
>> -Chandler
>>
>
>
>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160226/aa0be4c8/attachment.html>

Hal Finkel via llvm-dev

2016-Feb-27 05:07 UTC

head link

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

----- Original Message -----
> From: "Xinliang David Li" <xinliangli at gmail.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "llvm-dev" <llvm-dev at lists.llvm.org>, "Philip
Reames"
> <listmail at philipreames.com>, "Duncan P. N. Exon Smith"
> <dexonsmith at apple.com>, "Sanjoy Das"
> <sanjoy at playingwithpointers.com>, "Chandler Carruth"
> <chandlerc at google.com>
> Sent: Friday, February 26, 2016 10:59:40 PM
> Subject: Re: [llvm-dev] Possible soundness issue with
> available_externally (split from "RFC: Add guard intrinsics")
> On Fri, Feb 26, 2016 at 8:57 PM, Hal Finkel < hfinkel at anl.gov >
> wrote:
> > > From: "Xinliang David Li" < xinliangli at gmail.com
>
> > 
> 
> > > To: "Chandler Carruth" < chandlerc at google.com
>
> > 
> 
> > > Cc: "Hal Finkel" < hfinkel at anl.gov >,
"llvm-dev" <
> > > llvm-dev at lists.llvm.org >, "Philip Reames" <
> > > listmail at philipreames.com >, "Duncan P. N. Exon
Smith" <
> > > dexonsmith at apple.com >, "Sanjoy Das" <
> > > sanjoy at playingwithpointers.com >
> > 
> 
> > > Sent: Friday, February 26, 2016 10:48:13 PM
> > 
> 
> > > Subject: Re: [llvm-dev] Possible soundness issue with
> > > available_externally (split from "RFC: Add guard
intrinsics")
> > 
> 
> > > On Fri, Feb 26, 2016 at 8:43 PM, Chandler Carruth <
> > > chandlerc at google.com > wrote:
> > 
> 
> > > > On Fri, Feb 26, 2016 at 8:40 PM Xinliang David Li <
> > > > xinliangli at gmail.com > wrote:
> > > 
> > 
> 
> > > > > On Fri, Feb 26, 2016 at 8:30 PM, Chandler Carruth <
> > > > > chandlerc at google.com > wrote:
> > > > 
> > > 
> > 
> 
> > > > > > On Fri, Feb 26, 2016 at 8:15 PM Hal Finkel <
> > > > > > hfinkel at anl.gov
> > > > > > >
> > > > > > wrote:
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Can you explain your concerns about
reliability? My idea
> > > > > > > is
> > > > > > > that
> > > > > > > we
> > > > > > > have a kind of delayed-name comdat section
attribute that
> > > > > > > says
> > > > > > > that
> > > > > > > the name of the comdat section into which the
function
> > > > > > > will
> > > > > > > be
> > > > > > > placed will be based on the hash of the
contents (which
> > > > > > > should
> > > > > > > probably be the pre-isel IR-level contexts
plus perhaps
> > > > > > > the
> > > > > > > other
> > > > > > > state that might affect backend
optimization). Maybe just
> > > > > > > using
> > > > > > > the
> > > > > > > MI-level function contents will work too. It
seems like
> > > > > > > this
> > > > > > > should
> > > > > > > be "perfect". I don't see the
circumstances why function
> > > > > > > definitions
> > > > > > > optimized under identical optimization
settings should
> > > > > > > differ
> > > > > > > in
> > > > > > > this sense.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > Ok, let's consider Sanjoy's latest post.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > I think we could end up with this routine in N
different
> > > > > > modules
> > > > > > with
> > > > > > slightly different amounts of information in each.
Each of
> > > > > > these
> > > > > > in
> > > > > > turn could cause us to deduce a slightly different
set of
> > > > > > attributes, and thus compute a different hash.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > Now, certainly, there are a small number of
variants here.
> > > > > > Two,
> > > > > > maybe
> > > > > > three. But this could in the worst case cause us
to have
> > > > > > two
> > > > > > copies
> > > > > > of a *huge* number of functions.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > My concern here is not that this is a
*theoretically* hard
> > > > > > problem
> > > > > > to
> > > > > > solve. I think there are good techniques to handle
this. My
> > > > > > concern
> > > > > > is that it will be a substantial amount of work
and require
> > > > > > a
> > > > > > reasonable amount of testing and experimentation.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > Such data won't be too hard to find out. It can be
> > > > > conservatively
> > > > > estimated by looking at the final binary code for each
COMDAT
> > > > > to
> > > > > see
> > > > > how many 'clones's are needed due to different
code gen --
> > > > > the
> > > > > result is the upper bound as different code gen does
not mean
> > > > > different deduced function attributes.
> > > > 
> > > 
> > 
> 
> > > > > With PGO, the decision process will also become much
simpler.
> > > > 
> > > 
> > 
> 
> > > > Sure. And if you want to work on privatization in LLVM's
IPO,
> > > > that
> > > > would be awesome.
> > > 
> > 
> 
> > > > But I think it is completely unreasonable to gate checking
in
> > > > code
> > > > to
> > > > test for the invalid transformation until this new
optimization
> > > > technique is available. We should be fixing the bug now, and
> > > > then
> > > > figuring out how to do privatization.
> > > 
> > 
> 
> > > Isn't privatization one way to get correctness? Other
techniques
> > > such
> > > as hash based de-dup or use PGO etc for tuning are orthognal
> > > methods
> > > to reduce the overall cost of the former.
> > 
> 
> > It is, and I agree. However, the code-size growth seems like it
> > could
> > easily be unacceptable even for speed-optimized builds without the
> > kind of mitigation options mentioned.
> 
> Except that there might be icache/itlb impact so performance is not
> totally immune to size increase -- otherwise we may as well just
> force inlining those :)This is why I am hoping we can aggressively apply that kind of privatization.
However, it might turn out not to be the case.

-Hal 
> David
> > -Hal
> 
> > > David
> > 
> 
> > > > -Chandler
> > > 
> > 
> 
> > --
> 
> > Hal Finkel
> 
> > Assistant Computational Scientist
> 
> > Leadership Computing Facility
> 
> > Argonne National Laboratory
> 
-- 

Hal Finkel 
Assistant Computational Scientist 
Leadership Computing Facility 
Argonne National Laboratory 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160226/f802b3c3/attachment-0001.html>

Hal Finkel via llvm-dev

2016-Feb-27 05:17 UTC

head link

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

----- Original Message -----
> From: "Hal Finkel via llvm-dev" <llvm-dev at
lists.llvm.org>
> To: "Xinliang David Li" <xinliangli at gmail.com>
> Cc: "llvm-dev" <llvm-dev at lists.llvm.org>
> Sent: Friday, February 26, 2016 11:07:35 PM
> Subject: Re: [llvm-dev] Possible soundness issue with
> available_externally (split from "RFC: Add guard intrinsics")
> ----- Original Message -----
> > From: "Xinliang David Li" <xinliangli at gmail.com>
> 
> > To: "Hal Finkel" <hfinkel at anl.gov>
> 
> > Cc: "llvm-dev" <llvm-dev at lists.llvm.org>,
"Philip Reames"
> > <listmail at philipreames.com>, "Duncan P. N. Exon
Smith"
> > <dexonsmith at apple.com>, "Sanjoy Das"
> > <sanjoy at playingwithpointers.com>, "Chandler
Carruth"
> > <chandlerc at google.com>
> 
> > Sent: Friday, February 26, 2016 10:59:40 PM
> 
> > Subject: Re: [llvm-dev] Possible soundness issue with
> > available_externally (split from "RFC: Add guard
intrinsics")
> 
> > On Fri, Feb 26, 2016 at 8:57 PM, Hal Finkel < hfinkel at anl.gov
>
> > wrote:
> 
> > > > From: "Xinliang David Li" < xinliangli at
gmail.com >
> > > 
> > 
> 
> > > > To: "Chandler Carruth" < chandlerc at
google.com >
> > > 
> > 
> 
> > > > Cc: "Hal Finkel" < hfinkel at anl.gov >,
"llvm-dev" <
> > > > llvm-dev at lists.llvm.org >, "Philip Reames"
<
> > > > listmail at philipreames.com >, "Duncan P. N. Exon
Smith" <
> > > > dexonsmith at apple.com >, "Sanjoy Das" <
> > > > sanjoy at playingwithpointers.com >
> > > 
> > 
> 
> > > > Sent: Friday, February 26, 2016 10:48:13 PM
> > > 
> > 
> 
> > > > Subject: Re: [llvm-dev] Possible soundness issue with
> > > > available_externally (split from "RFC: Add guard
intrinsics")
> > > 
> > 
> 
> > > > On Fri, Feb 26, 2016 at 8:43 PM, Chandler Carruth <
> > > > chandlerc at google.com > wrote:
> > > 
> > 
> 
> > > > > On Fri, Feb 26, 2016 at 8:40 PM Xinliang David Li <
> > > > > xinliangli at gmail.com > wrote:
> > > > 
> > > 
> > 
> 
> > > > > > On Fri, Feb 26, 2016 at 8:30 PM, Chandler Carruth
<
> > > > > > chandlerc at google.com > wrote:
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > On Fri, Feb 26, 2016 at 8:15 PM Hal Finkel
<
> > > > > > > hfinkel at anl.gov
> > > > > > > >
> > > > > > > wrote:
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > Can you explain your concerns about
reliability? My
> > > > > > > > idea
> > > > > > > > is
> > > > > > > > that
> > > > > > > > we
> > > > > > > > have a kind of delayed-name comdat
section attribute
> > > > > > > > that
> > > > > > > > says
> > > > > > > > that
> > > > > > > > the name of the comdat section into
which the function
> > > > > > > > will
> > > > > > > > be
> > > > > > > > placed will be based on the hash of the
contents (which
> > > > > > > > should
> > > > > > > > probably be the pre-isel IR-level
contexts plus perhaps
> > > > > > > > the
> > > > > > > > other
> > > > > > > > state that might affect backend
optimization). Maybe
> > > > > > > > just
> > > > > > > > using
> > > > > > > > the
> > > > > > > > MI-level function contents will work
too. It seems like
> > > > > > > > this
> > > > > > > > should
> > > > > > > > be "perfect". I don't see
the circumstances why
> > > > > > > > function
> > > > > > > > definitions
> > > > > > > > optimized under identical optimization
settings should
> > > > > > > > differ
> > > > > > > > in
> > > > > > > > this sense.
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Ok, let's consider Sanjoy's latest
post.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > I think we could end up with this routine in
N different
> > > > > > > modules
> > > > > > > with
> > > > > > > slightly different amounts of information in
each. Each
> > > > > > > of
> > > > > > > these
> > > > > > > in
> > > > > > > turn could cause us to deduce a slightly
different set of
> > > > > > > attributes, and thus compute a different
hash.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Now, certainly, there are a small number of
variants
> > > > > > > here.
> > > > > > > Two,
> > > > > > > maybe
> > > > > > > three. But this could in the worst case cause
us to have
> > > > > > > two
> > > > > > > copies
> > > > > > > of a *huge* number of functions.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > My concern here is not that this is a
*theoretically*
> > > > > > > hard
> > > > > > > problem
> > > > > > > to
> > > > > > > solve. I think there are good techniques to
handle this.
> > > > > > > My
> > > > > > > concern
> > > > > > > is that it will be a substantial amount of
work and
> > > > > > > require
> > > > > > > a
> > > > > > > reasonable amount of testing and
experimentation.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > Such data won't be too hard to find out. It
can be
> > > > > > conservatively
> > > > > > estimated by looking at the final binary code for
each
> > > > > > COMDAT
> > > > > > to
> > > > > > see
> > > > > > how many 'clones's are needed due to
different code gen --
> > > > > > the
> > > > > > result is the upper bound as different code gen
does not
> > > > > > mean
> > > > > > different deduced function attributes.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > With PGO, the decision process will also become
much
> > > > > > simpler.
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > Sure. And if you want to work on privatization in
LLVM's IPO,
> > > > > that
> > > > > would be awesome.
> > > > 
> > > 
> > 
> 
> > > > > But I think it is completely unreasonable to gate
checking in
> > > > > code
> > > > > to
> > > > > test for the invalid transformation until this new
> > > > > optimization
> > > > > technique is available. We should be fixing the bug
now, and
> > > > > then
> > > > > figuring out how to do privatization.
> > > > 
> > > 
> > 
> 
> > > > Isn't privatization one way to get correctness? Other
> > > > techniques
> > > > such
> > > > as hash based de-dup or use PGO etc for tuning are orthognal
> > > > methods
> > > > to reduce the overall cost of the former.
> > > 
> > 
> 
> > > It is, and I agree. However, the code-size growth seems like it
> > > could
> > > easily be unacceptable even for speed-optimized builds without
> > > the
> > > kind of mitigation options mentioned.
> > 
> 
> > Except that there might be icache/itlb impact so performance is not
> > totally immune to size increase -- otherwise we may as well just
> > force inlining those :)
> 
> This is why I am hoping we can aggressively apply that kind of
> privatization. However, it might turn out not to be the case.
But to be clear, my point is that (at least when optimizing for speed), we
should try this when fixing the bug. It might turn out we can't afford the
size increase, but we need to make a quantitative decision on that. If we can
deal with the increase, then it seems like we can then work on recovering the
code-size losses using the mechanisms discussed. These might require tuning to
be acceptable, or the tuning might itself be an extra nicety to further recover
code size regressions.

I'm assuming here that privatization can be accomplished simply by switching
the linkage on the current function (although do we need to do something special
for function pointers?), and so the implementation cost is not high.

-Hal 
> -Hal
> > David
> 
> > > -Hal
> > 
> 
> > > > David
> > > 
> > 
> 
> > > > > -Chandler
> > > > 
> > > 
> > 
> 
> > > --
> > 
> 
> > > Hal Finkel
> > 
> 
> > > Assistant Computational Scientist
> > 
> 
> > > Leadership Computing Facility
> > 
> 
> > > Argonne National Laboratory
> > 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-- 

Hal Finkel 
Assistant Computational Scientist 
Leadership Computing Facility 
Argonne National Laboratory 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160226/7dcfc3b1/attachment.html>

Reasonably Related Threads

Search for more possibly parallel threads

llvm dev - Feb 2016 - Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

Reasonably Related Threads