thr3ads.net - llvm dev - [llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics") [Feb 2016]

If this information is useful, please help other people find it:
Share via:

Xinliang David Li via llvm-dev

2016-Feb-27 04:48 UTC

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

On Fri, Feb 26, 2016 at 8:43 PM, Chandler Carruth <chandlerc at
google.com>
wrote:
> On Fri, Feb 26, 2016 at 8:40 PM Xinliang David Li <xinliangli at
gmail.com>
> wrote:
>
>> On Fri, Feb 26, 2016 at 8:30 PM, Chandler Carruth <chandlerc at
google.com>
>> wrote:
>>
>>> On Fri, Feb 26, 2016 at 8:15 PM Hal Finkel <hfinkel at
anl.gov> wrote:
>>>
>>>> Can you explain your concerns about reliability? My idea is
that we
>>>> have a kind of delayed-name comdat section attribute that says
that the
>>>> name of the comdat section into which the function will be
placed will be
>>>> based on the hash of the contents (which should probably be the
pre-isel
>>>> IR-level contexts plus perhaps the other state that might
affect backend
>>>> optimization). Maybe just using the MI-level function contents
will work
>>>> too. It seems like this should be "perfect". I
don't see the circumstances
>>>> why function definitions optimized under identical optimization
settings
>>>> should differ in this sense.
>>>>
>>>
>>> Ok, let's consider Sanjoy's latest post.
>>>
>>> I think we could end up with this routine in N different modules
with
>>> slightly different amounts of information in each. Each of these in
turn
>>> could cause us to deduce a slightly different set of attributes,
and thus
>>> compute a different hash.
>>>
>>> Now, certainly, there are a small number of variants here. Two,
maybe
>>> three. But this could in the worst case cause us to have two copies
of a
>>> *huge* number of functions.
>>>
>>> My concern here is not that this is a *theoretically* hard problem
to
>>> solve. I think there are good techniques to handle this. My concern
is that
>>> it will be a substantial amount of work and require a reasonable
amount of
>>> testing and experimentation.
>>>
>>
>> Such data won't be too hard to find out. It can be conservatively
>> estimated by looking at the final binary code for each COMDAT to see
how
>> many 'clones's are needed due to different code gen -- the
result is the
>> upper bound as different code gen does not mean different deduced
function
>> attributes.
>>
>> With PGO, the decision process will also become much simpler.
>>
>
> Sure. And if you want to work on privatization in LLVM's IPO, that
would
> be awesome.
>
> But I think it is completely unreasonable to gate checking in code to test
> for the invalid transformation until this new optimization technique is
> available. We should be fixing the bug now, and then figuring out how to do
> privatization.
>
Isn't privatization one way to get correctness? Other techniques such as
hash based de-dup or use PGO etc for tuning are orthognal methods to reduce
the overall cost of the former.

David
>
> -Chandler
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160226/fd96d0dc/attachment.html>

Hal Finkel via llvm-dev

2016-Feb-27 04:57 UTC

head link

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

----- Original Message -----
> From: "Xinliang David Li" <xinliangli at gmail.com>
> To: "Chandler Carruth" <chandlerc at google.com>
> Cc: "Hal Finkel" <hfinkel at anl.gov>, "llvm-dev"
> <llvm-dev at lists.llvm.org>, "Philip Reames"
> <listmail at philipreames.com>, "Duncan P. N. Exon Smith"
> <dexonsmith at apple.com>, "Sanjoy Das"
> <sanjoy at playingwithpointers.com>
> Sent: Friday, February 26, 2016 10:48:13 PM
> Subject: Re: [llvm-dev] Possible soundness issue with
> available_externally (split from "RFC: Add guard intrinsics")
> On Fri, Feb 26, 2016 at 8:43 PM, Chandler Carruth <
> chandlerc at google.com > wrote:
> > On Fri, Feb 26, 2016 at 8:40 PM Xinliang David Li <
> > xinliangli at gmail.com > wrote:
> 
> > > On Fri, Feb 26, 2016 at 8:30 PM, Chandler Carruth <
> > > chandlerc at google.com > wrote:
> > 
> 
> > > > On Fri, Feb 26, 2016 at 8:15 PM Hal Finkel < hfinkel at
anl.gov >
> > > > wrote:
> > > 
> > 
> 
> > > > > Can you explain your concerns about reliability? My
idea is
> > > > > that
> > > > > we
> > > > > have a kind of delayed-name comdat section attribute
that
> > > > > says
> > > > > that
> > > > > the name of the comdat section into which the function
will
> > > > > be
> > > > > placed will be based on the hash of the contents (which
> > > > > should
> > > > > probably be the pre-isel IR-level contexts plus perhaps
the
> > > > > other
> > > > > state that might affect backend optimization). Maybe
just
> > > > > using
> > > > > the
> > > > > MI-level function contents will work too. It seems like
this
> > > > > should
> > > > > be "perfect". I don't see the
circumstances why function
> > > > > definitions
> > > > > optimized under identical optimization settings should
differ
> > > > > in
> > > > > this sense.
> > > > 
> > > 
> > 
> 
> > > > Ok, let's consider Sanjoy's latest post.
> > > 
> > 
> 
> > > > I think we could end up with this routine in N different
> > > > modules
> > > > with
> > > > slightly different amounts of information in each. Each of
> > > > these
> > > > in
> > > > turn could cause us to deduce a slightly different set of
> > > > attributes, and thus compute a different hash.
> > > 
> > 
> 
> > > > Now, certainly, there are a small number of variants here.
Two,
> > > > maybe
> > > > three. But this could in the worst case cause us to have two
> > > > copies
> > > > of a *huge* number of functions.
> > > 
> > 
> 
> > > > My concern here is not that this is a *theoretically* hard
> > > > problem
> > > > to
> > > > solve. I think there are good techniques to handle this. My
> > > > concern
> > > > is that it will be a substantial amount of work and require
a
> > > > reasonable amount of testing and experimentation.
> > > 
> > 
> 
> > > Such data won't be too hard to find out. It can be
conservatively
> > > estimated by looking at the final binary code for each COMDAT to
> > > see
> > > how many 'clones's are needed due to different code gen
-- the
> > > result is the upper bound as different code gen does not mean
> > > different deduced function attributes.
> > 
> 
> > > With PGO, the decision process will also become much simpler.
> > 
> 
> > Sure. And if you want to work on privatization in LLVM's IPO, that
> > would be awesome.
> 
> > But I think it is completely unreasonable to gate checking in code
> > to
> > test for the invalid transformation until this new optimization
> > technique is available. We should be fixing the bug now, and then
> > figuring out how to do privatization.
> 
> Isn't privatization one way to get correctness? Other techniques such
> as hash based de-dup or use PGO etc for tuning are orthognal methods
> to reduce the overall cost of the former.It is, and I agree. However, the code-size growth seems like it could easily be
unacceptable even for speed-optimized builds without the kind of mitigation
options mentioned.

-Hal 
> David
> > -Chandler
> -- 

Hal Finkel 
Assistant Computational Scientist 
Leadership Computing Facility 
Argonne National Laboratory 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160226/3eb36468/attachment.html>

Xinliang David Li via llvm-dev

2016-Feb-27 04:59 UTC

head link

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

On Fri, Feb 26, 2016 at 8:57 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>
>
> ------------------------------
>
> *From: *"Xinliang David Li" <xinliangli at gmail.com>
> *To: *"Chandler Carruth" <chandlerc at google.com>
> *Cc: *"Hal Finkel" <hfinkel at anl.gov>,
"llvm-dev" <llvm-dev at lists.llvm.org>,
> "Philip Reames" <listmail at philipreames.com>,
"Duncan P. N. Exon Smith" <
> dexonsmith at apple.com>, "Sanjoy Das" <sanjoy at
playingwithpointers.com>
> *Sent: *Friday, February 26, 2016 10:48:13 PM
> *Subject: *Re: [llvm-dev] Possible soundness issue with
> available_externally (split from "RFC: Add guard intrinsics")
>
>
>
> On Fri, Feb 26, 2016 at 8:43 PM, Chandler Carruth <chandlerc at
google.com>
> wrote:
>
>> On Fri, Feb 26, 2016 at 8:40 PM Xinliang David Li <xinliangli at
gmail.com>
>> wrote:
>>
>>> On Fri, Feb 26, 2016 at 8:30 PM, Chandler Carruth <chandlerc at
google.com>
>>> wrote:
>>>
>>>> On Fri, Feb 26, 2016 at 8:15 PM Hal Finkel <hfinkel at
anl.gov> wrote:
>>>>
>>>>> Can you explain your concerns about reliability? My idea is
that we
>>>>> have a kind of delayed-name comdat section attribute that
says that the
>>>>> name of the comdat section into which the function will be
placed will be
>>>>> based on the hash of the contents (which should probably be
the pre-isel
>>>>> IR-level contexts plus perhaps the other state that might
affect backend
>>>>> optimization). Maybe just using the MI-level function
contents will work
>>>>> too. It seems like this should be "perfect". I
don't see the circumstances
>>>>> why function definitions optimized under identical
optimization settings
>>>>> should differ in this sense.
>>>>>
>>>>
>>>> Ok, let's consider Sanjoy's latest post.
>>>>
>>>> I think we could end up with this routine in N different
modules with
>>>> slightly different amounts of information in each. Each of
these in turn
>>>> could cause us to deduce a slightly different set of
attributes, and thus
>>>> compute a different hash.
>>>>
>>>> Now, certainly, there are a small number of variants here. Two,
maybe
>>>> three. But this could in the worst case cause us to have two
copies of a
>>>> *huge* number of functions.
>>>>
>>>> My concern here is not that this is a *theoretically* hard
problem to
>>>> solve. I think there are good techniques to handle this. My
concern is that
>>>> it will be a substantial amount of work and require a
reasonable amount of
>>>> testing and experimentation.
>>>>
>>>
>>> Such data won't be too hard to find out. It can be
conservatively
>>> estimated by looking at the final binary code for each COMDAT to
see how
>>> many 'clones's are needed due to different code gen -- the
result is the
>>> upper bound as different code gen does not mean different deduced
function
>>> attributes.
>>>
>>> With PGO, the decision process will also become much simpler.
>>>
>>
>> Sure. And if you want to work on privatization in LLVM's IPO, that
would
>> be awesome.
>>
>> But I think it is completely unreasonable to gate checking in code to
>> test for the invalid transformation until this new optimization
technique
>> is available. We should be fixing the bug now, and then figuring out
how to
>> do privatization.
>>
>
> Isn't privatization one way to get correctness? Other techniques such
as
> hash based de-dup or use PGO etc for tuning are orthognal methods to reduce
> the overall cost of the former.
>
>
> It is, and I agree. However, the code-size growth seems like it could
> easily be unacceptable even for speed-optimized builds without the kind of
> mitigation options mentioned.
>
Except that there might be icache/itlb impact so performance is not totally
immune to size increase -- otherwise we may as well just force inlining
those :)

David

>
>  -Hal
>
>
> David
>
>>
>> -Chandler
>>
>
>
>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160226/aa0be4c8/attachment.html>

Apparently Analagous Threads

Search for more maybe matching threads

llvm dev - Feb 2016 - Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")

Apparently Analagous Threads