Xinliang David Li via llvm-dev
2016-Feb-27 04:48 UTC
[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
On Fri, Feb 26, 2016 at 8:43 PM, Chandler Carruth <chandlerc at google.com> wrote:> On Fri, Feb 26, 2016 at 8:40 PM Xinliang David Li <xinliangli at gmail.com> > wrote: > >> On Fri, Feb 26, 2016 at 8:30 PM, Chandler Carruth <chandlerc at google.com> >> wrote: >> >>> On Fri, Feb 26, 2016 at 8:15 PM Hal Finkel <hfinkel at anl.gov> wrote: >>> >>>> Can you explain your concerns about reliability? My idea is that we >>>> have a kind of delayed-name comdat section attribute that says that the >>>> name of the comdat section into which the function will be placed will be >>>> based on the hash of the contents (which should probably be the pre-isel >>>> IR-level contexts plus perhaps the other state that might affect backend >>>> optimization). Maybe just using the MI-level function contents will work >>>> too. It seems like this should be "perfect". I don't see the circumstances >>>> why function definitions optimized under identical optimization settings >>>> should differ in this sense. >>>> >>> >>> Ok, let's consider Sanjoy's latest post. >>> >>> I think we could end up with this routine in N different modules with >>> slightly different amounts of information in each. Each of these in turn >>> could cause us to deduce a slightly different set of attributes, and thus >>> compute a different hash. >>> >>> Now, certainly, there are a small number of variants here. Two, maybe >>> three. But this could in the worst case cause us to have two copies of a >>> *huge* number of functions. >>> >>> My concern here is not that this is a *theoretically* hard problem to >>> solve. I think there are good techniques to handle this. My concern is that >>> it will be a substantial amount of work and require a reasonable amount of >>> testing and experimentation. >>> >> >> Such data won't be too hard to find out. It can be conservatively >> estimated by looking at the final binary code for each COMDAT to see how >> many 'clones's are needed due to different code gen -- the result is the >> upper bound as different code gen does not mean different deduced function >> attributes. >> >> With PGO, the decision process will also become much simpler. >> > > Sure. And if you want to work on privatization in LLVM's IPO, that would > be awesome. > > But I think it is completely unreasonable to gate checking in code to test > for the invalid transformation until this new optimization technique is > available. We should be fixing the bug now, and then figuring out how to do > privatization. >Isn't privatization one way to get correctness? Other techniques such as hash based de-dup or use PGO etc for tuning are orthognal methods to reduce the overall cost of the former. David> > -Chandler >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160226/fd96d0dc/attachment.html>
Hal Finkel via llvm-dev
2016-Feb-27 04:57 UTC
[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
----- Original Message -----> From: "Xinliang David Li" <xinliangli at gmail.com> > To: "Chandler Carruth" <chandlerc at google.com> > Cc: "Hal Finkel" <hfinkel at anl.gov>, "llvm-dev" > <llvm-dev at lists.llvm.org>, "Philip Reames" > <listmail at philipreames.com>, "Duncan P. N. Exon Smith" > <dexonsmith at apple.com>, "Sanjoy Das" > <sanjoy at playingwithpointers.com> > Sent: Friday, February 26, 2016 10:48:13 PM > Subject: Re: [llvm-dev] Possible soundness issue with > available_externally (split from "RFC: Add guard intrinsics")> On Fri, Feb 26, 2016 at 8:43 PM, Chandler Carruth < > chandlerc at google.com > wrote:> > On Fri, Feb 26, 2016 at 8:40 PM Xinliang David Li < > > xinliangli at gmail.com > wrote: >> > > On Fri, Feb 26, 2016 at 8:30 PM, Chandler Carruth < > > > chandlerc at google.com > wrote: > > >> > > > On Fri, Feb 26, 2016 at 8:15 PM Hal Finkel < hfinkel at anl.gov > > > > > wrote: > > > > > > > > > > > Can you explain your concerns about reliability? My idea is > > > > > that > > > > > we > > > > > have a kind of delayed-name comdat section attribute that > > > > > says > > > > > that > > > > > the name of the comdat section into which the function will > > > > > be > > > > > placed will be based on the hash of the contents (which > > > > > should > > > > > probably be the pre-isel IR-level contexts plus perhaps the > > > > > other > > > > > state that might affect backend optimization). Maybe just > > > > > using > > > > > the > > > > > MI-level function contents will work too. It seems like this > > > > > should > > > > > be "perfect". I don't see the circumstances why function > > > > > definitions > > > > > optimized under identical optimization settings should differ > > > > > in > > > > > this sense. > > > > > > > > > >> > > > Ok, let's consider Sanjoy's latest post. > > > > > >> > > > I think we could end up with this routine in N different > > > > modules > > > > with > > > > slightly different amounts of information in each. Each of > > > > these > > > > in > > > > turn could cause us to deduce a slightly different set of > > > > attributes, and thus compute a different hash. > > > > > >> > > > Now, certainly, there are a small number of variants here. Two, > > > > maybe > > > > three. But this could in the worst case cause us to have two > > > > copies > > > > of a *huge* number of functions. > > > > > >> > > > My concern here is not that this is a *theoretically* hard > > > > problem > > > > to > > > > solve. I think there are good techniques to handle this. My > > > > concern > > > > is that it will be a substantial amount of work and require a > > > > reasonable amount of testing and experimentation. > > > > > > > > > Such data won't be too hard to find out. It can be conservatively > > > estimated by looking at the final binary code for each COMDAT to > > > see > > > how many 'clones's are needed due to different code gen -- the > > > result is the upper bound as different code gen does not mean > > > different deduced function attributes. > > >> > > With PGO, the decision process will also become much simpler. > > > > > Sure. And if you want to work on privatization in LLVM's IPO, that > > would be awesome. >> > But I think it is completely unreasonable to gate checking in code > > to > > test for the invalid transformation until this new optimization > > technique is available. We should be fixing the bug now, and then > > figuring out how to do privatization. > > Isn't privatization one way to get correctness? Other techniques such > as hash based de-dup or use PGO etc for tuning are orthognal methods > to reduce the overall cost of the former.It is, and I agree. However, the code-size growth seems like it could easily be unacceptable even for speed-optimized builds without the kind of mitigation options mentioned. -Hal> David > > -Chandler >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160226/3eb36468/attachment.html>
Xinliang David Li via llvm-dev
2016-Feb-27 04:59 UTC
[llvm-dev] Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
On Fri, Feb 26, 2016 at 8:57 PM, Hal Finkel <hfinkel at anl.gov> wrote:> > > ------------------------------ > > *From: *"Xinliang David Li" <xinliangli at gmail.com> > *To: *"Chandler Carruth" <chandlerc at google.com> > *Cc: *"Hal Finkel" <hfinkel at anl.gov>, "llvm-dev" <llvm-dev at lists.llvm.org>, > "Philip Reames" <listmail at philipreames.com>, "Duncan P. N. Exon Smith" < > dexonsmith at apple.com>, "Sanjoy Das" <sanjoy at playingwithpointers.com> > *Sent: *Friday, February 26, 2016 10:48:13 PM > *Subject: *Re: [llvm-dev] Possible soundness issue with > available_externally (split from "RFC: Add guard intrinsics") > > > > On Fri, Feb 26, 2016 at 8:43 PM, Chandler Carruth <chandlerc at google.com> > wrote: > >> On Fri, Feb 26, 2016 at 8:40 PM Xinliang David Li <xinliangli at gmail.com> >> wrote: >> >>> On Fri, Feb 26, 2016 at 8:30 PM, Chandler Carruth <chandlerc at google.com> >>> wrote: >>> >>>> On Fri, Feb 26, 2016 at 8:15 PM Hal Finkel <hfinkel at anl.gov> wrote: >>>> >>>>> Can you explain your concerns about reliability? My idea is that we >>>>> have a kind of delayed-name comdat section attribute that says that the >>>>> name of the comdat section into which the function will be placed will be >>>>> based on the hash of the contents (which should probably be the pre-isel >>>>> IR-level contexts plus perhaps the other state that might affect backend >>>>> optimization). Maybe just using the MI-level function contents will work >>>>> too. It seems like this should be "perfect". I don't see the circumstances >>>>> why function definitions optimized under identical optimization settings >>>>> should differ in this sense. >>>>> >>>> >>>> Ok, let's consider Sanjoy's latest post. >>>> >>>> I think we could end up with this routine in N different modules with >>>> slightly different amounts of information in each. Each of these in turn >>>> could cause us to deduce a slightly different set of attributes, and thus >>>> compute a different hash. >>>> >>>> Now, certainly, there are a small number of variants here. Two, maybe >>>> three. But this could in the worst case cause us to have two copies of a >>>> *huge* number of functions. >>>> >>>> My concern here is not that this is a *theoretically* hard problem to >>>> solve. I think there are good techniques to handle this. My concern is that >>>> it will be a substantial amount of work and require a reasonable amount of >>>> testing and experimentation. >>>> >>> >>> Such data won't be too hard to find out. It can be conservatively >>> estimated by looking at the final binary code for each COMDAT to see how >>> many 'clones's are needed due to different code gen -- the result is the >>> upper bound as different code gen does not mean different deduced function >>> attributes. >>> >>> With PGO, the decision process will also become much simpler. >>> >> >> Sure. And if you want to work on privatization in LLVM's IPO, that would >> be awesome. >> >> But I think it is completely unreasonable to gate checking in code to >> test for the invalid transformation until this new optimization technique >> is available. We should be fixing the bug now, and then figuring out how to >> do privatization. >> > > Isn't privatization one way to get correctness? Other techniques such as > hash based de-dup or use PGO etc for tuning are orthognal methods to reduce > the overall cost of the former. > > > It is, and I agree. However, the code-size growth seems like it could > easily be unacceptable even for speed-optimized builds without the kind of > mitigation options mentioned. >Except that there might be icache/itlb impact so performance is not totally immune to size increase -- otherwise we may as well just force inlining those :) David> > -Hal > > > David > >> >> -Chandler >> > > > > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160226/aa0be4c8/attachment.html>
Possibly Parallel Threads
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")
- Possible soundness issue with available_externally (split from "RFC: Add guard intrinsics")