Hal Finkel
2015-Jul-14 05:34 UTC
[LLVMdev] GlobalsModRef (and thus LTO) is completely broken
----- Original Message -----> From: "Chris Lattner" <clattner at apple.com> > To: "Chandler Carruth" <chandlerc at gmail.com> > Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>, "Hal Finkel" <hfinkel at anl.gov>, "Justin Bogner" > <mail at justinbogner.com>, "Duncan Exon Smith" <dexonsmith at apple.com>, "Rafael Espíndola" <rafael.espindola at gmail.com> > Sent: Tuesday, July 14, 2015 12:18:36 AM > Subject: Re: [LLVMdev] GlobalsModRef (and thus LTO) is completely broken > > > > On Jul 13, 2015, at 8:19 PM, Chandler Carruth <chandlerc at gmail.com> > > wrote: > > > > Ok folks, > > > > I wrote up the general high-level thoughts I have about stateful AA > > in a separate thread. But we need to sort out the completely and > > horribly broken aspects of GlobalsModRef today, and the practical > > steps forward. This email is totally about the practical stuff. > > > > Now, as to why I emailed this group of people and with this > > subject, the only pass pipeline that includes GlobalsModRef, is > > the LTO pipeline. > > Ah, so it is just an LTO enabled benchmark hack then. > > > It is also relying on a precomputed set of global variables whose > > address is never used by an instruction other than some very small > > set (gep, bitcast) as "non-address-taken". It then runs > > GetUnderlyingObject on the two pointers in alias queries, and if > > that finds one of these "non-address-taken" globals for one of the > > memory locations but not the other, it concludes no-alias! This is > > broken for a number of reasons. > > > > a) If the two locations merely have a different *depth* of > > instruction stack, because GetUnderlyingObject has a recursion > > cap, one side can fail while the other succeeds, and we > > erroneously produce no-alias. > > Interesting. I’m sure it is no consolation, but GlobalsModRef > probably predated the recursion cap :-) > > > b) If instcombine or any other pass for any reason introduces on > > one path an instruction that GetUnderlyingObject can't look > > through (select, phi, load, ....), we incorrectly conclude > > no-alias. This is what addEscapingUse was intended to solve, but > > we would literally have to call it from every pass because we > > can't rely on analysis invalidation! > > > > c) If any pass actually escapes a pointer from one function into > > another, we invalidate the underlying assumption of > > 'non-address-taken' that it relies upon. > > Yep, all of these are pretty nasty. > > > Now, as I argued in my general AA thread, I think we might be able > > to assume that (c) doesn't happen today. But both (a) and (b) seem > > like active nightmares to try to fix. I can see hacky ways to > > avoid (a) where we detect *why* GetUnderlyingObject fails, but I > > don't see how to fix both (a) and (b) (or to fix (a) well) without > > just disabling this specific aspect of GloblasModRef. > > Ok, but we need performance information to make sure this doesn’t > cause a regression in practice for LTO builds. For example, Spec 2K > and 2K6 are a reasonable place to start. > > > 1) Fix obvious issues with GloblasModRef and switch it to > > ValueHandles > > 2) Mail out a patch to disable this part of GlobalsModRef. I can > > put it behind a flag or however folks would like it to work. > > 3) Remove addEscapingUse() update API, which without #2 may regress > > some LTO test case I don't have (because I don't have any other > > than bootstrap) > > Sounds great if we can do this without causing a regression in > practice.Personally, I'm comfortable with moving to a contractual obligation for the escaping uses situation: No pass may capture the address of a (previously-uncaptured) global (even locally) without notifying the AA infrastructure. I can't think of any in-tree pass that does this now, although we might certainly have some in the future. Can you think of any we have now? I'd really like to get the AA pass that Sam Parker has been working on (http://reviews.llvm.org/D10059) in, but it will add measurable compile-time overhead if it can't cache, and efficient caching seems to depend on the same property. -Hal> Are you aware of any miscompiles that might be attributed > to this, or are these “theoretical" concerns? > > -Chris > >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
Chandler Carruth
2015-Jul-14 05:55 UTC
[LLVMdev] GlobalsModRef (and thus LTO) is completely broken
On Mon, Jul 13, 2015 at 10:38 PM Hal Finkel <hfinkel at anl.gov> wrote:> ----- Original Message ----- > > From: "Chris Lattner" <clattner at apple.com> > > To: "Chandler Carruth" <chandlerc at gmail.com> > > Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>, "Hal Finkel" < > hfinkel at anl.gov>, "Justin Bogner" > > <mail at justinbogner.com>, "Duncan Exon Smith" <dexonsmith at apple.com>, > "Rafael Espíndola" <rafael.espindola at gmail.com> > > Sent: Tuesday, July 14, 2015 12:18:36 AM > > Subject: Re: [LLVMdev] GlobalsModRef (and thus LTO) is completely broken > > > > > > > On Jul 13, 2015, at 8:19 PM, Chandler Carruth <chandlerc at gmail.com> > > > wrote: > > > > > > Ok folks, > > > > > > I wrote up the general high-level thoughts I have about stateful AA > > > in a separate thread. But we need to sort out the completely and > > > horribly broken aspects of GlobalsModRef today, and the practical > > > steps forward. This email is totally about the practical stuff. > > > > > > Now, as to why I emailed this group of people and with this > > > subject, the only pass pipeline that includes GlobalsModRef, is > > > the LTO pipeline. > > > > Ah, so it is just an LTO enabled benchmark hack then. > > > > > It is also relying on a precomputed set of global variables whose > > > address is never used by an instruction other than some very small > > > set (gep, bitcast) as "non-address-taken". It then runs > > > GetUnderlyingObject on the two pointers in alias queries, and if > > > that finds one of these "non-address-taken" globals for one of the > > > memory locations but not the other, it concludes no-alias! This is > > > broken for a number of reasons. > > > > > > a) If the two locations merely have a different *depth* of > > > instruction stack, because GetUnderlyingObject has a recursion > > > cap, one side can fail while the other succeeds, and we > > > erroneously produce no-alias. > > > > Interesting. I’m sure it is no consolation, but GlobalsModRef > > probably predated the recursion cap :-) > > > > > b) If instcombine or any other pass for any reason introduces on > > > one path an instruction that GetUnderlyingObject can't look > > > through (select, phi, load, ....), we incorrectly conclude > > > no-alias. This is what addEscapingUse was intended to solve, but > > > we would literally have to call it from every pass because we > > > can't rely on analysis invalidation! > > > > > > c) If any pass actually escapes a pointer from one function into > > > another, we invalidate the underlying assumption of > > > 'non-address-taken' that it relies upon. > > > > Yep, all of these are pretty nasty. > > > > > Now, as I argued in my general AA thread, I think we might be able > > > to assume that (c) doesn't happen today. But both (a) and (b) seem > > > like active nightmares to try to fix. I can see hacky ways to > > > avoid (a) where we detect *why* GetUnderlyingObject fails, but I > > > don't see how to fix both (a) and (b) (or to fix (a) well) without > > > just disabling this specific aspect of GloblasModRef. > > > > Ok, but we need performance information to make sure this doesn’t > > cause a regression in practice for LTO builds. For example, Spec 2K > > and 2K6 are a reasonable place to start. > > > > > 1) Fix obvious issues with GloblasModRef and switch it to > > > ValueHandles > > > 2) Mail out a patch to disable this part of GlobalsModRef. I can > > > put it behind a flag or however folks would like it to work. > > > 3) Remove addEscapingUse() update API, which without #2 may regress > > > some LTO test case I don't have (because I don't have any other > > > than bootstrap) > > > > Sounds great if we can do this without causing a regression in > > practice. > > Personally, I'm comfortable with moving to a contractual obligation for > the escaping uses situation: No pass may capture the address of a > (previously-uncaptured) global (even locally) without notifying the AA > infrastructure.Note that I said escape, and you say capture here. I've not quite thought about it enough to generalize this to capture. Anyways, I don't generally disagree (see my lengthier mail about how I'm envisioning AA updates). However, this is *not sufficient* to fix GloblasModRef. You also have to somehow prevent asymmetric evaluation of GetUnderlyingObject on the two pointers queried. To fix this would require a substantially different implementation of GlobalsModRef which I think is possible, but which I don't realistically have the time to implement right now.> I can't think of any in-tree pass that does this now, although we might > certainly have some in the future. Can you think of any we have now? >If you LTO the ASan runtime along with the instrumentation, then the instrumentation pass violates this today. Most instrumentation passes violate this if the optimizer can see the runtime. I'm not aware of any non-instrumentation passes that violate it today.> > I'd really like to get the AA pass that Sam Parker has been working on ( > http://reviews.llvm.org/D10059) in, but it will add measurable > compile-time overhead if it can't cache, and efficient caching seems to > depend on the same property. >Sorry I haven't chimed in there. I'm worried that there is *no* sound way to do this in the current pass manager, but I'll read that patch and chime in with specific suggestions. -Chandler -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150714/87e80e11/attachment.html>
Hal Finkel
2015-Jul-14 06:08 UTC
[LLVMdev] GlobalsModRef (and thus LTO) is completely broken
----- Original Message -----> From: "Chandler Carruth" <chandlerc at google.com> > To: "Hal Finkel" <hfinkel at anl.gov>, "Chris Lattner" <clattner at apple.com> > Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Tuesday, July 14, 2015 12:55:37 AM > Subject: Re: [LLVMdev] GlobalsModRef (and thus LTO) is completely broken > > > > > On Mon, Jul 13, 2015 at 10:38 PM Hal Finkel < hfinkel at anl.gov > > wrote: > > > ----- Original Message ----- > > From: "Chris Lattner" < clattner at apple.com > > > To: "Chandler Carruth" < chandlerc at gmail.com > > > Cc: "LLVM Developers Mailing List" < llvmdev at cs.uiuc.edu >, "Hal > > Finkel" < hfinkel at anl.gov >, "Justin Bogner" > > < mail at justinbogner.com >, "Duncan Exon Smith" < > > dexonsmith at apple.com >, "Rafael Espíndola" < > > rafael.espindola at gmail.com > > > Sent: Tuesday, July 14, 2015 12:18:36 AM > > Subject: Re: [LLVMdev] GlobalsModRef (and thus LTO) is completely > > broken > > > > > > > On Jul 13, 2015, at 8:19 PM, Chandler Carruth < > > > chandlerc at gmail.com > > > > wrote: > > > > > > Ok folks, > > > > > > I wrote up the general high-level thoughts I have about stateful > > > AA > > > in a separate thread. But we need to sort out the completely and > > > horribly broken aspects of GlobalsModRef today, and the practical > > > steps forward. This email is totally about the practical stuff. > > > > > > Now, as to why I emailed this group of people and with this > > > subject, the only pass pipeline that includes GlobalsModRef, is > > > the LTO pipeline. > > > > Ah, so it is just an LTO enabled benchmark hack then. > > > > > It is also relying on a precomputed set of global variables whose > > > address is never used by an instruction other than some very > > > small > > > set (gep, bitcast) as "non-address-taken". It then runs > > > GetUnderlyingObject on the two pointers in alias queries, and if > > > that finds one of these "non-address-taken" globals for one of > > > the > > > memory locations but not the other, it concludes no-alias! This > > > is > > > broken for a number of reasons. > > > > > > a) If the two locations merely have a different *depth* of > > > instruction stack, because GetUnderlyingObject has a recursion > > > cap, one side can fail while the other succeeds, and we > > > erroneously produce no-alias. > > > > Interesting. I’m sure it is no consolation, but GlobalsModRef > > probably predated the recursion cap :-) > > > > > b) If instcombine or any other pass for any reason introduces on > > > one path an instruction that GetUnderlyingObject can't look > > > through (select, phi, load, ....), we incorrectly conclude > > > no-alias. This is what addEscapingUse was intended to solve, but > > > we would literally have to call it from every pass because we > > > can't rely on analysis invalidation! > > > > > > c) If any pass actually escapes a pointer from one function into > > > another, we invalidate the underlying assumption of > > > 'non-address-taken' that it relies upon. > > > > Yep, all of these are pretty nasty. > > > > > Now, as I argued in my general AA thread, I think we might be > > > able > > > to assume that (c) doesn't happen today. But both (a) and (b) > > > seem > > > like active nightmares to try to fix. I can see hacky ways to > > > avoid (a) where we detect *why* GetUnderlyingObject fails, but I > > > don't see how to fix both (a) and (b) (or to fix (a) well) > > > without > > > just disabling this specific aspect of GloblasModRef. > > > > Ok, but we need performance information to make sure this doesn’t > > cause a regression in practice for LTO builds. For example, Spec 2K > > and 2K6 are a reasonable place to start. > > > > > 1) Fix obvious issues with GloblasModRef and switch it to > > > ValueHandles > > > 2) Mail out a patch to disable this part of GlobalsModRef. I can > > > put it behind a flag or however folks would like it to work. > > > 3) Remove addEscapingUse() update API, which without #2 may > > > regress > > > some LTO test case I don't have (because I don't have any other > > > than bootstrap) > > > > Sounds great if we can do this without causing a regression in > > practice. > > Personally, I'm comfortable with moving to a contractual obligation > for the escaping uses situation: No pass may capture the address of > a (previously-uncaptured) global (even locally) without notifying > the AA infrastructure. > > > Note that I said escape, and you say capture here.That was intentional. The problem is that I think you need to pick up all captures for a reasonably-general case, because otherwise you need separate logic to handle local queries. If you think that some local variable can't alias some global because you *know* that the global's address has not been captured, then you need to know if you generate a new local capture.> I've not quite > thought about it enough to generalize this to capture. > > > Anyways, I don't generally disagree (see my lengthier mail about how > I'm envisioning AA updates). However, this is *not sufficient* to > fix GloblasModRef. You also have to somehow prevent asymmetric > evaluation of GetUnderlyingObject on the two pointers queried. To > fix this would require a substantially different implementation of > GlobalsModRef which I think is possible, but which I don't > realistically have the time to implement right now. >You can turn off the depth limit on GetUnderlyingObject.> > I can't think of any in-tree pass that does this now, although we > might certainly have some in the future. Can you think of any we > have now? > > > > If you LTO the ASan runtime along with the instrumentation, then the > instrumentation pass violates this today. Most instrumentation > passes violate this if the optimizer can see the runtime. > > > I'm not aware of any non-instrumentation passes that violate it > today. >Okay, fair enough. Updating those passes to invalidate something in this regard seems manageable.> > > I'd really like to get the AA pass that Sam Parker has been working > on ( http://reviews.llvm.org/D10059 ) in, but it will add measurable > compile-time overhead if it can't cache, and efficient caching seems > to depend on the same property. > > > > Sorry I haven't chimed in there. I'm worried that there is *no* sound > way to do this in the current pass manager, but I'll read that patch > and chime in with specific suggestions.Thanks! -Hal> > > -Chandler-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
Pete Cooper
2015-Jul-14 17:10 UTC
[LLVMdev] GlobalsModRef (and thus LTO) is completely broken
> On Jul 13, 2015, at 10:34 PM, Hal Finkel <hfinkel at anl.gov> wrote: > > ----- Original Message ----- >> From: "Chris Lattner" <clattner at apple.com> >> To: "Chandler Carruth" <chandlerc at gmail.com> >> Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>, "Hal Finkel" <hfinkel at anl.gov>, "Justin Bogner" >> <mail at justinbogner.com>, "Duncan Exon Smith" <dexonsmith at apple.com>, "Rafael Espíndola" <rafael.espindola at gmail.com> >> Sent: Tuesday, July 14, 2015 12:18:36 AM >> Subject: Re: [LLVMdev] GlobalsModRef (and thus LTO) is completely broken >> >> >>> On Jul 13, 2015, at 8:19 PM, Chandler Carruth <chandlerc at gmail.com> >>> wrote: >>> >>> Ok folks, >>> >>> I wrote up the general high-level thoughts I have about stateful AA >>> in a separate thread. But we need to sort out the completely and >>> horribly broken aspects of GlobalsModRef today, and the practical >>> steps forward. This email is totally about the practical stuff. >>> >>> Now, as to why I emailed this group of people and with this >>> subject, the only pass pipeline that includes GlobalsModRef, is >>> the LTO pipeline. >> >> Ah, so it is just an LTO enabled benchmark hack then. >> >>> It is also relying on a precomputed set of global variables whose >>> address is never used by an instruction other than some very small >>> set (gep, bitcast) as "non-address-taken". It then runs >>> GetUnderlyingObject on the two pointers in alias queries, and if >>> that finds one of these "non-address-taken" globals for one of the >>> memory locations but not the other, it concludes no-alias! This is >>> broken for a number of reasons. >>> >>> a) If the two locations merely have a different *depth* of >>> instruction stack, because GetUnderlyingObject has a recursion >>> cap, one side can fail while the other succeeds, and we >>> erroneously produce no-alias. >> >> Interesting. I’m sure it is no consolation, but GlobalsModRef >> probably predated the recursion cap :-) >> >>> b) If instcombine or any other pass for any reason introduces on >>> one path an instruction that GetUnderlyingObject can't look >>> through (select, phi, load, ....), we incorrectly conclude >>> no-alias. This is what addEscapingUse was intended to solve, but >>> we would literally have to call it from every pass because we >>> can't rely on analysis invalidation! >>> >>> c) If any pass actually escapes a pointer from one function into >>> another, we invalidate the underlying assumption of >>> 'non-address-taken' that it relies upon. >> >> Yep, all of these are pretty nasty. >> >>> Now, as I argued in my general AA thread, I think we might be able >>> to assume that (c) doesn't happen today. But both (a) and (b) seem >>> like active nightmares to try to fix. I can see hacky ways to >>> avoid (a) where we detect *why* GetUnderlyingObject fails, but I >>> don't see how to fix both (a) and (b) (or to fix (a) well) without >>> just disabling this specific aspect of GloblasModRef. >> >> Ok, but we need performance information to make sure this doesn’t >> cause a regression in practice for LTO builds. For example, Spec 2K >> and 2K6 are a reasonable place to start. >> >>> 1) Fix obvious issues with GloblasModRef and switch it to >>> ValueHandles >>> 2) Mail out a patch to disable this part of GlobalsModRef. I can >>> put it behind a flag or however folks would like it to work. >>> 3) Remove addEscapingUse() update API, which without #2 may regress >>> some LTO test case I don't have (because I don't have any other >>> than bootstrap) >> >> Sounds great if we can do this without causing a regression in >> practice. > > Personally, I'm comfortable with moving to a contractual obligation for the escaping uses situation: No pass may capture the address of a (previously-uncaptured) global (even locally) without notifying the AA infrastructure. I can't think of any in-tree pass that does this now, although we might certainly have some in the future. Can you think of any we have now?MergeGlobals creates new globals which could escape. This is probably going to be treated differently from updating AA as there’s nothing tracking the new global yet, so nothing to update. But still best to make sure this is ok. Pete> > I'd really like to get the AA pass that Sam Parker has been working on (http://reviews.llvm.org/D10059) in, but it will add measurable compile-time overhead if it can't cache, and efficient caching seems to depend on the same property. > > -Hal > >> Are you aware of any miscompiles that might be attributed >> to this, or are these “theoretical" concerns? >> >> -Chris >> >> > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev