Hi folks, this summer I will work with Richard Smith on clang devirtualization. Check out our proposal: https://docs.google.com/document/d/1f2SGa4TIPuBGm6y6YO768GrQsA8awNfGEJSBFukLhYA/edit?usp=sharing And modified LangRef http://reviews.llvm.org/D11399 You can also check out previous disscussion that was started before our proposal was ready - http://lists.cs.uiuc.edu/pipermail/cfe-dev/2015-July/044052.html Regards Piotr Padlewski -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150722/a69bda5e/attachment.html>
Hi Piotr, You may be interested in a recent patch I posted: http://reviews.llvm.org/D11043 This patch addresses a de-virtualization case that I’m not sure would be handled by your current proposal, namely that of a virtual call where the ‘this’ object is a global variable. For example: struct A { A(); virtual void foo(); }; void g(A * a) { a->foo(); } A a; int main() { g(&a); } It might be worth considering if your approach could be extended to handle this case. -- Geoff Berry Employee of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Piotr Padlewski Sent: Wednesday, July 22, 2015 5:56 PM To: cfe-dev at cs.uiuc.edu Developers; llvmdev at cs.uiuc.edu Subject: [LLVMdev] Clang devirtualization proposal Hi folks, this summer I will work with Richard Smith on clang devirtualization. Check out our proposal: https://docs.google.com/document/d/1f2SGa4TIPuBGm6y6YO768GrQsA8awNfGEJSBFukLhYA/edit?usp=sharing <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_document_d_1f2SGa4TIPuBGm6y6YO768GrQsA8awNfGEJSBFukLhYA_edit-3Fusp-3Dsharing&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=-sGvXxkjadRLtXcTi4kOVPumoH-0XOKmk_vgUTcYugY&s=hHoo6tgC-NooXdIwbBwT_D8sIw8fcYF4XvBRI8Lr9Eg&e=> And modified LangRef http://reviews.llvm.org/D11399 <https://urldefense.proofpoint.com/v2/url?u=http-3A__reviews.llvm.org_D11399&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=-sGvXxkjadRLtXcTi4kOVPumoH-0XOKmk_vgUTcYugY&s=L6_vdinD06uAwgm4OJGL5QxKw8Tzfa_4DxPwf3Zj704&e=> You can also check out previous disscussion that was started before our proposal was ready - http://lists.cs.uiuc.edu/pipermail/cfe-dev/2015-July/044052.html Regards Piotr Padlewski -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150723/94d305d6/attachment.html>
HI, Yep, our proposal doesn't cover it, because this load ; icmp ; assume; will land global initilizer function, and main will not see it. At least if foo would be called multiple times, then we would only have one load from vtable, but unfortunatelly we will not be able to inline, or make direct call to it with this approach. I think that this case is rare enough to solve it right now. Piotr On Thu, Jul 23, 2015 at 10:30 AM, Geoff Berry <gberry at codeaurora.org> wrote:> Hi Piotr, > > > > You may be interested in a recent patch I posted: > http://reviews.llvm.org/D11043 > > This patch addresses a de-virtualization case that I’m not sure would be > handled by your current proposal, namely that of a virtual call where the > ‘this’ object is a global variable. > > For example: > > > > struct A { > > A(); > > virtual void foo(); > > }; > > void g(A * a) { > > a->foo(); > > } > > > > A a; > > int main() { > > g(&a); > > } > > > > It might be worth considering if your approach could be extended to handle > this case. > > > > -- > > Geoff Berry > > Employee of Qualcomm Innovation Center, Inc. > > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux > Foundation Collaborative Project > > > > *From:* llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] *On > Behalf Of *Piotr Padlewski > *Sent:* Wednesday, July 22, 2015 5:56 PM > *To:* cfe-dev at cs.uiuc.edu Developers; llvmdev at cs.uiuc.edu > *Subject:* [LLVMdev] Clang devirtualization proposal > > > > Hi folks, > > this summer I will work with Richard Smith on clang devirtualization. > Check out our proposal: > > > > > https://docs.google.com/document/d/1f2SGa4TIPuBGm6y6YO768GrQsA8awNfGEJSBFukLhYA/edit?usp=sharing > <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_document_d_1f2SGa4TIPuBGm6y6YO768GrQsA8awNfGEJSBFukLhYA_edit-3Fusp-3Dsharing&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=-sGvXxkjadRLtXcTi4kOVPumoH-0XOKmk_vgUTcYugY&s=hHoo6tgC-NooXdIwbBwT_D8sIw8fcYF4XvBRI8Lr9Eg&e=> > > > > And modified LangRef > > http://reviews.llvm.org/D11399 > <https://urldefense.proofpoint.com/v2/url?u=http-3A__reviews.llvm.org_D11399&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=-sGvXxkjadRLtXcTi4kOVPumoH-0XOKmk_vgUTcYugY&s=L6_vdinD06uAwgm4OJGL5QxKw8Tzfa_4DxPwf3Zj704&e=> > > > > You can also check out previous disscussion that was started before our > proposal was ready - > http://lists.cs.uiuc.edu/pipermail/cfe-dev/2015-July/044052.html > > > > Regards > > Piotr Padlewski >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150723/2a5bd832/attachment.html>
Hi Piotr, Thanks for posting this! First, a question. When you say, regarding i8* @llvm.invariant.group.barrier(i8*): "Required to handle destructors, placement new and std::launder. Call of this function will be put on the end of each of this functions" I completely understand placement new and std::launder. I don't understand destructors, could you explain? Also, am I correct in saying that we could handle the case of 'final' classes I highlighted in initial e-mail by inserting these assumptions whenever a pointer/reference to a class of such a type came into scope? struct A { virtual void foo() = 0; }; struct B final : public A { void foo(); }; void entry(B *b) { // emit assumptions about vtbl of 'b' here? } Thanks again, Hal ----- Original Message -----> From: "Piotr Padlewski" <prazek at google.com> > To: "cfe-dev at cs.uiuc.edu Developers" <cfe-dev at cs.uiuc.edu>, llvmdev at cs.uiuc.edu > Cc: "Richard Smith" <richard at metafoo.co.uk> > Sent: Wednesday, July 22, 2015 4:55:43 PM > Subject: [cfe-dev] Clang devirtualization proposal > > > > > Hi folks, > this summer I will work with Richard Smith on clang devirtualization. > Check out our proposal: > > https://docs.google.com/document/d/1f2SGa4TIPuBGm6y6YO768GrQsA8awNfGEJSBFukLhYA/edit?usp=sharing > > > > And modified LangRef > http://reviews.llvm.org/D11399 > > > > You can also check out previous disscussion that was started before > our proposal was ready - > http://lists.cs.uiuc.edu/pipermail/cfe-dev/2015-July/044052.html > > > Regards > Piotr Padlewski > _______________________________________________ > cfe-dev mailing list > cfe-dev at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
On Sat, Jul 25, 2015 at 12:39 PM, Hal Finkel <hfinkel at anl.gov> wrote:> Hi Piotr, > > Thanks for posting this! First, a question. When you say, regarding i8* > @llvm.invariant.group.barrier(i8*): > > "Required to handle destructors, placement new and std::launder. Call of > this function will be put on the end of each of this functions" > > I completely understand placement new and std::launder. I don't understand > destructors, could you explain? >When a derived class destructor invokes a base class destructor, the dynamic type of the object changes (as does the vptr), so we need an invariant barrier to prevent the derived class's vptr being used for virtual calls in an inlined base class destructor.> Also, am I correct in saying that we could handle the case of 'final' > classes I highlighted in initial e-mail by inserting these assumptions > whenever a pointer/reference to a class of such a type came into scope? >Yes, it would be correct to insert these assumptions anywhere where the language standard guarantees that there exists an object of a known (most-derived) dynamic type (and in particular, we can do this whenever we know there exists an object of a known final type). CodeGen already invokes EmitTypeCheck in many of the places where it's guaranteed that an object of a known type exists; we could experiment with adding an assumption from it any time that type is final. struct A {> virtual void foo() = 0; > }; > > struct B final : public A { > void foo(); > }; > > void entry(B *b) { > // emit assumptions about vtbl of 'b' here? >This case is tricky. We don't currently have a way of saying "assume that a load of %b would load %B.vtbl" without also saying "assume that %b is dereferenceable". We've seen other cases where that would be beneficial, so perhaps that's something we should consider adding.> } > > Thanks again, > Hal > > ----- Original Message ----- > > From: "Piotr Padlewski" <prazek at google.com> > > To: "cfe-dev at cs.uiuc.edu Developers" <cfe-dev at cs.uiuc.edu>, > llvmdev at cs.uiuc.edu > > Cc: "Richard Smith" <richard at metafoo.co.uk> > > Sent: Wednesday, July 22, 2015 4:55:43 PM > > Subject: [cfe-dev] Clang devirtualization proposal > > > > > > > > > > Hi folks, > > this summer I will work with Richard Smith on clang devirtualization. > > Check out our proposal: > > > > > https://docs.google.com/document/d/1f2SGa4TIPuBGm6y6YO768GrQsA8awNfGEJSBFukLhYA/edit?usp=sharing > > > > > > > > And modified LangRef > > http://reviews.llvm.org/D11399 > > > > > > > > You can also check out previous disscussion that was started before > > our proposal was ready - > > http://lists.cs.uiuc.edu/pipermail/cfe-dev/2015-July/044052.html > > > > > > Regards > > Piotr Padlewski > > _______________________________________________ > > cfe-dev mailing list > > cfe-dev at cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev > > > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150725/0a625c8a/attachment.html>
Having read through the proposal, I feel like I missing some of the background to understand the problem you're trying to solve. My mental model is that construction of an object creates a new abstract location in an infinite heap with each object infinitely far apart. Destruction of the object destroys the abstract location. As a result, destructing one object and constructing another produce unique incomparable abstract locations. The fact the two abstract locations might happen to share a physical address is irrelevant. If I'm understanding the proposal correctly, this model works for most code. The key optimization you appear to want to perform is to recognize the fact that these two abstract locations occupy the same memory. In particular, you want to be able to return mustalias for alias(loc1, loc2). Another way of saying this is that you want to reason about abstract locations as defined by allocation/deallocation events rather than construction/destruction events. Is that a fair summary? What I'm not clear on is *why* recognizing the two abstract locations share a physical address is important. Given that the contents of the abstract location before construction or after destruction are undefined (right?), what optimization does recognizing the mustalias relation enable? Philip On 07/22/2015 02:55 PM, Piotr Padlewski wrote:> Hi folks, > this summer I will work with Richard Smith on clang devirtualization. > Check out our proposal: > > https://docs.google.com/document/d/1f2SGa4TIPuBGm6y6YO768GrQsA8awNfGEJSBFukLhYA/edit?usp=sharing > <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_document_d_1f2SGa4TIPuBGm6y6YO768GrQsA8awNfGEJSBFukLhYA_edit-3Fusp-3Dsharing&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=-sGvXxkjadRLtXcTi4kOVPumoH-0XOKmk_vgUTcYugY&s=hHoo6tgC-NooXdIwbBwT_D8sIw8fcYF4XvBRI8Lr9Eg&e=> > > And modified LangRef > http://reviews.llvm.org/D11399 > <https://urldefense.proofpoint.com/v2/url?u=http-3A__reviews.llvm.org_D11399&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=-sGvXxkjadRLtXcTi4kOVPumoH-0XOKmk_vgUTcYugY&s=L6_vdinD06uAwgm4OJGL5QxKw8Tzfa_4DxPwf3Zj704&e=> > > You can also check out previous disscussion that was started before > our proposal was ready - > http://lists.cs.uiuc.edu/pipermail/cfe-dev/2015-July/044052.html > > Regards > Piotr Padlewski > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150728/222693e8/attachment.html>
On Tue, Jul 28, 2015 at 10:58 AM, Philip Reames <listmail at philipreames.com> wrote:> Having read through the proposal, I feel like I missing some of the > background to understand the problem you're trying to solve. > > My mental model is that construction of an object creates a new abstract > location in an infinite heap with each object infinitely far apart. > Destruction of the object destroys the abstract location. As a result, > destructing one object and constructing another produce unique incomparable > abstract locations. The fact the two abstract locations might happen to > share a physical address is irrelevant. > > If I'm understanding the proposal correctly, this model works for most > code. The key optimization you appear to want to perform is to recognize > the fact that these two abstract locations occupy the same memory. In > particular, you want to be able to return mustalias for alias(loc1, loc2). > Another way of saying this is that you want to reason about abstract > locations as defined by allocation/deallocation events rather than > construction/destruction events. Is that a fair summary? > > What I'm not clear on is *why* recognizing the two abstract locations > share a physical address is important. Given that the contents of the > abstract location before construction or after destruction are undefined > (right?), what optimization does recognizing the mustalias relation enable? >I think this is incorrect. LLVM's model is closer to the second model, and we need something like the first model to prevent erroneous devirtualization. The corner case for C++ is when the optimizer observes that two abstract objects share the same physical memory location. In practice, this could happen if the memory allocator becomes visible to the optimizer through inlining. For illustration, do placement new into the stack memory of another object. This is illustrated in example 2 of the proposal: struct MyClass { virtual void foo(); }; struct MyOtherClass : MyClass { virtual void foo(); }; int main() { MyClass c; c.foo(); // Reuse the storage temporarily. UB to access the object through ‘c’ c.~MyClass(); auto c2 = new (&c) MyOtherClass(); c2->foo(); //fine, we have new pointer // c.foo() // UB, the type has changed // The storage has to contain a ‘MyClass’ when it goes out of scope. c2->~MyOtherClass(); new (&c) MyClass(); // we have to get back to previous type because calling destructor using c would be UB } Without @llvm.invariant.group.barrier, LLVM will probably replace %c2 with %c here, since they are trivially the same. With @llvm.invariant.group.barrier, the result of placement new will be a distinct SSA value that LLVM can't reason about, and we won't accidentally devirtualize c2->foo() to MyClass::foo. There is, however, a small problem with this model. If the code happened to do this: ... auto c2 = new (&c) MyOtherClass(); assert(c2 == &c); ... LLVM might once again replace %c2 with %c, causing bad devirtualization. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150729/c59553d9/attachment.html>
Ok, replying anew now that I understand why reasoning about abstract locations for each object doesn't work. The general idea of describing a set of load and stores which belong to a particular invariant group seems reasonable. I've got some questions/comments on the specifics, but the overall direction seems entirely workable for the specific problem you're trying to solve. Quoting from the google doc: "If we don’t know definition of some function, we assume that it will not call @llvm.invariant.group.barrier()." This part really really bugs me. We generally try to assume minimal knowledge of external functions (i.e. they can do anything) and this assumption would invert that. Is there a way we can rephrase the proposal which avoids the need for this? I'm not quite clear what this assumption buys us. Is there a particular reason why a load or store must belong to a single invariant group rather than be a member of several? I don't have an immediate use case in mind, but it seems potentially useful. "i8* @llvm.invariant.group.barrier(i8*):Given a pointer, produces another pointer that aliases the first but which is considered different for the purposes of load !invariant.group metadata." The definition here minorly bugs me. This might just be a matter of wordsmithing, but it seems strange to me that this can't be defined in terms of the assumptions allowed about the memory through the two pointers. If I'm looking at this instruction in memory dependence analysis, what am I allowed to assume, not assume? The current definition doesn't make this obvious. One option: "produces another pointer which aliases the first, but where any invariant assumptions introduced by invariant.group metadata have been striped away." The notion of using the assume seems to make sense. I could see an argument for extending the invariant.group metadata with a way to express the assumed value, but we could also make that extension at a later time if needed. I'm wondering if there's a problematic interaction with CSE here. Consider this example is pseudo LLVM IR: v1 = load i64, %p, !invariant.group !Type1 ; I called destructor/placement new for the same type, but that optimized entirely away p2 = invariant.group.barrier(p1) if (p1 != p2) return. store i64 0, %p2, !invariant.group !Type1 v2 = load i64, %p2, !invariant.group !Type1 ret i64 v1 - v2 (Assume that !Type is used to describe a write once integer field within some class. Not all instances have the same integer value.) Having CSE turn this into: v1 = load i64, %p, !invariant.group !Type1 p2 = invariant.group.barrier(p1) if (p1 != p2) return. store i64 0, %p1, !invariant.group !Type1 v2 = load i64, %p1, !invariant.group !Type1 ret i64 v1 - v2 And then GVN turn this into: v1 = load i64, %p, !invariant.group !Type1 p2 = invariant.group.barrier(p1) if (p1 != p2) return. ret i64 v1 - v1 (-> 0) This doesn't seem like the result I'd expect. Is there something about my initial IR which is wrong/invalid in some way? Is the invariant.group required to be specific to a single bitpattern across all usages within a function/module/context? That would be reasonable, but I don't think is explicit said right now. It also makes !invariant.group effectively useless for describing constant fields which are constant per instance rather than per-class. Philip On 07/22/2015 02:55 PM, Piotr Padlewski wrote:> Hi folks, > this summer I will work with Richard Smith on clang devirtualization. > Check out our proposal: > > https://docs.google.com/document/d/1f2SGa4TIPuBGm6y6YO768GrQsA8awNfGEJSBFukLhYA/edit?usp=sharing > <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_document_d_1f2SGa4TIPuBGm6y6YO768GrQsA8awNfGEJSBFukLhYA_edit-3Fusp-3Dsharing&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=-sGvXxkjadRLtXcTi4kOVPumoH-0XOKmk_vgUTcYugY&s=hHoo6tgC-NooXdIwbBwT_D8sIw8fcYF4XvBRI8Lr9Eg&e=> > > And modified LangRef > http://reviews.llvm.org/D11399 > <https://urldefense.proofpoint.com/v2/url?u=http-3A__reviews.llvm.org_D11399&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=-sGvXxkjadRLtXcTi4kOVPumoH-0XOKmk_vgUTcYugY&s=L6_vdinD06uAwgm4OJGL5QxKw8Tzfa_4DxPwf3Zj704&e=> > > You can also check out previous disscussion that was started before > our proposal was ready - > http://lists.cs.uiuc.edu/pipermail/cfe-dev/2015-July/044052.html > > Regards > Piotr Padlewski > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150731/1d73c77c/attachment.html>
On Fri, Jul 31, 2015 at 3:53 PM, Philip Reames <listmail at philipreames.com> wrote:> > Quoting from the google doc: "If we don’t know definition of some > function, we assume that it will not call @llvm.invariant.group.barrier(). > " > This part really really bugs me. We generally try to assume minimal > knowledge of external functions (i.e. they can do anything) and this > assumption would invert that. Is there a way we can rephrase the proposal > which avoids the need for this? I'm not quite clear what this assumption > buys us. > > This is because without it the optimization will be useless. For example:A* a = new A; a->foo(); //outline virtual a->foo(); If we will assume that foo calls @llvm.invariant.barrier, then we will not be able to optimize the second call. Piotr -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150731/680b60f5/attachment.html>
On Fri, Jul 31, 2015 at 3:53 PM, Philip Reames <listmail at philipreames.com> wrote:> > I'm wondering if there's a problematic interaction with CSE here. > Consider this example is pseudo LLVM IR: > v1 = load i64, %p, !invariant.group !Type1 > ; I called destructor/placement new for the same type, but that optimized > entirely away > p2 = invariant.group.barrier(p1) > if (p1 != p2) return. > store i64 0, %p2, !invariant.group !Type1 > v2 = load i64, %p2, !invariant.group !Type1 > ret i64 v1 - v2 > > (Assume that !Type is used to describe a write once integer field within > some class. Not all instances have the same integer value.) > > Having CSE turn this into: > v1 = load i64, %p, !invariant.group !Type1 > p2 = invariant.group.barrier(p1) > if (p1 != p2) return. > store i64 0, %p1, !invariant.group !Type1 > v2 = load i64, %p1, !invariant.group !Type1 > ret i64 v1 - v2 > > And then GVN turn this into: > v1 = load i64, %p, !invariant.group !Type1 > p2 = invariant.group.barrier(p1) > if (p1 != p2) return. > ret i64 v1 - v1 (-> 0) > > This doesn't seem like the result I'd expect. Is there something about my > initial IR which is wrong/invalid in some way? Is the invariant.group > required to be specific to a single bitpattern across all usages within a > function/module/context? That would be reasonable, but I don't think is > explicit said right now. It also makes !invariant.group effectively > useless for describing constant fields which are constant per instance > rather than per-class. >Yes, this family of examples scares me. :) It seems we've discovered a new device testing IR soundness. We used it to build a test case that shows that 'readonly' on arguments without 'nocapture' doesn't let you forward stores across such a call. Consider this pseudo-IR and some possible transforms that I would expect to be semantics preserving: void f(i32* readonly %a, i32* %b) { llvm.assume(%a == %b) store i32 42, i32* %b } ... %p = alloca i32 store i32 13, i32* %p call f(i32* readonly %p, i32* %p) %r = load i32, i32* %p ; Propagate llvm.assume info void f(i32* readonly %a, i32* %b) { store i32 42, i32* %a } ... %p = alloca i32 store i32 13, i32* %p call f(i32* readonly %p, i32* %p) %r = load i32, i32* %p ; Delete dead args void f(i32* readonly %a) { store i32 42 } ... %p = alloca i32 store i32 13, i32* %p call f(i32* readonly %p) %r = load i32, i32* %p ; Forward store %p to load %p, since the only use of %p is readonly void f(i32* readonly %a) { store i32 42 } ... %p = alloca i32 call f(i32* readonly %p) %r = i32 13 Today LLVM will not do the final transform because it requires readonly on the entire function, or nocapture on the argument. nocapture cannot be inferred due to the assume comparison. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150731/79a6f691/attachment.html>