Hans Wennborg via llvm-dev
2015-Sep-24 21:15 UTC
[llvm-dev] Comparing stack addresses and function args (Was: [llvm] r174131 - Add a comment explaining an unavailable optimization)
On Thu, Sep 24, 2015 at 12:06 PM, Aaron Ballman <aaron at aaronballman.com> wrote:> On Thu, Sep 24, 2015 at 2:42 PM, Hans Wennborg <hans at chromium.org> wrote: >> I was wondering why LLVM cannot optimize this code (which GCC does optimize): >> >> int f(int *p) { int x; return p == &x; } >> >> it would seem that this must always return 0. (This occurs as a >> self-assignment check in the code I was looking at; I was hoping we >> could fold that check away.) > > This is different than a self-assignment check, is it not? > > blah& operator=(const blah &b) { > if (&b == this) {} > // ... > } > > (Because it gets the pointer from the parameter and compares against a > "local" pointer?) > > I just want to make sure that you're not suggesting we should optimize > away self-assignment checks in the general case.Right, I'm not suggesting that :-) The code I looked at went something like this: struct S { S& operator=(const S& other) { if (&other != this) val = other.val; return *this; } void foo(); int val; }; void S::foo() { S tmp; tmp.val = 42; *this = tmp; // operator= gets inlined; we should know(?) that &tmp != this } This is of course a silly example, but with GCC we get: movl $42, (%rdi) ret whereas Clang generates: movl $42, -8(%rsp) leaq -8(%rsp), %rax cmpq %rdi, %rax je .LBB0_2 movl $42, (%rdi) .LBB0_2: retq which made me sad.>> I'd be interested to hear what those with a stronger understanding of >> the standard than myself think about this, and also if there is any >> example of something that could break because of this optimization. If >> not, I'd like us to optimize it :-) >> >> >> On Thu, Jan 31, 2013 at 4:49 PM, Dan Gohman <dan433584 at gmail.com> wrote: >>> Author: djg >>> Date: Thu Jan 31 18:49:06 2013 >>> New Revision: 174131 >>> >>> URL: http://llvm.org/viewvc/llvm-project?rev=174131&view=rev >>> Log: >>> Add a comment explaining an unavailable optimization. >>> >>> Modified: >>> llvm/trunk/lib/Analysis/InstructionSimplify.cpp >>> >>> Modified: llvm/trunk/lib/Analysis/InstructionSimplify.cpp >>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/InstructionSimplify.cpp?rev=174131&r1=174130&r2=174131&view=diff >>> =============================================================================>>> --- llvm/trunk/lib/Analysis/InstructionSimplify.cpp (original) >>> +++ llvm/trunk/lib/Analysis/InstructionSimplify.cpp Thu Jan 31 18:49:06 2013 >>> @@ -1688,6 +1688,34 @@ static Value *ExtractEquivalentCondition >>> return 0; >>> } >>> >>> +// A significant optimization not implemented here is assuming that alloca >>> +// addresses are not equal to incoming argument values. They don't *alias*, >>> +// as we say, but that doesn't mean they aren't equal, so we take a >>> +// conservative approach. >>> +// >>> +// This is inspired in part by C++11 5.10p1: >>> +// "Two pointers of the same type compare equal if and only if they are both >>> +// null, both point to the same function, or both represent the same >>> +// address." >>> +// >>> +// This is pretty permissive. >> >> Indeed :-/ >> >>> +// It's also partly due to C11 6.5.9p6: >>> +// "Two pointers compare equal if and only if both are null pointers, both are >>> +// pointers to the same object (including a pointer to an object and a >>> +// subobject at its beginning) or function, both are pointers to one past the >>> +// last element of the same array object, or one is a pointer to one past the >>> +// end of one array object and the other is a pointer to the start of a >>> +// different array object that happens to immediately follow the ï¬ rst array >>> +// object in the address space.) >>> +// >>> +// C11's version is more restrictive, however there's no reason why an argument >>> +// couldn't be a one-past-the-end value for a stack object in the caller and be >>> +// equal to the beginning of a stack object in the callee. >> >> This is interesting. >> >> For the one-past-the-end pointer to point into the callee, the stack >> would have to be growing upwards. So this won't happen on X86. Can we >> turn this optimization on for downward-growing-stack targets? >> >> Second, if the stack grows upward, and the function argument does >> point into the callee stack frame, "p" and "&x" could have the same >> contents. So per the "represent the same address" part above, they >> should compare equal? But they're noalias? Are we allowed to write >> through p? It wasn't a pointer to a valid object when we made the >> call, but it became valid in the callee? This is all terrifying. >> >> I suppose one could store the value of &x though, and then use it >> again later, i.e.: >> >> int *global; >> int f(int *p) { >> int x; >> global = &x; >> return p == &x; >> } >> int g() { >> f(0); >> return f(global); >> } >> >> Is g() guaranteed to return 1 here? Maybe we could claim it's >> implementation dependent? GCC does not seem fold p==&x to 0 here. I >> suppose we could make sure to check whether &x escapes the function? >> >> - Hans
Aaron Ballman via llvm-dev
2015-Sep-24 21:34 UTC
[llvm-dev] Comparing stack addresses and function args (Was: [llvm] r174131 - Add a comment explaining an unavailable optimization)
On Thu, Sep 24, 2015 at 5:15 PM, Hans Wennborg <hans at chromium.org> wrote:> On Thu, Sep 24, 2015 at 12:06 PM, Aaron Ballman <aaron at aaronballman.com> wrote: >> On Thu, Sep 24, 2015 at 2:42 PM, Hans Wennborg <hans at chromium.org> wrote: >>> I was wondering why LLVM cannot optimize this code (which GCC does optimize): >>> >>> int f(int *p) { int x; return p == &x; } >>> >>> it would seem that this must always return 0. (This occurs as a >>> self-assignment check in the code I was looking at; I was hoping we >>> could fold that check away.) >> >> This is different than a self-assignment check, is it not? >> >> blah& operator=(const blah &b) { >> if (&b == this) {} >> // ... >> } >> >> (Because it gets the pointer from the parameter and compares against a >> "local" pointer?) >> >> I just want to make sure that you're not suggesting we should optimize >> away self-assignment checks in the general case. > > Right, I'm not suggesting that :-) > > The code I looked at went something like this: > > struct S { > S& operator=(const S& other) { > if (&other != this) > val = other.val; > return *this; > } > void foo(); > int val; > }; > void S::foo() { > S tmp; > tmp.val = 42; > *this = tmp; // operator= gets inlined; we should know(?) that &tmp != this > } > > This is of course a silly example, but with GCC we get: > > movl $42, (%rdi) > ret > > whereas Clang generates: > > movl $42, -8(%rsp) > leaq -8(%rsp), %rax > cmpq %rdi, %rax > je .LBB0_2 > movl $42, (%rdi) > .LBB0_2: > retq > > which made me sad.Ah, yes, this makes perfect sense to me. Thank you for the explanation! ~Aaron> > >>> I'd be interested to hear what those with a stronger understanding of >>> the standard than myself think about this, and also if there is any >>> example of something that could break because of this optimization. If >>> not, I'd like us to optimize it :-) >>> >>> >>> On Thu, Jan 31, 2013 at 4:49 PM, Dan Gohman <dan433584 at gmail.com> wrote: >>>> Author: djg >>>> Date: Thu Jan 31 18:49:06 2013 >>>> New Revision: 174131 >>>> >>>> URL: http://llvm.org/viewvc/llvm-project?rev=174131&view=rev >>>> Log: >>>> Add a comment explaining an unavailable optimization. >>>> >>>> Modified: >>>> llvm/trunk/lib/Analysis/InstructionSimplify.cpp >>>> >>>> Modified: llvm/trunk/lib/Analysis/InstructionSimplify.cpp >>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/InstructionSimplify.cpp?rev=174131&r1=174130&r2=174131&view=diff >>>> =============================================================================>>>> --- llvm/trunk/lib/Analysis/InstructionSimplify.cpp (original) >>>> +++ llvm/trunk/lib/Analysis/InstructionSimplify.cpp Thu Jan 31 18:49:06 2013 >>>> @@ -1688,6 +1688,34 @@ static Value *ExtractEquivalentCondition >>>> return 0; >>>> } >>>> >>>> +// A significant optimization not implemented here is assuming that alloca >>>> +// addresses are not equal to incoming argument values. They don't *alias*, >>>> +// as we say, but that doesn't mean they aren't equal, so we take a >>>> +// conservative approach. >>>> +// >>>> +// This is inspired in part by C++11 5.10p1: >>>> +// "Two pointers of the same type compare equal if and only if they are both >>>> +// null, both point to the same function, or both represent the same >>>> +// address." >>>> +// >>>> +// This is pretty permissive. >>> >>> Indeed :-/ >>> >>>> +// It's also partly due to C11 6.5.9p6: >>>> +// "Two pointers compare equal if and only if both are null pointers, both are >>>> +// pointers to the same object (including a pointer to an object and a >>>> +// subobject at its beginning) or function, both are pointers to one past the >>>> +// last element of the same array object, or one is a pointer to one past the >>>> +// end of one array object and the other is a pointer to the start of a >>>> +// different array object that happens to immediately follow the ï¬ rst array >>>> +// object in the address space.) >>>> +// >>>> +// C11's version is more restrictive, however there's no reason why an argument >>>> +// couldn't be a one-past-the-end value for a stack object in the caller and be >>>> +// equal to the beginning of a stack object in the callee. >>> >>> This is interesting. >>> >>> For the one-past-the-end pointer to point into the callee, the stack >>> would have to be growing upwards. So this won't happen on X86. Can we >>> turn this optimization on for downward-growing-stack targets? >>> >>> Second, if the stack grows upward, and the function argument does >>> point into the callee stack frame, "p" and "&x" could have the same >>> contents. So per the "represent the same address" part above, they >>> should compare equal? But they're noalias? Are we allowed to write >>> through p? It wasn't a pointer to a valid object when we made the >>> call, but it became valid in the callee? This is all terrifying. >>> >>> I suppose one could store the value of &x though, and then use it >>> again later, i.e.: >>> >>> int *global; >>> int f(int *p) { >>> int x; >>> global = &x; >>> return p == &x; >>> } >>> int g() { >>> f(0); >>> return f(global); >>> } >>> >>> Is g() guaranteed to return 1 here? Maybe we could claim it's >>> implementation dependent? GCC does not seem fold p==&x to 0 here. I >>> suppose we could make sure to check whether &x escapes the function? >>> >>> - Hans
Philip Reames via llvm-dev
2015-Sep-24 21:55 UTC
[llvm-dev] Comparing stack addresses and function args (Was: [llvm] r174131 - Add a comment explaining an unavailable optimization)
I threw together a patch which implements this (attached.) If we decide that this is actually a legal transform, I'm happy to post this for review. In addition to the version proposed here, I also implemented a case where a trivially escaped alloca's address is not equal to any other value. I believe both are valid, but we should confirm. Philip On 09/24/2015 02:34 PM, Aaron Ballman via llvm-dev wrote:> On Thu, Sep 24, 2015 at 5:15 PM, Hans Wennborg <hans at chromium.org> wrote: >> On Thu, Sep 24, 2015 at 12:06 PM, Aaron Ballman <aaron at aaronballman.com> wrote: >>> On Thu, Sep 24, 2015 at 2:42 PM, Hans Wennborg <hans at chromium.org> wrote: >>>> I was wondering why LLVM cannot optimize this code (which GCC does optimize): >>>> >>>> int f(int *p) { int x; return p == &x; } >>>> >>>> it would seem that this must always return 0. (This occurs as a >>>> self-assignment check in the code I was looking at; I was hoping we >>>> could fold that check away.) >>> This is different than a self-assignment check, is it not? >>> >>> blah& operator=(const blah &b) { >>> if (&b == this) {} >>> // ... >>> } >>> >>> (Because it gets the pointer from the parameter and compares against a >>> "local" pointer?) >>> >>> I just want to make sure that you're not suggesting we should optimize >>> away self-assignment checks in the general case. >> Right, I'm not suggesting that :-) >> >> The code I looked at went something like this: >> >> struct S { >> S& operator=(const S& other) { >> if (&other != this) >> val = other.val; >> return *this; >> } >> void foo(); >> int val; >> }; >> void S::foo() { >> S tmp; >> tmp.val = 42; >> *this = tmp; // operator= gets inlined; we should know(?) that &tmp != this >> } >> >> This is of course a silly example, but with GCC we get: >> >> movl $42, (%rdi) >> ret >> >> whereas Clang generates: >> >> movl $42, -8(%rsp) >> leaq -8(%rsp), %rax >> cmpq %rdi, %rax >> je .LBB0_2 >> movl $42, (%rdi) >> .LBB0_2: >> retq >> >> which made me sad. > Ah, yes, this makes perfect sense to me. Thank you for the explanation! > > ~Aaron > >> >>>> I'd be interested to hear what those with a stronger understanding of >>>> the standard than myself think about this, and also if there is any >>>> example of something that could break because of this optimization. If >>>> not, I'd like us to optimize it :-) >>>> >>>> >>>> On Thu, Jan 31, 2013 at 4:49 PM, Dan Gohman <dan433584 at gmail.com> wrote: >>>>> Author: djg >>>>> Date: Thu Jan 31 18:49:06 2013 >>>>> New Revision: 174131 >>>>> >>>>> URL: http://llvm.org/viewvc/llvm-project?rev=174131&view=rev >>>>> Log: >>>>> Add a comment explaining an unavailable optimization. >>>>> >>>>> Modified: >>>>> llvm/trunk/lib/Analysis/InstructionSimplify.cpp >>>>> >>>>> Modified: llvm/trunk/lib/Analysis/InstructionSimplify.cpp >>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/InstructionSimplify.cpp?rev=174131&r1=174130&r2=174131&view=diff >>>>> =============================================================================>>>>> --- llvm/trunk/lib/Analysis/InstructionSimplify.cpp (original) >>>>> +++ llvm/trunk/lib/Analysis/InstructionSimplify.cpp Thu Jan 31 18:49:06 2013 >>>>> @@ -1688,6 +1688,34 @@ static Value *ExtractEquivalentCondition >>>>> return 0; >>>>> } >>>>> >>>>> +// A significant optimization not implemented here is assuming that alloca >>>>> +// addresses are not equal to incoming argument values. They don't *alias*, >>>>> +// as we say, but that doesn't mean they aren't equal, so we take a >>>>> +// conservative approach. >>>>> +// >>>>> +// This is inspired in part by C++11 5.10p1: >>>>> +// "Two pointers of the same type compare equal if and only if they are both >>>>> +// null, both point to the same function, or both represent the same >>>>> +// address." >>>>> +// >>>>> +// This is pretty permissive. >>>> Indeed :-/ >>>> >>>>> +// It's also partly due to C11 6.5.9p6: >>>>> +// "Two pointers compare equal if and only if both are null pointers, both are >>>>> +// pointers to the same object (including a pointer to an object and a >>>>> +// subobject at its beginning) or function, both are pointers to one past the >>>>> +// last element of the same array object, or one is a pointer to one past the >>>>> +// end of one array object and the other is a pointer to the start of a >>>>> +// different array object that happens to immediately follow the ï¬ rst array >>>>> +// object in the address space.) >>>>> +// >>>>> +// C11's version is more restrictive, however there's no reason why an argument >>>>> +// couldn't be a one-past-the-end value for a stack object in the caller and be >>>>> +// equal to the beginning of a stack object in the callee. >>>> This is interesting. >>>> >>>> For the one-past-the-end pointer to point into the callee, the stack >>>> would have to be growing upwards. So this won't happen on X86. Can we >>>> turn this optimization on for downward-growing-stack targets? >>>> >>>> Second, if the stack grows upward, and the function argument does >>>> point into the callee stack frame, "p" and "&x" could have the same >>>> contents. So per the "represent the same address" part above, they >>>> should compare equal? But they're noalias? Are we allowed to write >>>> through p? It wasn't a pointer to a valid object when we made the >>>> call, but it became valid in the callee? This is all terrifying. >>>> >>>> I suppose one could store the value of &x though, and then use it >>>> again later, i.e.: >>>> >>>> int *global; >>>> int f(int *p) { >>>> int x; >>>> global = &x; >>>> return p == &x; >>>> } >>>> int g() { >>>> f(0); >>>> return f(global); >>>> } >>>> >>>> Is g() guaranteed to return 1 here? Maybe we could claim it's >>>> implementation dependent? GCC does not seem fold p==&x to 0 here. I >>>> suppose we could make sure to check whether &x escapes the function? >>>> >>>> - Hans > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- A non-text attachment was scrubbed... Name: AllocaEscape.diff Type: text/x-diff Size: 1933 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150924/6f97cc16/attachment.diff>
Possibly Parallel Threads
- Comparing stack addresses and function args (Was: [llvm] r174131 - Add a comment explaining an unavailable optimization)
- Comparing stack addresses and function args (Was: [llvm] r174131 - Add a comment explaining an unavailable optimization)
- Comparing stack addresses and function args (Was: [llvm] r174131 - Add a comment explaining an unavailable optimization)
- Hitting assertion failure related to vectorization + instcombine
- Hitting assertion failure related to vectorization + instcombine