Philip Reames via llvm-dev
2015-Sep-24 21:55 UTC
[llvm-dev] Comparing stack addresses and function args (Was: [llvm] r174131 - Add a comment explaining an unavailable optimization)
I threw together a patch which implements this (attached.) If we decide that this is actually a legal transform, I'm happy to post this for review. In addition to the version proposed here, I also implemented a case where a trivially escaped alloca's address is not equal to any other value. I believe both are valid, but we should confirm. Philip On 09/24/2015 02:34 PM, Aaron Ballman via llvm-dev wrote:> On Thu, Sep 24, 2015 at 5:15 PM, Hans Wennborg <hans at chromium.org> wrote: >> On Thu, Sep 24, 2015 at 12:06 PM, Aaron Ballman <aaron at aaronballman.com> wrote: >>> On Thu, Sep 24, 2015 at 2:42 PM, Hans Wennborg <hans at chromium.org> wrote: >>>> I was wondering why LLVM cannot optimize this code (which GCC does optimize): >>>> >>>> int f(int *p) { int x; return p == &x; } >>>> >>>> it would seem that this must always return 0. (This occurs as a >>>> self-assignment check in the code I was looking at; I was hoping we >>>> could fold that check away.) >>> This is different than a self-assignment check, is it not? >>> >>> blah& operator=(const blah &b) { >>> if (&b == this) {} >>> // ... >>> } >>> >>> (Because it gets the pointer from the parameter and compares against a >>> "local" pointer?) >>> >>> I just want to make sure that you're not suggesting we should optimize >>> away self-assignment checks in the general case. >> Right, I'm not suggesting that :-) >> >> The code I looked at went something like this: >> >> struct S { >> S& operator=(const S& other) { >> if (&other != this) >> val = other.val; >> return *this; >> } >> void foo(); >> int val; >> }; >> void S::foo() { >> S tmp; >> tmp.val = 42; >> *this = tmp; // operator= gets inlined; we should know(?) that &tmp != this >> } >> >> This is of course a silly example, but with GCC we get: >> >> movl $42, (%rdi) >> ret >> >> whereas Clang generates: >> >> movl $42, -8(%rsp) >> leaq -8(%rsp), %rax >> cmpq %rdi, %rax >> je .LBB0_2 >> movl $42, (%rdi) >> .LBB0_2: >> retq >> >> which made me sad. > Ah, yes, this makes perfect sense to me. Thank you for the explanation! > > ~Aaron > >> >>>> I'd be interested to hear what those with a stronger understanding of >>>> the standard than myself think about this, and also if there is any >>>> example of something that could break because of this optimization. If >>>> not, I'd like us to optimize it :-) >>>> >>>> >>>> On Thu, Jan 31, 2013 at 4:49 PM, Dan Gohman <dan433584 at gmail.com> wrote: >>>>> Author: djg >>>>> Date: Thu Jan 31 18:49:06 2013 >>>>> New Revision: 174131 >>>>> >>>>> URL: http://llvm.org/viewvc/llvm-project?rev=174131&view=rev >>>>> Log: >>>>> Add a comment explaining an unavailable optimization. >>>>> >>>>> Modified: >>>>> llvm/trunk/lib/Analysis/InstructionSimplify.cpp >>>>> >>>>> Modified: llvm/trunk/lib/Analysis/InstructionSimplify.cpp >>>>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/InstructionSimplify.cpp?rev=174131&r1=174130&r2=174131&view=diff >>>>> =============================================================================>>>>> --- llvm/trunk/lib/Analysis/InstructionSimplify.cpp (original) >>>>> +++ llvm/trunk/lib/Analysis/InstructionSimplify.cpp Thu Jan 31 18:49:06 2013 >>>>> @@ -1688,6 +1688,34 @@ static Value *ExtractEquivalentCondition >>>>> return 0; >>>>> } >>>>> >>>>> +// A significant optimization not implemented here is assuming that alloca >>>>> +// addresses are not equal to incoming argument values. They don't *alias*, >>>>> +// as we say, but that doesn't mean they aren't equal, so we take a >>>>> +// conservative approach. >>>>> +// >>>>> +// This is inspired in part by C++11 5.10p1: >>>>> +// "Two pointers of the same type compare equal if and only if they are both >>>>> +// null, both point to the same function, or both represent the same >>>>> +// address." >>>>> +// >>>>> +// This is pretty permissive. >>>> Indeed :-/ >>>> >>>>> +// It's also partly due to C11 6.5.9p6: >>>>> +// "Two pointers compare equal if and only if both are null pointers, both are >>>>> +// pointers to the same object (including a pointer to an object and a >>>>> +// subobject at its beginning) or function, both are pointers to one past the >>>>> +// last element of the same array object, or one is a pointer to one past the >>>>> +// end of one array object and the other is a pointer to the start of a >>>>> +// different array object that happens to immediately follow the ï¬ rst array >>>>> +// object in the address space.) >>>>> +// >>>>> +// C11's version is more restrictive, however there's no reason why an argument >>>>> +// couldn't be a one-past-the-end value for a stack object in the caller and be >>>>> +// equal to the beginning of a stack object in the callee. >>>> This is interesting. >>>> >>>> For the one-past-the-end pointer to point into the callee, the stack >>>> would have to be growing upwards. So this won't happen on X86. Can we >>>> turn this optimization on for downward-growing-stack targets? >>>> >>>> Second, if the stack grows upward, and the function argument does >>>> point into the callee stack frame, "p" and "&x" could have the same >>>> contents. So per the "represent the same address" part above, they >>>> should compare equal? But they're noalias? Are we allowed to write >>>> through p? It wasn't a pointer to a valid object when we made the >>>> call, but it became valid in the callee? This is all terrifying. >>>> >>>> I suppose one could store the value of &x though, and then use it >>>> again later, i.e.: >>>> >>>> int *global; >>>> int f(int *p) { >>>> int x; >>>> global = &x; >>>> return p == &x; >>>> } >>>> int g() { >>>> f(0); >>>> return f(global); >>>> } >>>> >>>> Is g() guaranteed to return 1 here? Maybe we could claim it's >>>> implementation dependent? GCC does not seem fold p==&x to 0 here. I >>>> suppose we could make sure to check whether &x escapes the function? >>>> >>>> - Hans > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- A non-text attachment was scrubbed... Name: AllocaEscape.diff Type: text/x-diff Size: 1933 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150924/6f97cc16/attachment.diff>
Hans Wennborg via llvm-dev
2015-Sep-25 01:04 UTC
[llvm-dev] Comparing stack addresses and function args (Was: [llvm] r174131 - Add a comment explaining an unavailable optimization)
I tried your patch on a Clang build to see if it would fire. It reduced the size of a bootstrap with 8500 bytes. Not huge, but it seems like a nice improvement. And maybe it could be made more powerful: not just checking if the address is a param or alloca, but an address based on such values. On Thu, Sep 24, 2015 at 2:55 PM, Philip Reames <listmail at philipreames.com> wrote:> I threw together a patch which implements this (attached.) If we decide > that this is actually a legal transform, I'm happy to post this for review. > > In addition to the version proposed here, I also implemented a case where a > trivially escaped alloca's address is not equal to any other value. I > believe both are valid, but we should confirm. > > Philip > > > On 09/24/2015 02:34 PM, Aaron Ballman via llvm-dev wrote: >> >> On Thu, Sep 24, 2015 at 5:15 PM, Hans Wennborg <hans at chromium.org> wrote: >>> >>> On Thu, Sep 24, 2015 at 12:06 PM, Aaron Ballman <aaron at aaronballman.com> >>> wrote: >>>> >>>> On Thu, Sep 24, 2015 at 2:42 PM, Hans Wennborg <hans at chromium.org> >>>> wrote: >>>>> >>>>> I was wondering why LLVM cannot optimize this code (which GCC does >>>>> optimize): >>>>> >>>>> int f(int *p) { int x; return p == &x; } >>>>> >>>>> it would seem that this must always return 0. (This occurs as a >>>>> self-assignment check in the code I was looking at; I was hoping we >>>>> could fold that check away.) >>>> >>>> This is different than a self-assignment check, is it not? >>>> >>>> blah& operator=(const blah &b) { >>>> if (&b == this) {} >>>> // ... >>>> } >>>> >>>> (Because it gets the pointer from the parameter and compares against a >>>> "local" pointer?) >>>> >>>> I just want to make sure that you're not suggesting we should optimize >>>> away self-assignment checks in the general case. >>> >>> Right, I'm not suggesting that :-) >>> >>> The code I looked at went something like this: >>> >>> struct S { >>> S& operator=(const S& other) { >>> if (&other != this) >>> val = other.val; >>> return *this; >>> } >>> void foo(); >>> int val; >>> }; >>> void S::foo() { >>> S tmp; >>> tmp.val = 42; >>> *this = tmp; // operator= gets inlined; we should know(?) that &tmp >>> != this >>> } >>> >>> This is of course a silly example, but with GCC we get: >>> >>> movl $42, (%rdi) >>> ret >>> >>> whereas Clang generates: >>> >>> movl $42, -8(%rsp) >>> leaq -8(%rsp), %rax >>> cmpq %rdi, %rax >>> je .LBB0_2 >>> movl $42, (%rdi) >>> .LBB0_2: >>> retq >>> >>> which made me sad. >> >> Ah, yes, this makes perfect sense to me. Thank you for the explanation! >> >> ~Aaron >> >>> >>>>> I'd be interested to hear what those with a stronger understanding of >>>>> the standard than myself think about this, and also if there is any >>>>> example of something that could break because of this optimization. If >>>>> not, I'd like us to optimize it :-) >>>>> >>>>> >>>>> On Thu, Jan 31, 2013 at 4:49 PM, Dan Gohman <dan433584 at gmail.com> >>>>> wrote: >>>>>> >>>>>> Author: djg >>>>>> Date: Thu Jan 31 18:49:06 2013 >>>>>> New Revision: 174131 >>>>>> >>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=174131&view=rev >>>>>> Log: >>>>>> Add a comment explaining an unavailable optimization. >>>>>> >>>>>> Modified: >>>>>> llvm/trunk/lib/Analysis/InstructionSimplify.cpp >>>>>> >>>>>> Modified: llvm/trunk/lib/Analysis/InstructionSimplify.cpp >>>>>> URL: >>>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/InstructionSimplify.cpp?rev=174131&r1=174130&r2=174131&view=diff >>>>>> >>>>>> =============================================================================>>>>>> --- llvm/trunk/lib/Analysis/InstructionSimplify.cpp (original) >>>>>> +++ llvm/trunk/lib/Analysis/InstructionSimplify.cpp Thu Jan 31 >>>>>> 18:49:06 2013 >>>>>> @@ -1688,6 +1688,34 @@ static Value *ExtractEquivalentCondition >>>>>> return 0; >>>>>> } >>>>>> >>>>>> +// A significant optimization not implemented here is assuming that >>>>>> alloca >>>>>> +// addresses are not equal to incoming argument values. They don't >>>>>> *alias*, >>>>>> +// as we say, but that doesn't mean they aren't equal, so we take a >>>>>> +// conservative approach. >>>>>> +// >>>>>> +// This is inspired in part by C++11 5.10p1: >>>>>> +// "Two pointers of the same type compare equal if and only if they >>>>>> are both >>>>>> +// null, both point to the same function, or both represent the >>>>>> same >>>>>> +// address." >>>>>> +// >>>>>> +// This is pretty permissive. >>>>> >>>>> Indeed :-/ >>>>> >>>>>> +// It's also partly due to C11 6.5.9p6: >>>>>> +// "Two pointers compare equal if and only if both are null >>>>>> pointers, both are >>>>>> +// pointers to the same object (including a pointer to an object >>>>>> and a >>>>>> +// subobject at its beginning) or function, both are pointers to >>>>>> one past the >>>>>> +// last element of the same array object, or one is a pointer to >>>>>> one past the >>>>>> +// end of one array object and the other is a pointer to the start >>>>>> of a >>>>>> +// different array object that happens to immediately follow the >>>>>> ï¬ rst array >>>>>> +// object in the address space.) >>>>>> +// >>>>>> +// C11's version is more restrictive, however there's no reason why >>>>>> an argument >>>>>> +// couldn't be a one-past-the-end value for a stack object in the >>>>>> caller and be >>>>>> +// equal to the beginning of a stack object in the callee. >>>>> >>>>> This is interesting. >>>>> >>>>> For the one-past-the-end pointer to point into the callee, the stack >>>>> would have to be growing upwards. So this won't happen on X86. Can we >>>>> turn this optimization on for downward-growing-stack targets? >>>>> >>>>> Second, if the stack grows upward, and the function argument does >>>>> point into the callee stack frame, "p" and "&x" could have the same >>>>> contents. So per the "represent the same address" part above, they >>>>> should compare equal? But they're noalias? Are we allowed to write >>>>> through p? It wasn't a pointer to a valid object when we made the >>>>> call, but it became valid in the callee? This is all terrifying. >>>>> >>>>> I suppose one could store the value of &x though, and then use it >>>>> again later, i.e.: >>>>> >>>>> int *global; >>>>> int f(int *p) { >>>>> int x; >>>>> global = &x; >>>>> return p == &x; >>>>> } >>>>> int g() { >>>>> f(0); >>>>> return f(global); >>>>> } >>>>> >>>>> Is g() guaranteed to return 1 here? Maybe we could claim it's >>>>> implementation dependent? GCC does not seem fold p==&x to 0 here. I >>>>> suppose we could make sure to check whether &x escapes the function? >>>>> >>>>> - Hans >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >
David Majnemer via llvm-dev
2015-Sep-25 15:22 UTC
[llvm-dev] Comparing stack addresses and function args (Was: [llvm] r174131 - Add a comment explaining an unavailable optimization)
FWIW, consider the following: $ cat t.c extern "C" int printf(const char *, ...); struct S { int x[]; }; void f(S a) { S b; printf("%p %p\n", &a, &b); } int main() { S s; f(s); } $ clang -O2 t.cpp -o t && ./t 0x7fff5e75a8e8 0x7fff5e75a8e8 Perhaps Dan was concerned about zero sized objects? On Thu, Sep 24, 2015 at 6:04 PM, Hans Wennborg via llvm-dev < llvm-dev at lists.llvm.org> wrote:> I tried your patch on a Clang build to see if it would fire. It > reduced the size of a bootstrap with 8500 bytes. Not huge, but it > seems like a nice improvement. And maybe it could be made more > powerful: not just checking if the address is a param or alloca, but > an address based on such values. > > On Thu, Sep 24, 2015 at 2:55 PM, Philip Reames > <listmail at philipreames.com> wrote: > > I threw together a patch which implements this (attached.) If we decide > > that this is actually a legal transform, I'm happy to post this for > review. > > > > In addition to the version proposed here, I also implemented a case > where a > > trivially escaped alloca's address is not equal to any other value. I > > believe both are valid, but we should confirm. > > > > Philip > > > > > > On 09/24/2015 02:34 PM, Aaron Ballman via llvm-dev wrote: > >> > >> On Thu, Sep 24, 2015 at 5:15 PM, Hans Wennborg <hans at chromium.org> > wrote: > >>> > >>> On Thu, Sep 24, 2015 at 12:06 PM, Aaron Ballman < > aaron at aaronballman.com> > >>> wrote: > >>>> > >>>> On Thu, Sep 24, 2015 at 2:42 PM, Hans Wennborg <hans at chromium.org> > >>>> wrote: > >>>>> > >>>>> I was wondering why LLVM cannot optimize this code (which GCC does > >>>>> optimize): > >>>>> > >>>>> int f(int *p) { int x; return p == &x; } > >>>>> > >>>>> it would seem that this must always return 0. (This occurs as a > >>>>> self-assignment check in the code I was looking at; I was hoping we > >>>>> could fold that check away.) > >>>> > >>>> This is different than a self-assignment check, is it not? > >>>> > >>>> blah& operator=(const blah &b) { > >>>> if (&b == this) {} > >>>> // ... > >>>> } > >>>> > >>>> (Because it gets the pointer from the parameter and compares against a > >>>> "local" pointer?) > >>>> > >>>> I just want to make sure that you're not suggesting we should optimize > >>>> away self-assignment checks in the general case. > >>> > >>> Right, I'm not suggesting that :-) > >>> > >>> The code I looked at went something like this: > >>> > >>> struct S { > >>> S& operator=(const S& other) { > >>> if (&other != this) > >>> val = other.val; > >>> return *this; > >>> } > >>> void foo(); > >>> int val; > >>> }; > >>> void S::foo() { > >>> S tmp; > >>> tmp.val = 42; > >>> *this = tmp; // operator= gets inlined; we should know(?) that > &tmp > >>> != this > >>> } > >>> > >>> This is of course a silly example, but with GCC we get: > >>> > >>> movl $42, (%rdi) > >>> ret > >>> > >>> whereas Clang generates: > >>> > >>> movl $42, -8(%rsp) > >>> leaq -8(%rsp), %rax > >>> cmpq %rdi, %rax > >>> je .LBB0_2 > >>> movl $42, (%rdi) > >>> .LBB0_2: > >>> retq > >>> > >>> which made me sad. > >> > >> Ah, yes, this makes perfect sense to me. Thank you for the explanation! > >> > >> ~Aaron > >> > >>> > >>>>> I'd be interested to hear what those with a stronger understanding of > >>>>> the standard than myself think about this, and also if there is any > >>>>> example of something that could break because of this optimization. > If > >>>>> not, I'd like us to optimize it :-) > >>>>> > >>>>> > >>>>> On Thu, Jan 31, 2013 at 4:49 PM, Dan Gohman <dan433584 at gmail.com> > >>>>> wrote: > >>>>>> > >>>>>> Author: djg > >>>>>> Date: Thu Jan 31 18:49:06 2013 > >>>>>> New Revision: 174131 > >>>>>> > >>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=174131&view=rev > >>>>>> Log: > >>>>>> Add a comment explaining an unavailable optimization. > >>>>>> > >>>>>> Modified: > >>>>>> llvm/trunk/lib/Analysis/InstructionSimplify.cpp > >>>>>> > >>>>>> Modified: llvm/trunk/lib/Analysis/InstructionSimplify.cpp > >>>>>> URL: > >>>>>> > http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/InstructionSimplify.cpp?rev=174131&r1=174130&r2=174131&view=diff > >>>>>> > >>>>>> > =============================================================================> >>>>>> --- llvm/trunk/lib/Analysis/InstructionSimplify.cpp (original) > >>>>>> +++ llvm/trunk/lib/Analysis/InstructionSimplify.cpp Thu Jan 31 > >>>>>> 18:49:06 2013 > >>>>>> @@ -1688,6 +1688,34 @@ static Value *ExtractEquivalentCondition > >>>>>> return 0; > >>>>>> } > >>>>>> > >>>>>> +// A significant optimization not implemented here is assuming that > >>>>>> alloca > >>>>>> +// addresses are not equal to incoming argument values. They don't > >>>>>> *alias*, > >>>>>> +// as we say, but that doesn't mean they aren't equal, so we take a > >>>>>> +// conservative approach. > >>>>>> +// > >>>>>> +// This is inspired in part by C++11 5.10p1: > >>>>>> +// "Two pointers of the same type compare equal if and only if > they > >>>>>> are both > >>>>>> +// null, both point to the same function, or both represent the > >>>>>> same > >>>>>> +// address." > >>>>>> +// > >>>>>> +// This is pretty permissive. > >>>>> > >>>>> Indeed :-/ > >>>>> > >>>>>> +// It's also partly due to C11 6.5.9p6: > >>>>>> +// "Two pointers compare equal if and only if both are null > >>>>>> pointers, both are > >>>>>> +// pointers to the same object (including a pointer to an object > >>>>>> and a > >>>>>> +// subobject at its beginning) or function, both are pointers to > >>>>>> one past the > >>>>>> +// last element of the same array object, or one is a pointer to > >>>>>> one past the > >>>>>> +// end of one array object and the other is a pointer to the > start > >>>>>> of a > >>>>>> +// different array object that happens to immediately follow the > >>>>>> ï¬ rst array > >>>>>> +// object in the address space.) > >>>>>> +// > >>>>>> +// C11's version is more restrictive, however there's no reason why > >>>>>> an argument > >>>>>> +// couldn't be a one-past-the-end value for a stack object in the > >>>>>> caller and be > >>>>>> +// equal to the beginning of a stack object in the callee. > >>>>> > >>>>> This is interesting. > >>>>> > >>>>> For the one-past-the-end pointer to point into the callee, the stack > >>>>> would have to be growing upwards. So this won't happen on X86. Can we > >>>>> turn this optimization on for downward-growing-stack targets? > >>>>> > >>>>> Second, if the stack grows upward, and the function argument does > >>>>> point into the callee stack frame, "p" and "&x" could have the same > >>>>> contents. So per the "represent the same address" part above, they > >>>>> should compare equal? But they're noalias? Are we allowed to write > >>>>> through p? It wasn't a pointer to a valid object when we made the > >>>>> call, but it became valid in the callee? This is all terrifying. > >>>>> > >>>>> I suppose one could store the value of &x though, and then use it > >>>>> again later, i.e.: > >>>>> > >>>>> int *global; > >>>>> int f(int *p) { > >>>>> int x; > >>>>> global = &x; > >>>>> return p == &x; > >>>>> } > >>>>> int g() { > >>>>> f(0); > >>>>> return f(global); > >>>>> } > >>>>> > >>>>> Is g() guaranteed to return 1 here? Maybe we could claim it's > >>>>> implementation dependent? GCC does not seem fold p==&x to 0 here. I > >>>>> suppose we could make sure to check whether &x escapes the function? > >>>>> > >>>>> - Hans > >> > >> _______________________________________________ > >> LLVM Developers mailing list > >> llvm-dev at lists.llvm.org > >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150925/6fe0c38f/attachment.html>
Philip Reames via llvm-dev
2015-Sep-25 20:26 UTC
[llvm-dev] Comparing stack addresses and function args (Was: [llvm] r174131 - Add a comment explaining an unavailable optimization)
On 09/24/2015 06:04 PM, Hans Wennborg wrote:> I tried your patch on a Clang build to see if it would fire. It > reduced the size of a bootstrap with 8500 bytes. Not huge, but it > seems like a nice improvement. And maybe it could be made more > powerful: not just checking if the address is a param or alloca, but > an address based on such values.Yeah, I realized after posting that this should be integrated with the existing noalias handling in the pointer icmp routine. However, let's defer that until we decide if this is actually correct to implement. :)> > On Thu, Sep 24, 2015 at 2:55 PM, Philip Reames > <listmail at philipreames.com> wrote: >> I threw together a patch which implements this (attached.) If we decide >> that this is actually a legal transform, I'm happy to post this for review. >> >> In addition to the version proposed here, I also implemented a case where a >> trivially escaped alloca's address is not equal to any other value. I >> believe both are valid, but we should confirm. >> >> Philip >> >> >> On 09/24/2015 02:34 PM, Aaron Ballman via llvm-dev wrote: >>> On Thu, Sep 24, 2015 at 5:15 PM, Hans Wennborg <hans at chromium.org> wrote: >>>> On Thu, Sep 24, 2015 at 12:06 PM, Aaron Ballman <aaron at aaronballman.com> >>>> wrote: >>>>> On Thu, Sep 24, 2015 at 2:42 PM, Hans Wennborg <hans at chromium.org> >>>>> wrote: >>>>>> I was wondering why LLVM cannot optimize this code (which GCC does >>>>>> optimize): >>>>>> >>>>>> int f(int *p) { int x; return p == &x; } >>>>>> >>>>>> it would seem that this must always return 0. (This occurs as a >>>>>> self-assignment check in the code I was looking at; I was hoping we >>>>>> could fold that check away.) >>>>> This is different than a self-assignment check, is it not? >>>>> >>>>> blah& operator=(const blah &b) { >>>>> if (&b == this) {} >>>>> // ... >>>>> } >>>>> >>>>> (Because it gets the pointer from the parameter and compares against a >>>>> "local" pointer?) >>>>> >>>>> I just want to make sure that you're not suggesting we should optimize >>>>> away self-assignment checks in the general case. >>>> Right, I'm not suggesting that :-) >>>> >>>> The code I looked at went something like this: >>>> >>>> struct S { >>>> S& operator=(const S& other) { >>>> if (&other != this) >>>> val = other.val; >>>> return *this; >>>> } >>>> void foo(); >>>> int val; >>>> }; >>>> void S::foo() { >>>> S tmp; >>>> tmp.val = 42; >>>> *this = tmp; // operator= gets inlined; we should know(?) that &tmp >>>> != this >>>> } >>>> >>>> This is of course a silly example, but with GCC we get: >>>> >>>> movl $42, (%rdi) >>>> ret >>>> >>>> whereas Clang generates: >>>> >>>> movl $42, -8(%rsp) >>>> leaq -8(%rsp), %rax >>>> cmpq %rdi, %rax >>>> je .LBB0_2 >>>> movl $42, (%rdi) >>>> .LBB0_2: >>>> retq >>>> >>>> which made me sad. >>> Ah, yes, this makes perfect sense to me. Thank you for the explanation! >>> >>> ~Aaron >>> >>>>>> I'd be interested to hear what those with a stronger understanding of >>>>>> the standard than myself think about this, and also if there is any >>>>>> example of something that could break because of this optimization. If >>>>>> not, I'd like us to optimize it :-) >>>>>> >>>>>> >>>>>> On Thu, Jan 31, 2013 at 4:49 PM, Dan Gohman <dan433584 at gmail.com> >>>>>> wrote: >>>>>>> Author: djg >>>>>>> Date: Thu Jan 31 18:49:06 2013 >>>>>>> New Revision: 174131 >>>>>>> >>>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=174131&view=rev >>>>>>> Log: >>>>>>> Add a comment explaining an unavailable optimization. >>>>>>> >>>>>>> Modified: >>>>>>> llvm/trunk/lib/Analysis/InstructionSimplify.cpp >>>>>>> >>>>>>> Modified: llvm/trunk/lib/Analysis/InstructionSimplify.cpp >>>>>>> URL: >>>>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/InstructionSimplify.cpp?rev=174131&r1=174130&r2=174131&view=diff >>>>>>> >>>>>>> =============================================================================>>>>>>> --- llvm/trunk/lib/Analysis/InstructionSimplify.cpp (original) >>>>>>> +++ llvm/trunk/lib/Analysis/InstructionSimplify.cpp Thu Jan 31 >>>>>>> 18:49:06 2013 >>>>>>> @@ -1688,6 +1688,34 @@ static Value *ExtractEquivalentCondition >>>>>>> return 0; >>>>>>> } >>>>>>> >>>>>>> +// A significant optimization not implemented here is assuming that >>>>>>> alloca >>>>>>> +// addresses are not equal to incoming argument values. They don't >>>>>>> *alias*, >>>>>>> +// as we say, but that doesn't mean they aren't equal, so we take a >>>>>>> +// conservative approach. >>>>>>> +// >>>>>>> +// This is inspired in part by C++11 5.10p1: >>>>>>> +// "Two pointers of the same type compare equal if and only if they >>>>>>> are both >>>>>>> +// null, both point to the same function, or both represent the >>>>>>> same >>>>>>> +// address." >>>>>>> +// >>>>>>> +// This is pretty permissive. >>>>>> Indeed :-/ >>>>>> >>>>>>> +// It's also partly due to C11 6.5.9p6: >>>>>>> +// "Two pointers compare equal if and only if both are null >>>>>>> pointers, both are >>>>>>> +// pointers to the same object (including a pointer to an object >>>>>>> and a >>>>>>> +// subobject at its beginning) or function, both are pointers to >>>>>>> one past the >>>>>>> +// last element of the same array object, or one is a pointer to >>>>>>> one past the >>>>>>> +// end of one array object and the other is a pointer to the start >>>>>>> of a >>>>>>> +// different array object that happens to immediately follow the >>>>>>> ï¬ rst array >>>>>>> +// object in the address space.) >>>>>>> +// >>>>>>> +// C11's version is more restrictive, however there's no reason why >>>>>>> an argument >>>>>>> +// couldn't be a one-past-the-end value for a stack object in the >>>>>>> caller and be >>>>>>> +// equal to the beginning of a stack object in the callee. >>>>>> This is interesting. >>>>>> >>>>>> For the one-past-the-end pointer to point into the callee, the stack >>>>>> would have to be growing upwards. So this won't happen on X86. Can we >>>>>> turn this optimization on for downward-growing-stack targets? >>>>>> >>>>>> Second, if the stack grows upward, and the function argument does >>>>>> point into the callee stack frame, "p" and "&x" could have the same >>>>>> contents. So per the "represent the same address" part above, they >>>>>> should compare equal? But they're noalias? Are we allowed to write >>>>>> through p? It wasn't a pointer to a valid object when we made the >>>>>> call, but it became valid in the callee? This is all terrifying. >>>>>> >>>>>> I suppose one could store the value of &x though, and then use it >>>>>> again later, i.e.: >>>>>> >>>>>> int *global; >>>>>> int f(int *p) { >>>>>> int x; >>>>>> global = &x; >>>>>> return p == &x; >>>>>> } >>>>>> int g() { >>>>>> f(0); >>>>>> return f(global); >>>>>> } >>>>>> >>>>>> Is g() guaranteed to return 1 here? Maybe we could claim it's >>>>>> implementation dependent? GCC does not seem fold p==&x to 0 here. I >>>>>> suppose we could make sure to check whether &x escapes the function? >>>>>> >>>>>> - Hans >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>
Reasonably Related Threads
- Comparing stack addresses and function args (Was: [llvm] r174131 - Add a comment explaining an unavailable optimization)
- Comparing stack addresses and function args (Was: [llvm] r174131 - Add a comment explaining an unavailable optimization)
- Comparing stack addresses and function args (Was: [llvm] r174131 - Add a comment explaining an unavailable optimization)
- Hitting assertion failure related to vectorization + instcombine
- Hitting assertion failure related to vectorization + instcombine