Chris Lattner
2008-May-01 04:26 UTC
[LLVMdev] optimization assumes malloc return is non-null
On Apr 30, 2008, at 8:51 PM, David Vandevoorde wrote:>> This isn't safe in general unless you can (tightly) bound "n". You >> don't want to overflow the stack. > > Ah yes, of course. Does LLVM do this for known & small constant n?We don't do this currently, primarily because I haven't seen a case where it is a win yet: it would be very easy to do for some cases. The trick with this is that you have to prove a couple of interesting properties of the program. For example, you effectively have to prove that the malloc can only be executed once for each invocation of the function. You don't want to turn something like this: for () malloc(12) // perhaps a linked list. Into unbounded stack allocation (even though it would work as long as you don't run out of stack). There are interesting cases that could be caught like: for () { p = malloc(42) use(p); free(p); } etc, which could also be done. In this case, the alloca could even be a fixed alloca coded into the prolog of the function, not a dynamic alloca. Personally to me, I have a bigger axe to grind with C++ operator new. AFAIK, the standard doesn't give leeway to do a number of interesting optimizations for new/delete because the user is explicitly allowed to override them and the std doesn't require them to behave "as expected". Very interesting properties to me would be: 1) Safety to remove "delete (new int);" and friends. 2) Aliasing guarantees about the result of new. There are a huge number of code pessimizations that happen because the optimizer has to assume that 'new' can return a pointer that already exists in the program. 3) Lifetime guarantees. It would be really nice to be able to delete the store to X in: " double *X = ...; *X = 4.0; delete X;" which is safe with 'free'. etc. A lot of nice guarantees that we have with malloc/free aren't available with new/delete. Also, since new/delete can be overridden at any time (as late as runtime with LD_PRELOAD and friends), there is really no way the compiler can assume anything spiffy about new/delete except with some magic "standards violation is ok" compiler option, which is gross. Any thoughts on how to improve this situation? -Chris
David Vandevoorde
2008-May-01 14:54 UTC
[LLVMdev] optimization assumes malloc return is non-null
On May 1, 2008, at 12:26 AM, Chris Lattner wrote:> On Apr 30, 2008, at 8:51 PM, David Vandevoorde wrote: >>> This isn't safe in general unless you can (tightly) bound "n". You >>> don't want to overflow the stack. >> >> Ah yes, of course. Does LLVM do this for known & small constant n? > > We don't do this currently, primarily because I haven't seen a case > where it is a win yet: it would be very easy to do for some cases. > The trick with this is that you have to prove a couple of interesting > properties of the program. For example, you effectively have to prove > that the malloc can only be executed once for each invocation of the > function. You don't want to turn something like this: > > for () > malloc(12) // perhaps a linked list. > > Into unbounded stack allocation (even though it would work as long as > you don't run out of stack). > > There are interesting cases that could be caught like: > > for () { > p = malloc(42) > use(p); > free(p); > } > > etc, which could also be done.Yes, that's what I was thinking. In particular, I had hoped this could happen as the result of inlining certain constructor/destructor pairs; e.g.: ... { std::string s... } (You'd also have to be able to limit the string length, but I think there are quite a few cases where that is doable.) Unfortunately, I'd forgotten that the default allocator must use storage returned by ::operator new(std::size_t).> In this case, the alloca could even be > a fixed alloca coded into the prolog of the function, not a dynamic > alloca.Right.> Personally to me, I have a bigger axe to grind with C++ operator new. > AFAIK, the standard doesn't give leeway to do a number of interesting > optimizations for new/delete because the user is explicitly allowed to > override them and the std doesn't require them to behave "as > expected". Very interesting properties to me would be: > > 1) Safety to remove "delete (new int);" and friends.That is indeed a real problem: Replacement "new" can have observable side-effects (such as writing to the console), and currently there is no latitude to elide those.> 2) Aliasing guarantees about the result of new. There are a huge > number of code pessimizations that happen because the optimizer has to > assume that 'new' can return a pointer that already exists in the > program.Since TC1 (i.e., the 2003 standard) the following constraint exists (3.7.3.1/3): If the request succeeds, the value returned shall be a non-null pointer value (4.10) different from any previously returned value p1, unless that value p1 was since passed to an operator delete. That's clearly not enough, but maybe one could argue that the missing bit is an oversight? Have you talked to Howard Hinnant about this by any chance? (Howard is Apple's J16 rep, and he is the library subgroup chair; this constraint was added at the request of the library subgroup)> > 3) Lifetime guarantees. It would be really nice to be able to delete > the store to X in: " double *X = ...; *X = 4.0; delete X;" which is > safe with 'free'.The object lifetime rules (3.8 in the 2003 standard) enable that one I think.> etc. A lot of nice guarantees that we have with malloc/free aren't > available with new/delete. Also, since new/delete can be overridden > at any time (as late as runtime with LD_PRELOAD and friends), there is > really no way the compiler can assume anything spiffy about new/delete > except with some magic "standards violation is ok" compiler option, > which is gross. > > Any thoughts on how to improve this situation?Well, the spec has to change ;-) If we had talked about this two years ago, we'd be in much better shape to get that done for C++0x. Now, the committee is in wrap-up mode, and new proposals aren't very welcome. The aliasing guarantees issue above (2) could conceivably be smuggled in as a defect report resolution ("it's not a new feature; it's a fix for a wording bug"). Item (1) (and (3), if needed) could gain some traction if a strong case could be made that this is a weakness vs. C in practice. I.e., you'd want to show a realistic C example and the corresponding C++ example, and demonstrate that an existing "typical" optimizer (LLVM presumably) working on both versions delivers significantly better performance on the C version. I think I've seen you post to the WG21+J16 reflector before? If you have access to it, you could put out a feeler for this issue, including asking whether the lifetime optimization example (3 above) is valid in C++03 and/or C++0x (I believe there have been some changes in this area since C++03). Daveed
On Apr 30, 2008, at 9:26 PM, Chris Lattner wrote:> Personally to me, I have a bigger axe to grind with C++ operator new. > AFAIK, the standard doesn't give leeway to do a number of interesting > optimizations for new/delete because the user is explicitly allowed to > override them and the std doesn't require them to behave "as > expected".Yes it does: Replaceable: a C++ program may define a function with this function signature that displaces the default version defined by the C++ Standard library. Required behavior: Return a non-null pointer to suitably aligned storage (_basic.stc.dynamic_), or else throw a bad_alloc exception. This requirement is binding on a replacement version of this function. These _are_ the defined semantics. You cannot imagine any other semantic and substitute it. Any program that does, is outside the scope of the standard and has _no_ constraint upon it. Meaning, the compiler free to do anything, technically. The latitude exists to do exactly what you want to do as in intelligent optimization person. You can see that this is what the standard mandates by imagining what would happen if you had the allocator return the same pointer to the same memory all the time that was maximally aligned. What does cout << "Hi" << "there" do, or some other complex program that underneath wanted to use the allocator? Either, they _have_ to work, which is impossible and nonsensical, or there has to be enough latitude in the standard to allow the code to not work. The requirements on the replacement versions of the allocators is the constraint that provides the mechanism to decide wether your program is going to work. If you behaves exactly as if you allocated suitably aligned storage and return it or throw bad_alloc, then, the execution of the program is as constrained by the abstract semantics. If not then _all_ requirements of the standard are vacated. In particular, printf("Hi") in the allocator means the standard places no requirements on the program. You can print Hi, not print it, print it twice, whatever you want.> Very interesting properties to me would be: > > 1) Safety to remove "delete (new int);" and friends.You are free to do that.> 2) Aliasing guarantees about the result of new.The replacements have their behavior required (you can also read this as constrained). They are required to allocate new memory, and aliasing falls out from that.> There are a huge number of code pessimizations that happen because > the optimizer has to > assume that 'new' can return a pointer that already exists in the > program.This one is safe. It has to behaves as if it allocates storage. If you can detect it didn't, there are no requirements placed upon the semantics of the code.> 3) Lifetime guarantees. It would be really nice to be able to delete > the store to X in: " double *X = ...; *X = 4.0; delete X;" which is > safe with 'free'.Again, you can't inspect the value as there is no way to have those semantics without also then violating the required semantics of the deallocator and if you did that, then that standard places no requirements on the program, so net result you are free to omit the store.> A lot of nice guarantees that we have with malloc/free aren't > available with new/delete. Also, since new/delete can be overridden > at any time (as late as runtime with LD_PRELOAD and friends),3 The program's definitions are used instead of the default versions supplied by the implementation (_dcl.fct.def_). Such replacement occurs prior to program startup (_basic.def.odr_, _basic.start_). So, the replacement is done before start, if later, there are no requirements. And, the replacement has known semantics.
David Vandevoorde
2008-May-01 22:39 UTC
[LLVMdev] optimization assumes malloc return is non-null
(Hi Mike!) On May 1, 2008, at 6:11 PM, Mike Stump wrote:> On Apr 30, 2008, at 9:26 PM, Chris Lattner wrote: >> Personally to me, I have a bigger axe to grind with C++ operator new. >> AFAIK, the standard doesn't give leeway to do a number of interesting >> optimizations for new/delete because the user is explicitly allowed >> to >> override them and the std doesn't require them to behave "as >> expected". > > Yes it does: > > Replaceable: > a C++ program may define a function with this function > signature > that displaces the default version defined by the C++ > Standard > library. > Required behavior: > Return a non-null pointer to suitably aligned > storage > (_basic.stc.dynamic_), or else throw a bad_alloc exception. > This > requirement is binding on a replacement version of this function. > > These _are_ the defined semantics. You cannot imagine any other > semantic and substitute it. Any program that does, is outside the > scope of the standard and has _no_ constraint upon it. Meaning, the > compiler free to do anything, technically.Not quite. Although there is a requirement there (and more precise ones in Clause 3), there is no prohibition against doing additional, observable stuff (e.g., log the calls) and hence allocations cannot be elided. (I'll repeat some of my notes to Chris earlier today.)> The latitude exists to do exactly what you want to do as in > intelligent optimization person. > > You can see that this is what the standard mandates by imagining what > would happen if you had the allocator return the same pointer to the > same memory all the time that was maximally aligned. What does cout > << "Hi" << "there" do, or some other complex program that underneath > wanted to use the allocator? Either, they _have_ to work, which is > impossible and nonsensical, or there has to be enough latitude in the > standard to allow the code to not work. The requirements on the > replacement versions of the allocators is the constraint that provides > the mechanism to decide wether your program is going to work. If you > behaves exactly as if you allocated suitably aligned storage and > return it or throw bad_alloc, then, the execution of the program is as > constrained by the abstract semantics. If not then _all_ requirements > of the standard are vacated. In particular, printf("Hi") in the > allocator means the standard places no requirements on the program. > You can print Hi, not print it, print it twice, whatever you want. > >> Very interesting properties to me would be: >> >> 1) Safety to remove "delete (new int);" and friends. > > You are free to do that.No, because both "delete" and "new" may have side effects. The standard does not currenty prohibit those, and does not give latitude (as e.g. it does for call to construct copies of temporaries) to elide those effects.>> 2) Aliasing guarantees about the result of new. > > The replacements have their behavior required (you can also read this > as constrained). They are required to allocate new memory, and > aliasing falls out from that.It's not equivalent. Here are the words: "If the request succeeds, the value returned shall be a non-null pointer value (4.10) different from any previously returned value p1, unless that value p1 was since passed to an operator delete." The problem is that with a user-written operator new, there may be other valid ways (besides operator new) to get at the storage returned. The object lifetime rules don't always give such programs undefined behavior. (I personally consider this a defect in the standard, but I don't think that's unarguable.)> >> There are a huge number of code pessimizations that happen because >> the optimizer has to >> assume that 'new' can return a pointer that already exists in the >> program. > > This one is safe. It has to behaves as if it allocates storage. If > you can detect it didn't, there are no requirements placed upon the > semantics of the code.Which part of the standard would I be missing that implies so?>> 3) Lifetime guarantees. It would be really nice to be able to delete >> the store to X in: " double *X = ...; *X = 4.0; delete X;" which is >> safe with 'free'. > > Again, you can't inspect the value as there is no way to have those > semantics without also then violating the required semantics of the > deallocator and if you did that, then that standard places no > requirements on the program, so net result you are free to omit the > store.I don't think it's the deallocator that makes this work; it may be the object lifetime rules. I'm pretty sure it would work if instead of "double", a type with a destructor was used. Worth investigating?>> A lot of nice guarantees that we have with malloc/free aren't >> available with new/delete. Also, since new/delete can be overridden >> at any time (as late as runtime with LD_PRELOAD and friends), > > 3 The program's definitions are used instead of the default > versions > supplied by the implementation (_dcl.fct.def_). Such > replacement > occurs prior to program startup (_basic.def.odr_, _basic.start_). > > So, the replacement is done before start, if later, there are no > requirements. And, the replacement has known semantics.But isn't that still too late? The optimizer often must make decisions way before that. (Plus, even though the standard currently fails to address dynamic libraries, in practice implementations must keep things working right.) Daveed
Reasonably Related Threads
- [LLVMdev] optimization assumes malloc return is non-null
- [LLVMdev] optimization assumes malloc return is non-null
- [LLVMdev] optimization assumes malloc return is non-null
- [LLVMdev] optimization assumes malloc return is non-null
- [LLVMdev] optimization assumes malloc return is non-null