Jonathan S. Shapiro
2008-Apr-30 21:20 UTC
[LLVMdev] optimization assumes malloc return is non-null
On Wed, 2008-04-30 at 15:25 -0400, David Vandevoorde wrote:> On Apr 30, 2008, at 2:47 PM, Jonathan S. Shapiro wrote: > > Daveed: > > > > Perhaps I am looking at the wrong version of the specification. > > Section > > 5.1.2.3 appears to refer to objects having volatile-qualified type. > > The > > type of malloc() is not volatile qualified in the standard library > > definition. > > ...malloc() is not specified to access a volatile > object, modify an object, or modifying a file (directly or > indirectly); i.e., it has no side effect from the language point of > view.Daveed: Good to know that I was looking at the correct section. I do not agree that your interpretation follows the as-if rule, because I do not agree with your interpretation of the C library specification of malloc(). The standard library specification of malloc() clearly requires that it allocates storage, and that such allocation is contingent on storage availability. Storage availability is, in turn, a function (in part) of previous calls to malloc() and free(). Even if free() is not called, the possibility of realloc() implies a need to retain per-malloc() state. In either case, it follows immediately that malloc() is stateful, and therefore that any conforming implementation of malloc() must modify at least one object in the sense of the standard. If I understand your position correctly, your justification for the optimization is that the C library standard does not say in so many words that malloc() modifies an object. I do not believe that any such overt statement is required in order for it to be clear that malloc() is stateful. The functional description of malloc() and free() clearly cannot be satisfied under the C abstract machine without mutation of at least one object. Also, I do not read 5.1.2.3 in the way that you do. Paragraph 2 defines "side effect", but it does not imply any requirement that side effects be explicitly annotated. What Paragraph 3 gives you is leeway to optimize standard functions when you proactively know their behavior. A standard library procedure is not side-effect free for optimization purposes by virtue of the absence of annotation. It can only be treated as side-effect free by virtue of proactive knowledge of the implementation of the procedure. In this case, we clearly have knowledge of the implementation of malloc, and that knowledge clearly precludes any possibility that malloc is simultaneously side-effect free and conforming. So it seems clear that this optimization is wrong. By my reading, not only does the standard fail to justify it under 6.1.2.3 paragraph 3, it *prohibits* this optimization under 5.1.2.3 under Paragraph 1 because there is no conforming implementation that is side-effect free. Exception: there are rare cases where, under whole-program optimization, it is possible to observe that free() is not called, that there is an upper bound on the number of possible calls to malloc() and also an upper bound on the total amount of storage allocated. In this very unusual case, the compiler can perform a hypothetical inlining of the known implementation of malloc and then do partial evaluation to determine that no heap size tracking is required. If so, it can then legally perform the optimization that is currently being done. But I don't think that the current compiler is actually doing that analysis in this case...> > In general, calls to procedures that are outside the current unit of > > compilation are presumed to involve side effects performed in the body > > of the external procedure (at least in the absence of annotation). > > > That may often be done in practice, but it's not a language > requirement. In particular, for standard library functions (like > malloc) an optimizer can exploit the known behavior of the function.I disagree. In the malloc() case, the known behavior is side effecting. In the general case, the compiler cannot assume side-effect freedom unless it can prove it, and in the absence of implementation knowledge the standard requires conservatism.> The same concept exists in C++, and we often refer to it as the "as > if" rule; i.e., implementations can do all kinds of things, as long as > the effect is "as if" specified by the abstract machine.Yes. But the C++ version of this is quite different, because in any situation where new would return NULL it would instead be raising an out of memory exception. In consequence, the optimization is correct for operator new whether or not operator new is side effecting. Setting the matter of the standard entirely aside, the currently implemented behavior deviates so blatantly from common sense and universal convention that it really must be viewed as a bug. Finally, I strongly suspect that LLVM will fail the standard conformance suites so long as this optimization is retained. shap
Török Edwin
2008-Apr-30 21:58 UTC
[LLVMdev] optimization assumes malloc return is non-null
Jonathan S. Shapiro wrote:> On Wed, 2008-04-30 at 15:25 -0400, David Vandevoorde wrote: > >> On Apr 30, 2008, at 2:47 PM, Jonathan S. Shapiro wrote: >> >>> Daveed: >>> >>> Perhaps I am looking at the wrong version of the specification. >>> Section >>> 5.1.2.3 appears to refer to objects having volatile-qualified type. >>> The >>> type of malloc() is not volatile qualified in the standard library >>> definition. >>> >> ...malloc() is not specified to access a volatile >> object, modify an object, or modifying a file (directly or >> indirectly); i.e., it has no side effect from the language point of >> view. >> > > Daveed: > > Good to know that I was looking at the correct section. I do not agree > that your interpretation follows the as-if rule, because I do not agree > with your interpretation of the C library specification of malloc(). > > The standard library specification of malloc() clearly requires that it > allocates storage, and that such allocation is contingent on storage > availability. Storage availability is, in turn, a function (in part) of > previous calls to malloc() and free(). Even if free() is not called, the > possibility of realloc() implies a need to retain per-malloc() state. In > either case, it follows immediately that malloc() is stateful, and > therefore that any conforming implementation of malloc() must modify at > least one object in the sense of the standard. > > If I understand your position correctly, your justification for the > optimization is that the C library standard does not say in so many > words that malloc() modifies an object. I do not believe that any such > overt statement is required in order for it to be clear that malloc() is > stateful. The functional description of malloc() and free() clearly > cannot be satisfied under the C abstract machine without mutation of at > least one object. > > Also, I do not read 5.1.2.3 in the way that you do. Paragraph 2 defines > "side effect", but it does not imply any requirement that side effects > be explicitly annotated. What Paragraph 3 gives you is leeway to > optimize standard functions when you proactively know their behavior. A > standard library procedure is not side-effect free for optimization > purposes by virtue of the absence of annotation. It can only be treated > as side-effect free by virtue of proactive knowledge of the > implementation of the procedure. In this case, we clearly have knowledge > of the implementation of malloc, and that knowledge clearly precludes > any possibility that malloc is simultaneously side-effect free and > conforming. > > So it seems clear that this optimization is wrong. By my reading, not > only does the standard fail to justify it under 6.1.2.3 paragraph 3, it > *prohibits* this optimization under 5.1.2.3 under Paragraph 1 because > there is no conforming implementation that is side-effect free. > > Exception: there are rare cases where, under whole-program optimization, > it is possible to observe that free() is not called, that there is an > upper bound on the number of possible calls to malloc() and also an > upper bound on the total amount of storage allocated. In this very > unusual case, the compiler can perform a hypothetical inlining of the > known implementation of malloc and then do partial evaluation to > determine that no heap size tracking is required. If so, it can then > legally perform the optimization that is currently being done. > > But I don't think that the current compiler is actually doing that > analysis in this case... > > >>> In general, calls to procedures that are outside the current unit of >>> compilation are presumed to involve side effects performed in the body >>> of the external procedure (at least in the absence of annotation). >>> >> That may often be done in practice, but it's not a language >> requirement. In particular, for standard library functions (like >> malloc) an optimizer can exploit the known behavior of the function. >> > > I disagree. In the malloc() case, the known behavior is side effecting. > In the general case, the compiler cannot assume side-effect freedom > unless it can prove it, and in the absence of implementation knowledge > the standard requires conservatism.Although the ISO standard doesn't say anything about malloc setting errno, POSIX does: "Otherwise, it shall return a null pointer and set /errno/ to indicate the error". IMHO setting errno can be considered a side-effect. Best regards, --Edwin
David Vandevoorde
2008-Apr-30 22:26 UTC
[LLVMdev] optimization assumes malloc return is non-null
On Apr 30, 2008, at 5:20 PM, Jonathan S. Shapiro wrote:> On Wed, 2008-04-30 at 15:25 -0400, David Vandevoorde wrote: >> On Apr 30, 2008, at 2:47 PM, Jonathan S. Shapiro wrote: >>> Daveed: >>> >>> Perhaps I am looking at the wrong version of the specification. >>> Section >>> 5.1.2.3 appears to refer to objects having volatile-qualified type. >>> The >>> type of malloc() is not volatile qualified in the standard library >>> definition. >> >> ...malloc() is not specified to access a volatile >> object, modify an object, or modifying a file (directly or >> indirectly); i.e., it has no side effect from the language point of >> view. > > Daveed: > > Good to know that I was looking at the correct section. I do not agree > that your interpretation follows the as-if rule, because I do not > agree > with your interpretation of the C library specification of malloc().Before I go on, let me state that this is not a contentious issue among WG14: There is no doubt that the intent of the standard is that this be a valid optimization. So at most we're debating whether the wording implements the intent in this case (which it does IMO).> The standard library specification of malloc() clearly requires that > it > allocates storage, and that such allocation is contingent on storage > availability.(I think that's a contradiction: A perfectly standard-conforming implementation of malloc is: void *malloc(size_t size) { return 0; } Admittedly a moot point here.)> Storage availability is, in turn, a function (in part) of > previous calls to malloc() and free().While that's certainly true of every reasonable implementation, it's not a requirement (see above).> Even if free() is not called, the > possibility of realloc() implies a need to retain per-malloc() state.Realloc is permitted to fail on every call.> In > either case, it follows immediately that malloc() is stateful, and > therefore that any conforming implementation of malloc() must modify > at > least one object in the sense of the standard.For every useful implementation malloc will indeed be stateful (but again, that's not a language requirement). So in practice malloc will indeed have side effects. However even in the practical cases, 5.1.2.3/3 still allows the optimization on the example shown: <begin quote> 3 In the abstract machine, all expressions are evaluated as specified by the semantics. An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced (including any caused by calling a function or accessing a volatile object). <end quote> (I should have been more careful in my earlier response and clarify that modifying a nonvolatile object is a side-effect that can be optimized away in many cases.)> If I understand your position correctly, your justification for the > optimization is that the C library standard does not say in so many > words that malloc() modifies an object. I do not believe that any such > overt statement is required in order for it to be clear that > malloc() is > stateful. The functional description of malloc() and free() clearly > cannot be satisfied under the C abstract machine without mutation of > at > least one object.I shown above, that isn't actually correct.> > Also, I do not read 5.1.2.3 in the way that you do. Paragraph 2 > defines > "side effect", but it does not imply any requirement that side effects > be explicitly annotated. What Paragraph 3 gives you is leeway to > optimize standard functions when you proactively know their > behavior. A > standard library procedure is not side-effect free for optimization > purposes by virtue of the absence of annotation. It can only be > treated > as side-effect free by virtue of proactive knowledge of the > implementation of the procedure.I agree with that, and in this case a compiler can know that malloc won't change a volatile object or write to a non-temporary file. (Alternatively, it can just "use" a strange malloc implementation for this particular case -- a platonic exercise since the optimization that allows then makes the implementation unneeded.) (It might also be worth noting that standard library functions can be "magical macros". That also opens the door to all kinds of optimizations; perhaps> In this case, we clearly have knowledge > of the implementation of malloc, and that knowledge clearly precludes > any possibility that malloc is simultaneously side-effect free and > conforming. > > So it seems clear that this optimization is wrong. By my reading, not > only does the standard fail to justify it under 6.1.2.3 paragraph 3, > it > *prohibits* this optimization under 5.1.2.3 under Paragraph 1 because > there is no conforming implementation that is side-effect free.I'm afraid I don't understand how 5.1.2.3/1 applies there, nor why 5.1.2.3/3 (which I assume was the intended reference) would not apply. Even if malloc changes some state, that side effect isn't a "needed side effect" in the sense of 5.1.2.3/3.> Exception: there are rare cases where, under whole-program > optimization, > it is possible to observe that free() is not called, that there is an > upper bound on the number of possible calls to malloc() and also an > upper bound on the total amount of storage allocated. In this very > unusual case, the compiler can perform a hypothetical inlining of the > known implementation of malloc and then do partial evaluation to > determine that no heap size tracking is required. If so, it can then > legally perform the optimization that is currently being done.Right. And this is such a case. However, the optimization is allowed even without that: Just by noting that the return value of malloc isn't used (nor leaked); I suspect that's what the compiler does in this case (though I don't know).> But I don't think that the current compiler is actually doing that > analysis in this case...I cannot speak to that.>>> In general, calls to procedures that are outside the current unit of >>> compilation are presumed to involve side effects performed in the >>> body >>> of the external procedure (at least in the absence of annotation). >> >> >> That may often be done in practice, but it's not a language >> requirement. In particular, for standard library functions (like >> malloc) an optimizer can exploit the known behavior of the function. > > I disagree. In the malloc() case, the known behavior is side > effecting.See above.> > In the general case, the compiler cannot assume side-effect freedom > unless it can prove it, and in the absence of implementation knowledge > the standard requires conservatism.Except for the leeway offered by 5.1.2.3/3. This is no different from not requiring that an actual store operation happen in something like: void f(int x) { x = 3; }>> The same concept exists in C++, and we often refer to it as the "as >> if" rule; i.e., implementations can do all kinds of things, as long >> as >> the effect is "as if" specified by the abstract machine. > > Yes. But the C++ version of this is quite different, because in any > situation where new would return NULL it would instead be raising an > out > of memory exception. In consequence, the optimization is correct for > operator new whether or not operator new is side effecting.I wasn't talking about malloc vs. new: C++ also has malloc. I was only talking about C++ because I actively work on that standard, so I'm more familiar with the "jargon" that applies to it.> Setting the matter of the standard entirely aside, the currently > implemented behavior deviates so blatantly from common sense and > universal convention that it really must be viewed as a bug.That's a different topic, and perhaps a matter of opinion. I personally think it's a perfectly decent optimization, though this particular instance of it isn't very useful as far as I can tell.> Finally, I strongly suspect that LLVM will fail the standard > conformance > suites so long as this optimization is retained.I'm quite familiar with the significant suites out there, and I can assure you that this won't be an issue. (Though I don't believe the gcc front end is fully conforming either -- for other reasons.) Daveed
David A. Greene
2008-Apr-30 23:17 UTC
[LLVMdev] optimization assumes malloc return is non-null
On Wednesday 30 April 2008 17:26, David Vandevoorde wrote:> >> ...malloc() is not specified to access a volatile > >> object, modify an object, or modifying a file (directly or > >> indirectly); i.e., it has no side effect from the language point of > >> view. > > > > Daveed: > > > > Good to know that I was looking at the correct section. I do not agree > > that your interpretation follows the as-if rule, because I do not > > agree > > with your interpretation of the C library specification of malloc(). > > Before I go on, let me state that this is not a contentious issue > among WG14: There is no doubt that the intent of the standard is that > this be a valid optimization.Maybe I missed something, but aren't we all talking about the wrong thing here? It seems to me that this isn't about side effects, it's about the return value of malloc. Why can LLVM assume malloc will always return non-zero? -Dave
Jonathan S. Shapiro
2008-May-01 01:35 UTC
[LLVMdev] optimization assumes malloc return is non-null
On Wed, 2008-04-30 at 18:26 -0400, David Vandevoorde wrote:> > Daveed: > > > > Good to know that I was looking at the correct section. I do not agree > > that your interpretation follows the as-if rule, because I do not > > agree > > with your interpretation of the C library specification of malloc(). > > > Before I go on, let me state that this is not a contentious issue > among WG14: There is no doubt that the intent of the standard is that > this be a valid optimization.And of course, ANSI C working groups are infallible. Consider "noalias". :-) But I think I am starting to understand your argument, and if so I concur that the optimization is correct in this case. Let me see ask some further questions just to confirm. Let me attempt to replay what I believe is happening here: 1. There is an observably linear chain of calls to malloc() in the example, and because of the structure of the example, the compiler can see that there are no *unobserved* calls to malloc() in the program. 2. After all inlining of malloc() has proceeded, the compiler is free to observe that all calls succeed by partial evaluation (thus the non-zero return value) and that after this is done there is no witness to the internal heap limit variable. 3. Progressive application of dead variable elimination will then delete the entire chain of updates to the heap limit variable, eliminating the side effects that would normally be induced by malloc(). They will also delete the uncaptured pointer result from malloc(). 4. Since the result pointer is not captured, and the heap limit variable is not consulted, the entire result of the call to malloc() is the return of a non-NULL pointer. 5. Constant folding then reduces the IF condition (malloc() == NULL) to false. 6. Dead code elimination then reduces the if/then to simply the then body. 7. Optimizer now reduces the whole thing to "return 1". Is this more or less the chain of events here (or if not this, then something of a similar nature)? If so, I understand and concur that this optimization is valid in this example, primarily due to perverse and non-obvious consequences of the particular bit of sample code. Just to cement my understanding, I suspect that *either* of the following two changes to the sample code would make this optimization invalid: 1. Rename main() to f() 2. Introduce a call to some external procedure anywhere in main(). In either case, the compiler can no longer witness all reachable calls to malloc(), and must assume conservatively that the internally produced side effect results of malloc() may be sourced by other units of compilation, and therefore cannot be elided. Is this correct? Hmm. No. I can weasel out of this in case (2), because the standard does not require that calls to malloc() be processed in order, so in some cases calls to malloc() appearing in main() can still be eliminated by logically rearranging them to occur before any of the external calls to malloc(). I bet LLVM doesn't know about this. :-)> (I should have been more careful in my earlier response and clarify > that modifying a nonvolatile object is a side-effect that can be > optimized away in many cases.)If I understand the standard correctly, this can be done exactly if there is no witness to the side effect. Furthermore, chains of intermediate unwitnessed side effects can be collapsed if the optimizer can prove that the intermediate results are unwitnessed. Yes?> ...However, the optimization is allowed even without that: Just by noting > that the return value of malloc isn't used (nor leaked); I suspect > that's what the compiler does in this case (though I don't know).I think that is not quite strong enough. If there is any possibility of preceding or succeeding calls to malloc() in compilation units that are not visible at compile time, then the compiler cannot determine that the current call to malloc() must succeed or that it's side effects are unwitnessed. I suppose that the compiler can do special case things in main on the theory that these calls to malloc() may be executed out of order by a conforming implementation, but I suspect that goes well beyond the cleverness of the current LLVM implementation. It's the sort of thing that the VAX/VMS optimizer or the IBM PL.8 research optimizer might actually have done. And certainly the Bulldog optimizer.> > Setting the matter of the standard entirely aside, the currently > > implemented behavior deviates so blatantly from common sense and > > universal convention that it really must be viewed as a bug.Just to be explicit: I now see that my statement above is wrong in the present example. Daveed: Thanks for being patient on this. I appreciate your willingness to help me see this one. shap
On Apr 30, 2008, at 2:20 PM, Jonathan S. Shapiro wrote:> The standard library specification of malloc() clearly requires that > it > allocates storageNo, your interpretation is wrong. When the standard says: void *malloc(size_t size); Description [#2] The malloc function allocates space for an object they don't mean that the malloc function allocates space for an object. What they mean is that you cannot write a program that can notice that it didn't allocate space for an object. The difference is subtle, but important. As an example, compare: #include <stdlib.h> void *vp; int main(int argc, char** argv){ if((vp=malloc(sizeof(int))) == NULL){ return 0; } else{ return 1; } } with -fwhole-program the call to malloc goes away. Without it, it stays. If we change it to: #include <stdlib.h> void * volatile vp; int foo(int argc, char** argv){ if((vp=malloc(sizeof(int))) == NULL){ return 0; } else{ return 1; } } it stays. If change it to: #include <stdlib.h> void * volatile vp; int main(int argc, char** argv){ if((vp=malloc(sizeof(int))) == NULL){ return 0; } else{ return 1; } } it then stays. This is because you're allowed to observe malloc through vp as it is volatile. Observing the program with writes and reads to volatile variables and the I/O behavior of the program are only ways in which you get to observe the program.> I do not believe that any such overt statement is required in order > for it to be clear that malloc() is stateful.You can observe the behavior with I/O and volatiles, all that you observe meets the requirements for the semantics according the to standard. malloc need preserve no state or even change state. The standard has no mechanism to require that. It only says the program is required to behave as if describe by the standard.> The functional description of malloc() and free() clearly cannot be > satisfied under the C abstract machine without mutation of at least > one object.Wrong. In int main() { malloc (4); return 0; } there are 0 writes to volatile storage. There are 0 input routines called, there are 0 output routines called. The abstract semantics are return 0 to the environment with no input and no output. Any program that does that, is a valid compilation of the code. Another equally valid translation would be: int d[100]; int foo() { return 100; } main() { for (i = 0; i<foo(); ++i) d[i] = i; return 0; } thats because the abstract semantics of that program are exactly identical to the required abstract semantics of the first.> Setting the matter of the standard entirely aside, the currently> implemented behavior deviates so blatantly from common senseMost people have no clue just what as if means. People that write compilers are expected to know what it means. If one doesn't, they should learn.> and universal convention that it really must be viewed as a bug.You can state the required abstract semantics of the program and you can state the actual semantics of the translation and we can string compare them, if they are identical, we get to say, that is a valid compilation of the code.
Possibly Parallel Threads
- [LLVMdev] optimization assumes malloc return is non-null
- [LLVMdev] optimization assumes malloc return is non-null
- [LLVMdev] optimization assumes malloc return is non-null
- [LLVMdev] optimization assumes malloc return is non-null
- [LLVMdev] optimization assumes malloc return is non-null