Jonathan S. Shapiro
2008-Apr-30 18:47 UTC
[LLVMdev] optimization assumes malloc return is non-null
Daveed: Perhaps I am looking at the wrong version of the specification. Section 5.1.2.3 appears to refer to objects having volatile-qualified type. The type of malloc() is not volatile qualified in the standard library definition. In general, calls to procedures that are outside the current unit of compilation are presumed to involve side effects performed in the body of the external procedure (at least in the absence of annotation). Can you say what version of the standard you are referencing, and (just so I know) why section 5.1.2.3 makes a call to malloc() different from any other procedure call with respect to side effects? Thanks Jonathan On Wed, 2008-04-30 at 14:29 -0400, David Vandevoorde wrote:> On Apr 30, 2008, at 2:10 PM, Ryan M. Lefever wrote: > > Consider the following c code: > > > > #include <stdlib.h> > > > > int main(int argc, char** argv){ > > if(malloc(sizeof(int)) == NULL){ return 0; } > > else{ return 1; } > > } > > > > > > When I compile it with -O3, it produces the following bytecode: > > > > define i32 @main(i32 %argc, i8** %argv) { > > entry: > > ret i32 1 > > } > > > > Is this an error? It should be possible for malloc to return NULL, if > > it can not allocate more space. In fact, some programs should be able > > to gracefully handle such situations. > > It's an allowable program transformation because a call to malloc is > not in itself a side effect. See e.g. 5.1.2.3 in the C standard. > > Daveed > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
David Vandevoorde
2008-Apr-30 19:25 UTC
[LLVMdev] optimization assumes malloc return is non-null
On Apr 30, 2008, at 2:47 PM, Jonathan S. Shapiro wrote:> Daveed: > > Perhaps I am looking at the wrong version of the specification. > Section > 5.1.2.3 appears to refer to objects having volatile-qualified type. > The > type of malloc() is not volatile qualified in the standard library > definition.More importantly, malloc() is not specified to access a volatile object, modify an object, or modifying a file (directly or indirectly); i.e., it has no side effect from the language point of view.> In general, calls to procedures that are outside the current unit of > compilation are presumed to involve side effects performed in the body > of the external procedure (at least in the absence of annotation).That may often be done in practice, but it's not a language requirement. In particular, for standard library functions (like malloc) an optimizer can exploit the known behavior of the function.> Can you say what version of the standard you are referencing, and > (just > so I know) why section 5.1.2.3 makes a call to malloc() different from > any other procedure call with respect to side effects?I'm looking at ISO/IEC 9899:1999. <begin quote> 1 The semantic descriptions in this International Standard describe the behavior of an abstract machine in which issues of optimization are irrelevant. 2 Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects, [footnote bout floating-point effects] which are changes in the state of the execution environment. Evaluation of an expression may produce side effects. At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place. (A summary of the sequence points is given in annex C.) 3 In the abstract machine, all expressions are evaluated as specified by the semantics. An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced (including any caused by calling a function or accessing a volatile object). <end quote> The same concept exists in C++, and we often refer to it as the "as if" rule; i.e., implementations can do all kinds of things, as long as the effect is "as if" specified by the abstract machine. Daveed> > Thanks > > > Jonathan > > On Wed, 2008-04-30 at 14:29 -0400, David Vandevoorde wrote: >> On Apr 30, 2008, at 2:10 PM, Ryan M. Lefever wrote: >>> Consider the following c code: >>> >>> #include <stdlib.h> >>> >>> int main(int argc, char** argv){ >>> if(malloc(sizeof(int)) == NULL){ return 0; } >>> else{ return 1; } >>> } >>> >>> >>> When I compile it with -O3, it produces the following bytecode: >>> >>> define i32 @main(i32 %argc, i8** %argv) { >>> entry: >>> ret i32 1 >>> } >>> >>> Is this an error? It should be possible for malloc to return >>> NULL, if >>> it can not allocate more space. In fact, some programs should be >>> able >>> to gracefully handle such situations. >> >> It's an allowable program transformation because a call to malloc is >> not in itself a side effect. See e.g. 5.1.2.3 in the C standard. >> >> Daveed >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Jonathan S. Shapiro
2008-Apr-30 21:20 UTC
[LLVMdev] optimization assumes malloc return is non-null
On Wed, 2008-04-30 at 15:25 -0400, David Vandevoorde wrote:> On Apr 30, 2008, at 2:47 PM, Jonathan S. Shapiro wrote: > > Daveed: > > > > Perhaps I am looking at the wrong version of the specification. > > Section > > 5.1.2.3 appears to refer to objects having volatile-qualified type. > > The > > type of malloc() is not volatile qualified in the standard library > > definition. > > ...malloc() is not specified to access a volatile > object, modify an object, or modifying a file (directly or > indirectly); i.e., it has no side effect from the language point of > view.Daveed: Good to know that I was looking at the correct section. I do not agree that your interpretation follows the as-if rule, because I do not agree with your interpretation of the C library specification of malloc(). The standard library specification of malloc() clearly requires that it allocates storage, and that such allocation is contingent on storage availability. Storage availability is, in turn, a function (in part) of previous calls to malloc() and free(). Even if free() is not called, the possibility of realloc() implies a need to retain per-malloc() state. In either case, it follows immediately that malloc() is stateful, and therefore that any conforming implementation of malloc() must modify at least one object in the sense of the standard. If I understand your position correctly, your justification for the optimization is that the C library standard does not say in so many words that malloc() modifies an object. I do not believe that any such overt statement is required in order for it to be clear that malloc() is stateful. The functional description of malloc() and free() clearly cannot be satisfied under the C abstract machine without mutation of at least one object. Also, I do not read 5.1.2.3 in the way that you do. Paragraph 2 defines "side effect", but it does not imply any requirement that side effects be explicitly annotated. What Paragraph 3 gives you is leeway to optimize standard functions when you proactively know their behavior. A standard library procedure is not side-effect free for optimization purposes by virtue of the absence of annotation. It can only be treated as side-effect free by virtue of proactive knowledge of the implementation of the procedure. In this case, we clearly have knowledge of the implementation of malloc, and that knowledge clearly precludes any possibility that malloc is simultaneously side-effect free and conforming. So it seems clear that this optimization is wrong. By my reading, not only does the standard fail to justify it under 6.1.2.3 paragraph 3, it *prohibits* this optimization under 5.1.2.3 under Paragraph 1 because there is no conforming implementation that is side-effect free. Exception: there are rare cases where, under whole-program optimization, it is possible to observe that free() is not called, that there is an upper bound on the number of possible calls to malloc() and also an upper bound on the total amount of storage allocated. In this very unusual case, the compiler can perform a hypothetical inlining of the known implementation of malloc and then do partial evaluation to determine that no heap size tracking is required. If so, it can then legally perform the optimization that is currently being done. But I don't think that the current compiler is actually doing that analysis in this case...> > In general, calls to procedures that are outside the current unit of > > compilation are presumed to involve side effects performed in the body > > of the external procedure (at least in the absence of annotation). > > > That may often be done in practice, but it's not a language > requirement. In particular, for standard library functions (like > malloc) an optimizer can exploit the known behavior of the function.I disagree. In the malloc() case, the known behavior is side effecting. In the general case, the compiler cannot assume side-effect freedom unless it can prove it, and in the absence of implementation knowledge the standard requires conservatism.> The same concept exists in C++, and we often refer to it as the "as > if" rule; i.e., implementations can do all kinds of things, as long as > the effect is "as if" specified by the abstract machine.Yes. But the C++ version of this is quite different, because in any situation where new would return NULL it would instead be raising an out of memory exception. In consequence, the optimization is correct for operator new whether or not operator new is side effecting. Setting the matter of the standard entirely aside, the currently implemented behavior deviates so blatantly from common sense and universal convention that it really must be viewed as a bug. Finally, I strongly suspect that LLVM will fail the standard conformance suites so long as this optimization is retained. shap
Reasonably Related Threads
- [LLVMdev] optimization assumes malloc return is non-null
- [LLVMdev] optimization assumes malloc return is non-null
- [LLVMdev] optimization assumes malloc return is non-null
- [LLVMdev] optimization assumes malloc return is non-null
- [LLVMdev] optimization assumes malloc return is non-null