Hal Finkel
2015-Jun-30 17:21 UTC
[LLVMdev] C as used/implemented in practice: analysis of responses
----- Original Message -----> From: "Sean Silva" <chisophugis at gmail.com> > To: "Peter Sewell" <Peter.Sewell at cl.cam.ac.uk> > Cc: llvmdev at cs.uiuc.edu > Sent: Friday, June 26, 2015 4:53:30 PM > Subject: Re: [LLVMdev] C as used/implemented in practice: analysis of responses > > > > All of these seem to fall into the pattern of "The compiler is > required to do what you expect, as long as it can't prove X about > your program". That is, the only reasonable compilation in the > absence of inferring some extra piece of information about your > program, is the one you expect. For example, the only way to codegen > a comparison between two random pointers has the meaning you expect > (on common computer architectures); but if the compiler can figure > something out that tells it that comparing those two pointers is > undefined by the language standard, then, well, technically it can > do whatever it wants. > > > Many people interpret this as the compiler being somewhat malevolent, > but there's another interpretation in some cases. > > > > I have not looked in depth at the history in all the undefined > behaviors mentioned in the survey, but some of the undefined > behaviors are there because at some point in time the underlying > system diversity made it difficult or impossible to assign a > meaning. So long as the diversity that led to the desire to leave > something undefined still exists, programs that use those constructs > with certain expectations *will* fail to behave as "expected" on > those targets (on a system where pointers are represented > differently, your program *may* actually format your hard disk if > you do so-and-so!). > > > To put it another way, what is "expected" is actually dependent on > the C programmer's knowledge of the underlying system (computer > architecture, system architecture, etc.), and there will always be > tension so long as the programmer is not thinking about what the C > language guarantees, but rather (roughly speaking) how *they* would > translate their code to assembly language for the system or systems > that they happen to know they're targeting. An x86 programmer > doesn't expect unaligned loads to invoke nasal demons, but a SPARC > programmer does. > > > So if you unravel the thread of logic back through the undefined > behaviors made undefined for this reason, many of these cases of > exploiting undefined behavior are really an extension, on the > compiler's part, of the logic "there are some systems for which your > code would invoke nasal demons, so I might as well assume that it > will invoke nasal demons on this system (since the language standard > doesn't say anything about specific systems)". Or to put it another > way, the compiler is effectively assuming that your code is written > to target all the systems taken into account by the C standard, and > if it would invoke nasal demons on any one of them then the compiler > is allowed to invoke nasal demons on all of them.Honestly, I don't think this argument buys us all that much in the eyes of most programmers. Maybe we're saving them from themselves, maybe not, but while certainly a side effect of exploiting undefined behavior, this is not really why we exploit undefined behavior. We exploit undefined behavior because it helps to optimize real programs and shrink the size of generated code. Here's a simple example: int caching_disabled; static void do_something(struct cache *c) { ... if (!caching_disabled) { ... ... c->something ... ... } ... } /* This is only called when caching is disabled */ void foo() { ... do_something(NULL); ... } a compiler might inline the call to do_something, or specialize it, such that we know that the argument is NULL. But the code in do_something does not check the pointer directly, but instead, checks some other mutable state (that must be related to the pointer for the program to have defined behavior). The code that is inlined into foo(), however, does not need to check '!caching_disabled', but rather because c is unconditionally dereferenced within the block guarded by that check, and because that would necessarily be a dereference of a NULL pointer, and because that is undefined behavior, the compiler can mark that block as unreachable, and can eliminate the check and a potentially-large block of code when inlining. This kind of application of undefined behavior turns out to be quite helpful in practice. The rules for integer overflow are another great example (loop optimizations don't work nearly as well without them in many cases).> > This is obviously sort of a twisted logic, and I think that a lot of > the "malevolence" attributed to compilers is due to this. It > certainly removes many target-dependent checks from the mid-level > optimizer though. >This point is correct and important. Exploiting undefined behavior also simplifies the task of creating target-independent optimizers. -Hal> > > -- Sean Silva > > > On Fri, Jun 26, 2015 at 1:42 AM, Peter Sewell < > Peter.Sewell at cl.cam.ac.uk > wrote: > > > As part of a project to clarify what behaviour of C implementations > is > actually relied upon in modern practice, and what behaviour is > guaranteed by current mainstream implementations, we recently > distributed a survey of 15 questions about C, https://goo.gl/AZXH3S . > > We were asking what C is in current mainstream practice: the > behaviour > that programmers assume they can rely on, the behaviour provided by > mainstream compilers, and the idioms used in existing code, > especially > systems code. We were *not* asking what the ISO C standard permits, > which is often more restrictive, or about obsolete or obscure > hardware > or compilers. We focussed on the behaviour of memory and pointers. > > We've had around 300 responses, including many compiler and OS > developers, and the results are summarised below, or on the web at > http://www.cl.cam.ac.uk/~pes20/cerberus (which also has more > details). > For many questions the outcome seems clear, but for some, especially > 1, 2, 9, 10, and 11, major open questions about current compiler > behaviour remain; we'd greatly appreciate informed comments on those > from the relevant compiler developers (or other experts). > > If you can answer these, please reply either below or by mailing the > Cerberus mailing list: > > cl-cerberus at lists.cam.ac.uk > > https://lists.cam.ac.uk/mailman/listinfo/cl-cerberus > > many thanks, > Kayvan Memarian and Peter Sewell (University of Cambridge) > > > What is C in practice? (Cerberus survey): Conclusions > =====================================================================> > ## Kayvan Memarian and Peter Sewell > ### University of Cambridge, 2015-06-21 > > ---------------------------------------------------------------------- > > In April-June 2015 we distributed a web survey to investigate what C > is, in current mainstream practice: the behaviour that programmers > assume they can rely on, the behaviour provided by mainstream > compilers, and the idioms used in existing code, especially systems > code. We were *not* asking what the ISO C standard permits, which is > often more restrictive, or about obsolete or obscure hardware or > compilers. We focussed on the behaviour of memory and pointers. > This is a step towards an unambiguous and mathematically precise > definition > of the consensus *de facto* standard (to the extent that such > exists): the C that is actually used by systems programmers and > implemented by mainstream compilers. > > This note summarises our conclusions. For many questions the outcome > seems clear, but for some, especially 1, 2, 9, 10, and 11, major open > questions about current compiler behaviour remain; we'd greatly > appreciate further comments from the relevant compiler developers or > other experts. > > More details of the results are available from > [ > http://www.cl.cam.ac.uk/~pes20/cerberus](http://www.cl.cam.ac.uk/~pes20/cerberus) > . > > ### Responses > > Aiming for a modest-scale but technically expert audience, we > distributed the survey at the University of Cambridge systems > research > group, at EuroLLVM 2015, via John Regehr's blog, and via various > mailing lists: gcc, llvmdev, cfe-dev, libc-alpha, xorg, a FreeBSD > list, xen-devel, a Google C users list, and a Google C compilers > list. > In all there were around 300 responses, including around 100 printed > pages of textual comments. Most report expertise in C systems > programming and significant numbers report expertise in compiler > internals and in the C standard. > > > MAIN QUESTION RESPONSES > ----------------------- > > ###[1/15] How predictable are reads from padding bytes? > > **If you zero all bytes of a struct and then write some of its > members, do reads of the padding return zero? (e.g. for a bytewise > CAS or hash of the struct, or to know that no security-relevant data > has leaked into them.)** > > It remains unclear what behaviour compilers currently provide (or > should provide) for this. We see four main alternatives: > > a) Structure copies might copy padding, but structure member writes > never touch padding. > > b) Structure member writes might write zeros over subsequent padding. > > c) Structure member writes might write arbitrary values over > subsequent > padding, with reads seeing stable results. > > d) Padding bytes are regarded as always holding unspecified values, > irrespective of any byte writes to them, and so reads of them might > return arbitrary and unstable values (in which case the compiler > could > arbitrarily write to padding at any point, as that would be masked by > this). > > On the one hand: > > - In some circumstances it seems important to provide systems > programmers with a mechanism to ensure that no information is > leaked via padding. Rewriting structure definitions to make all > padding into explicit fields may not be practicable, especially if > one wants to do so in a platform-independent way, and so option (d) > is not compatible with this. Option (c) makes it possible but > awkward to prevent leakage, as there padding must be re-zero'd > after member writes. > > - In some circumstances programmers may rely on predictable padding > values, at least in the absence of structure member writes, > e.g. for memcmp, hashing, or compare-and-swap of struct values. > Again (d) is not compatible with this, and (a) or (b) are > preferable. But it's not clear whether any of those usages are > common or essential. > > - For Clang, one respondent suggests that by the time the > optimisation > passes operate, padding has been replaced by explicit fields, so > neither over-wide writes or permanently-undefined-value behaviour > will occur. > > - For MSVC, one respondent suggests the compiler provides (a). > > On the other hand, others suggest plausible optimisations, e.g., > "to apply SRA (scalar replacement of aggregates), replacing the > memset with a sequence of member assignments (discarding assignments > to padding) in order to do so". This could require (d) to make the > existing compiler behaviour admissible, but it's unclear > to us whether it actually does at present. > > Note also that for structs stored to malloc'd regions, option (d) is > at odds with the idea that malloc'd regions can be reused, as a write > of a new value (of a new type) to a malloc'd region might start with > writes to what were padding bytes, so perhaps we could only really > have this semantics for the other storage-duration kinds. > > Conclusion: > > For each compiler (GCC, Clang, MSVC, ICC, ...), we ask which of these > it provides on mainstream platforms? If the answer is not (a), to > what extent is it feasible to provide compiler flags that force the > behaviour to be stronger? > > Our "mainstream C" semantics should provide switchable options for > each of (a), (b), (c), and (d). > > > > ### [2/15] Uninitialised values > > **Is reading an uninitialised variable or struct member (with a > current mainstream compiler):** > > **(This might either be due to a bug or be intentional, e.g. when > copying a partially initialised struct, or to output, hash, or set > some bits of a value that may have been partially initialised.)** > > a) undefined behaviour (meaning that the compiler is free to > arbitrarily miscompile the program, with or without a warning), > > b) going to make the result of any expression involving that value > unpredictable, > > c) going to give an arbitrary and unstable value (maybe with a > different value if you read again), or > > d) going to give an arbitrary but stable value (with the same value > if > you read again). > > Here also it remains unclear what compilers current provide and what > it should provide. The survey responses are dominated by the > (a) "undefined behaviour" and (d) "arbitrary but stable" options. > > It's not clear whether people are actually depending on the latter, > beyond the case of copying a partially initialised struct, which it > seems must be supported, and comparing against a partially > initialised > struct, which it seems is done sometimes. Many respondents mention > historical uses to attempt to get entropy, but that seems now widely > regarded as crazy. There is a legitimate general argument that the > more determinacy that can be provided the better, for debugging. > > But it seems clear from the text responses that GCC, Clang, and MSVC > do not at present exploit the licence the ISO standard gives (in > defining this to be undefined behaviour) to arbitrarily miscompile > code, option (a). Clang seems to be the most aggressive, propagating > undef in many cases, though one respondent said "LLVM is moving > towards treating this as UB in the cases where the standards allow it > to do so". But there are special cases where LLVM is a bit stronger > (cf the undef docs); it's unclear why they think those are useful. > For GCC, one respondent said > > "Going to give arbitrary, unstable values (that is, the variable > assigned from the uninitialised variable itself acts as > uninitialised and having no consistent value). (Quite possibly > subsequent transformations will have the effect of undefined > behavior.) Inconsistency of observed values is an inevitable > consequence of transformations PHI (undefined, X) -> X (useful in > practice for programs that don't actually use uninitialised > variables, but where the compiler can't see that)." > > For MSVC, one respondent said: > > "I am aware of a significant divergence between the LLVM community > and MSVC here; in general LLVM uses "undefined behaviour" to mean > "we can miscompile the program and get better benchmarks", whereas > MSVC regards "undefined behaviour" as "we might have a security > vulnerability so this is a compile error / build break". First, > there is reading an uninitialized variable (i.e. something which > does not necessarily have a memory location); that should always be > a compile error. Period. Second, there is reading a partially > initialised struct (i.e. reading some memory whose contents are only > partly defined). That should give a compile error/warning or static > analysis warning if detectable. If not detectable it should give > the actual contents of the memory (be stable). I am strongly with > the MSVC folks on this one - if the compiler can tell at compile > time that anything is undefined then it should error out. Security > problems are a real problem for the whole industry and should not be > included deliberately by compilers." > > > For each compiler we ask which of these four semantics it provides. > > It looks as if several compiler writers are saying (b), while a > significant number of programmers are relying on (d). > > Our "mainstream C" semantics should support both (b) and (d). > > > > ### [3/15] Can one use pointer arithmetic between separately > allocated > C objects? > > **If you calculate an offset between two separately allocated C > memory > objects (e.g. malloc'd regions or global or local variables) by > pointer subtraction, can you make a usable pointer to the second by > adding the offset to the address of the first?** > > Most respondents expect this to work, and a significant number know > of > real code that relies on it, with many concrete examples. So far we > see no solid reason to disallow it for "mainstream" C > implementations. > The main reason to disallow it seems to have been segmented > architectures, especially 8086. There are still some embedded > architectures with distinct address spaces but it seems that > "mainstream" C is not concerned with this, and those cases could be > identified as a language dialect or implementation-defined choice. > > For GCC, one respondent writes the following, but doesn't give a > reason: > > - This is not safe in practice even if the alignment is sufficient > (and if the alignment of the type is less than its size, obviously > such a subtraction can't possibly work even with a naive compiler). > > It looks reasonable to identify two language dialects, one (our > "mainstream C") in which pointer arithmetic between separately > allocated C objects is permitted and one (principally for segmented > architectures) in which it is not. > > > > ### [4/15] Is pointer equality sensitive to their original allocation > sites? > > **For two pointers derived from the addresses of two separate > allocations, will equality testing (with ==) of them just compare > their runtime values, or might it take their original allocations > into > account and assume that they do not alias, even if they happen to > have > the same runtime value? (for current mainstream compilers)** > > The responses are roughly bimodal: many believe "it will just compare > the runtime values", while a similar number believe that the > comparison might take the allocation provenance into account. > Of the former, 41 "know of real code that relies on it". > > In practice we see that GCC does sometimes take allocation provenance > into account, with the result of a comparison (in an n+1 case, > comparing &p+1 and &q) sometimes varying depending on whether the > compiler can see the provenance, e.g. on whether it's done in the > same > compilation unit as the allocation. We don't see any reason to forbid > that, especially as this n+1 case seems unlikely to arise in > practice, > though it does complicate the semantics, effectively requiring a > nondeterministic choice at each comparison of whether to take > provenance into account. But for comparisons between pointers formed > by more radical pointer arithmetic from pointers originally from > different allocations, as in Q3, it's not so clear. > > The best "mainstream C" semantics here seems to be to make a > nondeterministic choice at each comparison of whether to take > provenance into account or just compare the runtime pointer value. In > the vast majority of cases the two will coincide. > > > > ### [5/15] Can pointer values be copied indirectly? > > **Can you make a usable copy of a pointer by copying its > representation bytes with code that indirectly computes the identity > function on them, e.g. writing the pointer value to a file and then > reading it back, and using compression or encryption on the way?** > > This seems unequivocably "yes", both in usage and for current > compilers, at least in simple cases, with direct data-flow from > original to computed pointer, both GCC and Clang support this. But > for computation via control-flow, it's not so clear. > > It looks as if a reasonable "mainstream C" semantics should allow > indirect pointer copying at least whenever there's a data-flow > provenance path, perhaps not when there's only a control-flow > provenance path. It should allow pointers to be marshalled to and > read back in, and the simplest way of doing that is to allow any > pointer value to be read in, with the compiler making no > aliasing/provenance assumptions on such, and with the semantics > checking the numeric pointer values points to a suitable live object > only when and if it is dereferenced. > > > > ###[6/15] Pointer comparison at different types > > **Can one do == comparison between pointers to objects of different > types (e.g. pointers to int, float, and different struct types)?** > > The question should have been clearer about whether the pointers are > first cast to void* or char*. With those casts, the responses seem > clear that it should be allowed, modulo now-unusual architectures > with > segmented memory or where the pointer representations are different. > > Then there's a question, which we would hope applies only in the case > without those casts, about whether -fstrict-aliasing will treat > comparisons with type mismatches as nonequal, e.g. > > - "Depends on strict-aliasing flags? I think LLVM TBAA might optimise > this sort of check away?" > > - "There are a lot of examples of this, in particular in libc, or > possibly implementations of vtables." > > > > > ### [7/15] Pointer comparison across different allocations > > **Can one do < comparison between pointers to separately allocated > objects?** > > This seems to be widely used for lock ordering and collections. > As for Q3, there's a potential issue for segmented memory systems > (where the implementation might only compare the offset) which seems > not to be relevant for current "mainstream" C. > > Apart from that, there doesn't seem to be any reason from compiler > implementation to forbid it. > > > > ###[8/15] Pointer values after lifetime end > > **Can you inspect (e.g. by comparing with ==) the value of a pointer > to an object after the object itself has been free'd or its scope has > ended?** > > The responses mostly say that this will work (modulo debugging > environments), with various use cases, and there is no clear reason > why compilers do not support it. > > Can we either establish that current mainstream compilers will > support > this or identify more specifically where and how they will fail to do > so? > > > ### [9/15] Pointer arithmetic > > **Can you (transiently) construct an out-of-bounds pointer value > (e.g. > before the beginning of an array, or more than one-past its end) by > pointer arithmetic, so long as later arithmetic makes it in-bounds > before it is used to access memory?** > > It seems clear that this is often assumed to work. But on the other > hand, compilers may sometimes assume otherwise: > > - "This is not safe; compilers may optimise based on pointers being > within bounds. In some cases, it's possible such code might not > even link, depending on the offsets allowed in any relocations that > get used in the object files." > > - "The situation has not gotten friendlier to old-school pointer > manipulations since https://lwn.net/Articles/278137/ was written in > 2008. [This is a case where GCC optimised away a comparison > involving an out-of-bounds pointer] The pattern could still be found > in code exposed to malicious interlocutors in 2013: > https://access.redhat.com/security/cve/CVE-2013-5607 " > > - "Pretty sure this one I've seen buggy code optimised away by real > compilers." > > Here the prevalence of transiently out-of-bounds pointer values in > real code suggests it's worth seriously asking the cost of disabling > whatever compiler optimisation is done based on this, to provide a > simple predictable semantics. > > > > ### [10/15] Pointer casts > > **Given two structure types that have the same initial members, can > you use a pointer of one type to access the intial members of a value > of the other?** > > It's clear that this is used very commonly. On the other hand, with > strict aliasing: > > - "This is something that GCC tends to actually kill in practice (if > strict aliasing is on); I've had to fix bugs that were caused by > it." > > and w.r.t. GCC: > > - "This is not safe in practice (unless a union is visibly > used as described in 6.5.2.3#6)." > > though the latter doesn't say why, or whether that's specific to > strict-aliasing. > > At least with -no-strict-aliasing, it seems this should be guaranteed > to work. > > > > ### [11/15] Using unsigned char arrays > > **Can an unsigned character array be used (in the same way as a > malloc’d region) to hold values of other types?** > > Here again it's clear that it's very often relied on for statically > allocated (non-malloc'd) character arrays, and it should work, with > due care about alignment. But the ISO standard disallows it, and we > also see: > > - [wrt GCC] "No, this is not safe (if it's visible to the compiler > that the memory in question has unsigned char as its declared > type)." > > though the latter doesn't say why. It is a violation of the > strict-aliasing text of the ISO standard. > > With -no-strict-aliasing it seems clear it that it should be allowed. > > > > ### [12/15] Null pointers from non-constant expressions > > **Can you make a null pointer by casting from an expression that > isn't > a constant but that evaluates to 0?** > > This is very often assumed to work. The only exception seems to be > some (unidentified) embedded systems. > > Our "mainstream C" semantics should permit it. > > > > ### [13/15] Null pointer representations > > **Can null pointers be assumed to be represented with 0?** > > Basically an unequivocal "yes" for mainstream systems. Again there a > potential exception for segmented memory, but not one relevant for > "mainstream" current practice. > > > > ### [14/15] Overlarge representation reads > > **Can one read the byte representation of a struct as aligned words > without regard for the fact that its extent might not include all of > the last word?** > > This is sometimes used in practice and believed to work, modulo > alignment, page-boundary alignment, and valgrind/MSan/etc. > Our "mainstream C" semantics could either forbid this entirely > (slightly limiting the scope of the semantics) or could allow it, for > sufficiently aligned cases, if some switch is set. > > For the moment we choose the former. > > ### [15/15] Union type punning > > **When is type punning - writing one union member and then reading it > as a different member, thereby reinterpreting its representation > bytes > - guaranteed to work (without confusing the compiler analysis and > optimisation passes)?** > > There's widespread doubt, disagreement and confusion here, e.g.: > > - "always" > - "Never" > - "According to the standard never; in practice always." > > Here the minimal thing it seems we should support is, broadly > following the GCC documentation, type punning via a union whose > definition is in scope and which is accessed via l-values that > manifestly involve the union type. > > Or, in the -no-strict-aliasing case, one could allow it everywhere. > > > POSTAMBLE RESPONSES > ==================> > ### Other differences > > ** If you know of other areas where the C used in practice differs > from that of the ISO standard, or where compiler optimisation limits > the behaviour you can rely on, please list them. ** > > There were many comments here which are hard to summarise. Many > mention integer overflow and the behaviour of shift operators. > > > > > ### Acknowledgements > > We would like to thank all those who responded to the survey, those > who distributed it, and especially those members of the Cambridge > Systems Research Group who helped us tune earlier versions. This work > is funded by the EPSRC REMS (Rigorous Engineering for Mainstream > Systems) Programme Grant, EP/K008528/1. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
Peter Sewell
2015-Jun-30 21:37 UTC
[LLVMdev] C as used/implemented in practice: analysis of responses
On 30 June 2015 at 18:21, Hal Finkel <hfinkel at anl.gov> wrote:> ----- Original Message ----- >> From: "Sean Silva" <chisophugis at gmail.com> >> To: "Peter Sewell" <Peter.Sewell at cl.cam.ac.uk> >> Cc: llvmdev at cs.uiuc.edu >> Sent: Friday, June 26, 2015 4:53:30 PM >> Subject: Re: [LLVMdev] C as used/implemented in practice: analysis of responses >> >> >> >> All of these seem to fall into the pattern of "The compiler is >> required to do what you expect, as long as it can't prove X about >> your program". That is, the only reasonable compilation in the >> absence of inferring some extra piece of information about your >> program, is the one you expect. For example, the only way to codegen >> a comparison between two random pointers has the meaning you expect >> (on common computer architectures); but if the compiler can figure >> something out that tells it that comparing those two pointers is >> undefined by the language standard, then, well, technically it can >> do whatever it wants. >> >> >> Many people interpret this as the compiler being somewhat malevolent, >> but there's another interpretation in some cases. >> >> >> >> I have not looked in depth at the history in all the undefined >> behaviors mentioned in the survey, but some of the undefined >> behaviors are there because at some point in time the underlying >> system diversity made it difficult or impossible to assign a >> meaning. So long as the diversity that led to the desire to leave >> something undefined still exists, programs that use those constructs >> with certain expectations *will* fail to behave as "expected" on >> those targets (on a system where pointers are represented >> differently, your program *may* actually format your hard disk if >> you do so-and-so!). >> >> >> To put it another way, what is "expected" is actually dependent on >> the C programmer's knowledge of the underlying system (computer >> architecture, system architecture, etc.), and there will always be >> tension so long as the programmer is not thinking about what the C >> language guarantees, but rather (roughly speaking) how *they* would >> translate their code to assembly language for the system or systems >> that they happen to know they're targeting. An x86 programmer >> doesn't expect unaligned loads to invoke nasal demons, but a SPARC >> programmer does. >> >> >> So if you unravel the thread of logic back through the undefined >> behaviors made undefined for this reason, many of these cases of >> exploiting undefined behavior are really an extension, on the >> compiler's part, of the logic "there are some systems for which your >> code would invoke nasal demons, so I might as well assume that it >> will invoke nasal demons on this system (since the language standard >> doesn't say anything about specific systems)". Or to put it another >> way, the compiler is effectively assuming that your code is written >> to target all the systems taken into account by the C standard, and >> if it would invoke nasal demons on any one of them then the compiler >> is allowed to invoke nasal demons on all of them. > > Honestly, I don't think this argument buys us all that much in the eyes of most programmers. Maybe we're saving them from themselves, maybe not, but while certainly a side effect of exploiting undefined behavior, this is not really why we exploit undefined behavior. We exploit undefined behavior because it helps to optimize real programs and shrink the size of generated code. > > Here's a simple example: > > int caching_disabled; > > static void do_something(struct cache *c) { > ... > > if (!caching_disabled) { > ... > ... c->something ... > ... > } > > ... > } > > /* This is only called when caching is disabled */ > void foo() { > ... > do_something(NULL); > ... > } > > a compiler might inline the call to do_something, or specialize it, such that we know that the argument is NULL. But the code in do_something does not check the pointer directly, but instead, checks some other mutable state (that must be related to the pointer for the program to have defined behavior). The code that is inlined into foo(), however, does not need to check '!caching_disabled', but rather because c is unconditionally dereferenced within the block guarded by that check, and because that would necessarily be a dereference of a NULL pointer, and because that is undefined behavior, the compiler can mark that block as unreachable, and can eliminate the check and a potentially-large block of code when inlining. > > This kind of application of undefined behavior turns out to be quite helpful in practice. The rules for integer overflow are another great example (loop optimizations don't work nearly as well without them in many cases).Yes - but there seems also to be a significant downside of aggressive exploitation of some of these UBs, at least as reported by OS and other systems people. The trade-offs between allowing optimisation and having a language which can be understood (and which supports common-but-non-ISO idioms) are difficult to manage, but somehow we (collectively) need to do better than previous decades of C evolution have. thanks, Peter>> >> This is obviously sort of a twisted logic, and I think that a lot of >> the "malevolence" attributed to compilers is due to this. It >> certainly removes many target-dependent checks from the mid-level >> optimizer though. >> > > This point is correct and important. Exploiting undefined behavior also simplifies the task of creating target-independent optimizers. > > -Hal > >> >> >> -- Sean Silva >> >> >> On Fri, Jun 26, 2015 at 1:42 AM, Peter Sewell < >> Peter.Sewell at cl.cam.ac.uk > wrote: >> >> >> As part of a project to clarify what behaviour of C implementations >> is >> actually relied upon in modern practice, and what behaviour is >> guaranteed by current mainstream implementations, we recently >> distributed a survey of 15 questions about C, https://goo.gl/AZXH3S . >> >> We were asking what C is in current mainstream practice: the >> behaviour >> that programmers assume they can rely on, the behaviour provided by >> mainstream compilers, and the idioms used in existing code, >> especially >> systems code. We were *not* asking what the ISO C standard permits, >> which is often more restrictive, or about obsolete or obscure >> hardware >> or compilers. We focussed on the behaviour of memory and pointers. >> >> We've had around 300 responses, including many compiler and OS >> developers, and the results are summarised below, or on the web at >> http://www.cl.cam.ac.uk/~pes20/cerberus (which also has more >> details). >> For many questions the outcome seems clear, but for some, especially >> 1, 2, 9, 10, and 11, major open questions about current compiler >> behaviour remain; we'd greatly appreciate informed comments on those >> from the relevant compiler developers (or other experts). >> >> If you can answer these, please reply either below or by mailing the >> Cerberus mailing list: >> >> cl-cerberus at lists.cam.ac.uk >> >> https://lists.cam.ac.uk/mailman/listinfo/cl-cerberus >> >> many thanks, >> Kayvan Memarian and Peter Sewell (University of Cambridge) >> >> >> What is C in practice? (Cerberus survey): Conclusions >> =====================================================================>> >> ## Kayvan Memarian and Peter Sewell >> ### University of Cambridge, 2015-06-21 >> >> ---------------------------------------------------------------------- >> >> In April-June 2015 we distributed a web survey to investigate what C >> is, in current mainstream practice: the behaviour that programmers >> assume they can rely on, the behaviour provided by mainstream >> compilers, and the idioms used in existing code, especially systems >> code. We were *not* asking what the ISO C standard permits, which is >> often more restrictive, or about obsolete or obscure hardware or >> compilers. We focussed on the behaviour of memory and pointers. >> This is a step towards an unambiguous and mathematically precise >> definition >> of the consensus *de facto* standard (to the extent that such >> exists): the C that is actually used by systems programmers and >> implemented by mainstream compilers. >> >> This note summarises our conclusions. For many questions the outcome >> seems clear, but for some, especially 1, 2, 9, 10, and 11, major open >> questions about current compiler behaviour remain; we'd greatly >> appreciate further comments from the relevant compiler developers or >> other experts. >> >> More details of the results are available from >> [ >> http://www.cl.cam.ac.uk/~pes20/cerberus](http://www.cl.cam.ac.uk/~pes20/cerberus) >> . >> >> ### Responses >> >> Aiming for a modest-scale but technically expert audience, we >> distributed the survey at the University of Cambridge systems >> research >> group, at EuroLLVM 2015, via John Regehr's blog, and via various >> mailing lists: gcc, llvmdev, cfe-dev, libc-alpha, xorg, a FreeBSD >> list, xen-devel, a Google C users list, and a Google C compilers >> list. >> In all there were around 300 responses, including around 100 printed >> pages of textual comments. Most report expertise in C systems >> programming and significant numbers report expertise in compiler >> internals and in the C standard. >> >> >> MAIN QUESTION RESPONSES >> ----------------------- >> >> ###[1/15] How predictable are reads from padding bytes? >> >> **If you zero all bytes of a struct and then write some of its >> members, do reads of the padding return zero? (e.g. for a bytewise >> CAS or hash of the struct, or to know that no security-relevant data >> has leaked into them.)** >> >> It remains unclear what behaviour compilers currently provide (or >> should provide) for this. We see four main alternatives: >> >> a) Structure copies might copy padding, but structure member writes >> never touch padding. >> >> b) Structure member writes might write zeros over subsequent padding. >> >> c) Structure member writes might write arbitrary values over >> subsequent >> padding, with reads seeing stable results. >> >> d) Padding bytes are regarded as always holding unspecified values, >> irrespective of any byte writes to them, and so reads of them might >> return arbitrary and unstable values (in which case the compiler >> could >> arbitrarily write to padding at any point, as that would be masked by >> this). >> >> On the one hand: >> >> - In some circumstances it seems important to provide systems >> programmers with a mechanism to ensure that no information is >> leaked via padding. Rewriting structure definitions to make all >> padding into explicit fields may not be practicable, especially if >> one wants to do so in a platform-independent way, and so option (d) >> is not compatible with this. Option (c) makes it possible but >> awkward to prevent leakage, as there padding must be re-zero'd >> after member writes. >> >> - In some circumstances programmers may rely on predictable padding >> values, at least in the absence of structure member writes, >> e.g. for memcmp, hashing, or compare-and-swap of struct values. >> Again (d) is not compatible with this, and (a) or (b) are >> preferable. But it's not clear whether any of those usages are >> common or essential. >> >> - For Clang, one respondent suggests that by the time the >> optimisation >> passes operate, padding has been replaced by explicit fields, so >> neither over-wide writes or permanently-undefined-value behaviour >> will occur. >> >> - For MSVC, one respondent suggests the compiler provides (a). >> >> On the other hand, others suggest plausible optimisations, e.g., >> "to apply SRA (scalar replacement of aggregates), replacing the >> memset with a sequence of member assignments (discarding assignments >> to padding) in order to do so". This could require (d) to make the >> existing compiler behaviour admissible, but it's unclear >> to us whether it actually does at present. >> >> Note also that for structs stored to malloc'd regions, option (d) is >> at odds with the idea that malloc'd regions can be reused, as a write >> of a new value (of a new type) to a malloc'd region might start with >> writes to what were padding bytes, so perhaps we could only really >> have this semantics for the other storage-duration kinds. >> >> Conclusion: >> >> For each compiler (GCC, Clang, MSVC, ICC, ...), we ask which of these >> it provides on mainstream platforms? If the answer is not (a), to >> what extent is it feasible to provide compiler flags that force the >> behaviour to be stronger? >> >> Our "mainstream C" semantics should provide switchable options for >> each of (a), (b), (c), and (d). >> >> >> >> ### [2/15] Uninitialised values >> >> **Is reading an uninitialised variable or struct member (with a >> current mainstream compiler):** >> >> **(This might either be due to a bug or be intentional, e.g. when >> copying a partially initialised struct, or to output, hash, or set >> some bits of a value that may have been partially initialised.)** >> >> a) undefined behaviour (meaning that the compiler is free to >> arbitrarily miscompile the program, with or without a warning), >> >> b) going to make the result of any expression involving that value >> unpredictable, >> >> c) going to give an arbitrary and unstable value (maybe with a >> different value if you read again), or >> >> d) going to give an arbitrary but stable value (with the same value >> if >> you read again). >> >> Here also it remains unclear what compilers current provide and what >> it should provide. The survey responses are dominated by the >> (a) "undefined behaviour" and (d) "arbitrary but stable" options. >> >> It's not clear whether people are actually depending on the latter, >> beyond the case of copying a partially initialised struct, which it >> seems must be supported, and comparing against a partially >> initialised >> struct, which it seems is done sometimes. Many respondents mention >> historical uses to attempt to get entropy, but that seems now widely >> regarded as crazy. There is a legitimate general argument that the >> more determinacy that can be provided the better, for debugging. >> >> But it seems clear from the text responses that GCC, Clang, and MSVC >> do not at present exploit the licence the ISO standard gives (in >> defining this to be undefined behaviour) to arbitrarily miscompile >> code, option (a). Clang seems to be the most aggressive, propagating >> undef in many cases, though one respondent said "LLVM is moving >> towards treating this as UB in the cases where the standards allow it >> to do so". But there are special cases where LLVM is a bit stronger >> (cf the undef docs); it's unclear why they think those are useful. >> For GCC, one respondent said >> >> "Going to give arbitrary, unstable values (that is, the variable >> assigned from the uninitialised variable itself acts as >> uninitialised and having no consistent value). (Quite possibly >> subsequent transformations will have the effect of undefined >> behavior.) Inconsistency of observed values is an inevitable >> consequence of transformations PHI (undefined, X) -> X (useful in >> practice for programs that don't actually use uninitialised >> variables, but where the compiler can't see that)." >> >> For MSVC, one respondent said: >> >> "I am aware of a significant divergence between the LLVM community >> and MSVC here; in general LLVM uses "undefined behaviour" to mean >> "we can miscompile the program and get better benchmarks", whereas >> MSVC regards "undefined behaviour" as "we might have a security >> vulnerability so this is a compile error / build break". First, >> there is reading an uninitialized variable (i.e. something which >> does not necessarily have a memory location); that should always be >> a compile error. Period. Second, there is reading a partially >> initialised struct (i.e. reading some memory whose contents are only >> partly defined). That should give a compile error/warning or static >> analysis warning if detectable. If not detectable it should give >> the actual contents of the memory (be stable). I am strongly with >> the MSVC folks on this one - if the compiler can tell at compile >> time that anything is undefined then it should error out. Security >> problems are a real problem for the whole industry and should not be >> included deliberately by compilers." >> >> >> For each compiler we ask which of these four semantics it provides. >> >> It looks as if several compiler writers are saying (b), while a >> significant number of programmers are relying on (d). >> >> Our "mainstream C" semantics should support both (b) and (d). >> >> >> >> ### [3/15] Can one use pointer arithmetic between separately >> allocated >> C objects? >> >> **If you calculate an offset between two separately allocated C >> memory >> objects (e.g. malloc'd regions or global or local variables) by >> pointer subtraction, can you make a usable pointer to the second by >> adding the offset to the address of the first?** >> >> Most respondents expect this to work, and a significant number know >> of >> real code that relies on it, with many concrete examples. So far we >> see no solid reason to disallow it for "mainstream" C >> implementations. >> The main reason to disallow it seems to have been segmented >> architectures, especially 8086. There are still some embedded >> architectures with distinct address spaces but it seems that >> "mainstream" C is not concerned with this, and those cases could be >> identified as a language dialect or implementation-defined choice. >> >> For GCC, one respondent writes the following, but doesn't give a >> reason: >> >> - This is not safe in practice even if the alignment is sufficient >> (and if the alignment of the type is less than its size, obviously >> such a subtraction can't possibly work even with a naive compiler). >> >> It looks reasonable to identify two language dialects, one (our >> "mainstream C") in which pointer arithmetic between separately >> allocated C objects is permitted and one (principally for segmented >> architectures) in which it is not. >> >> >> >> ### [4/15] Is pointer equality sensitive to their original allocation >> sites? >> >> **For two pointers derived from the addresses of two separate >> allocations, will equality testing (with ==) of them just compare >> their runtime values, or might it take their original allocations >> into >> account and assume that they do not alias, even if they happen to >> have >> the same runtime value? (for current mainstream compilers)** >> >> The responses are roughly bimodal: many believe "it will just compare >> the runtime values", while a similar number believe that the >> comparison might take the allocation provenance into account. >> Of the former, 41 "know of real code that relies on it". >> >> In practice we see that GCC does sometimes take allocation provenance >> into account, with the result of a comparison (in an n+1 case, >> comparing &p+1 and &q) sometimes varying depending on whether the >> compiler can see the provenance, e.g. on whether it's done in the >> same >> compilation unit as the allocation. We don't see any reason to forbid >> that, especially as this n+1 case seems unlikely to arise in >> practice, >> though it does complicate the semantics, effectively requiring a >> nondeterministic choice at each comparison of whether to take >> provenance into account. But for comparisons between pointers formed >> by more radical pointer arithmetic from pointers originally from >> different allocations, as in Q3, it's not so clear. >> >> The best "mainstream C" semantics here seems to be to make a >> nondeterministic choice at each comparison of whether to take >> provenance into account or just compare the runtime pointer value. In >> the vast majority of cases the two will coincide. >> >> >> >> ### [5/15] Can pointer values be copied indirectly? >> >> **Can you make a usable copy of a pointer by copying its >> representation bytes with code that indirectly computes the identity >> function on them, e.g. writing the pointer value to a file and then >> reading it back, and using compression or encryption on the way?** >> >> This seems unequivocably "yes", both in usage and for current >> compilers, at least in simple cases, with direct data-flow from >> original to computed pointer, both GCC and Clang support this. But >> for computation via control-flow, it's not so clear. >> >> It looks as if a reasonable "mainstream C" semantics should allow >> indirect pointer copying at least whenever there's a data-flow >> provenance path, perhaps not when there's only a control-flow >> provenance path. It should allow pointers to be marshalled to and >> read back in, and the simplest way of doing that is to allow any >> pointer value to be read in, with the compiler making no >> aliasing/provenance assumptions on such, and with the semantics >> checking the numeric pointer values points to a suitable live object >> only when and if it is dereferenced. >> >> >> >> ###[6/15] Pointer comparison at different types >> >> **Can one do == comparison between pointers to objects of different >> types (e.g. pointers to int, float, and different struct types)?** >> >> The question should have been clearer about whether the pointers are >> first cast to void* or char*. With those casts, the responses seem >> clear that it should be allowed, modulo now-unusual architectures >> with >> segmented memory or where the pointer representations are different. >> >> Then there's a question, which we would hope applies only in the case >> without those casts, about whether -fstrict-aliasing will treat >> comparisons with type mismatches as nonequal, e.g. >> >> - "Depends on strict-aliasing flags? I think LLVM TBAA might optimise >> this sort of check away?" >> >> - "There are a lot of examples of this, in particular in libc, or >> possibly implementations of vtables." >> >> >> >> >> ### [7/15] Pointer comparison across different allocations >> >> **Can one do < comparison between pointers to separately allocated >> objects?** >> >> This seems to be widely used for lock ordering and collections. >> As for Q3, there's a potential issue for segmented memory systems >> (where the implementation might only compare the offset) which seems >> not to be relevant for current "mainstream" C. >> >> Apart from that, there doesn't seem to be any reason from compiler >> implementation to forbid it. >> >> >> >> ###[8/15] Pointer values after lifetime end >> >> **Can you inspect (e.g. by comparing with ==) the value of a pointer >> to an object after the object itself has been free'd or its scope has >> ended?** >> >> The responses mostly say that this will work (modulo debugging >> environments), with various use cases, and there is no clear reason >> why compilers do not support it. >> >> Can we either establish that current mainstream compilers will >> support >> this or identify more specifically where and how they will fail to do >> so? >> >> >> ### [9/15] Pointer arithmetic >> >> **Can you (transiently) construct an out-of-bounds pointer value >> (e.g. >> before the beginning of an array, or more than one-past its end) by >> pointer arithmetic, so long as later arithmetic makes it in-bounds >> before it is used to access memory?** >> >> It seems clear that this is often assumed to work. But on the other >> hand, compilers may sometimes assume otherwise: >> >> - "This is not safe; compilers may optimise based on pointers being >> within bounds. In some cases, it's possible such code might not >> even link, depending on the offsets allowed in any relocations that >> get used in the object files." >> >> - "The situation has not gotten friendlier to old-school pointer >> manipulations since https://lwn.net/Articles/278137/ was written in >> 2008. [This is a case where GCC optimised away a comparison >> involving an out-of-bounds pointer] The pattern could still be found >> in code exposed to malicious interlocutors in 2013: >> https://access.redhat.com/security/cve/CVE-2013-5607 " >> >> - "Pretty sure this one I've seen buggy code optimised away by real >> compilers." >> >> Here the prevalence of transiently out-of-bounds pointer values in >> real code suggests it's worth seriously asking the cost of disabling >> whatever compiler optimisation is done based on this, to provide a >> simple predictable semantics. >> >> >> >> ### [10/15] Pointer casts >> >> **Given two structure types that have the same initial members, can >> you use a pointer of one type to access the intial members of a value >> of the other?** >> >> It's clear that this is used very commonly. On the other hand, with >> strict aliasing: >> >> - "This is something that GCC tends to actually kill in practice (if >> strict aliasing is on); I've had to fix bugs that were caused by >> it." >> >> and w.r.t. GCC: >> >> - "This is not safe in practice (unless a union is visibly >> used as described in 6.5.2.3#6)." >> >> though the latter doesn't say why, or whether that's specific to >> strict-aliasing. >> >> At least with -no-strict-aliasing, it seems this should be guaranteed >> to work. >> >> >> >> ### [11/15] Using unsigned char arrays >> >> **Can an unsigned character array be used (in the same way as a >> malloc’d region) to hold values of other types?** >> >> Here again it's clear that it's very often relied on for statically >> allocated (non-malloc'd) character arrays, and it should work, with >> due care about alignment. But the ISO standard disallows it, and we >> also see: >> >> - [wrt GCC] "No, this is not safe (if it's visible to the compiler >> that the memory in question has unsigned char as its declared >> type)." >> >> though the latter doesn't say why. It is a violation of the >> strict-aliasing text of the ISO standard. >> >> With -no-strict-aliasing it seems clear it that it should be allowed. >> >> >> >> ### [12/15] Null pointers from non-constant expressions >> >> **Can you make a null pointer by casting from an expression that >> isn't >> a constant but that evaluates to 0?** >> >> This is very often assumed to work. The only exception seems to be >> some (unidentified) embedded systems. >> >> Our "mainstream C" semantics should permit it. >> >> >> >> ### [13/15] Null pointer representations >> >> **Can null pointers be assumed to be represented with 0?** >> >> Basically an unequivocal "yes" for mainstream systems. Again there a >> potential exception for segmented memory, but not one relevant for >> "mainstream" current practice. >> >> >> >> ### [14/15] Overlarge representation reads >> >> **Can one read the byte representation of a struct as aligned words >> without regard for the fact that its extent might not include all of >> the last word?** >> >> This is sometimes used in practice and believed to work, modulo >> alignment, page-boundary alignment, and valgrind/MSan/etc. >> Our "mainstream C" semantics could either forbid this entirely >> (slightly limiting the scope of the semantics) or could allow it, for >> sufficiently aligned cases, if some switch is set. >> >> For the moment we choose the former. >> >> ### [15/15] Union type punning >> >> **When is type punning - writing one union member and then reading it >> as a different member, thereby reinterpreting its representation >> bytes >> - guaranteed to work (without confusing the compiler analysis and >> optimisation passes)?** >> >> There's widespread doubt, disagreement and confusion here, e.g.: >> >> - "always" >> - "Never" >> - "According to the standard never; in practice always." >> >> Here the minimal thing it seems we should support is, broadly >> following the GCC documentation, type punning via a union whose >> definition is in scope and which is accessed via l-values that >> manifestly involve the union type. >> >> Or, in the -no-strict-aliasing case, one could allow it everywhere. >> >> >> POSTAMBLE RESPONSES >> ==================>> >> ### Other differences >> >> ** If you know of other areas where the C used in practice differs >> from that of the ISO standard, or where compiler optimisation limits >> the behaviour you can rely on, please list them. ** >> >> There were many comments here which are hard to summarise. Many >> mention integer overflow and the behaviour of shift operators. >> >> >> >> >> ### Acknowledgements >> >> We would like to thank all those who responded to the survey, those >> who distributed it, and especially those members of the Cambridge >> Systems Research Group who helped us tune earlier versions. This work >> is funded by the EPSRC REMS (Rigorous Engineering for Mainstream >> Systems) Programme Grant, EP/K008528/1. >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory
Sean Silva
2015-Jul-01 02:44 UTC
[LLVMdev] C as used/implemented in practice: analysis of responses
On Tue, Jun 30, 2015 at 10:21 AM, Hal Finkel <hfinkel at anl.gov> wrote:> ----- Original Message ----- > > From: "Sean Silva" <chisophugis at gmail.com> > > To: "Peter Sewell" <Peter.Sewell at cl.cam.ac.uk> > > Cc: llvmdev at cs.uiuc.edu > > Sent: Friday, June 26, 2015 4:53:30 PM > > Subject: Re: [LLVMdev] C as used/implemented in practice: analysis of > responses > > > > > > > > All of these seem to fall into the pattern of "The compiler is > > required to do what you expect, as long as it can't prove X about > > your program". That is, the only reasonable compilation in the > > absence of inferring some extra piece of information about your > > program, is the one you expect. For example, the only way to codegen > > a comparison between two random pointers has the meaning you expect > > (on common computer architectures); but if the compiler can figure > > something out that tells it that comparing those two pointers is > > undefined by the language standard, then, well, technically it can > > do whatever it wants. > > > > > > Many people interpret this as the compiler being somewhat malevolent, > > but there's another interpretation in some cases. > > > > > > > > I have not looked in depth at the history in all the undefined > > behaviors mentioned in the survey, but some of the undefined > > behaviors are there because at some point in time the underlying > > system diversity made it difficult or impossible to assign a > > meaning. So long as the diversity that led to the desire to leave > > something undefined still exists, programs that use those constructs > > with certain expectations *will* fail to behave as "expected" on > > those targets (on a system where pointers are represented > > differently, your program *may* actually format your hard disk if > > you do so-and-so!). > > > > > > To put it another way, what is "expected" is actually dependent on > > the C programmer's knowledge of the underlying system (computer > > architecture, system architecture, etc.), and there will always be > > tension so long as the programmer is not thinking about what the C > > language guarantees, but rather (roughly speaking) how *they* would > > translate their code to assembly language for the system or systems > > that they happen to know they're targeting. An x86 programmer > > doesn't expect unaligned loads to invoke nasal demons, but a SPARC > > programmer does. > > > > > > So if you unravel the thread of logic back through the undefined > > behaviors made undefined for this reason, many of these cases of > > exploiting undefined behavior are really an extension, on the > > compiler's part, of the logic "there are some systems for which your > > code would invoke nasal demons, so I might as well assume that it > > will invoke nasal demons on this system (since the language standard > > doesn't say anything about specific systems)". Or to put it another > > way, the compiler is effectively assuming that your code is written > > to target all the systems taken into account by the C standard, and > > if it would invoke nasal demons on any one of them then the compiler > > is allowed to invoke nasal demons on all of them. > > Honestly, I don't think this argument buys us all that much in the eyes of > most programmers. Maybe we're saving them from themselves, maybe not, but > while certainly a side effect of exploiting undefined behavior, this is not > really why we exploit undefined behavior.I think you read "all" in some of the places I said "many" or "some". The example you give below isn't one of the ones I was referring to. Actually, probably all of the UB that deals with "what happens when you read/write from <some dubious address>" specifically doesn't fall into the category I was talking about -- those are true nasal demons and in those situations the compiler's justification to the programmer is "as far as the language is concerned, there's really no way to predict what could happen in these cases, so I have to assume that you don't care about what happens in this case", which seems pretty reasonable.> We exploit undefined behavior because it helps to optimize real programs > and shrink the size of generated code. > > Here's a simple example: > > int caching_disabled; > > static void do_something(struct cache *c) { > ... > > if (!caching_disabled) { > ... > ... c->something ... > ... > } > > ... > } > > /* This is only called when caching is disabled */ > void foo() { > ... > do_something(NULL); > ... > } > > a compiler might inline the call to do_something, or specialize it, such > that we know that the argument is NULL. But the code in do_something does > not check the pointer directly, but instead, checks some other mutable > state (that must be related to the pointer for the program to have defined > behavior). The code that is inlined into foo(), however, does not need to > check '!caching_disabled', but rather because c is unconditionally > dereferenced within the block guarded by that check, and because that would > necessarily be a dereference of a NULL pointer, and because that is > undefined behavior, the compiler can mark that block as unreachable, and > can eliminate the check and a potentially-large block of code when inlining.> This kind of application of undefined behavior turns out to be quite > helpful in practice. The rules for integer overflow are another great > example (loop optimizations don't work nearly as well without them in many > cases). >Integer overflow is one of the ones that IIRC is specifically there for historical reasons, due to 2's complement not being the unequivocal standard at the time (that is why overflow UB only applies to signed numbers). I was actually going to mention how it is a quite happy coincidence that we got undefined behavior for signed overflow. -- Sean Silva> > > > > This is obviously sort of a twisted logic, and I think that a lot of > > the "malevolence" attributed to compilers is due to this. It > > certainly removes many target-dependent checks from the mid-level > > optimizer though. > > > > This point is correct and important. Exploiting undefined behavior also > simplifies the task of creating target-independent optimizers. > > -Hal > > > > > > > -- Sean Silva > > > > > > On Fri, Jun 26, 2015 at 1:42 AM, Peter Sewell < > > Peter.Sewell at cl.cam.ac.uk > wrote: > > > > > > As part of a project to clarify what behaviour of C implementations > > is > > actually relied upon in modern practice, and what behaviour is > > guaranteed by current mainstream implementations, we recently > > distributed a survey of 15 questions about C, https://goo.gl/AZXH3S . > > > > We were asking what C is in current mainstream practice: the > > behaviour > > that programmers assume they can rely on, the behaviour provided by > > mainstream compilers, and the idioms used in existing code, > > especially > > systems code. We were *not* asking what the ISO C standard permits, > > which is often more restrictive, or about obsolete or obscure > > hardware > > or compilers. We focussed on the behaviour of memory and pointers. > > > > We've had around 300 responses, including many compiler and OS > > developers, and the results are summarised below, or on the web at > > http://www.cl.cam.ac.uk/~pes20/cerberus (which also has more > > details). > > For many questions the outcome seems clear, but for some, especially > > 1, 2, 9, 10, and 11, major open questions about current compiler > > behaviour remain; we'd greatly appreciate informed comments on those > > from the relevant compiler developers (or other experts). > > > > If you can answer these, please reply either below or by mailing the > > Cerberus mailing list: > > > > cl-cerberus at lists.cam.ac.uk > > > > https://lists.cam.ac.uk/mailman/listinfo/cl-cerberus > > > > many thanks, > > Kayvan Memarian and Peter Sewell (University of Cambridge) > > > > > > What is C in practice? (Cerberus survey): Conclusions > > =====================================================================> > > > ## Kayvan Memarian and Peter Sewell > > ### University of Cambridge, 2015-06-21 > > > > ---------------------------------------------------------------------- > > > > In April-June 2015 we distributed a web survey to investigate what C > > is, in current mainstream practice: the behaviour that programmers > > assume they can rely on, the behaviour provided by mainstream > > compilers, and the idioms used in existing code, especially systems > > code. We were *not* asking what the ISO C standard permits, which is > > often more restrictive, or about obsolete or obscure hardware or > > compilers. We focussed on the behaviour of memory and pointers. > > This is a step towards an unambiguous and mathematically precise > > definition > > of the consensus *de facto* standard (to the extent that such > > exists): the C that is actually used by systems programmers and > > implemented by mainstream compilers. > > > > This note summarises our conclusions. For many questions the outcome > > seems clear, but for some, especially 1, 2, 9, 10, and 11, major open > > questions about current compiler behaviour remain; we'd greatly > > appreciate further comments from the relevant compiler developers or > > other experts. > > > > More details of the results are available from > > [ > > > http://www.cl.cam.ac.uk/~pes20/cerberus](http://www.cl.cam.ac.uk/~pes20/cerberus) > > . > > > > ### Responses > > > > Aiming for a modest-scale but technically expert audience, we > > distributed the survey at the University of Cambridge systems > > research > > group, at EuroLLVM 2015, via John Regehr's blog, and via various > > mailing lists: gcc, llvmdev, cfe-dev, libc-alpha, xorg, a FreeBSD > > list, xen-devel, a Google C users list, and a Google C compilers > > list. > > In all there were around 300 responses, including around 100 printed > > pages of textual comments. Most report expertise in C systems > > programming and significant numbers report expertise in compiler > > internals and in the C standard. > > > > > > MAIN QUESTION RESPONSES > > ----------------------- > > > > ###[1/15] How predictable are reads from padding bytes? > > > > **If you zero all bytes of a struct and then write some of its > > members, do reads of the padding return zero? (e.g. for a bytewise > > CAS or hash of the struct, or to know that no security-relevant data > > has leaked into them.)** > > > > It remains unclear what behaviour compilers currently provide (or > > should provide) for this. We see four main alternatives: > > > > a) Structure copies might copy padding, but structure member writes > > never touch padding. > > > > b) Structure member writes might write zeros over subsequent padding. > > > > c) Structure member writes might write arbitrary values over > > subsequent > > padding, with reads seeing stable results. > > > > d) Padding bytes are regarded as always holding unspecified values, > > irrespective of any byte writes to them, and so reads of them might > > return arbitrary and unstable values (in which case the compiler > > could > > arbitrarily write to padding at any point, as that would be masked by > > this). > > > > On the one hand: > > > > - In some circumstances it seems important to provide systems > > programmers with a mechanism to ensure that no information is > > leaked via padding. Rewriting structure definitions to make all > > padding into explicit fields may not be practicable, especially if > > one wants to do so in a platform-independent way, and so option (d) > > is not compatible with this. Option (c) makes it possible but > > awkward to prevent leakage, as there padding must be re-zero'd > > after member writes. > > > > - In some circumstances programmers may rely on predictable padding > > values, at least in the absence of structure member writes, > > e.g. for memcmp, hashing, or compare-and-swap of struct values. > > Again (d) is not compatible with this, and (a) or (b) are > > preferable. But it's not clear whether any of those usages are > > common or essential. > > > > - For Clang, one respondent suggests that by the time the > > optimisation > > passes operate, padding has been replaced by explicit fields, so > > neither over-wide writes or permanently-undefined-value behaviour > > will occur. > > > > - For MSVC, one respondent suggests the compiler provides (a). > > > > On the other hand, others suggest plausible optimisations, e.g., > > "to apply SRA (scalar replacement of aggregates), replacing the > > memset with a sequence of member assignments (discarding assignments > > to padding) in order to do so". This could require (d) to make the > > existing compiler behaviour admissible, but it's unclear > > to us whether it actually does at present. > > > > Note also that for structs stored to malloc'd regions, option (d) is > > at odds with the idea that malloc'd regions can be reused, as a write > > of a new value (of a new type) to a malloc'd region might start with > > writes to what were padding bytes, so perhaps we could only really > > have this semantics for the other storage-duration kinds. > > > > Conclusion: > > > > For each compiler (GCC, Clang, MSVC, ICC, ...), we ask which of these > > it provides on mainstream platforms? If the answer is not (a), to > > what extent is it feasible to provide compiler flags that force the > > behaviour to be stronger? > > > > Our "mainstream C" semantics should provide switchable options for > > each of (a), (b), (c), and (d). > > > > > > > > ### [2/15] Uninitialised values > > > > **Is reading an uninitialised variable or struct member (with a > > current mainstream compiler):** > > > > **(This might either be due to a bug or be intentional, e.g. when > > copying a partially initialised struct, or to output, hash, or set > > some bits of a value that may have been partially initialised.)** > > > > a) undefined behaviour (meaning that the compiler is free to > > arbitrarily miscompile the program, with or without a warning), > > > > b) going to make the result of any expression involving that value > > unpredictable, > > > > c) going to give an arbitrary and unstable value (maybe with a > > different value if you read again), or > > > > d) going to give an arbitrary but stable value (with the same value > > if > > you read again). > > > > Here also it remains unclear what compilers current provide and what > > it should provide. The survey responses are dominated by the > > (a) "undefined behaviour" and (d) "arbitrary but stable" options. > > > > It's not clear whether people are actually depending on the latter, > > beyond the case of copying a partially initialised struct, which it > > seems must be supported, and comparing against a partially > > initialised > > struct, which it seems is done sometimes. Many respondents mention > > historical uses to attempt to get entropy, but that seems now widely > > regarded as crazy. There is a legitimate general argument that the > > more determinacy that can be provided the better, for debugging. > > > > But it seems clear from the text responses that GCC, Clang, and MSVC > > do not at present exploit the licence the ISO standard gives (in > > defining this to be undefined behaviour) to arbitrarily miscompile > > code, option (a). Clang seems to be the most aggressive, propagating > > undef in many cases, though one respondent said "LLVM is moving > > towards treating this as UB in the cases where the standards allow it > > to do so". But there are special cases where LLVM is a bit stronger > > (cf the undef docs); it's unclear why they think those are useful. > > For GCC, one respondent said > > > > "Going to give arbitrary, unstable values (that is, the variable > > assigned from the uninitialised variable itself acts as > > uninitialised and having no consistent value). (Quite possibly > > subsequent transformations will have the effect of undefined > > behavior.) Inconsistency of observed values is an inevitable > > consequence of transformations PHI (undefined, X) -> X (useful in > > practice for programs that don't actually use uninitialised > > variables, but where the compiler can't see that)." > > > > For MSVC, one respondent said: > > > > "I am aware of a significant divergence between the LLVM community > > and MSVC here; in general LLVM uses "undefined behaviour" to mean > > "we can miscompile the program and get better benchmarks", whereas > > MSVC regards "undefined behaviour" as "we might have a security > > vulnerability so this is a compile error / build break". First, > > there is reading an uninitialized variable (i.e. something which > > does not necessarily have a memory location); that should always be > > a compile error. Period. Second, there is reading a partially > > initialised struct (i.e. reading some memory whose contents are only > > partly defined). That should give a compile error/warning or static > > analysis warning if detectable. If not detectable it should give > > the actual contents of the memory (be stable). I am strongly with > > the MSVC folks on this one - if the compiler can tell at compile > > time that anything is undefined then it should error out. Security > > problems are a real problem for the whole industry and should not be > > included deliberately by compilers." > > > > > > For each compiler we ask which of these four semantics it provides. > > > > It looks as if several compiler writers are saying (b), while a > > significant number of programmers are relying on (d). > > > > Our "mainstream C" semantics should support both (b) and (d). > > > > > > > > ### [3/15] Can one use pointer arithmetic between separately > > allocated > > C objects? > > > > **If you calculate an offset between two separately allocated C > > memory > > objects (e.g. malloc'd regions or global or local variables) by > > pointer subtraction, can you make a usable pointer to the second by > > adding the offset to the address of the first?** > > > > Most respondents expect this to work, and a significant number know > > of > > real code that relies on it, with many concrete examples. So far we > > see no solid reason to disallow it for "mainstream" C > > implementations. > > The main reason to disallow it seems to have been segmented > > architectures, especially 8086. There are still some embedded > > architectures with distinct address spaces but it seems that > > "mainstream" C is not concerned with this, and those cases could be > > identified as a language dialect or implementation-defined choice. > > > > For GCC, one respondent writes the following, but doesn't give a > > reason: > > > > - This is not safe in practice even if the alignment is sufficient > > (and if the alignment of the type is less than its size, obviously > > such a subtraction can't possibly work even with a naive compiler). > > > > It looks reasonable to identify two language dialects, one (our > > "mainstream C") in which pointer arithmetic between separately > > allocated C objects is permitted and one (principally for segmented > > architectures) in which it is not. > > > > > > > > ### [4/15] Is pointer equality sensitive to their original allocation > > sites? > > > > **For two pointers derived from the addresses of two separate > > allocations, will equality testing (with ==) of them just compare > > their runtime values, or might it take their original allocations > > into > > account and assume that they do not alias, even if they happen to > > have > > the same runtime value? (for current mainstream compilers)** > > > > The responses are roughly bimodal: many believe "it will just compare > > the runtime values", while a similar number believe that the > > comparison might take the allocation provenance into account. > > Of the former, 41 "know of real code that relies on it". > > > > In practice we see that GCC does sometimes take allocation provenance > > into account, with the result of a comparison (in an n+1 case, > > comparing &p+1 and &q) sometimes varying depending on whether the > > compiler can see the provenance, e.g. on whether it's done in the > > same > > compilation unit as the allocation. We don't see any reason to forbid > > that, especially as this n+1 case seems unlikely to arise in > > practice, > > though it does complicate the semantics, effectively requiring a > > nondeterministic choice at each comparison of whether to take > > provenance into account. But for comparisons between pointers formed > > by more radical pointer arithmetic from pointers originally from > > different allocations, as in Q3, it's not so clear. > > > > The best "mainstream C" semantics here seems to be to make a > > nondeterministic choice at each comparison of whether to take > > provenance into account or just compare the runtime pointer value. In > > the vast majority of cases the two will coincide. > > > > > > > > ### [5/15] Can pointer values be copied indirectly? > > > > **Can you make a usable copy of a pointer by copying its > > representation bytes with code that indirectly computes the identity > > function on them, e.g. writing the pointer value to a file and then > > reading it back, and using compression or encryption on the way?** > > > > This seems unequivocably "yes", both in usage and for current > > compilers, at least in simple cases, with direct data-flow from > > original to computed pointer, both GCC and Clang support this. But > > for computation via control-flow, it's not so clear. > > > > It looks as if a reasonable "mainstream C" semantics should allow > > indirect pointer copying at least whenever there's a data-flow > > provenance path, perhaps not when there's only a control-flow > > provenance path. It should allow pointers to be marshalled to and > > read back in, and the simplest way of doing that is to allow any > > pointer value to be read in, with the compiler making no > > aliasing/provenance assumptions on such, and with the semantics > > checking the numeric pointer values points to a suitable live object > > only when and if it is dereferenced. > > > > > > > > ###[6/15] Pointer comparison at different types > > > > **Can one do == comparison between pointers to objects of different > > types (e.g. pointers to int, float, and different struct types)?** > > > > The question should have been clearer about whether the pointers are > > first cast to void* or char*. With those casts, the responses seem > > clear that it should be allowed, modulo now-unusual architectures > > with > > segmented memory or where the pointer representations are different. > > > > Then there's a question, which we would hope applies only in the case > > without those casts, about whether -fstrict-aliasing will treat > > comparisons with type mismatches as nonequal, e.g. > > > > - "Depends on strict-aliasing flags? I think LLVM TBAA might optimise > > this sort of check away?" > > > > - "There are a lot of examples of this, in particular in libc, or > > possibly implementations of vtables." > > > > > > > > > > ### [7/15] Pointer comparison across different allocations > > > > **Can one do < comparison between pointers to separately allocated > > objects?** > > > > This seems to be widely used for lock ordering and collections. > > As for Q3, there's a potential issue for segmented memory systems > > (where the implementation might only compare the offset) which seems > > not to be relevant for current "mainstream" C. > > > > Apart from that, there doesn't seem to be any reason from compiler > > implementation to forbid it. > > > > > > > > ###[8/15] Pointer values after lifetime end > > > > **Can you inspect (e.g. by comparing with ==) the value of a pointer > > to an object after the object itself has been free'd or its scope has > > ended?** > > > > The responses mostly say that this will work (modulo debugging > > environments), with various use cases, and there is no clear reason > > why compilers do not support it. > > > > Can we either establish that current mainstream compilers will > > support > > this or identify more specifically where and how they will fail to do > > so? > > > > > > ### [9/15] Pointer arithmetic > > > > **Can you (transiently) construct an out-of-bounds pointer value > > (e.g. > > before the beginning of an array, or more than one-past its end) by > > pointer arithmetic, so long as later arithmetic makes it in-bounds > > before it is used to access memory?** > > > > It seems clear that this is often assumed to work. But on the other > > hand, compilers may sometimes assume otherwise: > > > > - "This is not safe; compilers may optimise based on pointers being > > within bounds. In some cases, it's possible such code might not > > even link, depending on the offsets allowed in any relocations that > > get used in the object files." > > > > - "The situation has not gotten friendlier to old-school pointer > > manipulations since https://lwn.net/Articles/278137/ was written in > > 2008. [This is a case where GCC optimised away a comparison > > involving an out-of-bounds pointer] The pattern could still be found > > in code exposed to malicious interlocutors in 2013: > > https://access.redhat.com/security/cve/CVE-2013-5607 " > > > > - "Pretty sure this one I've seen buggy code optimised away by real > > compilers." > > > > Here the prevalence of transiently out-of-bounds pointer values in > > real code suggests it's worth seriously asking the cost of disabling > > whatever compiler optimisation is done based on this, to provide a > > simple predictable semantics. > > > > > > > > ### [10/15] Pointer casts > > > > **Given two structure types that have the same initial members, can > > you use a pointer of one type to access the intial members of a value > > of the other?** > > > > It's clear that this is used very commonly. On the other hand, with > > strict aliasing: > > > > - "This is something that GCC tends to actually kill in practice (if > > strict aliasing is on); I've had to fix bugs that were caused by > > it." > > > > and w.r.t. GCC: > > > > - "This is not safe in practice (unless a union is visibly > > used as described in 6.5.2.3#6)." > > > > though the latter doesn't say why, or whether that's specific to > > strict-aliasing. > > > > At least with -no-strict-aliasing, it seems this should be guaranteed > > to work. > > > > > > > > ### [11/15] Using unsigned char arrays > > > > **Can an unsigned character array be used (in the same way as a > > malloc’d region) to hold values of other types?** > > > > Here again it's clear that it's very often relied on for statically > > allocated (non-malloc'd) character arrays, and it should work, with > > due care about alignment. But the ISO standard disallows it, and we > > also see: > > > > - [wrt GCC] "No, this is not safe (if it's visible to the compiler > > that the memory in question has unsigned char as its declared > > type)." > > > > though the latter doesn't say why. It is a violation of the > > strict-aliasing text of the ISO standard. > > > > With -no-strict-aliasing it seems clear it that it should be allowed. > > > > > > > > ### [12/15] Null pointers from non-constant expressions > > > > **Can you make a null pointer by casting from an expression that > > isn't > > a constant but that evaluates to 0?** > > > > This is very often assumed to work. The only exception seems to be > > some (unidentified) embedded systems. > > > > Our "mainstream C" semantics should permit it. > > > > > > > > ### [13/15] Null pointer representations > > > > **Can null pointers be assumed to be represented with 0?** > > > > Basically an unequivocal "yes" for mainstream systems. Again there a > > potential exception for segmented memory, but not one relevant for > > "mainstream" current practice. > > > > > > > > ### [14/15] Overlarge representation reads > > > > **Can one read the byte representation of a struct as aligned words > > without regard for the fact that its extent might not include all of > > the last word?** > > > > This is sometimes used in practice and believed to work, modulo > > alignment, page-boundary alignment, and valgrind/MSan/etc. > > Our "mainstream C" semantics could either forbid this entirely > > (slightly limiting the scope of the semantics) or could allow it, for > > sufficiently aligned cases, if some switch is set. > > > > For the moment we choose the former. > > > > ### [15/15] Union type punning > > > > **When is type punning - writing one union member and then reading it > > as a different member, thereby reinterpreting its representation > > bytes > > - guaranteed to work (without confusing the compiler analysis and > > optimisation passes)?** > > > > There's widespread doubt, disagreement and confusion here, e.g.: > > > > - "always" > > - "Never" > > - "According to the standard never; in practice always." > > > > Here the minimal thing it seems we should support is, broadly > > following the GCC documentation, type punning via a union whose > > definition is in scope and which is accessed via l-values that > > manifestly involve the union type. > > > > Or, in the -no-strict-aliasing case, one could allow it everywhere. > > > > > > POSTAMBLE RESPONSES > > ==================> > > > ### Other differences > > > > ** If you know of other areas where the C used in practice differs > > from that of the ISO standard, or where compiler optimisation limits > > the behaviour you can rely on, please list them. ** > > > > There were many comments here which are hard to summarise. Many > > mention integer overflow and the behaviour of shift operators. > > > > > > > > > > ### Acknowledgements > > > > We would like to thank all those who responded to the survey, those > > who distributed it, and especially those members of the Cambridge > > Systems Research Group who helped us tune earlier versions. This work > > is funded by the EPSRC REMS (Rigorous Engineering for Mainstream > > Systems) Programme Grant, EP/K008528/1. > > > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150630/63feba48/attachment.html>
Sean Silva
2015-Jul-01 02:53 UTC
[LLVMdev] C as used/implemented in practice: analysis of responses
On Tue, Jun 30, 2015 at 2:37 PM, Peter Sewell <Peter.Sewell at cl.cam.ac.uk> wrote:> On 30 June 2015 at 18:21, Hal Finkel <hfinkel at anl.gov> wrote: > > ----- Original Message ----- > >> From: "Sean Silva" <chisophugis at gmail.com> > >> To: "Peter Sewell" <Peter.Sewell at cl.cam.ac.uk> > >> Cc: llvmdev at cs.uiuc.edu > >> Sent: Friday, June 26, 2015 4:53:30 PM > >> Subject: Re: [LLVMdev] C as used/implemented in practice: analysis of > responses > >> > >> > >> > >> All of these seem to fall into the pattern of "The compiler is > >> required to do what you expect, as long as it can't prove X about > >> your program". That is, the only reasonable compilation in the > >> absence of inferring some extra piece of information about your > >> program, is the one you expect. For example, the only way to codegen > >> a comparison between two random pointers has the meaning you expect > >> (on common computer architectures); but if the compiler can figure > >> something out that tells it that comparing those two pointers is > >> undefined by the language standard, then, well, technically it can > >> do whatever it wants. > >> > >> > >> Many people interpret this as the compiler being somewhat malevolent, > >> but there's another interpretation in some cases. > >> > >> > >> > >> I have not looked in depth at the history in all the undefined > >> behaviors mentioned in the survey, but some of the undefined > >> behaviors are there because at some point in time the underlying > >> system diversity made it difficult or impossible to assign a > >> meaning. So long as the diversity that led to the desire to leave > >> something undefined still exists, programs that use those constructs > >> with certain expectations *will* fail to behave as "expected" on > >> those targets (on a system where pointers are represented > >> differently, your program *may* actually format your hard disk if > >> you do so-and-so!). > >> > >> > >> To put it another way, what is "expected" is actually dependent on > >> the C programmer's knowledge of the underlying system (computer > >> architecture, system architecture, etc.), and there will always be > >> tension so long as the programmer is not thinking about what the C > >> language guarantees, but rather (roughly speaking) how *they* would > >> translate their code to assembly language for the system or systems > >> that they happen to know they're targeting. An x86 programmer > >> doesn't expect unaligned loads to invoke nasal demons, but a SPARC > >> programmer does. > >> > >> > >> So if you unravel the thread of logic back through the undefined > >> behaviors made undefined for this reason, many of these cases of > >> exploiting undefined behavior are really an extension, on the > >> compiler's part, of the logic "there are some systems for which your > >> code would invoke nasal demons, so I might as well assume that it > >> will invoke nasal demons on this system (since the language standard > >> doesn't say anything about specific systems)". Or to put it another > >> way, the compiler is effectively assuming that your code is written > >> to target all the systems taken into account by the C standard, and > >> if it would invoke nasal demons on any one of them then the compiler > >> is allowed to invoke nasal demons on all of them. > > > > Honestly, I don't think this argument buys us all that much in the eyes > of most programmers. Maybe we're saving them from themselves, maybe not, > but while certainly a side effect of exploiting undefined behavior, this is > not really why we exploit undefined behavior. We exploit undefined behavior > because it helps to optimize real programs and shrink the size of generated > code. > > > > Here's a simple example: > > > > int caching_disabled; > > > > static void do_something(struct cache *c) { > > ... > > > > if (!caching_disabled) { > > ... > > ... c->something ... > > ... > > } > > > > ... > > } > > > > /* This is only called when caching is disabled */ > > void foo() { > > ... > > do_something(NULL); > > ... > > } > > > > a compiler might inline the call to do_something, or specialize it, such > that we know that the argument is NULL. But the code in do_something does > not check the pointer directly, but instead, checks some other mutable > state (that must be related to the pointer for the program to have defined > behavior). The code that is inlined into foo(), however, does not need to > check '!caching_disabled', but rather because c is unconditionally > dereferenced within the block guarded by that check, and because that would > necessarily be a dereference of a NULL pointer, and because that is > undefined behavior, the compiler can mark that block as unreachable, and > can eliminate the check and a potentially-large block of code when inlining. > > > > This kind of application of undefined behavior turns out to be quite > helpful in practice. The rules for integer overflow are another great > example (loop optimizations don't work nearly as well without them in many > cases). > > Yes - but there seems also to be a significant downside of aggressive > exploitation of some of these UBs, at least as reported by OS and > other systems people. The trade-offs between allowing optimisation > and having a language which can be understood (and which supports > common-but-non-ISO idioms) are difficult to manage, but somehow we > (collectively) need to do better than previous decades of C evolution > have. >In the particular example Hal gives it seems like most users would probably prefer the compiler require them to add an explicit assertion of c's validity in the block dominated by !caching_disabled. Unfortunately in other cases it is very hard to communicate what the user should assert/why they should assert it, as Chris talks about in his blog posts. So it realistically becomes sort of black and white -- either don't optimize based on UB or do. For what is probably just social reasons, the desire to optimize wins out; from an economic standpoint (e.g. power saved) it overall may be the right choice (I haven't run any ballpark figures though and don't claim this to be true). Ideally together with the compiler there would be a static analyzer with the invariant that it finds all situations where the compiler, while compiling, optimizes based on UB. This static analyzer would report in an intelligible way on all such situations. Unfortunately this is a really hard problem. -- Sean Silva> > thanks, > Peter > > > >> > >> This is obviously sort of a twisted logic, and I think that a lot of > >> the "malevolence" attributed to compilers is due to this. It > >> certainly removes many target-dependent checks from the mid-level > >> optimizer though. > >> > > > > This point is correct and important. Exploiting undefined behavior also > simplifies the task of creating target-independent optimizers. > > > > -Hal > > > >> > >> > >> -- Sean Silva > >> > >> > >> On Fri, Jun 26, 2015 at 1:42 AM, Peter Sewell < > >> Peter.Sewell at cl.cam.ac.uk > wrote: > >> > >> > >> As part of a project to clarify what behaviour of C implementations > >> is > >> actually relied upon in modern practice, and what behaviour is > >> guaranteed by current mainstream implementations, we recently > >> distributed a survey of 15 questions about C, https://goo.gl/AZXH3S . > >> > >> We were asking what C is in current mainstream practice: the > >> behaviour > >> that programmers assume they can rely on, the behaviour provided by > >> mainstream compilers, and the idioms used in existing code, > >> especially > >> systems code. We were *not* asking what the ISO C standard permits, > >> which is often more restrictive, or about obsolete or obscure > >> hardware > >> or compilers. We focussed on the behaviour of memory and pointers. > >> > >> We've had around 300 responses, including many compiler and OS > >> developers, and the results are summarised below, or on the web at > >> http://www.cl.cam.ac.uk/~pes20/cerberus (which also has more > >> details). > >> For many questions the outcome seems clear, but for some, especially > >> 1, 2, 9, 10, and 11, major open questions about current compiler > >> behaviour remain; we'd greatly appreciate informed comments on those > >> from the relevant compiler developers (or other experts). > >> > >> If you can answer these, please reply either below or by mailing the > >> Cerberus mailing list: > >> > >> cl-cerberus at lists.cam.ac.uk > >> > >> https://lists.cam.ac.uk/mailman/listinfo/cl-cerberus > >> > >> many thanks, > >> Kayvan Memarian and Peter Sewell (University of Cambridge) > >> > >> > >> What is C in practice? (Cerberus survey): Conclusions > >> =====================================================================> >> > >> ## Kayvan Memarian and Peter Sewell > >> ### University of Cambridge, 2015-06-21 > >> > >> ---------------------------------------------------------------------- > >> > >> In April-June 2015 we distributed a web survey to investigate what C > >> is, in current mainstream practice: the behaviour that programmers > >> assume they can rely on, the behaviour provided by mainstream > >> compilers, and the idioms used in existing code, especially systems > >> code. We were *not* asking what the ISO C standard permits, which is > >> often more restrictive, or about obsolete or obscure hardware or > >> compilers. We focussed on the behaviour of memory and pointers. > >> This is a step towards an unambiguous and mathematically precise > >> definition > >> of the consensus *de facto* standard (to the extent that such > >> exists): the C that is actually used by systems programmers and > >> implemented by mainstream compilers. > >> > >> This note summarises our conclusions. For many questions the outcome > >> seems clear, but for some, especially 1, 2, 9, 10, and 11, major open > >> questions about current compiler behaviour remain; we'd greatly > >> appreciate further comments from the relevant compiler developers or > >> other experts. > >> > >> More details of the results are available from > >> [ > >> > http://www.cl.cam.ac.uk/~pes20/cerberus](http://www.cl.cam.ac.uk/~pes20/cerberus) > >> . > >> > >> ### Responses > >> > >> Aiming for a modest-scale but technically expert audience, we > >> distributed the survey at the University of Cambridge systems > >> research > >> group, at EuroLLVM 2015, via John Regehr's blog, and via various > >> mailing lists: gcc, llvmdev, cfe-dev, libc-alpha, xorg, a FreeBSD > >> list, xen-devel, a Google C users list, and a Google C compilers > >> list. > >> In all there were around 300 responses, including around 100 printed > >> pages of textual comments. Most report expertise in C systems > >> programming and significant numbers report expertise in compiler > >> internals and in the C standard. > >> > >> > >> MAIN QUESTION RESPONSES > >> ----------------------- > >> > >> ###[1/15] How predictable are reads from padding bytes? > >> > >> **If you zero all bytes of a struct and then write some of its > >> members, do reads of the padding return zero? (e.g. for a bytewise > >> CAS or hash of the struct, or to know that no security-relevant data > >> has leaked into them.)** > >> > >> It remains unclear what behaviour compilers currently provide (or > >> should provide) for this. We see four main alternatives: > >> > >> a) Structure copies might copy padding, but structure member writes > >> never touch padding. > >> > >> b) Structure member writes might write zeros over subsequent padding. > >> > >> c) Structure member writes might write arbitrary values over > >> subsequent > >> padding, with reads seeing stable results. > >> > >> d) Padding bytes are regarded as always holding unspecified values, > >> irrespective of any byte writes to them, and so reads of them might > >> return arbitrary and unstable values (in which case the compiler > >> could > >> arbitrarily write to padding at any point, as that would be masked by > >> this). > >> > >> On the one hand: > >> > >> - In some circumstances it seems important to provide systems > >> programmers with a mechanism to ensure that no information is > >> leaked via padding. Rewriting structure definitions to make all > >> padding into explicit fields may not be practicable, especially if > >> one wants to do so in a platform-independent way, and so option (d) > >> is not compatible with this. Option (c) makes it possible but > >> awkward to prevent leakage, as there padding must be re-zero'd > >> after member writes. > >> > >> - In some circumstances programmers may rely on predictable padding > >> values, at least in the absence of structure member writes, > >> e.g. for memcmp, hashing, or compare-and-swap of struct values. > >> Again (d) is not compatible with this, and (a) or (b) are > >> preferable. But it's not clear whether any of those usages are > >> common or essential. > >> > >> - For Clang, one respondent suggests that by the time the > >> optimisation > >> passes operate, padding has been replaced by explicit fields, so > >> neither over-wide writes or permanently-undefined-value behaviour > >> will occur. > >> > >> - For MSVC, one respondent suggests the compiler provides (a). > >> > >> On the other hand, others suggest plausible optimisations, e.g., > >> "to apply SRA (scalar replacement of aggregates), replacing the > >> memset with a sequence of member assignments (discarding assignments > >> to padding) in order to do so". This could require (d) to make the > >> existing compiler behaviour admissible, but it's unclear > >> to us whether it actually does at present. > >> > >> Note also that for structs stored to malloc'd regions, option (d) is > >> at odds with the idea that malloc'd regions can be reused, as a write > >> of a new value (of a new type) to a malloc'd region might start with > >> writes to what were padding bytes, so perhaps we could only really > >> have this semantics for the other storage-duration kinds. > >> > >> Conclusion: > >> > >> For each compiler (GCC, Clang, MSVC, ICC, ...), we ask which of these > >> it provides on mainstream platforms? If the answer is not (a), to > >> what extent is it feasible to provide compiler flags that force the > >> behaviour to be stronger? > >> > >> Our "mainstream C" semantics should provide switchable options for > >> each of (a), (b), (c), and (d). > >> > >> > >> > >> ### [2/15] Uninitialised values > >> > >> **Is reading an uninitialised variable or struct member (with a > >> current mainstream compiler):** > >> > >> **(This might either be due to a bug or be intentional, e.g. when > >> copying a partially initialised struct, or to output, hash, or set > >> some bits of a value that may have been partially initialised.)** > >> > >> a) undefined behaviour (meaning that the compiler is free to > >> arbitrarily miscompile the program, with or without a warning), > >> > >> b) going to make the result of any expression involving that value > >> unpredictable, > >> > >> c) going to give an arbitrary and unstable value (maybe with a > >> different value if you read again), or > >> > >> d) going to give an arbitrary but stable value (with the same value > >> if > >> you read again). > >> > >> Here also it remains unclear what compilers current provide and what > >> it should provide. The survey responses are dominated by the > >> (a) "undefined behaviour" and (d) "arbitrary but stable" options. > >> > >> It's not clear whether people are actually depending on the latter, > >> beyond the case of copying a partially initialised struct, which it > >> seems must be supported, and comparing against a partially > >> initialised > >> struct, which it seems is done sometimes. Many respondents mention > >> historical uses to attempt to get entropy, but that seems now widely > >> regarded as crazy. There is a legitimate general argument that the > >> more determinacy that can be provided the better, for debugging. > >> > >> But it seems clear from the text responses that GCC, Clang, and MSVC > >> do not at present exploit the licence the ISO standard gives (in > >> defining this to be undefined behaviour) to arbitrarily miscompile > >> code, option (a). Clang seems to be the most aggressive, propagating > >> undef in many cases, though one respondent said "LLVM is moving > >> towards treating this as UB in the cases where the standards allow it > >> to do so". But there are special cases where LLVM is a bit stronger > >> (cf the undef docs); it's unclear why they think those are useful. > >> For GCC, one respondent said > >> > >> "Going to give arbitrary, unstable values (that is, the variable > >> assigned from the uninitialised variable itself acts as > >> uninitialised and having no consistent value). (Quite possibly > >> subsequent transformations will have the effect of undefined > >> behavior.) Inconsistency of observed values is an inevitable > >> consequence of transformations PHI (undefined, X) -> X (useful in > >> practice for programs that don't actually use uninitialised > >> variables, but where the compiler can't see that)." > >> > >> For MSVC, one respondent said: > >> > >> "I am aware of a significant divergence between the LLVM community > >> and MSVC here; in general LLVM uses "undefined behaviour" to mean > >> "we can miscompile the program and get better benchmarks", whereas > >> MSVC regards "undefined behaviour" as "we might have a security > >> vulnerability so this is a compile error / build break". First, > >> there is reading an uninitialized variable (i.e. something which > >> does not necessarily have a memory location); that should always be > >> a compile error. Period. Second, there is reading a partially > >> initialised struct (i.e. reading some memory whose contents are only > >> partly defined). That should give a compile error/warning or static > >> analysis warning if detectable. If not detectable it should give > >> the actual contents of the memory (be stable). I am strongly with > >> the MSVC folks on this one - if the compiler can tell at compile > >> time that anything is undefined then it should error out. Security > >> problems are a real problem for the whole industry and should not be > >> included deliberately by compilers." > >> > >> > >> For each compiler we ask which of these four semantics it provides. > >> > >> It looks as if several compiler writers are saying (b), while a > >> significant number of programmers are relying on (d). > >> > >> Our "mainstream C" semantics should support both (b) and (d). > >> > >> > >> > >> ### [3/15] Can one use pointer arithmetic between separately > >> allocated > >> C objects? > >> > >> **If you calculate an offset between two separately allocated C > >> memory > >> objects (e.g. malloc'd regions or global or local variables) by > >> pointer subtraction, can you make a usable pointer to the second by > >> adding the offset to the address of the first?** > >> > >> Most respondents expect this to work, and a significant number know > >> of > >> real code that relies on it, with many concrete examples. So far we > >> see no solid reason to disallow it for "mainstream" C > >> implementations. > >> The main reason to disallow it seems to have been segmented > >> architectures, especially 8086. There are still some embedded > >> architectures with distinct address spaces but it seems that > >> "mainstream" C is not concerned with this, and those cases could be > >> identified as a language dialect or implementation-defined choice. > >> > >> For GCC, one respondent writes the following, but doesn't give a > >> reason: > >> > >> - This is not safe in practice even if the alignment is sufficient > >> (and if the alignment of the type is less than its size, obviously > >> such a subtraction can't possibly work even with a naive compiler). > >> > >> It looks reasonable to identify two language dialects, one (our > >> "mainstream C") in which pointer arithmetic between separately > >> allocated C objects is permitted and one (principally for segmented > >> architectures) in which it is not. > >> > >> > >> > >> ### [4/15] Is pointer equality sensitive to their original allocation > >> sites? > >> > >> **For two pointers derived from the addresses of two separate > >> allocations, will equality testing (with ==) of them just compare > >> their runtime values, or might it take their original allocations > >> into > >> account and assume that they do not alias, even if they happen to > >> have > >> the same runtime value? (for current mainstream compilers)** > >> > >> The responses are roughly bimodal: many believe "it will just compare > >> the runtime values", while a similar number believe that the > >> comparison might take the allocation provenance into account. > >> Of the former, 41 "know of real code that relies on it". > >> > >> In practice we see that GCC does sometimes take allocation provenance > >> into account, with the result of a comparison (in an n+1 case, > >> comparing &p+1 and &q) sometimes varying depending on whether the > >> compiler can see the provenance, e.g. on whether it's done in the > >> same > >> compilation unit as the allocation. We don't see any reason to forbid > >> that, especially as this n+1 case seems unlikely to arise in > >> practice, > >> though it does complicate the semantics, effectively requiring a > >> nondeterministic choice at each comparison of whether to take > >> provenance into account. But for comparisons between pointers formed > >> by more radical pointer arithmetic from pointers originally from > >> different allocations, as in Q3, it's not so clear. > >> > >> The best "mainstream C" semantics here seems to be to make a > >> nondeterministic choice at each comparison of whether to take > >> provenance into account or just compare the runtime pointer value. In > >> the vast majority of cases the two will coincide. > >> > >> > >> > >> ### [5/15] Can pointer values be copied indirectly? > >> > >> **Can you make a usable copy of a pointer by copying its > >> representation bytes with code that indirectly computes the identity > >> function on them, e.g. writing the pointer value to a file and then > >> reading it back, and using compression or encryption on the way?** > >> > >> This seems unequivocably "yes", both in usage and for current > >> compilers, at least in simple cases, with direct data-flow from > >> original to computed pointer, both GCC and Clang support this. But > >> for computation via control-flow, it's not so clear. > >> > >> It looks as if a reasonable "mainstream C" semantics should allow > >> indirect pointer copying at least whenever there's a data-flow > >> provenance path, perhaps not when there's only a control-flow > >> provenance path. It should allow pointers to be marshalled to and > >> read back in, and the simplest way of doing that is to allow any > >> pointer value to be read in, with the compiler making no > >> aliasing/provenance assumptions on such, and with the semantics > >> checking the numeric pointer values points to a suitable live object > >> only when and if it is dereferenced. > >> > >> > >> > >> ###[6/15] Pointer comparison at different types > >> > >> **Can one do == comparison between pointers to objects of different > >> types (e.g. pointers to int, float, and different struct types)?** > >> > >> The question should have been clearer about whether the pointers are > >> first cast to void* or char*. With those casts, the responses seem > >> clear that it should be allowed, modulo now-unusual architectures > >> with > >> segmented memory or where the pointer representations are different. > >> > >> Then there's a question, which we would hope applies only in the case > >> without those casts, about whether -fstrict-aliasing will treat > >> comparisons with type mismatches as nonequal, e.g. > >> > >> - "Depends on strict-aliasing flags? I think LLVM TBAA might optimise > >> this sort of check away?" > >> > >> - "There are a lot of examples of this, in particular in libc, or > >> possibly implementations of vtables." > >> > >> > >> > >> > >> ### [7/15] Pointer comparison across different allocations > >> > >> **Can one do < comparison between pointers to separately allocated > >> objects?** > >> > >> This seems to be widely used for lock ordering and collections. > >> As for Q3, there's a potential issue for segmented memory systems > >> (where the implementation might only compare the offset) which seems > >> not to be relevant for current "mainstream" C. > >> > >> Apart from that, there doesn't seem to be any reason from compiler > >> implementation to forbid it. > >> > >> > >> > >> ###[8/15] Pointer values after lifetime end > >> > >> **Can you inspect (e.g. by comparing with ==) the value of a pointer > >> to an object after the object itself has been free'd or its scope has > >> ended?** > >> > >> The responses mostly say that this will work (modulo debugging > >> environments), with various use cases, and there is no clear reason > >> why compilers do not support it. > >> > >> Can we either establish that current mainstream compilers will > >> support > >> this or identify more specifically where and how they will fail to do > >> so? > >> > >> > >> ### [9/15] Pointer arithmetic > >> > >> **Can you (transiently) construct an out-of-bounds pointer value > >> (e.g. > >> before the beginning of an array, or more than one-past its end) by > >> pointer arithmetic, so long as later arithmetic makes it in-bounds > >> before it is used to access memory?** > >> > >> It seems clear that this is often assumed to work. But on the other > >> hand, compilers may sometimes assume otherwise: > >> > >> - "This is not safe; compilers may optimise based on pointers being > >> within bounds. In some cases, it's possible such code might not > >> even link, depending on the offsets allowed in any relocations that > >> get used in the object files." > >> > >> - "The situation has not gotten friendlier to old-school pointer > >> manipulations since https://lwn.net/Articles/278137/ was written in > >> 2008. [This is a case where GCC optimised away a comparison > >> involving an out-of-bounds pointer] The pattern could still be found > >> in code exposed to malicious interlocutors in 2013: > >> https://access.redhat.com/security/cve/CVE-2013-5607 " > >> > >> - "Pretty sure this one I've seen buggy code optimised away by real > >> compilers." > >> > >> Here the prevalence of transiently out-of-bounds pointer values in > >> real code suggests it's worth seriously asking the cost of disabling > >> whatever compiler optimisation is done based on this, to provide a > >> simple predictable semantics. > >> > >> > >> > >> ### [10/15] Pointer casts > >> > >> **Given two structure types that have the same initial members, can > >> you use a pointer of one type to access the intial members of a value > >> of the other?** > >> > >> It's clear that this is used very commonly. On the other hand, with > >> strict aliasing: > >> > >> - "This is something that GCC tends to actually kill in practice (if > >> strict aliasing is on); I've had to fix bugs that were caused by > >> it." > >> > >> and w.r.t. GCC: > >> > >> - "This is not safe in practice (unless a union is visibly > >> used as described in 6.5.2.3#6)." > >> > >> though the latter doesn't say why, or whether that's specific to > >> strict-aliasing. > >> > >> At least with -no-strict-aliasing, it seems this should be guaranteed > >> to work. > >> > >> > >> > >> ### [11/15] Using unsigned char arrays > >> > >> **Can an unsigned character array be used (in the same way as a > >> malloc’d region) to hold values of other types?** > >> > >> Here again it's clear that it's very often relied on for statically > >> allocated (non-malloc'd) character arrays, and it should work, with > >> due care about alignment. But the ISO standard disallows it, and we > >> also see: > >> > >> - [wrt GCC] "No, this is not safe (if it's visible to the compiler > >> that the memory in question has unsigned char as its declared > >> type)." > >> > >> though the latter doesn't say why. It is a violation of the > >> strict-aliasing text of the ISO standard. > >> > >> With -no-strict-aliasing it seems clear it that it should be allowed. > >> > >> > >> > >> ### [12/15] Null pointers from non-constant expressions > >> > >> **Can you make a null pointer by casting from an expression that > >> isn't > >> a constant but that evaluates to 0?** > >> > >> This is very often assumed to work. The only exception seems to be > >> some (unidentified) embedded systems. > >> > >> Our "mainstream C" semantics should permit it. > >> > >> > >> > >> ### [13/15] Null pointer representations > >> > >> **Can null pointers be assumed to be represented with 0?** > >> > >> Basically an unequivocal "yes" for mainstream systems. Again there a > >> potential exception for segmented memory, but not one relevant for > >> "mainstream" current practice. > >> > >> > >> > >> ### [14/15] Overlarge representation reads > >> > >> **Can one read the byte representation of a struct as aligned words > >> without regard for the fact that its extent might not include all of > >> the last word?** > >> > >> This is sometimes used in practice and believed to work, modulo > >> alignment, page-boundary alignment, and valgrind/MSan/etc. > >> Our "mainstream C" semantics could either forbid this entirely > >> (slightly limiting the scope of the semantics) or could allow it, for > >> sufficiently aligned cases, if some switch is set. > >> > >> For the moment we choose the former. > >> > >> ### [15/15] Union type punning > >> > >> **When is type punning - writing one union member and then reading it > >> as a different member, thereby reinterpreting its representation > >> bytes > >> - guaranteed to work (without confusing the compiler analysis and > >> optimisation passes)?** > >> > >> There's widespread doubt, disagreement and confusion here, e.g.: > >> > >> - "always" > >> - "Never" > >> - "According to the standard never; in practice always." > >> > >> Here the minimal thing it seems we should support is, broadly > >> following the GCC documentation, type punning via a union whose > >> definition is in scope and which is accessed via l-values that > >> manifestly involve the union type. > >> > >> Or, in the -no-strict-aliasing case, one could allow it everywhere. > >> > >> > >> POSTAMBLE RESPONSES > >> ==================> >> > >> ### Other differences > >> > >> ** If you know of other areas where the C used in practice differs > >> from that of the ISO standard, or where compiler optimisation limits > >> the behaviour you can rely on, please list them. ** > >> > >> There were many comments here which are hard to summarise. Many > >> mention integer overflow and the behaviour of shift operators. > >> > >> > >> > >> > >> ### Acknowledgements > >> > >> We would like to thank all those who responded to the survey, those > >> who distributed it, and especially those members of the Cambridge > >> Systems Research Group who helped us tune earlier versions. This work > >> is funded by the EPSRC REMS (Rigorous Engineering for Mainstream > >> Systems) Programme Grant, EP/K008528/1. > >> > >> _______________________________________________ > >> LLVM Developers mailing list > >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >> > >> > >> _______________________________________________ > >> LLVM Developers mailing list > >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >> > > > > -- > > Hal Finkel > > Assistant Computational Scientist > > Leadership Computing Facility > > Argonne National Laboratory >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150630/d34dfef8/attachment.html>
Russell Wallace
2015-Jul-01 14:20 UTC
[LLVMdev] C as used/implemented in practice: analysis of responses
On Tue, Jun 30, 2015 at 6:21 PM, Hal Finkel <hfinkel at anl.gov> wrote:> We exploit undefined behavior because it helps to optimize real programs > and shrink the size of generated code. >That is the reason compilers exploit undefined behaviour even when they are generating code for a vanilla architecture with a flat address space, yes. However, I will suggest: 1. The performance gain from this on real programs is small. I will suggest that the total performance gain from optimisations that rely on exploiting undefined behaviour - let's call them monkey's paw optimisations for short - is practically never more than a few percent, and often less than one percent. 2. For most programs, having the program work is worth far more than having it run a few percent faster. 3. If you look at the survey answers, it's clear that most real programs, whether deliberately or accidentally, invoke undefined behaviour at some point. You could say this is the programmer's fault, but in practice, miscompiling a program typically punishes people other than the programmer who wrote the code in question (and who may at this stage be long gone). Furthermore, so little behaviour is actually defined by the letter of the C and C++ standards, that the probability of even a team of highly skilled and conscientious programmers writing a million line program without ever invoking undefined behaviour is for all practical purposes zero. I am frankly of the opinion that monkey's paw optimisations cause so much trouble and are so difficult (for all practical purposes impossible) to avoid tripping over, that it would be better to remove them entirely, but given that compiler maintainers are clearly unwilling to do that, I propose the following compromise: Group all monkey's paw optimisations together, and enable them only if an extra compiler flag is supplied. Or failing that, at least have a compiler flag that will disable all of them (while leaving all the safe optimisations enabled). -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150701/05a0db2e/attachment.html>
Renato Golin
2015-Jul-01 15:41 UTC
[LLVMdev] C as used/implemented in practice: analysis of responses
On 1 July 2015 at 15:20, Russell Wallace <russell.wallace at gmail.com> wrote:> Group all monkey's paw optimisations together, and enable them only if an > extra compiler flag is supplied. Or failing that, at least have a compiler > flag that will disable all of them (while leaving all the safe optimisations > enabled).So, are you suggesting we get rid of all undefined AND implementation defined behaviour from compilers? That means getting: * all compiler people to agree on a specific interpretation, then * all hardware people to re-implement their hardware, re-ship their products Unless there is a flag, say, -std=c11, which makes the compiler follow the standard? If not all, how pragmatic is good pragmatic? How much of it should we do? How are we going to get all compiler folks from all fields to agree on what's acceptable and what's not? Funny enough, there is a place where that happens already, the C/C++ standard committee. And what's left of undefined / implementation defined behaviour is what they don't agree on. I can't see how this could be different... cheers, --renato
Tim Northover
2015-Jul-01 18:08 UTC
[LLVMdev] C as used/implemented in practice: analysis of responses
> 1. The performance gain from this on real programs is small. I will suggest > that the total performance gain from optimisations that rely on exploiting > undefined behaviour - let's call them monkey's paw optimisations for short - > is practically never more than a few percent, and often less than one > percent. > > 2. For most programs, having the program work is worth far more than having > it run a few percent faster.Which may or may not be fine until you decide to switch compilers/platforms. Encouraging programmers to use Clang-specific interpretations of these constructs would promote vendor lock-in and be a blow for portability, which I think is worse than UB. At least now we can tell people "you're doing it wrong". Cheers. Tim.
Joerg Sonnenberger
2015-Jul-02 16:13 UTC
[LLVMdev] C as used/implemented in practice: analysis of responses
On Wed, Jul 01, 2015 at 03:20:16PM +0100, Russell Wallace wrote:> 1. The performance gain from this on real programs is small. I will suggest > that the total performance gain from optimisations that rely on exploiting > undefined behaviour - let's call them monkey's paw optimisations for short > - is practically never more than a few percent, and often less than one > percent.This is blatantly wrong. There are quite a few things where hot code would get seriously penalized if you want to "fix" UB. A very basic example is over-large shifts. If the compiler can't prove that the shift is within the size of the type, it would have to add additional branches on at least one major CPU architecture as x86 and ARM implement different results in that case. There is code where such additional branches is providing a significant penalty. It is just one example. If you say "performance gains from artifical UB", you might get more agreement. The aggressive nonnull optimisations are likely a good candidate here. I do hope that the definition of what is UB and what not can be changed in the C standard, but there are still limits. What happens on access to a NULL pointer should remain UB as the behavior does differ between embedded systems and a desktop/server environment using virtual memory with page protection. Joerg
Daniel Berlin
2015-Jul-02 22:36 UTC
[LLVMdev] C as used/implemented in practice: analysis of responses
On Wed, Jul 1, 2015 at 7:20 AM, Russell Wallace <russell.wallace at gmail.com> wrote:> On Tue, Jun 30, 2015 at 6:21 PM, Hal Finkel <hfinkel at anl.gov> wrote: >> >> We exploit undefined behavior because it helps to optimize real programs >> and shrink the size of generated code. > > > That is the reason compilers exploit undefined behaviour even when they are > generating code for a vanilla architecture with a flat address space, yes. > However, I will suggest: > > 1. The performance gain from this on real programs is small. I will suggest > that the total performance gain from optimisations that rely on exploiting > undefined behaviour - let's call them monkey's paw optimisations for short - > is practically never more than a few percent, and often less than one > percent.This is a very large assumption. If you mean "direct optimization based on undef", it may be correct. If you mean "optimization that takes advantage of knowledge of what undefined behavior is", this is surely wrong. Almost all loop analysis/etc takes advantage of knowing what both implementation and undefined behavior is to be able to do simple things like "derive loop bounds" for most loops, let alone more advanced things.