Renato Golin
2015-Jul-01 15:41 UTC
[LLVMdev] C as used/implemented in practice: analysis of responses
On 1 July 2015 at 15:20, Russell Wallace <russell.wallace at gmail.com> wrote:> Group all monkey's paw optimisations together, and enable them only if an > extra compiler flag is supplied. Or failing that, at least have a compiler > flag that will disable all of them (while leaving all the safe optimisations > enabled).So, are you suggesting we get rid of all undefined AND implementation defined behaviour from compilers? That means getting: * all compiler people to agree on a specific interpretation, then * all hardware people to re-implement their hardware, re-ship their products Unless there is a flag, say, -std=c11, which makes the compiler follow the standard? If not all, how pragmatic is good pragmatic? How much of it should we do? How are we going to get all compiler folks from all fields to agree on what's acceptable and what's not? Funny enough, there is a place where that happens already, the C/C++ standard committee. And what's left of undefined / implementation defined behaviour is what they don't agree on. I can't see how this could be different... cheers, --renato
Russell Wallace
2015-Jul-01 16:15 UTC
[LLVMdev] C as used/implemented in practice: analysis of responses
On Wed, Jul 1, 2015 at 4:41 PM, Renato Golin <renato.golin at linaro.org> wrote:> So, are you suggesting we get rid of all undefined AND implementation > defined behaviour from compilers? >Not at all. As you say, that would require all compiler implementers to agree, and what little behaviour is defined in the standards is presumably already what all compiler implementers can agree on. I'm proposing that LLVM unilaterally replace most undefined behaviour with implementation-defined behaviour. Note that this would give it a substantial competitive advantage over GCC: a lot of people care far more about reliability than about tiny increments of performance. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150701/e4c86079/attachment.html>
Renato Golin
2015-Jul-01 16:58 UTC
[LLVMdev] C as used/implemented in practice: analysis of responses
On 1 July 2015 at 17:15, Russell Wallace <russell.wallace at gmail.com> wrote:> I'm proposing that LLVM unilaterally replace most undefined behaviour with > implementation-defined behaviour.That's precisely the problem. Which behaviour? Let's have an example: struct Foo { long a[95]; char b[4]; double c[2]; }; void fuzz(Foo &F) { for (int i=0; i<100; i++) F.a[i] = 123; } There are many ways I can do this "right": 1. Only go up to 95, since you're using an integer to set the value. 2. Go up to 96, since char is an integer type. 2. Go all the way to 100, but casting "123" to double from 97 onwards, in pairs 3. Go all the way to 100, and set integer 123 bitwise (for whatever fp representation that is) from 97 4. Do any of above, and emit a warning 5. Bail on error Compilers prefer not to bail on error, since the standard permits it. A warning would be a good thing, though. Now, since it's a warning, I *have* to output something. What? Even considering one compiler, you'll have to convince *most* <compilerX> engineers to agree on something, and that's not trivial. Moreover, this loop is very easy to vectorise, and that would give me 4x speed improvements for 4-way vectorization. That's too much for compilers to pass. If I create a vectorised loop that goes all the way to 92, I'll have to create a tail loop. If I don't want to create a tail loop, I have to override 'b' (and probably 'c') on a vector write. If I implement the variations where I can do that, the vectoriser will be very happy. People generally like when the vectoriser is happy. Now, you have a "safe mode" where these things don't happen. Let's say you and me agree that it should only go to 95, since this is "probably what the user wants". But some programmers *use* that as a feature, and the standard allow it, so we *have* to implement it *both*. Best case scenario, you have now implemented two completely different behaviours for every undefined behaviour in each standard. Worse still, you have divided the programmers in two classes: those that play it safe, and those that don't, essentially creating two different programming languages. Code that compiles and work with compilerA+safe_mode will not necessarily compile/work with compilerB+safe_mode or compilerA+full_mode either. C and C++ are already complicated enough, with so many standard levels to implement (C90, C99, C11, C++03, C++11, C++14, etc) that duplicating each and everyone of them, *per compiler*, is not something you want to do. That will, ultimately, move compilers away from each other, which is not what most users really want. cheers, --renato
Reasonably Related Threads
- [LLVMdev] C as used/implemented in practice: analysis of responses
- General question about PHP
- [LLVMdev] C as used/implemented in practice: analysis of responses
- [LLVMdev] C as used/implemented in practice: analysis of responses
- can I use puppet for security configuration check in centos