Linus Torvalds via llvm-dev
2016-Feb-28 16:13 UTC
[llvm-dev] [isocpp-parallel] Proposal for new memory_order_consume definition
On Sun, Feb 28, 2016 at 12:27 AM, Markus Trippelsdorf <markus at trippelsdorf.de> wrote:>> > >> > -fno-strict-overflow >> >> -fno-strict-aliasing. > > Do not forget -fno-delete-null-pointer-checks. > > So the kernel obviously is already using its own C dialect, that is > pretty far from standard C. > All these options also have a negative impact on the performance of the > generated code.They really don't. Have you ever seen code that cared about signed integer overflow? Yeah, getting it right can make the compiler generate an extra ALU instruction once in a blue moon, but trust me - you'll never notice. You *will* notice when you suddenly have a crash or a security issue due to bad code generation, though. The idiotic C alias rules aren't even worth discussing. They were a mistake. The kernel doesn't use some "C dialect pretty far from standard C". Yeah, let's just say that the original C designers were better at their job than a gaggle of standards people who were making bad crap up to make some Fortran-style programs go faster. They don't speed up normal code either, they just introduce undefined behavior in a lot of code. And deleting NULL pointer checks because somebody made a mistake, and then turning that small mistake into a real and exploitable security hole? Not so smart either. The fact is, undefined compiler behavior is never a good idea. Not for serious projects. Performance doesn't come from occasional small and odd micro-optimizations. I care about performance a lot, and I actually look at generated code and do profiling etc. None of those three options have *ever* shown up as issues. But the incorrect code they generate? It has. Linus
via llvm-dev
2016-Feb-28 16:50 UTC
[llvm-dev] [isocpp-parallel] Proposal for new memory_order_consume definition
Sometimes Linus says some really flippant and funny things but gosh I couldn't agree more.. with one tiny nit.. Properly written Fortran and a good compiler is potentially as fast or faster than typical C version in HPC codes. (yes you may be able to get the c version faster, but it would take some effort.) Original Message From: Linus Torvalds via llvm-dev Sent: Sunday, February 28, 2016 23:13 To: Markus Trippelsdorf Reply To: Linus Torvalds Cc: linux-arch at vger.kernel.org; gcc at gcc.gnu.org; Jade Alglave; parallel at lists.isocpp.org; llvm-dev at lists.llvm.org; Will Deacon; Linux Kernel Mailing List; David Howells; Peter Zijlstra; Ramana Radhakrishnan; Luc Maranget; Andrew Morton; Paul McKenney; Ingo Molnar Subject: Re: [llvm-dev] [isocpp-parallel] Proposal for new memory_order_consume definition On Sun, Feb 28, 2016 at 12:27 AM, Markus Trippelsdorf <markus at trippelsdorf.de> wrote:>> > >> > -fno-strict-overflow >> >> -fno-strict-aliasing. > > Do not forget -fno-delete-null-pointer-checks. > > So the kernel obviously is already using its own C dialect, that is > pretty far from standard C. > All these options also have a negative impact on the performance of the > generated code.They really don't. Have you ever seen code that cared about signed integer overflow? Yeah, getting it right can make the compiler generate an extra ALU instruction once in a blue moon, but trust me - you'll never notice. You *will* notice when you suddenly have a crash or a security issue due to bad code generation, though. The idiotic C alias rules aren't even worth discussing. They were a mistake. The kernel doesn't use some "C dialect pretty far from standard C". Yeah, let's just say that the original C designers were better at their job than a gaggle of standards people who were making bad crap up to make some Fortran-style programs go faster. They don't speed up normal code either, they just introduce undefined behavior in a lot of code. And deleting NULL pointer checks because somebody made a mistake, and then turning that small mistake into a real and exploitable security hole? Not so smart either. The fact is, undefined compiler behavior is never a good idea. Not for serious projects. Performance doesn't come from occasional small and odd micro-optimizations. I care about performance a lot, and I actually look at generated code and do profiling etc. None of those three options have *ever* shown up as issues. But the incorrect code they generate? It has. Linus _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Michael Matz via llvm-dev
2016-Feb-29 17:37 UTC
[llvm-dev] [isocpp-parallel] Proposal for new memory_order_consume definition
Hi, On Sun, 28 Feb 2016, Linus Torvalds wrote:> > So the kernel obviously is already using its own C dialect, that is > > pretty far from standard C. All these options also have a negative > > impact on the performance of the generated code. > > They really don't.They do.> Have you ever seen code that cared about signed integer overflow? > > Yeah, getting it right can make the compiler generate an extra ALU > instruction once in a blue moon, but trust me - you'll never notice. > You *will* notice when you suddenly have a crash or a security issue > due to bad code generation, though.No, that's not at all the important piece of making signed overflow undefined. The important part is with induction variables controlling loops: short i; for (i = start; i < end; i++) vs. unsigned short u; for (u = start; u < end; u++) For the former you're allowed to assume that the loop will terminate, and that its iteration count is easily computable. For the latter you get modulo arithmetic and (if start/end are of larger type than u, say 'int') it might not even terminate at all. That has direct consequences of vectorizability of such loops (or profitability of such transformation) and hence quite important performance implications in practice. Not for the kernel of course. Now we can endlessly debate how (non)practical it is to write HPC code in C or C++, but there we are.> The fact is, undefined compiler behavior is never a good idea. Not for > serious projects.Perhaps if these undefinednesses wouldn't have been put into the standard, people wouldn't have written HPC code, and if that were so the world would be a nicer place sometimes (certainly for the compiler). Alas, it isn't. Ciao, Michael.
Linus Torvalds via llvm-dev
2016-Feb-29 17:57 UTC
[llvm-dev] [isocpp-parallel] Proposal for new memory_order_consume definition
On Mon, Feb 29, 2016 at 9:37 AM, Michael Matz <matz at suse.de> wrote:> >The important part is with induction variables controlling > loops: > > short i; for (i = start; i < end; i++) > vs. > unsigned short u; for (u = start; u < end; u++) > > For the former you're allowed to assume that the loop will terminate, and > that its iteration count is easily computable. For the latter you get > modulo arithmetic and (if start/end are of larger type than u, say 'int') > it might not even terminate at all. That has direct consequences of > vectorizability of such loops (or profitability of such transformation) > and hence quite important performance implications in practice.Stop bullshitting me. It would generally force the compiler to add a few extra checks when you do vectorize (or, more generally, do any kind of loop unrolling), and yes, it would make things slightly more painful. You might, for example, need to add code to handle the wraparound and have a more complex non-unrolled head/tail version for that case. In theory you could do a whole "restart the unrolled loop around the index wraparound" if you actually cared about the performance of such a case - but since nobody would ever care about that, it's more likely that you'd just do it with a non-unrolled fallback (which would likely be identical to the tail fixup). It would be painful, yes. But it wouldn't be fundamentally hard, or hurt actual performance fundamentally. It would be _inconvenient_ for compiler writers, and the bad ones would argue vehemently against it. .. and it's how a "go fast" mode would be implemented by a compiler writer initially as a compiler option, for those HPC people. Then you have a use case and implementation example, and can go to the standards body and say "look, we have people who use this already, it breaks almost no code, and it makes our compiler able to generate much faster code". Which is why the standard was written to be good for compiler writers, not actual users. Of course, in real life HPC performance is often more about doing the cache blocking etc, and I've seen people move to more parameterized languages rather than C to get best performance. Generate the code from a much higher-level description, and be able to do a much better job, and leave C to do the low-level job, and let people do the important part. But no. Instead the C compiler people still argue for bad features that were a misdesign and a wart on the language. At the very least it should have been left as a "go unsafe, go fast" option, and standardize *that*, instead of screwing everybody else over. The HPC people end up often using those anyway, because it turns out that they'll happily get rid of proper rounding etc if it buys them a couple of percent on their workload. Things like "I really want you to generate multiply-accumulate instructions because I don't mind having intermediates with higher precision" etc. Linus
Lawrence Crowl via llvm-dev
2016-Feb-29 19:38 UTC
[llvm-dev] [isocpp-parallel] Proposal for new memory_order_consume definition
On 2/28/16, Linus Torvalds <torvalds at linux-foundation.org> wrote:> The fact is, undefined compiler behavior is never a good idea. Not for > serious projects.Actually, undefined behavior is essential for serious projects, but not for the reasons mentioned. If the language has no undefined behavior, then from the compiler's view, there is no such thing as a bad program. All programs will compile and enter functional debug (possibly after shipping to customer). On the other hand, a language with undefined behavior makes it possible for compilers (and their run-time support) to identify a program as wrong. The problem with the latest spate of compiler optimizations was not the optimization, but the lack of warnings about exploiting undefined behavior. -- Lawrence Crowl
Toon Moene via llvm-dev
2016-Feb-29 20:45 UTC
[llvm-dev] [isocpp-parallel] Proposal for new memory_order_consume definition
On 02/28/2016 05:13 PM, Linus Torvalds wrote:> Yeah, let's just say that the original C designers were > better at their job than a gaggle of standards people who were making > bad crap up to make some Fortran-style programs go faster.The original C designers were defining a language that would make it easy to write operating systems in (and not having to rely on assembler). I misled the quote where they said they first tried Fortran (and concluded it didn't fit their purpose). BTW, Fortran was designed around floating point arithmetic (and its non-relation to the mathematical concept of the field of the reals). It used integers only for counting and indexing arrays, so it had no purpose for "signed integers that overflowed". Therefore, to the Fortran standard, this was "undefined". It was literally "undefined" - as it was not described by the standard's text. -- Toon Moene - e-mail: toon at moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/ Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
James Y Knight via llvm-dev
2016-Feb-29 21:10 UTC
[llvm-dev] [isocpp-parallel] Proposal for new memory_order_consume definition
No, you really don't need undefined behavior in the standard in order to enable bug-finding. The standard could've (and still could...) make signed integer overflow " implementation-defined" rather than "undefined". Compilers would thus be required to have *some documented meaning* for it (e.g. wrap 2's-complement, wrap 1's-complement, saturate to min/max, trap, or whatever...), but must not have the current "Anything goes! I can set your cat on fire if the optimizer feels like it today!" behavior. Such a change to the standard would not reduce any ability to do error checking, as compilers that want to be helpful could perfectly-well define it to trap at runtime when given certain compiler flags, and perfectly well warn you of your dependence upon unportable implementation-defined behavior (or, that your program is going to trap), at build-time. On Mon, Feb 29, 2016 at 2:38 PM, Lawrence Crowl via llvm-dev < llvm-dev at lists.llvm.org> wrote:> On 2/28/16, Linus Torvalds <torvalds at linux-foundation.org> wrote: > > The fact is, undefined compiler behavior is never a good idea. Not for > > serious projects. > > Actually, undefined behavior is essential for serious projects, but > not for the reasons mentioned. > > If the language has no undefined behavior, then from the compiler's view, > there is no such thing as a bad program. All programs will compile and > enter functional debug (possibly after shipping to customer). On the > other hand, a language with undefined behavior makes it possible for > compilers (and their run-time support) to identify a program as wrong. > > The problem with the latest spate of compiler optimizations was not the > optimization, but the lack of warnings about exploiting undefined behavior. > > -- > Lawrence Crowl > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160229/1b991fd8/attachment.html>
James Y Knight via llvm-dev
2016-Feb-29 21:12 UTC
[llvm-dev] [isocpp-parallel] Proposal for new memory_order_consume definition
No, you really don't need undefined behavior in the standard in order to enable bug-finding. The standard could've (and still could...) make signed integer overflow "implementation-defined" rather than "undefined". Compilers would thus be required to have *some documented meaning* for it (e.g. wrap 2's-complement, wrap 1's-complement, saturate to min/max, trap, or whatever...), but must not have the current "Anything goes! I can set your cat on fire if the optimizer feels like it today!" behavior. Such a change to the standard would not reduce any ability to do error checking, as compilers that want to be helpful could perfectly-well define it to trap at runtime when given certain compiler flags, and perfectly well warn you of your dependence upon unportable implementation-defined behavior (or, that your program is going to trap), at build-time. [Sending again as a plain-text email, since a bunch of mailing lists apparently hate on multipart messages that even contain a text/html part...] On Mon, Feb 29, 2016 at 2:38 PM, Lawrence Crowl via llvm-dev <llvm-dev at lists.llvm.org> wrote:> On 2/28/16, Linus Torvalds <torvalds at linux-foundation.org> wrote: >> The fact is, undefined compiler behavior is never a good idea. Not for >> serious projects. > > Actually, undefined behavior is essential for serious projects, but > not for the reasons mentioned. > > If the language has no undefined behavior, then from the compiler's view, > there is no such thing as a bad program. All programs will compile and > enter functional debug (possibly after shipping to customer). On the > other hand, a language with undefined behavior makes it possible for > compilers (and their run-time support) to identify a program as wrong. > > The problem with the latest spate of compiler optimizations was not the > optimization, but the lack of warnings about exploiting undefined behavior. > > -- > Lawrence Crowl > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Maybe Matching Threads
- [isocpp-parallel] Proposal for new memory_order_consume definition
- [isocpp-parallel] Proposal for new memory_order_consume definition
- [isocpp-parallel] Proposal for new memory_order_consume definition
- [isocpp-parallel] Proposal for new memory_order_consume definition
- [isocpp-parallel] Proposal for new memory_order_consume definition