thr3ads.net - llvm dev - [LLVMdev] C as used/implemented in practice: analysis of responses [Jul 2015]

If this information is useful, please help other people find it:
Share via:

Sean Silva

2015-Jul-01 22:20 UTC

[LLVMdev] C as used/implemented in practice: analysis of responses

On Wed, Jul 1, 2015 at 12:22 PM, Russell Wallace <russell.wallace at
gmail.com>
wrote:
> I am arguing in favor of a point, and I understand you disagree with it,
> but I don't think I'm dismissing any use cases except a very small
> performance increment.
>
I'm sure Google has numbers about how much electricity/server cost they
save for X% performance improvement.
I'm sure Apple has numbers about how much money they make with X% improved
battery life.
I'm not convinced that the cost of some of these bugs is actually larger
than the benefit of faster programs. Nor am I convinced about the inverse.
I'm just pointing out that pointing to a "bad bug" caused by a
certain
optimization without comparing the cost of the bug to the benefit of the
optimization is basically meaningless. You'll need to quantify "very
small
performance improvement" and put it in context of the bugs you're
talking
about.

> Furthermore, the compiler would still be free to perform such
> optimisations where it can prove they won't break the program.
>
"won't break the program" is very hard to know...

-- Sean Silva

> That's not all cases, to be sure, but at least we would then be back to
> the normal scenario where over the years as the compiler gets smarter,
> things get better, as opposed to monkey's paw optimisations which cause
a
> smarter compiler to make things worse.
>
> On Wed, Jul 1, 2015 at 7:53 PM, Tim Northover <t.p.northover at
gmail.com>
> wrote:
>
>> On 1 July 2015 at 11:34, Russell Wallace <russell.wallace at
gmail.com>
>> wrote:
>> > Why do you say spin?
>>
>> You're dismissing all use-cases other than this very narrow one
I'd
>> (with my own spin) characterise as "Do What I Mean, I Can't Be
>> Bothered To Get My Code Right". Fair enough, you're arguing in
favour
>> of a point; but it's not one I agree with.
>>
>> Tim.
>>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150701/b4922acd/attachment.html>

Chris Lattner

2015-Jul-07 17:26 UTC

head link

[LLVMdev] C as used/implemented in practice: analysis of responses

> On Jul 1, 2015, at 3:20 PM, Sean Silva <chisophugis at gmail.com>
wrote:
> 
> 
> 
> On Wed, Jul 1, 2015 at 12:22 PM, Russell Wallace <russell.wallace at
gmail.com <mailto:russell.wallace at gmail.com>> wrote:
> I am arguing in favor of a point, and I understand you disagree with it,
but I don't think I'm dismissing any use cases except a very small
performance increment.
> 
> I'm sure Google has numbers about how much electricity/server cost they
save for X% performance improvement.
> I'm sure Apple has numbers about how much money they make with X%
improved battery life.
> I'm not convinced that the cost of some of these bugs is actually
larger than the benefit of faster programs. Nor am I convinced about the
inverse. I'm just pointing out that pointing to a "bad bug" caused
by a certain optimization without comparing the cost of the bug to the benefit
of the optimization is basically meaningless. You'll need to quantify
"very small performance improvement" and put it in context of the bugs
you're talking about.
As with many things, it is more complicated than that.  The performance effects
of optimizations are often non-linear, and you can take a look at many of the
worst forms of UB in C and easily show cases where they allow 2x speedups, not
just 2%.

For example, consider undefined behavior for integer overflow:

  for (int i = 0; i <= N; ++i) {
     …
  }

When compiling for a 64-bit machine, you really want to promote the induction
variable to 64-bits.  Further, knowing the trip count of a loop is extremely
important for many loop optimizations.  Unfortunately, without being able to
assume undefined integer wraparound, you get neither of these from C.

-fstrict-aliasing is another great example.  In many cases, it makes no
difference whatsoever.  OTOH, on code like:

void doLoopThing(float *array, int *N) {
    for (int i = 0; i < *N; ++i) {
       array[i] = array[i] + 1;
  }

You can easily get a 2x or more speedup due to auto-vectorization if you can
assume -fstrict-aliasing.  Of course usually you wouldn’t write this code, you’d
get this because doLoopThing is a template, and N is passed in as a reference.

Anyway, I could go on and on here, and I’ve spent a lot of time over the years
thinking about how to improve the situation: can we make clang detect more of
these, can we make the optimizer more conservative in certain cases etc?  This
is why (for example) our TBAA uses simple structural points-to analysis before
using TBAA.  With GCC’s implementation (circa GCC 4.0, I have no idea what they
are doing now), GCC would “miscompile” code like:

	float bitcast(int x) {
	  return *(float*)&x;
	}

This code is a TBAA violation, but is also “obvious” what the programmer meant. 
LLVM being “nicer” in this case is a feature.  It is irritating that the union
version of this is also technically UB or implementation defined behavior, so
that isn’t portable either (a C programmer needs to magically know that memcpy
is the safe way to do this).

However, as I’ve continued to dig into this, my feeling is that there really is
no satisfactory solution to these issues.  The problem here are pervasive
structural problems in the C language: In the first example above, it is that
“int” is the default type people generally reach for, not “long”, and that array
indexing is expressed with integers instead of iterators.  This isn’t something
that we’re going to “fix" in the C language, the C community, or the body
of existing C code.  Likewise, while C++ has made definite progress here by
replacing some of these idioms (e.g. with iterators), it adds its own layers of
UB on, and doesn’t actually *subtract* the worst things in C.

My conclusion is that C, and derivatives like C++, is a very dangerous language
the write safety/correctness critical software in, and my personal opinion is
that it is almost impossible to write *security* critical software in it.  This
isn’t really C’s fault if you consider where it was born and evolved from (some
joke that it started as a *very* nice high level assembler for the PDP11
https://en.wikipedia.org/wiki/C_(programming_language)#Early_developments :-).

There are many more modern and much safer languages that either eliminate the UB
entirely through language design (e.g. using a garbage collector to eliminate an
entire class of memory safety issues, completely disallowing pointer casts to
enable TBAA safely, etc), or by intentionally spending a bit of performance to
provide a safe and correct programming model (e.g. by guaranteeing that integers
will trap if they overflow).  My hope is that the industry will eventually move
to better systems programming languages, but that will take a very very long
time...

-Chris

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150707/cc89cde8/attachment.html>

David Chisnall

2015-Jul-07 17:51 UTC

head link

[LLVMdev] C as used/implemented in practice: analysis of responses

On 7 Jul 2015, at 18:26, Chris Lattner <clattner at apple.com>
wrote:> 
> a C programmer needs to magically know that memcpy is the safe way to do
this.
And, of course, has to know that if you’re implementing memcpy (or memmove, or
malloc, or free) that you have to do it in a language that’s not C.  free is my
favourite example here: it is UB to dereference a pointer after a call to free,
so a compiler for pure standard C is completely free to optimise away the body
of any function called free that stores metadata inside the object.
> However, as I’ve continued to dig into this, my feeling is that there
really is no satisfactory solution to these issues.  The problem here are
pervasive structural problems in the C language: In the first example above, it
is that “int” is the default type people generally reach for, not “long”, and
that array indexing is expressed with integers instead of iterators.
That’s because in C89 int was defined as the natural type for the target
machine.  On Alpha, int was 64 bits, but it turned out that a lot of people who
wrote int really meant int32_t.  What they probably really want is int_fast32_t
(the type that is at least as big as int32_t, but is the fastest thing for the
target), but I’ve never seen real-world code use any of the int_fast types.  At
least people are starting to use intptr_t instead of long (partly due to win64,
where sizeof(long) < sizeof(void*)), though that’s been a struggle.

David

Peter Sewell

2015-Jul-07 19:09 UTC

head link

[LLVMdev] C as used/implemented in practice: analysis of responses

On 7 July 2015 at 18:26, Chris Lattner <clattner at apple.com>
wrote:>
> On Jul 1, 2015, at 3:20 PM, Sean Silva <chisophugis at gmail.com>
wrote:
>
>
>
> On Wed, Jul 1, 2015 at 12:22 PM, Russell Wallace <russell.wallace at
gmail.com>
> wrote:
>>
>> I am arguing in favor of a point, and I understand you disagree with
it,
>> but I don't think I'm dismissing any use cases except a very
small
>> performance increment.
>
>
> I'm sure Google has numbers about how much electricity/server cost they
save
> for X% performance improvement.
> I'm sure Apple has numbers about how much money they make with X%
improved
> battery life.
> I'm not convinced that the cost of some of these bugs is actually
larger
> than the benefit of faster programs. Nor am I convinced about the inverse.
> I'm just pointing out that pointing to a "bad bug" caused by
a certain
> optimization without comparing the cost of the bug to the benefit of the
> optimization is basically meaningless. You'll need to quantify
"very small
> performance improvement" and put it in context of the bugs you're
talking
> about.
>
>
> As with many things, it is more complicated than that.  The performance
> effects of optimizations are often non-linear, and you can take a look at
> many of the worst forms of UB in C and easily show cases where they allow
2x
> speedups, not just 2%.
>
> For example, consider undefined behavior for integer overflow:
>
>   for (int i = 0; i <= N; ++i) {
>      …
>   }
>
> When compiling for a 64-bit machine, you really want to promote the
> induction variable to 64-bits.  Further, knowing the trip count of a loop
is
> extremely important for many loop optimizations.  Unfortunately, without
> being able to assume undefined integer wraparound, you get neither of these
> from C.
>
> -fstrict-aliasing is another great example.  In many cases, it makes no
> difference whatsoever.  OTOH, on code like:
>
> void doLoopThing(float *array, int *N) {
>     for (int i = 0; i < *N; ++i) {
>        array[i] = array[i] + 1;
>   }
>
> You can easily get a 2x or more speedup due to auto-vectorization if you
can
> assume -fstrict-aliasing.  Of course usually you wouldn’t write this code,
> you’d get this because doLoopThing is a template, and N is passed in as a
> reference.
>
>
> Anyway, I could go on and on here, and I’ve spent a lot of time over the
> years thinking about how to improve the situation: can we make clang detect
> more of these, can we make the optimizer more conservative in certain cases
> etc?  This is why (for example) our TBAA uses simple structural points-to
> analysis before using TBAA.  With GCC’s implementation (circa GCC 4.0, I
> have no idea what they are doing now), GCC would “miscompile” code like:
>
> float bitcast(int x) {
>   return *(float*)&x;
> }
>
> This code is a TBAA violation, but is also “obvious” what the programmer
> meant.  LLVM being “nicer” in this case is a feature.  It is irritating
that
> the union version of this is also technically UB or implementation defined
> behavior, so that isn’t portable either (a C programmer needs to magically
> know that memcpy is the safe way to do this).
>
> However, as I’ve continued to dig into this, my feeling is that there
really
> is no satisfactory solution to these issues.  The problem here are
pervasive
> structural problems in the C language: In the first example above, it is
> that “int” is the default type people generally reach for, not “long”, and
> that array indexing is expressed with integers instead of iterators.  This
> isn’t something that we’re going to “fix" in the C language, the C
> community, or the body of existing C code.  Likewise, while C++ has made
> definite progress here by replacing some of these idioms (e.g. with
> iterators), it adds its own layers of UB on, and doesn’t actually
*subtract*
> the worst things in C.
>
> My conclusion is that C, and derivatives like C++, is a very dangerous
> language the write safety/correctness critical software in,
This is certainly true, but (as you say below) the shift to better
languages for systems code isn't going to happen any time soon, so,
given the level of disagreement and confusion about what can be relied
on that we saw in that survey, I think we have to do something to
improve matters for C more-or-less as it is.

Taking those survey questions as an (incomplete) starting point:

  http://www.cl.cam.ac.uk/~pes20/cerberus/notes51-2015-06-21-survey-short.html

- for some, it seems that clang and gcc already support a
stronger-than-ISO behaviour on "normal" platforms (e.g. perhaps the
null-pointer cases).  Can we identify and document those?

- for all the others where there is substantial code relying on them,
it seems that the minimal thing is at least to identify whatever
current optimisation-limiting compiler option(s) are enough to
guarantee it's supported, where that exists. That'll have a cost, of
course, but in many contexts the clarity may be worth it, just as many
of our respondents routinely use -fno-strict-aliasing.

- other cases might need new such options

- sometimes the non-ISO behaviour may be so prevalent that there
really isn't any reasonable sense in which compilers should be
exploiting it (I'm suspicious about temporarily out-of-bounds pointer
construction, for example)

- others might really need new attributes or other extensions, e.g. to
say that some pointers should support inter-object arithmetic, if
there are relatively few places where that occurs in practice.

- and, of course, some UBs are really essential to normal C
implementation; we can't hope to give sensible semantics to wild
writes, and also some of the arithmetic issues, in any way other than
declaring them UB.

I'd hope that identifying which of the survey questions is which can
be reasonably straightforward for the compiler development community,
and, given that, we should be in a position to precisely define what
*that* version of C is, at least.

Peter

----
Peter Sewell
Computer Laboratory, University of Cambridge
http://www.cl.cam.ac.uk/users/pes20

>  and my personal
> opinion is that it is almost impossible to write *security* critical
> software in it.  This isn’t really C’s fault if you consider where it was
> born and evolved from (some joke that it started as a *very* nice high
level
> assembler for the PDP11
> https://en.wikipedia.org/wiki/C_(programming_language)#Early_developments
> :-).
>
> There are many more modern and much safer languages that either eliminate
> the UB entirely through language design (e.g. using a garbage collector to
> eliminate an entire class of memory safety issues, completely disallowing
> pointer casts to enable TBAA safely, etc), or by intentionally spending a
> bit of performance to provide a safe and correct programming model (e.g. by
> guaranteeing that integers will trap if they overflow).  My hope is that
the
> industry will eventually move to better systems programming languages, but
> that will take a very very long time...
>
> -Chris
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

Sean Silva

2015-Jul-08 02:18 UTC

head link

[LLVMdev] C as used/implemented in practice: analysis of responses

On Tue, Jul 7, 2015 at 10:26 AM, Chris Lattner <clattner at apple.com>
wrote:
>
> On Jul 1, 2015, at 3:20 PM, Sean Silva <chisophugis at gmail.com>
wrote:
>
>
>
> On Wed, Jul 1, 2015 at 12:22 PM, Russell Wallace <
> russell.wallace at gmail.com> wrote:
>
>> I am arguing in favor of a point, and I understand you disagree with
it,
>> but I don't think I'm dismissing any use cases except a very
small
>> performance increment.
>>
>
> I'm sure Google has numbers about how much electricity/server cost they
> save for X% performance improvement.
> I'm sure Apple has numbers about how much money they make with X%
improved
> battery life.
> I'm not convinced that the cost of some of these bugs is actually
larger
> than the benefit of faster programs. Nor am I convinced about the inverse.
> I'm just pointing out that pointing to a "bad bug" caused by
a certain
> optimization without comparing the cost of the bug to the benefit of the
> optimization is basically meaningless. You'll need to quantify
"very small
> performance improvement" and put it in context of the bugs you're
talking
> about.
>
>
> As with many things, it is more complicated than that.  The performance
> effects of optimizations are often non-linear, and you can take a look at
> many of the worst forms of UB in C and easily show cases where they allow
> 2x speedups, not just 2%.
>
> For example, consider undefined behavior for integer overflow:
>
>   for (int i = 0; i <= N; ++i) {
>      …
>   }
>
> When compiling for a 64-bit machine, you really want to promote the
> induction variable to 64-bits.  Further, knowing the trip count of a loop
> is extremely important for many loop optimizations.  Unfortunately, without
> being able to assume undefined integer wraparound, you get neither of these
> from C.
>
> -fstrict-aliasing is another great example.  In many cases, it makes no
> difference whatsoever.  OTOH, on code like:
>
> void doLoopThing(float *array, int *N) {
>     for (int i = 0; i < *N; ++i) {
>        array[i] = array[i] + 1;
>   }
>
> You can easily get a 2x or more speedup due to auto-vectorization if you
> can assume -fstrict-aliasing.  Of course usually you wouldn’t write this
> code, you’d get this because doLoopThing is a template, and N is passed in
> as a reference.
>

You're absolutely right that it's complicated, but I don't think
this is
the best example.

The 2x speedup optimizations you're talking about can be done even without
strict aliasing or signed overflow. You just have to emit runtime checks.
It's analogous to emitting the "remainder" loop when doing
autovectorization. Imagine if the standard made it undefined for a loop
over an array of more than 4 floats to not be suitable for vectorization:
sure, that would be nice and make the vectorizer's life easier, but we can
mostly get the "big bang" speedup without it, in the sense that not
having
the standard say it is UB probably results in closer to 2% "slowdown"
in
the aggregate across all these loops (factoring in icache, etc.) than 2x.

-- Sean Silva

>
>
> Anyway, I could go on and on here, and I’ve spent a lot of time over the
> years thinking about how to improve the situation: can we make clang detect
> more of these, can we make the optimizer more conservative in certain cases
> etc?  This is why (for example) our TBAA uses simple structural points-to
> analysis before using TBAA.  With GCC’s implementation (circa GCC 4.0, I
> have no idea what they are doing now), GCC would “miscompile” code like:
>
> float bitcast(int x) {
>   return *(float*)&x;
> }
>
> This code is a TBAA violation, but is also “obvious” what the programmer
> meant.  LLVM being “nicer” in this case is a feature.  It is irritating
> that the union version of this is also technically UB or implementation
> defined behavior, so that isn’t portable either (a C programmer needs to
> magically know that memcpy is the safe way to do this).
>
> However, as I’ve continued to dig into this, my feeling is that there
> really is no satisfactory solution to these issues.  The problem here are
> pervasive structural problems in the C language: In the first example
> above, it is that “int” is the default type people generally reach for, not
> “long”, and that array indexing is expressed with integers instead of
> iterators.  This isn’t something that we’re going to “fix" in the C
> language, the C community, or the body of existing C code.  Likewise, while
> C++ has made definite progress here by replacing some of these idioms (e.g.
> with iterators), it adds its own layers of UB on, and doesn’t actually
> *subtract* the worst things in C.
>
> My conclusion is that C, and derivatives like C++, is a very dangerous
> language the write safety/correctness critical software in, and my personal
> opinion is that it is almost impossible to write *security* critical
> software in it.  This isn’t really C’s fault if you consider where it was
> born and evolved from (some joke that it started as a *very* nice high
> level assembler for the PDP11
> https://en.wikipedia.org/wiki/C_(programming_language)#Early_developments
> :-).
>
> There are many more modern and much safer languages that either eliminate
> the UB entirely through language design (e.g. using a garbage collector to
> eliminate an entire class of memory safety issues, completely disallowing
> pointer casts to enable TBAA safely, etc), or by intentionally spending a
> bit of performance to provide a safe and correct programming model (e.g. by
> guaranteeing that integers will trap if they overflow).  My hope is that
> the industry will eventually move to better systems programming languages,
> but that will take a very very long time...
>
> -Chris
>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150707/197703e7/attachment.html>

Antoine Pitrou

2015-Jul-13 18:28 UTC

head link

[LLVMdev] C as used/implemented in practice: analysis of responses

Chris Lattner <clattner <at> apple.com>
writes:> 
> There are many more modern and much safer languages that either eliminate 
> the UB entirely through language design (e.g. using a garbage collector to 
> eliminate an entire class of memory safety issues, completely disallowing 
> pointer casts to enable TBAA safely, etc), or by intentionally spending a 
> bit of performance to provide a safe and correct programming model (e.g.
> by guaranteeing that integers will trap if they overflow).
For the record, in Numba when we enabled the "nsw" flag on all
arithmetic
operations, we got a sizable speedup (up to 2x IIRC) on some benchmarks.
That's even though we implement a high-level language (Python).

For those curious, the explanation is that negative array indices are
legal in Python - they mean indexing from the end of the array. So,
whenever a lookup such as `my_array[i+1]` was used, a check was
necessary to detect whether `i+1` was negative even when `i` was known
to be positive. Adding the "nsw" flag helps LLVM optimize away that
check in many cases - and it turns that can enable some other optimizations
such as vectorization.

Regards

Antoine.

llvm dev - Jul 2015 - [LLVMdev] C as used/implemented in practice: analysis of responses

[LLVMdev] C as used/implemented in practice: analysis of responses

[LLVMdev] C as used/implemented in practice: analysis of responses

[LLVMdev] C as used/implemented in practice: analysis of responses

[LLVMdev] C as used/implemented in practice: analysis of responses

[LLVMdev] C as used/implemented in practice: analysis of responses

[LLVMdev] C as used/implemented in practice: analysis of responses