On 28 March 2014 16:51, Krzysztof Parzyszek <kparzysz at codeaurora.org>
wrote:> If foo doesn't read memory, then it's legal to interchange these
two:
> store x, 1
> call foo()
I know that, but the question is, actually, how to you represent the
call to foo in IR in order for it *not* to move around? Isn't it the
default behaviour for function calls without any attributes?
> It means that you can, in fact, move a call around as long as some
> restrictions are met.
Then lets make this call introduce as many problems as possible for
the restrictions never to be met.
> Even if you restrict the intrinsics to only apply to
> non-allocatable registers, can you guarantee that such apparently safe code
> motion won't alter these registers?
It'll never be. More specifically, the semantics are normally only
valid in code order. The compiler cannot infer the safety about these
writes. Ever.
> A, perhaps hypothetical, example could
> be reading "sp" on x86, while the code that have been moved
around caused
> spill and "push/pop" to be generated. Even if this example is
only
> hypothetical, couldn't something of that nature happen in practice?
Absolutely, and we don't really care. It's up to the user to make sure
that his/her code is minimal enough so that these things don't happen.
*ALL* uses of it I could find in the kernel's unwind code only reads
from the stack pointer, never writes to it. I think whoever uses this
kind of trick should be well aware that "here be dragons" and there
are no guarantees of any safety due to compiler allocation, spillage,
etc.
> On the other hand, if you treat the intrinsics as accessing memory, it
would
> be strictly worse than inline asm.
Yup. That's the idea. ;)
I'll use the same argument I've used for __builtin___clear_cache:
Users of those extensions know *precisely* what they're getting into,
and they *only* do it because there is no better way of doing this.
Performance may be a problem in this case (it wasn't in the cache's),
but the order in which the instructions are scheduled won't affect
*that* much in this case, and some scheduling inefficiencies are
accepted due to the quirky nature of the extension.
Other arguments I heard:
Uses of those extensions only exist because the system can't cope with
their needs, and by extension, the compiler shouldn't be able to judge
what's best either. So the best course of action is to do exactly what
the user asks and let them figure out what's the best way to write
their own non-portable non-standard code.
We're bound to hit odd problems like these when compiling the kernel,
it was just a matter of time...
cheers,
--renato