thr3ads.net - llvm dev - [llvm-dev] setjmp/longjmp and volatile stores, but non-volatile loads [Sep 2016]

If this information is useful, please help other people find it:
Share via:

Jonas Maebe via llvm-dev

2016-Sep-16 17:13 UTC

[llvm-dev] setjmp/longjmp and volatile stores, but non-volatile loads

Hi,

In our (non-C) compiler we use setjmp/longjmp to implement exception
handling. For the initial implementation LLVM backend, I'm keeping that
model. In order to ensure that changes performed in a try/setjmp==0
block survive the longjmp, the changes must be done via volatile operations.

Given that volatility is a property of individual load/store
instructions rather than of memory slots in LLVM, I thought I could
optimise this by only marking the stores in try-blocks as volatile.
After all, I don't mind LLVM removing extra loads of the same variable
in a try-block.

However, if I do that (instead of making all loads/stores to the
variable volatile), then some kind of (in my case invalid) value
propagation seems to happen from the try-block to the exception
block.>From what I can see with opt -print-after-all, it's the GVN pass thatdoes this.

I have attached a C program that demonstrates the issue: compiled with
clang -O0 or -O1 it works fine, but with -O2 it prints an error because
the "loops" variable has a wrong value on exit.

Mind you: I do not claim that the attached program is valid according to
any C standard or setjmp/longjmp documentation (it might be, but I don't
know). It's just that if you compile this program at with clang -O0
-emit-llvm, you get more or less the same LLVM IR as what we generate
for our code. (*) Next you can process the resulting .ll file with e.g.
opt -O2 to reproduce the issue (i.e., without any regard to a C
standard, to the extent that LLVM IR behaviour can be considered to be
unrelated to any C standard).

My question is: is this a bug in LLVM, or is this behaviour deemed to be
by design? Also: our setjmp is not called setjmp, but we do mark its
replacement also as "returns_twice". So even if the above behaviour
would be considered "expected" for setjmp, would the same go for any
function marked as "returns_twice"?

I tested with clang/LLVM 3.7.0 and the clang from 7.0.1. I don't have
newer versions available here.

Thanks,


Jonas

(*) The only difference is that in our code there is no alias of the
"loops" variable through a pointer: we simply directly store to the
loops variable with a volatile store inside the try/setjmp==0 block. At
the LLVM IR level, that should not make a difference though since the
aliasing from the pointer to the loops variable is trivial. Removing the
aliasing manually from the generated IR does not change anything either,
as expected.


-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: tint643c.c
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160916/01e8f0d3/attachment.c>

Reid Kleckner via llvm-dev

2016-Sep-16 17:26 UTC

head link

[llvm-dev] setjmp/longjmp and volatile stores, but non-volatile loads

On Fri, Sep 16, 2016 at 10:13 AM, Jonas Maebe via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> In our (non-C) compiler we use setjmp/longjmp to implement exception
> handling. For the initial implementation LLVM backend, I'm keeping that
> model. In order to ensure that changes performed in a try/setjmp==0
> block survive the longjmp, the changes must be done via volatile
> operations.
>
If you want to observe those volatile store updates, you're really going to
need to volatilize the load operations. In your example, LLVM does not
model the CFG edge from the longjmp to the setjmp. This leads LLVM to
conclude that the only reaching definition of 'loops' at the point of
the
load in the else block is 'loops = 0'.

Volatilizing all operations on local variables is going to kill your
performance, obviously. You should really emit invoke instructions in your
frontend. You can either use your own EH personality, or the existing SjLj
EH personality, which will optimize on a correct CFG and then volatilize
all values live across exceptional edges. Then the LLVM CFG will be
correct, and you'll get pretty good code.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160916/c13d1713/attachment.html>

Jonas Maebe via llvm-dev

2016-Sep-19 11:42 UTC

head link

[llvm-dev] setjmp/longjmp and volatile stores, but non-volatile loads

Reid Kleckner wrote:> On Fri, Sep 16, 2016 at 10:13 AM, Jonas Maebe via llvm-dev
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
> 
>     model. In order to ensure that changes performed in a try/setjmp==0
>     block survive the longjmp, the changes must be done via volatile
>     operations.
> 
> If you want to observe those volatile store updates, you're really
going
> to need to volatilize the load operations. In your example, LLVM does
> not model the CFG edge from the longjmp to the setjmp. This leads LLVM
> to conclude that the only reaching definition of 'loops' at the
point of
> the load in the else block is 'loops = 0'.
Ok, thanks for confirming this approach is not going to work.
> Volatilizing all operations on local variables is going to kill your
> performance, obviously. You should really emit invoke instructions in
> your frontend. You can either use your own EH personality, or the
> existing SjLj EH personality, which will optimize on a correct CFG and
> then volatilize all values live across exceptional edges. Then the LLVM
> CFG will be correct, and you'll get pretty good code.
The main reason for using setjmp/longjmp is to maintain compatibility
between code compiled with the LLVM backend and with our existing code
generators . Switching to the SjLj personality would defeat that I think
(it seems to use LLVM-defined internal data structures for storing the
context information, such as the "five word buffer in which the calling
context is saved"). In that case it would be better to immediately
switch to ehframe-based exception handling so as to at least reap some
benefits in the process.

It is not clear to me from reading
http://llvm.org/docs/ExceptionHandling.html whether it is possible to
use our own setjmp/longjmp infrastructure without modifying LLVM. I'm
only interested in getting the LLVM CFG correct. I don't need any
runtime support, data structures (ehframe) or context information from
LLVM. All of our exception state is stored in TLS structures that can be
obtained by calling routines in our own runtime.

So, can I use invoke and landingpad without using any of the other
exception handling intrinsics? (in combination with a dummy personality
function) Or will LLVM in all cases insist on using ehframe information,
a (C++-)ABI-compliant personality function, and possibly linking in
parts of its own runtime that depend on this information being correct?

Thanks,

Jonas

llvm dev - Sep 2016 - setjmp/longjmp and volatile stores, but non-volatile loads

[llvm-dev] setjmp/longjmp and volatile stores, but non-volatile loads

[llvm-dev] setjmp/longjmp and volatile stores, but non-volatile loads

[llvm-dev] setjmp/longjmp and volatile stores, but non-volatile loads