thr3ads.net - llvm dev - [llvm-dev] Altering the return address , for a function with multiple return paths [Jul 2019]

If this information is useful, please help other people find it:
Share via:

James Y Knight via llvm-dev

2019-Jul-21 16:29 UTC

[llvm-dev] Altering the return address , for a function with multiple return paths

Yes, indeed!

The SBCL lisp compiler (not llvm based) used to emit functions which would
return either via ret to the usual instruction after the call, or else load
the return-address from the stack, then jump 2 bytes later (which would
skip over either a nop or a short jmp at original target location). Which
one it used depended upon whether the function was doing a multi-valued
return (in which case it used ret) or a single-valued return (in which case
it did the jmp retpc+2).

While this seems like a clever and efficient hack, it actually has an
absolutely awful effect on performance, due to the unpaired call vs return,
and the unexpected return address.

SBCL stopped doing this in 2006, a decade later than it should've -- the
Pentium1 MMX from 1997 already had a hardware return stack which made this
a really bad idea!

What it does now is have the called function set or clear the carry flag
(using STC and CLC) immediately before the return. If the caller cares,
then the caller emits JNC as the first instruction after the call. (but
callers typically do not care -- most calls only consume a single value,
and any extra return-values are silently ignored).

On Sun, Jul 21, 2019, 6:18 AM Jacob Lifshay via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> one (non-LLVM) problem you will run into is that almost all processors
> are optimized to have functions return to the instruction right after
> the instruction that called them.
>
> The most common method is to predict where the return instruction will
> jump to by using a processor-internal stack of return addresses, which
> is separate from the in-memory call stack. This enables the processor
> to fetch, decode, and execute instructions following (in program
> order) the return instruction before the processor knows for sure what
> address the return instruction will branch to. If the return address
> turns out to be different than the processor predicted, it has to
> throw out all the instructions it started executing that it thought
> came after the return, causing massive slow-downs.
>
> For an interesting application of changing the return address, lookup
> retpolines.
>
> On Sun, Jul 21, 2019 at 2:07 AM Tsur Herman via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
> >
> > Playing around with calling conventions naked functions and
> epilogue/prologue...
> > Is it possible/expressible/feasible to alter the return address the
> function will return to?
> >
> > For example, when a function may return an Int8 or a Float64,
depending
> on some external state
> > (user, or random variable), instead of checking the returned type in
the
> calling function, is it possible
> > to pass 2 potential return addresses one suitable for Int8 and one
> suitable for Float64 and let the function return to the right place?
> >
> > if it is possible, what are the implications? do these inhibit the
> optimization opportunities somehow?
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190721/f6a48f5b/attachment.html>

John McCall via llvm-dev

2019-Jul-24 03:42 UTC

head link

[llvm-dev] Altering the return address , for a function with multiple return paths

On 21 Jul 2019, at 12:29, James Y Knight via llvm-dev wrote:
> Yes, indeed!
>
> The SBCL lisp compiler (not llvm based) used to emit functions which would
> return either via ret to the usual instruction after the call, or else load
> the return-address from the stack, then jump 2 bytes later (which would
> skip over either a nop or a short jmp at original target location). Which
> one it used depended upon whether the function was doing a multi-valued
> return (in which case it used ret) or a single-valued return (in which case
> it did the jmp retpc+2).
>
> While this seems like a clever and efficient hack, it actually has an
> absolutely awful effect on performance, due to the unpaired call vs return,
> and the unexpected return address.
>
> SBCL stopped doing this in 2006, a decade later than it should've --
the
> Pentium1 MMX from 1997 already had a hardware return stack which made this
> a really bad idea!
>
> What it does now is have the called function set or clear the carry flag
> (using STC and CLC) immediately before the return. If the caller cares,
> then the caller emits JNC as the first instruction after the call. (but
> callers typically do not care -- most calls only consume a single value,
> and any extra return-values are silently ignored).
On Swift, we've occasionally considered whether it would be useful to be
able to return values in flags.  For example, you could imagine returning
a trinary comparison result on x86_64 based on whether ZF and CF are set.
A function which compares two pairs of unsigned numbers could be compiled
to something like:

```
  cmpq %rdi, %rdx
  jz end
  cmpq %rsi, %rcx
end:
  ret
```

And the caller can switch over the values just by testing the flags.

The main problem is that this is really elegant if you have an
instruction that sets the flags exactly right and really terrible
if you don't.  For example, if we want this function to compare two
pairs of *signed* numbers, we need to move OF to CF without disturbing
ZF, which I don't think is possible without some really ugly
instruction sequences.  (Or we could add 0x8000_0000_0000_0000 to both
operands before the comparison, but that's terrible in its own right.)

That problem isn't as bad if it's just a single boolean in ZF or CF, but
it's still not great, at least on x86.

Now, specialized purposes like SBCL's can definitely still benefit from
being able to return in a flag.  If LLVM had had the ability to return
values in flags, we might've used it in Swift's coroutines ABI, where
(similar to SBCL) any particular return site does know exactly which
value it wants to return.  So it'd be nice if someone was interested in
adding it.

But we did ultimately decide that it wasn't even worth prototyping it
for the generic Swift CC.

John.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190723/c4fbc9b6/attachment.html>

Philip Reames via llvm-dev

2019-Jul-24 05:46 UTC

head link

[llvm-dev] Altering the return address , for a function with multiple return paths

On 7/23/19 8:42 PM, John McCall via llvm-dev wrote:>
> On 21 Jul 2019, at 12:29, James Y Knight via llvm-dev wrote:
>
>     Yes, indeed!
>
>     The SBCL lisp compiler (not llvm based) used to emit functions
>     which would
>     return either via ret to the usual instruction after the call, or
>     else load
>     the return-address from the stack, then jump 2 bytes later (which
>     would
>     skip over either a nop or a short jmp at original target
>     location). Which
>     one it used depended upon whether the function was doing a
>     multi-valued
>     return (in which case it used ret) or a single-valued return (in
>     which case
>     it did the jmp retpc+2).
>
>     While this seems like a clever and efficient hack, it actually has an
>     absolutely awful effect on performance, due to the unpaired call
>     vs return,
>     and the unexpected return address.
>
>     SBCL stopped doing this in 2006, a decade later than it should've
>     -- the
>     Pentium1 MMX from 1997 already had a hardware return stack which
>     made this
>     a really bad idea!
>
>     What it does now is have the called function set or clear the
>     carry flag
>     (using STC and CLC) immediately before the return. If the caller
>     cares,
>     then the caller emits JNC as the first instruction after the call.
>     (but
>     callers typically do not care -- most calls only consume a single
>     value,
>     and any extra return-values are silently ignored).
>
> On Swift, we've occasionally considered whether it would be useful to
be
> able to return values in flags. For example, you could imagine returning
> a trinary comparison result on x86_64 based on whether ZF and CF are set.
> A function which compares two pairs of unsigned numbers could be compiled
> to something like:
>
> |cmpq %rdi, %rdx jz end cmpq %rsi, %rcx end: ret |
>
> And the caller can switch over the values just by testing the flags.
>
> The main problem is that this is really elegant if you have an
> instruction that sets the flags exactly right and really terrible
> if you don't. For example, if we want this function to compare two
> pairs of /signed/ numbers, we need to move OF to CF without disturbing
> ZF, which I don't think is possible without some really ugly
> instruction sequences. (Or we could add 0x8000_0000_0000_0000 to both
> operands before the comparison, but that's terrible in its own right.)
>
> That problem isn't as bad if it's just a single boolean in ZF or
CF, but
> it's still not great, at least on x86.
>
> Now, specialized purposes like SBCL's can definitely still benefit from
> being able to return in a flag. If LLVM had had the ability to return
> values in flags, we might've used it in Swift's coroutines ABI,
where
> (similar to SBCL) any particular return site does know exactly which
> value it wants to return. So it'd be nice if someone was interested in
> adding it.
>
> But we did ultimately decide that it wasn't even worth prototyping it
> for the generic Swift CC.
>We've also got some cases where returning a value in a flag might be 
useful.  Our typical use case is we have a "rare, but not *that* rare* 
slowpath which sometimes needs to run after a call from a runtime 
function.  Our other compiler(s) - which use hand rolled assembly for 
all of these bits - return the "take-rare" bit in ZF, and branch on
that
after the call.  For our LLVM based system, we just materialize the 
value into $rax and branch on that.  That naive scheme has been 
surprisingly not bad performance wise.

* The "not *that* rare" part is needed to avoid having exceptional 
unwinding be the right answer.

If we were to support something like this, you'd really want to be able 
to define individual flags in the callee's calling convention 
clobber/preserve lists.  It's really common to have a helper routine 
which sets say ZF, but leaves others unchanged.  Or to have a function 
which sets ZF, clobbers OF, and preserves all others.  But if we were 
going to do that, we'd quickly realize that the x86 backend doesn't 
track individual flags at all, and thus conclude it probably wasn't 
worth it begin with.  :)

Philip



-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190723/1a2b9ff5/attachment.html>

Seemingly Similar Threads

Search for more maybe matching threads

llvm dev - Jul 2019 - Altering the return address , for a function with multiple return paths

[llvm-dev] Altering the return address , for a function with multiple return paths

[llvm-dev] Altering the return address , for a function with multiple return paths

[llvm-dev] Altering the return address , for a function with multiple return paths

Seemingly Similar Threads