Bill Wendling via llvm-dev
2019-Jun-27 18:10 UTC
[llvm-dev] [RFC] ASM Goto With Output Constraints
Now that ASM goto support has landed, Nick Desaulniers and I wrote up a
document describing how to expand clang's implementation of ASM goto to
support output constraints. The work *should* be straight-forward, but as
always will need to be verified to work. Below is a copy of our whitepaper.
Please take a look and offer any comments you have.
Share and enjoy!
-bw
Overview
Support for asm goto
<https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html>
with output constraints is a feature that the Linux community is interested
in having. Adding this new feature should give Clang a higher profile in
the Linux community:
-
It demonstrates the Clang community's commitment to supporting Linux.
-
Developers are likely to adopt it on their own, which means they will
need to use Clang in some fashion, either as a complete replacement for or
in addition to GCC.
Current state
Clang's implementation of asm goto converts this code:
int vogon(unsigned a, unsigned b) {
asm goto("poetry %0, %1" : : "r"(a), "r"(b) : :
error);
return a + b;
error:
return -1;
}
into the following LLVM IR:
define i32 @vogon(i32 %a, i32 %b) {
entry:
callbr void asm sideeffect "poetry $0, $1", "r,r,X"
(i32 %a, i32 %b, i8* blockaddress(@vogon, %return))
to label %asm.fallthrough [label %return]
asm.fallthrough:
%add = add i32 %b, %a
br label %return
return:
%retval.0 = phi i32 [ %add, %asm.fallthrough ], [ -1, %entry ]
ret i32 %retval.0
}
Our proposal won't change LLVM's current behavior–i.e. a callbr without
a
return value will act in the same way as the current implementation.
Proposal
GCC restricts asm goto from having output constraints due to limitations in
its internal representation–i.e. GCC's control transfer instructions cannot
have outputs. For example:
int vogon(int a, int b) {
asm goto("poetry %0, %1" : "=r"(a), "=r"(b) : :
: error);
return a + b;
error:
return -1;
}
currently fails to compile in GCC with the following error:
<source>: In function 'vogon':
<source>:2:29: error: expected ':' before string constant
2 | asm goto("poetry %0, %1" : "=r"(a),
"=r"(b) : : : error);
| ^~~~~
| :
ToT Clang matches GCC's behavior:
<source>:2:30: error: 'asm goto' cannot have output constraints
asm goto("poetry %0, %1" : "=r"(a), "=r"(b) : :
: error);
However, LLVM doesn't restrict control transfer instructions from having
outputs (e.g. the invoke instruction
<https://llvm.org/docs/LangRef.html#invoke-instruction>). We propose
changing LLVM's callbr instruction
<https://llvm.org/docs/LangRef.html#callbr-instruction> to allow return
values, similar to how LLVM's implementation of inline assembly (via the
call instruction <https://llvm.org/docs/LangRef.html#call-instruction>)
allows return values. Since there can potentially be zero to many output
constraints, callbr would now return an aggregate which contains an element
for each output constraint. These values would then be extracted via
extractvalue. With our proposal, the above C example will be converted to
LLVM IR like this:
define i32 @vogon(i32 %a, i32 %b) {
entry:
%0 = callbr { i32, i32 } asm sideeffect "poetry $0, $1",
"=r,=r,X"
(i8* blockaddress(@vogon, %error))
to label %asm.fallthrough [label %error]
asm.fallthrough:
%asmresult.a = extractvalue { i32, i32 } %0, 0
%asmresult.b = extractvalue { i32, i32 } %0, 1
%result = add i32 %asmresult.a, %asmresult.b
ret i32 %result
error:
ret i32 -1
}
Note that unlike the invoke instruction, callbr's return values are assumed
valid on all branches. The assumption is that the programmer knows what
their inline assembly is doing and where its output constraints are valid.
If the value isn't valid on a particular branch but is used there anyway,
then the result is a poison value. (Also, if a callbr's return values
affect a branch, it will be handled similarly to the invoke instruction's
implementation.) Here's an example of how this would work:
int vogon(int a, int b) {
asm goto("poetry %0, %1" : "=r"(a), "=r"(b) : :
: error);
if (a == 42)
return 42 * b;
return a + b;
error:
return b - 42;
}
generates the following LLVM IR:
define i32 @vogon(i32 %a, i32 %b) {
entry:
%0 = callbr { i32, i32 } asm sideeffect "poetry $0, $1",
"=r,=r,X"
(i8* blockaddress(@vogon, %error))
to label %asm.fallthrough [label %error]
asm.fallthrough:
%asmresult.a = extractvalue { i32, i32 } %0, 0
%tobool = icmp eq i32 %asmresult.a, 42
br i1 %tobool, label %if.true, label %if.false
if.true:
%asmresult.b = extractvalue { i32, i32 } %0, 1
%mul = mul i32 42, %asmresult.b
ret i32 %mul
if.false:
%asmresult.a.1 = extractvalue { i32, i32 } %0, 0
%asmresult.b.1 = extractvalue { i32, i32 } %0, 1
%result = add i32 %asmresult.a.1, %asmresult.b.1
ret i32 %result
error:
%asmresult.b.error = extractvalue { i32, i32 } %0, 1
%error.result = sub i32 %asmresult.b.error, 42
ret i32 %error.result
}
Implementation
Because LLVM's invoke instruction is a terminating instruction that may
have return values, we can use it as a template for callbr's changes. The
new functionality lies mostly in modifying Clang's front-end. In
particular, we need to do the following:
-
Remove all error checks restricting asm goto from returning values, and
-
Generate the extractvalue instructions on callbr's branches.
LLVM's middle- and back-ends need to be audited to ensure there are no
restrictions on callbr returning a value. We expect all passes to Just
Work™ without modifications, but of course will be verified.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190627/fb832e7f/attachment-0001.html>
Finkel, Hal J. via llvm-dev
2019-Jun-27 21:14 UTC
[llvm-dev] [cfe-dev] [RFC] ASM Goto With Output Constraints
On 6/27/19 1:10 PM, Bill Wendling via cfe-dev wrote:
Now that ASM goto support has landed, Nick Desaulniers and I wrote up a document
describing how to expand clang's implementation of ASM goto to support
output constraints. The work should be straight-forward, but as always will need
to be verified to work. Below is a copy of our whitepaper. Please take a look
and offer any comments you have.
This all sounds fairly straightforward and removes an technically-unnecessary
restriction to produce a more-general capability - LLVM terminators can have
return values, and so we have no problem representing the underlying concept.
There is no governing standard here, and we made a fairly invasive change to
LLVM already to support this extension in the first place. We should leverage
that work to make the extension as useful as possible.
-Hal
Share and enjoy!
-bw
Overview
Support for asm goto<https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html>
with output constraints is a feature that the Linux community is interested in
having. Adding this new feature should give Clang a higher profile in the Linux
community:
* It demonstrates the Clang community's commitment to supporting Linux.
* Developers are likely to adopt it on their own, which means they will need
to use Clang in some fashion, either as a complete replacement for or in
addition to GCC.
Current state
Clang's implementation of asm goto converts this code:
int vogon(unsigned a, unsigned b) { asm goto("poetry %0, %1" : :
"r"(a), "r"(b) : : error); return a + b; error: return
-1; }
into the following LLVM IR:
define i32 @vogon(i32 %a, i32 %b) { entry: callbr void asm sideeffect
"poetry $0, $1", "r,r,X" (i32 %a, i32 %b, i8*
blockaddress(@vogon, %return)) to label %asm.fallthrough [label
%return] asm.fallthrough: %add = add i32 %b, %a br label %return return:
%retval.0 = phi i32 [ %add, %asm.fallthrough ], [ -1, %entry ] ret i32
%retval.0 }
Our proposal won't change LLVM's current behavior–i.e. a callbr without
a return value will act in the same way as the current implementation.
Proposal
GCC restricts asm goto from having output constraints due to limitations in its
internal representation–i.e. GCC's control transfer instructions cannot have
outputs. For example:
int vogon(int a, int b) { asm goto("poetry %0, %1" :
"=r"(a), "=r"(b) : : : error); return a + b; error:
return -1; }
currently fails to compile in GCC with the following error:
<source>: In function 'vogon': <source>:2:29: error:
expected ':' before string constant 2 | asm goto("poetry %0,
%1" : "=r"(a), "=r"(b) : : : error); |
^~~~~ | :
ToT Clang matches GCC's behavior:
<source>:2:30: error: 'asm goto' cannot have output constraints
asm goto("poetry %0, %1" : "=r"(a), "=r"(b) : : :
error);
However, LLVM doesn't restrict control transfer instructions from having
outputs (e.g. the invoke
instruction<https://llvm.org/docs/LangRef.html#invoke-instruction>). We
propose changing LLVM's callbr
instruction<https://llvm.org/docs/LangRef.html#callbr-instruction> to
allow return values, similar to how LLVM's implementation of inline assembly
(via the call
instruction<https://llvm.org/docs/LangRef.html#call-instruction>) allows
return values. Since there can potentially be zero to many output constraints,
callbr would now return an aggregate which contains an element for each output
constraint. These values would then be extracted via extractvalue. With our
proposal, the above C example will be converted to LLVM IR like this:
define i32 @vogon(i32 %a, i32 %b) { entry: %0 = callbr { i32, i32 } asm
sideeffect "poetry $0, $1", "=r,=r,X" (i8*
blockaddress(@vogon, %error)) to label %asm.fallthrough [label %error]
asm.fallthrough: %asmresult.a = extractvalue { i32, i32 } %0, 0 %asmresult.b
= extractvalue { i32, i32 } %0, 1 %result = add i32 %asmresult.a, %asmresult.b
ret i32 %result error: ret i32 -1 }
Note that unlike the invoke instruction, callbr's return values are assumed
valid on all branches. The assumption is that the programmer knows what their
inline assembly is doing and where its output constraints are valid. If the
value isn't valid on a particular branch but is used there anyway, then the
result is a poison value. (Also, if a callbr's return values affect a
branch, it will be handled similarly to the invoke instruction's
implementation.) Here's an example of how this would work:
int vogon(int a, int b) { asm goto("poetry %0, %1" :
"=r"(a), "=r"(b) : : : error); if (a == 42) return 42
* b; return a + b; error: return b - 42; }
generates the following LLVM IR:
define i32 @vogon(i32 %a, i32 %b) { entry: %0 = callbr { i32, i32 } asm
sideeffect "poetry $0, $1", "=r,=r,X" (i8*
blockaddress(@vogon, %error)) to label %asm.fallthrough [label %error]
asm.fallthrough: %asmresult.a = extractvalue { i32, i32 } %0, 0 %tobool =
icmp eq i32 %asmresult.a, 42 br i1 %tobool, label %if.true, label %if.false
if.true: %asmresult.b = extractvalue { i32, i32 } %0, 1 %mul = mul i32 42,
%asmresult.b ret i32 %mul if.false: %asmresult.a.1 = extractvalue { i32, i32
} %0, 0 %asmresult.b.1 = extractvalue { i32, i32 } %0, 1 %result = add i32
%asmresult.a.1, %asmresult.b.1 ret i32 %result error: %asmresult.b.error =
extractvalue { i32, i32 } %0, 1 %error.result = sub i32 %asmresult.b.error, 42
ret i32 %error.result }
Implementation
Because LLVM's invoke instruction is a terminating instruction that may have
return values, we can use it as a template for callbr's changes. The new
functionality lies mostly in modifying Clang's front-end. In particular, we
need to do the following:
* Remove all error checks restricting asm goto from returning values, and
* Generate the extractvalue instructions on callbr's branches.
LLVM's middle- and back-ends need to be audited to ensure there are no
restrictions on callbr returning a value. We expect all passes to Just Work™
without modifications, but of course will be verified.
_______________________________________________
cfe-dev mailing list
cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190627/4b87b135/attachment-0001.html>