thr3ads.net - llvm dev - [llvm-dev] [inline-asm][asm-goto] Supporting "asm goto" in inline assembly [Feb 2018]

If this information is useful, please help other people find it:
Share via:

Matthias Braun via llvm-dev

2017-Apr-04 20:13 UTC

[llvm-dev] [inline-asm][asm-goto] Supporting "asm goto" in inline assembly

> On Apr 4, 2017, at 11:44 AM, John McCall via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
>> On Apr 4, 2017, at 2:12 PM, Matthias Braun <matze at braunis.de
<mailto:matze at braunis.de>> wrote:
>> My two cents:
>> 
>> - I think inline assembly should work even if the compiler cannot parse
the contents. This would rule out msvc inline assembly (or alternatively put all
the parsing and interpretation burden on the frontend), but would work with gcc
asm goto which specifies possible targets separately.
>> - Supporting control flow in inline assembly by allowing jumps out of
an assembly block seems natural to me.
>> - Jumping into an inline assembly block seems like an unnecessary
feature to me.
>> - To have this working in lib/CodeGen we would need an alternative
opcode with the terminator flag set. (There should also be opportunities to
remodel some instruction flags in the backend, to be part of the MachineInstr
instead of the opcode, but that is an orthogonal discussion to this)
>> - I don't foresee big problems in CodeGen, we should take a look on
how computed goto is implementation to find ways to reference arbitrary basic
blocks.
>> - The register allocator fails when the terminator instruction also
writes a register which is subsequently spilled (none of the existing targets
does that, but you could specify this situation in inline assembly).
>> - I'd always prefer intrinsics over inline assembly. Hey, why
don't we add a -Wassembly that warns on inline assembly usage and is enabled
by default...
>> - I still think inline assembly is valuable for new architecture
bringup/experimentation situations.
> 
> To me, this feels like a great example of "we really wanted a language
feature, but we figured out that we could hack it in using inline assembly in a
way that's ultimately significantly harder for the compiler to support than
a language feature, and now it's your problem."  I agree with Chandler
that we should just design and implement the language feature.
> 
> I would recommend:
> 
>   if (__builtin_patchable_branch("section name")) {
>     trace();
>   }
> 
> ==>
> 
>   %0 = call i1 @llvm.patchable_branch(i8* @sectionNameString)
>   br %0, ...
> 
> where @llvm.patchable_branch has the semantics of appending whatever
patching information is necessary to the given section such that, if you apply
the patch, it will change the result of the call from 0 to 1.  That can then
typically be pattern-matched in the backend to get the optimal codegen.
> 
> If I might recommend a better ABI for the patching information: consider
using a pair of relative pointers, one from the patching information to the
patchable instruction, and one from the patchable instruction to the new target.
That would allow the patching information to be relocated at zero cost.
> 
> The actual details of how to apply the patch, and what the inline
patchable-instruction sequence needs to be in order to accept the patch, would
be target-specific.  The documented motivating example seems to assume that a
single nop is always big enough, which is pretty questionable.
> 
> This feature could be made potentially interesting to e.g. JIT authors by
allowing the patching information to be embellished with additional information
to identify the source branch.
I completely agree that for this example we rather want a proper intrinsic. As a
matter of fact we have similar mechanism in CodeGen already to support the XRay
feature.

- Matthias

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170404/8746fa26/attachment.html>

Dean Michael Berris via llvm-dev

2017-Apr-07 02:05 UTC

head link

[llvm-dev] [inline-asm][asm-goto] Supporting "asm goto" in inline assembly

> On 5 Apr 2017, at 06:13, Matthias Braun via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
>> 
>> On Apr 4, 2017, at 11:44 AM, John McCall via llvm-dev <llvm-dev at
lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>> 
>>> On Apr 4, 2017, at 2:12 PM, Matthias Braun <matze at braunis.de
<mailto:matze at braunis.de>> wrote:
>>> My two cents:
>>> 
>>> - I think inline assembly should work even if the compiler cannot
parse the contents. This would rule out msvc inline assembly (or alternatively
put all the parsing and interpretation burden on the frontend), but would work
with gcc asm goto which specifies possible targets separately.
>>> - Supporting control flow in inline assembly by allowing jumps out
of an assembly block seems natural to me.
>>> - Jumping into an inline assembly block seems like an unnecessary
feature to me.
>>> - To have this working in lib/CodeGen we would need an alternative
opcode with the terminator flag set. (There should also be opportunities to
remodel some instruction flags in the backend, to be part of the MachineInstr
instead of the opcode, but that is an orthogonal discussion to this)
>>> - I don't foresee big problems in CodeGen, we should take a
look on how computed goto is implementation to find ways to reference arbitrary
basic blocks.
>>> - The register allocator fails when the terminator instruction also
writes a register which is subsequently spilled (none of the existing targets
does that, but you could specify this situation in inline assembly).
>>> - I'd always prefer intrinsics over inline assembly. Hey, why
don't we add a -Wassembly that warns on inline assembly usage and is enabled
by default...
>>> - I still think inline assembly is valuable for new architecture
bringup/experimentation situations.
>> 
>> To me, this feels like a great example of "we really wanted a
language feature, but we figured out that we could hack it in using inline
assembly in a way that's ultimately significantly harder for the compiler to
support than a language feature, and now it's your problem."  I agree
with Chandler that we should just design and implement the language feature.
>> 
>> I would recommend:
>> 
>>   if (__builtin_patchable_branch("section name")) {
>>     trace();
>>   }
>> 
>> ==>
>> 
>>   %0 = call i1 @llvm.patchable_branch(i8* @sectionNameString)
>>   br %0, ...
>> 
>> where @llvm.patchable_branch has the semantics of appending whatever
patching information is necessary to the given section such that, if you apply
the patch, it will change the result of the call from 0 to 1.  That can then
typically be pattern-matched in the backend to get the optimal codegen.
>> 
>> If I might recommend a better ABI for the patching information:
consider using a pair of relative pointers, one from the patching information to
the patchable instruction, and one from the patchable instruction to the new
target.  That would allow the patching information to be relocated at zero cost.
>> 
>> The actual details of how to apply the patch, and what the inline
patchable-instruction sequence needs to be in order to accept the patch, would
be target-specific.  The documented motivating example seems to assume that a
single nop is always big enough, which is pretty questionable.
>> 
>> This feature could be made potentially interesting to e.g. JIT authors
by allowing the patching information to be embellished with additional
information to identify the source branch.
> 
> I completely agree that for this example we rather want a proper intrinsic.
As a matter of fact we have similar mechanism in CodeGen already to support the
XRay feature.
I for one would really like that intrinsic. I have something similar under
review, which wraps a function call for XRay's custom event logging feature
(I should've sent an RFC on this I realise, I'll do that next). Patch
doing this in particular for XRay's requirements are in
https://reviews.llvm.org/D27503 <https://reviews.llvm.org/D27503> --
wherein we do the following:

- In LLVM IR, lower calls to the @llvm.xray.customevent(...) intrinsic into
something like:

  # align to 2 byte address
  .xray_sled_N
  jmp +NN
  # calling convention setup
  call <XRay's trampoline>

  We also mark the point where this sled is in the instrumentation map.

- At runtime we overwrite the jump to become nops.

If we get this patchable branch intrinsic, then we can just certainly use that
in lowering the XRay built-in we're trying to add to Clang as well.

/me goes writing up the RFC for the custom event logging.

Cheers

-- Dean

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170407/138bab15/attachment.html>

David Woodhouse via llvm-dev

2018-Feb-12 16:19 UTC

head link

[llvm-dev] [inline-asm][asm-goto] Supporting "asm goto" in inline assembly

FYI there is now serious talk of the Linux kernel dropping support for
compilers that *don't* support asm goto.
On Tue, 2017-04-04 at 13:13 -0700, Matthias Braun via llvm-dev
wrote:> 
> > On Apr 4, 2017, at 11:44 AM, John McCall via llvm-dev 
> > ts.llvm.org> wrote:
> > 
> > > On Apr 4, 2017, at 2:12 PM, Matthias Braun <matze at
braunis.de>
> > > wrote:
> > > My two cents:
> > > 
> > > - I think inline assembly should work even if the compiler cannot
> > > parse the contents. This would rule out msvc inline assembly (or
> > > alternatively put all the parsing and interpretation burden on
> > > the frontend), but would work with gcc asm goto which specifies
> > > possible targets separately.
> > > - Supporting control flow in inline assembly by allowing jumps
> > > out of an assembly block seems natural to me.
> > > - Jumping into an inline assembly block seems like an unnecessary
> > > feature to me.
> > > - To have this working in lib/CodeGen we would need an
> > > alternative opcode with the terminator flag set. (There should
> > > also be opportunities to remodel some instruction flags in the
> > > backend, to be part of the MachineInstr instead of the opcode,
> > > but that is an orthogonal discussion to this)
> > > - I don't foresee big problems in CodeGen, we should take a
look
> > > on how computed goto is implementation to find ways to reference
> > > arbitrary basic blocks.
> > > - The register allocator fails when the terminator instruction
> > > also writes a register which is subsequently spilled (none of the
> > > existing targets does that, but you could specify this situation
> > > in inline assembly).
> > > - I'd always prefer intrinsics over inline assembly. Hey, why
> > > don't we add a -Wassembly that warns on inline assembly usage
and
> > > is enabled by default...
> > > - I still think inline assembly is valuable for new architecture
> > > bringup/experimentation situations.
> > To me, this feels like a great example of "we really wanted a
> > language feature, but we figured out that we could hack it in using
> > inline assembly in a way that's ultimately significantly harder
for
> > the compiler to support than a language feature, and now it's your
> > problem."  I agree with Chandler that we should just design and
> > implement the language feature.
> > 
> > I would recommend:
> > 
> >   if (__builtin_patchable_branch("section name")) {
> >     trace();
> >   }
> > 
> > ==>
> > 
> >   %0 = call i1 @llvm.patchable_branch(i8* @sectionNameString)
> >   br %0, ...
> > 
> > where @llvm.patchable_branch has the semantics of appending
> > whatever patching information is necessary to the given section
> > such that, if you apply the patch, it will change the result of the
> > call from 0 to 1.  That can then typically be pattern-matched in
> > the backend to get the optimal codegen.
> > 
> > If I might recommend a better ABI for the patching information:
> > consider using a pair of relative pointers, one from the patching
> > information to the patchable instruction, and one from the
> > patchable instruction to the new target.  That would allow the
> > patching information to be relocated at zero cost.
> > 
> > The actual details of how to apply the patch, and what the inline
> > patchable-instruction sequence needs to be in order to accept the
> > patch, would be target-specific.  The documented motivating example
> > seems to assume that a single nop is always big enough, which is
> > pretty questionable.
> > 
> > This feature could be made potentially interesting to e.g. JIT
> > authors by allowing the patching information to be embellished with
> > additional information to identify the source branch.
> I completely agree that for this example we rather want a proper
> intrinsic. As a matter of fact we have similar mechanism in CodeGen
> already to support the XRay feature.
> 
> - Matthias
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180212/293437f6/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 5213 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180212/293437f6/attachment.bin>

Peter Zijlstra via llvm-dev

2018-Feb-12 17:11 UTC

head link

[llvm-dev] [inline-asm][asm-goto] Supporting "asm goto" in inline assembly

On Mon, Feb 12, 2018 at 04:19:30PM +0000, David Woodhouse
wrote:> FYI there is now serious talk of the Linux kernel dropping support for
> compilers that *don't* support asm goto.
> > I completely agree that for this example we rather want a proper
> > intrinsic. As a matter of fact we have similar mechanism in CodeGen
> > already to support the XRay feature.
Also, we're very much _NOT_ going to support anything 'similar' but
completely dfferent. It's asm-goto or nothing.

Possibly Parallel Threads

Search for more maybe matching threads

llvm dev - Feb 2018 - [inline-asm][asm-goto] Supporting "asm goto" in inline assembly

[llvm-dev] [inline-asm][asm-goto] Supporting "asm goto" in inline assembly

[llvm-dev] [inline-asm][asm-goto] Supporting "asm goto" in inline assembly

[llvm-dev] [inline-asm][asm-goto] Supporting "asm goto" in inline assembly

[llvm-dev] [inline-asm][asm-goto] Supporting "asm goto" in inline assembly

Possibly Parallel Threads