thr3ads.net - llvm dev - [LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler [Oct 2013]

If this information is useful, please help other people find it:
Share via:

Sean Silva

2013-Oct-26 00:22 UTC

[LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler

On Fri, Oct 25, 2013 at 7:30 PM, Jim Grosbach <grosbach at apple.com>
wrote:
>
> On Oct 25, 2013, at 3:53 PM, David Peixotto <dpeixott at
codeaurora.org>
> wrote:
>
> Hi Renato, Thanks for the thoughtful reply. Please find my thoughts below.
> ****
>
> -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> hosted by The Linux Foundation****
>
>
> *From:* Renato Golin [mailto:renato.golin at linaro.org<renato.golin at
linaro.org>
> ]
> *Sent:* Friday, October 25, 2013 1:11 PM
> *To:* David Peixotto
> *Cc:* LLVM Dev; Logan Chien; Gabor Ballabas; Rafael Espíndola; Richard
> Barton; Amara Emerson
> *Subject:* Re: Add support for ldr pseudo instruction in ARM integrated
> assembler****
> ** **
> On 25 October 2013 18:33, David Peixotto <dpeixott at codeaurora.org>
wrote:*
> ***
>
> Both armasm and gnu as support an ldr pseudo instruction for loading
> constants that lowers to either a mov, movn, or a pc-relative ldr from the
> constant pool. It would be great if the llvm integrated assembler could
> support this feature as well.****
>
> ** **
> Hi David,****
> ** **
> As much as I think that it's important to add support for old codebases
to
> be compiled, I also have to consider the importance of compatibility and
> compiler sanity.****
> ** **
> Adding support for this GNU extension (that ARMCC also seem to support)
> can:****
> ** **
> 1. Add multiple representations to the same operation, which is fine when
> you're converting ASM into binary, but not so much when you're
doing the
> reverse.****
>
> By the reverse, do you mean converting binary to asm? If so I do not think
> there would be an issue here because we would never need to generate the
> ldr pseudo instruction. It is really a convenience for the programmer much
> like macros. Once it has been converted to mov/ldr+constant pool there
> would be no need to go the opposite direction. I think it is similar to
> macros in that regard in that we do not need to reconstruct the macro after
> it has been processed by the assembler to generate the final output.
>
>
> I’m not sure macros are a good analogy, but there are other
> pseudo-instructions that we’re not always able to reconstruct in
> disassembled code back to how the user wrote them. Or if we do, it’s purely
> via heuristic methods. I don’t see this as a big issue.
>
> Do the ARM usages include allowing a single pseudo-instruction to expand
> to multiple real instructions? For example, a movw/movt pair? If so, I’m
> *very* opposed to that part. A single assembler instruction, pseudo or
> otherwise, should represent a single instruction in the final output. Even
> with a single instruction, I’m still leery, as it makes the source unclear
> whether a constant load is a plain ‘move’ instruction or a memory
> reference. That makes it very difficult to read the assembly and do any
> sort of thinking about performance considerations.
>
x86 has this issue to an extent that goes far beyond what you describe
here, and FWIW I've never seen a situation where it has been a problem.
Usually when doing instruction-level/uarch-level optimization I find myself
disassembling raw bytes in memory or in linked executables (or showing
relocations in object files). The point of source code (even assembler) is
to abstract over what is happening in the machine; when you specifically
want to know what is happening in the machine you should use a tool
designed to show you that, i.e. a disassembler (that shows raw bytes too).

Also, I think the fact that there are high-profile users (well, I guess
they are potential users since this is currently broken) that use this
feature overrides any "elegance"/"simplicity" concern about
an instruction
expanding differently, for the purposes of "is it acceptable to support
this feature in LLVM if someone will do the work to implement and maintain
it?".

-- Sean Silva





>
> ****
> ** **
> We have been trying to deprecate pre-UAL and GNU-extensions as much as
> possible, and this is a move that is supported from most of the ARM LLVM
> community. The main reason is, as I said, to be able to come to and from
> ASM, having the same canonical representation. This will make it a lot
> easier to test the compiler and to avoid silent asm/disasm failures not
> covered by the tests. We also have been avoiding to remove pre-UAL support
> just because "bacon", but if it impacts a needed change, or if
removing it
> fixes a bug, we tend to trim it off.****
> ** **
> 2. Increment the complexity of the assembler and disassembler in areas
> that you cannot predict right now.****
>
> Yes, I was wondering about this part. I’m not familiar with whether the
> assembler can easily create constant pools which is something that would
> definitely be needed.  From my first glance it looks like creating and
> placing the constant pool is the tricky part in actually implementing this
> feature.****
>
>
> This is where things get interesting. IIRC, the ARM tools rely on the user
> to locate the constant pools (there’s a .constpool directive or something
> like that which spurs out whatever constants have been collected up to that
> point. It’s up to the user to make sure that the pool is in range for any
> load instructions referencing it. I assume a diagnostic is expected if
> they’re not, which is the part that would be tricky here. Not undoable,
> just tricky to thread the information through from where it starts to where
> it’ll be diagnosable (post-layout and relaxation in the object writer,
> probably).
>
> Tangentially related, I would also very strongly oppose changing the
> compiler’s asm printer to use these constructs for its constant pool
> references. I don’t think that’s being proposed, but throwing it out there
> just in case that’s a direction anyone is considering.
>
>
> It seems that this syntax allows for you to define symbols and constant
> pools, and both have interactions with the rest of the code generation,
> relocation symbols, etc. I can't guarantee right now that there will be
no
> conflicts with anything elsewhere, and to make sure we're covering all
> bases, a huge amount of tests will need to be added. Of course, people are
> free to work on things that make their work compile and run, but you have
> to be aware that GNU-extensions and pre-UAL features are not taken lightly.
> ****
>
>
>
> If I understand the history correctly (very hazy, so I may not), this
> originated in ARM’s compiler, and GNU as adopted it from there.
>
> ** **
> Ultimately, if there is a feature that cannot be done any other way in ARM
> UAL, than we shall consider the options and implement the best. But if this
> is just syntactic sugar, than I'd strongly suggest you to upgrade your
> assembly files. The argument that "GCC does it" is used far too
often in
> the LLVM list, and I can't say I'm a big fan of it. GCC did not
take *only*
> smart decisions in its life (neither did LLVM), and copying everything from
> one side to the other blindly is not a good strategy.
>
>
> In the general case, I vehemently agree with this. I am deeply concerned
> about the direction of some of the target assemblers in this regard.
>
> I’m a little more sympathetic to this specific case because it’s a feature
> of not just binutils, but the ARM tools also and last I checked, well
> documented. That said, great care and deliberation need to be taken.
>
> In summary, I’m very skeptical of the feature, but can see value. It’s on
> the line for me. I’m interested in hearing more of what y’all think about
> it.
>
> -Jim
>
>
> ****
> ** **
> I may be wrong, correct me if I'm wrong, but as far as I got it, this
> looks just like syntactic sugar.****
>
> I cannot think of why it would be more than syntactic sugar at this point.
> For me it is a new feature that I encountered when trying to compile some
> of our code bases here so it is possible that there could be something more
> that I’m missing. That said, I do think it is a useful sugar when you
> actually have to write assembly.****
>
> I think probably it is blindly convert each instance like this:****
>
>     ldr r0, =foo****
>     @continuation point****
> ==>****
>     ldr r0, [pc]****
>     b 2f****
> 1:****
>     .word foo****
> 2:****
>    @continuation point****
>
> I think the first is much easier for the programmer to write and maintain.
> Btw, the local numeric label syntax (1:, b 1f) was new to me as well.
> Luckily for me the integrated assembler already supports this syntax :)***
> *
>
> cheers,****
> --renato****
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131025/df05fa55/attachment.html>

Chris Lattner

2013-Oct-27 00:02 UTC

head link

[LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler

On Oct 25, 2013, at 5:22 PM, Sean Silva <chisophugis at gmail.com>
wrote:> I’m not sure macros are a good analogy, but there are other
pseudo-instructions that we’re not always able to reconstruct in disassembled
code back to how the user wrote them. Or if we do, it’s purely via heuristic
methods. I don’t see this as a big issue.
I agree.  These pseudo instructions seem like pure syntactic sugar that should
never be produced by the disassembler.  That doesn't make them bad, in fact
it makes them simpler to implement and reason about.
> 
> Do the ARM usages include allowing a single pseudo-instruction to expand to
multiple real instructions? For example, a movw/movt pair? If so, I’m *very*
opposed to that part.
Why?  For people writing assembly manually, having pseudo instructions to
encapsulate common patterns is very useful.
> A single assembler instruction, pseudo or otherwise, should represent a
single instruction in the final output. Even with a single instruction, I’m
still leery, as it makes the source unclear whether a constant load is a plain
‘move’ instruction or a memory reference. That makes it very difficult to read
the assembly and do any sort of thinking about performance considerations.
No one is compelled to use these if they don't want to.
> x86 has this issue to an extent that goes far beyond what you describe
here, and FWIW I've never seen a situation where it has been a problem.
Usually when doing instruction-level/uarch-level optimization I find myself
disassembling raw bytes in memory or in linked executables (or showing
relocations in object files). The point of source code (even assembler) is to
abstract over what is happening in the machine; when you specifically want to
know what is happening in the machine you should use a tool designed to show you
that, i.e. a disassembler (that shows raw bytes too).
> 
> Also, I think the fact that there are high-profile users (well, I guess
they are potential users since this is currently broken) that use this feature
overrides any "elegance"/"simplicity" concern about an
instruction expanding differently, for the purposes of "is it acceptable to
support this feature in LLVM if someone will do the work to implement and
maintain it?".
Given that this pseudo instruction is widely implemented and empirically used by
important code bases like the Linux kernel, it seems like a no-brainer to
support it IMO.

-Chris

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131026/492d8e80/attachment.html>

JF Bastien

2013-Oct-27 00:45 UTC

head link

[LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler

>> Do the ARM usages include allowing a single pseudo-instruction to
expand
>> to multiple real instructions? For example, a movw/movt pair? If so,
I’m
>> *very* opposed to that part.
>
>
> Why?  For people writing assembly manually, having pseudo instructions to
> encapsulate common patterns is very useful.
Would it be acceptable for this pseudo-instruction to expand to
movw/movt for some targets, and ldr for others? AFAIK the former
performs better on some ARM implementations which the OP may have an
interest in.

Renato Golin

2013-Oct-27 05:57 UTC

head link

[LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler

So, it seems there are enough people on the plus side, I just wanted to
make sure we evaluate all sides before taking a decision to add syntactic
sugar to LLVM assembler.

My main concern is still the same as earlier this year: the integrated
assembler for ARM is still not complete, and the more extensions we add to
the back-end, the harder it'll be to get it into production quality.

That said...

On 27 October 2013 01:02, Chris Lattner <clattner at apple.com> wrote:
> I agree.  These pseudo instructions seem like pure syntactic sugar that
> should never be produced by the disassembler.  That doesn't make them
bad,
> in fact it makes them simpler to implement and reason about.
>

I agree with this line of thought, though it's not necessarily simple to
implement this specific one, because of the constant pools. You have to pay
attention if there aren't many pools next to each other, or where is the
best placement (due to proximity, relocations, alignment), etc. I'm not
sure we've got all that logic already in, so this patch might end up a lot
bigger than just adding a few parser lines.

Ultimately, I'm not against it, but I'd be a lot more comfortable if I
saw
lots of tests and lots of people looking at it (from different angles),
just to make sure we're not missing anything obvious and introducing major
regressions because of syntactic sugar.

cheers,
--renato
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131027/2fb440f9/attachment.html>

Kristof Beyls

2013-Oct-27 09:41 UTC

head link

[LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler

I agree that this pseudo instruction should be supported.

 
>From having talked with a number of people very experienced in writingassembler for ARM targets,
they consistently find this one of the most important features missing in
the integrated assembler.

 

Kristof

 

From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
Behalf Of Chris Lattner
Sent: 27 October 2013 01:02
To: Sean Silva
Cc: LLVM Dev
Subject: Re: [LLVMdev] Add support for ldr pseudo instruction in ARM
integrated assembler

 

On Oct 25, 2013, at 5:22 PM, Sean Silva <chisophugis at gmail.com> wrote:

I'm not sure macros are a good analogy, but there are other
pseudo-instructions that we're not always able to reconstruct in
disassembled code back to how the user wrote them. Or if we do, it's purely
via heuristic methods. I don't see this as a big issue.

 

I agree.  These pseudo instructions seem like pure syntactic sugar that
should never be produced by the disassembler.  That doesn't make them bad,
in fact it makes them simpler to implement and reason about.

 

 

Do the ARM usages include allowing a single pseudo-instruction to expand to
multiple real instructions? For example, a movw/movt pair? If so, I'm *very*
opposed to that part. 

 

Why?  For people writing assembly manually, having pseudo instructions to
encapsulate common patterns is very useful.

 

A single assembler instruction, pseudo or otherwise, should represent a
single instruction in the final output. Even with a single instruction, I'm
still leery, as it makes the source unclear whether a constant load is a
plain 'move' instruction or a memory reference. That makes it very
difficult
to read the assembly and do any sort of thinking about performance
considerations.

 

No one is compelled to use these if they don't want to.

 

x86 has this issue to an extent that goes far beyond what you describe here,
and FWIW I've never seen a situation where it has been a problem. Usually
when doing instruction-level/uarch-level optimization I find myself
disassembling raw bytes in memory or in linked executables (or showing
relocations in object files). The point of source code (even assembler) is
to abstract over what is happening in the machine; when you specifically
want to know what is happening in the machine you should use a tool designed
to show you that, i.e. a disassembler (that shows raw bytes too).

 

Also, I think the fact that there are high-profile users (well, I guess they
are potential users since this is currently broken) that use this feature
overrides any "elegance"/"simplicity" concern about an
instruction expanding
differently, for the purposes of "is it acceptable to support this feature
in LLVM if someone will do the work to implement and maintain it?".

 

Given that this pseudo instruction is widely implemented and empirically
used by important code bases like the Linux kernel, it seems like a
no-brainer to support it IMO.

 

-Chris

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131027/8ddb7f14/attachment.html>

Jim Grosbach

2013-Oct-29 17:21 UTC

head link

[LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler

On Oct 26, 2013, at 5:02 PM, Chris Lattner <clattner at apple.com> wrote:
> On Oct 25, 2013, at 5:22 PM, Sean Silva <chisophugis at gmail.com>
wrote:
>> I’m not sure macros are a good analogy, but there are other
pseudo-instructions that we’re not always able to reconstruct in disassembled
code back to how the user wrote them. Or if we do, it’s purely via heuristic
methods. I don’t see this as a big issue.
> 
> I agree.  These pseudo instructions seem like pure syntactic sugar that
should never be produced by the disassembler.  That doesn't make them bad,
in fact it makes them simpler to implement and reason about.
> 
>> 
>> Do the ARM usages include allowing a single pseudo-instruction to
expand to multiple real instructions? For example, a movw/movt pair? If so, I’m
*very* opposed to that part.
> 
> Why?  For people writing assembly manually, having pseudo instructions to
encapsulate common patterns is very useful.
> 
An assembler is not a compiler. When reading the assembly code, it should be
clear what instructions are actually going into the output.
>> A single assembler instruction, pseudo or otherwise, should represent a
single instruction in the final output. Even with a single instruction, I’m
still leery, as it makes the source unclear whether a constant load is a plain
‘move’ instruction or a memory reference. That makes it very difficult to read
the assembly and do any sort of thinking about performance considerations.
> 
> No one is compelled to use these if they don't want to.
> 
>> x86 has this issue to an extent that goes far beyond what you describe
here, and FWIW I've never seen a situation where it has been a problem.
Usually when doing instruction-level/uarch-level optimization I find myself
disassembling raw bytes in memory or in linked executables (or showing
relocations in object files). The point of source code (even assembler) is to
abstract over what is happening in the machine; when you specifically want to
know what is happening in the machine you should use a tool designed to show you
that, i.e. a disassembler (that shows raw bytes too).
>> 
>> Also, I think the fact that there are high-profile users (well, I guess
they are potential users since this is currently broken) that use this feature
overrides any "elegance"/"simplicity" concern about an
instruction expanding differently, for the purposes of "is it acceptable to
support this feature in LLVM if someone will do the work to implement and
maintain it?".
> 
> Given that this pseudo instruction is widely implemented and empirically
used by important code bases like the Linux kernel, it seems like a no-brainer
to support it IMO.
> 
> -Chris
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131029/464a7060/attachment.html>

Possibly Parallel Threads

Search for more reasonably related threads

llvm dev - Oct 2013 - [LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler

[LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler

[LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler

[LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler

[LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler

[LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler

[LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler

Possibly Parallel Threads