David Peixotto
2013-Oct-26  01:14 UTC
[LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler
From: Jim Grosbach [mailto:grosbach at apple.com] 
Sent: Friday, October 25, 2013 4:31 PM
To: David Peixotto
Cc: Renato Golin; LLVM Dev
Subject: Re: [LLVMdev] Add support for ldr pseudo instruction in ARM
integrated assembler
 
 
On Oct 25, 2013, at 3:53 PM, David Peixotto <dpeixott at codeaurora.org>
wrote:
Hi Renato, Thanks for the thoughtful reply. Please find my thoughts below.
 
-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
by The Linux Foundation
 
 
From: Renato Golin [ <mailto:renato.golin at linaro.org>
mailto:renato.golin at linaro.org] 
Sent: Friday, October 25, 2013 1:11 PM
To: David Peixotto
Cc: LLVM Dev; Logan Chien; Gabor Ballabas; Rafael Espíndola; Richard Barton;
Amara Emerson
Subject: Re: Add support for ldr pseudo instruction in ARM integrated
assembler
 
On 25 October 2013 18:33, David Peixotto < <mailto:dpeixott at
codeaurora.org>
dpeixott at codeaurora.org> wrote:
Both armasm and gnu as support an ldr pseudo instruction for loading
constants that lowers to either a mov, movn, or a pc-relative ldr from the
constant pool. It would be great if the llvm integrated assembler could
support this feature as well.
 
Hi David,
 
As much as I think that it's important to add support for old codebases to
be compiled, I also have to consider the importance of compatibility and
compiler sanity.
 
Adding support for this GNU extension (that ARMCC also seem to support) can:
 
1. Add multiple representations to the same operation, which is fine when
you're converting ASM into binary, but not so much when you're doing the
reverse.
 
By the reverse, do you mean converting binary to asm? If so I do not think
there would be an issue here because we would never need to generate the ldr
pseudo instruction. It is really a convenience for the programmer much like
macros. Once it has been converted to mov/ldr+constant pool there would be
no need to go the opposite direction. I think it is similar to macros in
that regard in that we do not need to reconstruct the macro after it has
been processed by the assembler to generate the final output.
 
Im not sure macros are a good analogy, but there are other
pseudo-instructions that were not always able to reconstruct in
disassembled code back to how the user wrote them. Or if we do, its purely
via heuristic methods. I dont see this as a big issue.
 
Do the ARM usages include allowing a single pseudo-instruction to expand to
multiple real instructions? For example, a movw/movt pair? If so, Im *very*
opposed to that part. A single assembler instruction, pseudo or otherwise,
should represent a single instruction in the final output. Even with a
single instruction, Im still leery, as it makes the source unclear whether
a constant load is a plain move instruction or a memory reference. That
makes it very difficult to read the assembly and do any sort of thinking
about performance considerations.
 
I believe that the ldr pseudo will always expand to a single instruction
(mov, movn, or ldr). I see that armasm also supports a mov32 pseudo
instruction that expands to movw/movt pair. GCC does not support mov32 and
Im not proposing that we add it to llvm.
We have been trying to deprecate pre-UAL and GNU-extensions as much as
possible, and this is a move that is supported from most of the ARM LLVM
community. The main reason is, as I said, to be able to come to and from
ASM, having the same canonical representation. This will make it a lot
easier to test the compiler and to avoid silent asm/disasm failures not
covered by the tests. We also have been avoiding to remove pre-UAL support
just because "bacon", but if it impacts a needed change, or if
removing it
fixes a bug, we tend to trim it off.
 
2. Increment the complexity of the assembler and disassembler in areas that
you cannot predict right now.
 
Yes, I was wondering about this part. Im not familiar with whether the
assembler can easily create constant pools which is something that would
definitely be needed.  From my first glance it looks like creating and
placing the constant pool is the tricky part in actually implementing this
feature.
 
This is where things get interesting. IIRC, the ARM tools rely on the user
to locate the constant pools (theres a .constpool directive or something
like that which spurs out whatever constants have been collected up to that
point. Its up to the user to make sure that the pool is in range for any
load instructions referencing it. I assume a diagnostic is expected if
theyre not, which is the part that would be tricky here. Not undoable, just
tricky to thread the information through from where it starts to where itll
be diagnosable (post-layout and relaxation in the object writer, probably).
 
I see. Yes the ARM documents say it is the users responsibility to ensure
the constant can be placed in a constant pool that is reachable by the
instruction. LTORG is the arm directive that forces the assembler to emit
the current constant pool. The corresponding gcc directive is .ltorg. Both
armasm and gcc give an error if the constant pool is placed too far away.
 
Tangentially related, I would also very strongly oppose changing the
compilers asm printer to use these constructs for its constant pool
references. I dont think thats being proposed, but throwing it out there
just in case thats a direction anyone is considering.
 
No, Im not proposing to change the printer for constant pool references. I
just want to support reading assembly with ldr pseudos.
 
It seems that this syntax allows for you to define symbols and constant
pools, and both have interactions with the rest of the code generation,
relocation symbols, etc. I can't guarantee right now that there will be no
conflicts with anything elsewhere, and to make sure we're covering all
bases, a huge amount of tests will need to be added. Of course, people are
free to work on things that make their work compile and run, but you have to
be aware that GNU-extensions and pre-UAL features are not taken lightly.
 
 
If I understand the history correctly (very hazy, so I may not), this
originated in ARMs compiler, and GNU as adopted it from there.
 
Ultimately, if there is a feature that cannot be done any other way in ARM
UAL, than we shall consider the options and implement the best. But if this
is just syntactic sugar, than I'd strongly suggest you to upgrade your
assembly files. The argument that "GCC does it" is used far too often
in the
LLVM list, and I can't say I'm a big fan of it. GCC did not take *only*
smart decisions in its life (neither did LLVM), and copying everything from
one side to the other blindly is not a good strategy.
 
In the general case, I vehemently agree with this. I am deeply concerned
about the direction of some of the target assemblers in this regard.
 
Im a little more sympathetic to this specific case because its a feature
of not just binutils, but the ARM tools also and last I checked, well
documented. That said, great care and deliberation need to be taken.
 
In summary, Im very skeptical of the feature, but can see value. Its on
the line for me. Im interested in hearing more of what yall think about
it.
 
-Jim
 
 
I may be wrong, correct me if I'm wrong, but as far as I got it, this looks
just like syntactic sugar.
 
I cannot think of why it would be more than syntactic sugar at this point.
For me it is a new feature that I encountered when trying to compile some of
our code bases here so it is possible that there could be something more
that Im missing. That said, I do think it is a useful sugar when you
actually have to write assembly.
 
I think probably it is blindly convert each instance like this:
 
    ldr r0, =foo
    @continuation point
==>
    ldr r0, [pc]
    b 2f
1:
    .word foo
2:
   @continuation point
 
I think the first is much easier for the programmer to write and maintain.
Btw, the local numeric label syntax (1:, b 1f) was new to me as well.
Luckily for me the integrated assembler already supports this syntax :)
 
cheers,
--renato
_______________________________________________
LLVM Developers mailing list
 <mailto:LLVMdev at cs.uiuc.edu> LLVMdev at cs.uiuc.edu
<http://llvm.cs.uiuc.edu/> http://llvm.cs.uiuc.edu
 <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131025/7d1ed7f7/attachment.html>
Jim Grosbach
2013-Oct-29  17:58 UTC
[LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler
On Oct 25, 2013, at 6:14 PM, David Peixotto <dpeixott at codeaurora.org> wrote:> From: Jim Grosbach [mailto:grosbach at apple.com] > Sent: Friday, October 25, 2013 4:31 PM > To: David Peixotto > Cc: Renato Golin; LLVM Dev > Subject: Re: [LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler > > > On Oct 25, 2013, at 3:53 PM, David Peixotto <dpeixott at codeaurora.org> wrote: > > > Hi Renato, Thanks for the thoughtful reply. Please find my thoughts below. > > -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation > > > From: Renato Golin [mailto:renato.golin at linaro.org] > Sent: Friday, October 25, 2013 1:11 PM > To: David Peixotto > Cc: LLVM Dev; Logan Chien; Gabor Ballabas; Rafael Espíndola; Richard Barton; Amara Emerson > Subject: Re: Add support for ldr pseudo instruction in ARM integrated assembler > > On 25 October 2013 18:33, David Peixotto <dpeixott at codeaurora.org> wrote: > Both armasm and gnu as support an ldr pseudo instruction for loading > constants that lowers to either a mov, movn, or a pc-relative ldr from the > constant pool. It would be great if the llvm integrated assembler could > support this feature as well. > > Hi David, > > As much as I think that it's important to add support for old codebases to be compiled, I also have to consider the importance of compatibility and compiler sanity. > > Adding support for this GNU extension (that ARMCC also seem to support) can: > > 1. Add multiple representations to the same operation, which is fine when you're converting ASM into binary, but not so much when you're doing the reverse. > > By the reverse, do you mean converting binary to asm? If so I do not think there would be an issue here because we would never need to generate the ldr pseudo instruction. It is really a convenience for the programmer much like macros. Once it has been converted to mov/ldr+constant pool there would be no need to go the opposite direction. I think it is similar to macros in that regard in that we do not need to reconstruct the macro after it has been processed by the assembler to generate the final output. > > I’m not sure macros are a good analogy, but there are other pseudo-instructions that we’re not always able to reconstruct in disassembled code back to how the user wrote them. Or if we do, it’s purely via heuristic methods. I don’t see this as a big issue. > > Do the ARM usages include allowing a single pseudo-instruction to expand to multiple real instructions? For example, a movw/movt pair? If so, I’m *very* opposed to that part. A single assembler instruction, pseudo or otherwise, should represent a single instruction in the final output. Even with a single instruction, I’m still leery, as it makes the source unclear whether a constant load is a plain ‘move’ instruction or a memory reference. That makes it very difficult to read the assembly and do any sort of thinking about performance considerations. > > I believe that the ldr pseudo will always expand to a single instruction (mov, movn, or ldr). I see that armasm also supports a mov32 pseudo instruction that expands to movw/movt pair. GCC does not support mov32 and I’m not proposing that we add it to llvm. >OK. If we can restrict this to always only generating a single instruction, I’m less concerned. Still not a huge fan, but not actively opposed. Thanks for checking into that.> > We have been trying to deprecate pre-UAL and GNU-extensions as much as possible, and this is a move that is supported from most of the ARM LLVM community. The main reason is, as I said, to be able to come to and from ASM, having the same canonical representation. This will make it a lot easier to test the compiler and to avoid silent asm/disasm failures not covered by the tests. We also have been avoiding to remove pre-UAL support just because "bacon", but if it impacts a needed change, or if removing it fixes a bug, we tend to trim it off. > > 2. Increment the complexity of the assembler and disassembler in areas that you cannot predict right now. > > Yes, I was wondering about this part. I’m not familiar with whether the assembler can easily create constant pools which is something that would definitely be needed. From my first glance it looks like creating and placing the constant pool is the tricky part in actually implementing this feature. > > This is where things get interesting. IIRC, the ARM tools rely on the user to locate the constant pools (there’s a .constpool directive or something like that which spurs out whatever constants have been collected up to that point. It’s up to the user to make sure that the pool is in range for any load instructions referencing it. I assume a diagnostic is expected if they’re not, which is the part that would be tricky here. Not undoable, just tricky to thread the information through from where it starts to where it’ll be diagnosable (post-layout and relaxation in the object writer, probably). > > I see. Yes the ARM documents say it is the user’s responsibility to ensure the constant can be placed in a constant pool that is reachable by the instruction. LTORG is the arm directive that forces the assembler to emit the current constant pool. The corresponding gcc directive is .ltorg. Both armasm and gcc give an error if the constant pool is placed too far away.Sounds good. I’m not hugely fond of the assembler doing that sort of thing, but it’s not too horrible. The implementation to get a good diagnostic is where it’ll get “fun."> > Tangentially related, I would also very strongly oppose changing the compiler’s asm printer to use these constructs for its constant pool references. I don’t think that’s being proposed, but throwing it out there just in case that’s a direction anyone is considering. > > No, I’m not proposing to change the printer for constant pool references. I just want to support reading assembly with ldr pseudos.OK, cool. We’re on the same page, then.> > > > It seems that this syntax allows for you to define symbols and constant pools, and both have interactions with the rest of the code generation, relocation symbols, etc. I can't guarantee right now that there will be no conflicts with anything elsewhere, and to make sure we're covering all bases, a huge amount of tests will need to be added. Of course, people are free to work on things that make their work compile and run, but you have to be aware that GNU-extensions and pre-UAL features are not taken lightly. > > > If I understand the history correctly (very hazy, so I may not), this originated in ARM’s compiler, and GNU as adopted it from there. > > > > Ultimately, if there is a feature that cannot be done any other way in ARM UAL, than we shall consider the options and implement the best. But if this is just syntactic sugar, than I'd strongly suggest you to upgrade your assembly files. The argument that "GCC does it" is used far too often in the LLVM list, and I can't say I'm a big fan of it. GCC did not take *only* smart decisions in its life (neither did LLVM), and copying everything from one side to the other blindly is not a good strategy. > > In the general case, I vehemently agree with this. I am deeply concerned about the direction of some of the target assemblers in this regard. > > I’m a little more sympathetic to this specific case because it’s a feature of not just binutils, but the ARM tools also and last I checked, well documented. That said, great care and deliberation need to be taken. > > In summary, I’m very skeptical of the feature, but can see value. It’s on the line for me. I’m interested in hearing more of what y’all think about it. > > -Jim > > > > > I may be wrong, correct me if I'm wrong, but as far as I got it, this looks just like syntactic sugar. > > I cannot think of why it would be more than syntactic sugar at this point. For me it is a new feature that I encountered when trying to compile some of our code bases here so it is possible that there could be something more that I’m missing. That said, I do think it is a useful sugar when you actually have to write assembly. > > I think probably it is blindly convert each instance like this: > > ldr r0, =foo > @continuation point > ==> > ldr r0, [pc] > b 2f > 1: > .word foo > 2: > @continuation point > > I think the first is much easier for the programmer to write and maintain. Btw, the local numeric label syntax (1:, b 1f) was new to me as well. Luckily for me the integrated assembler already supports this syntax :) > > cheers, > --renato > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131029/87abcf8d/attachment.html>
Eric Christopher
2013-Oct-29  18:05 UTC
[LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler
On Tue, Oct 29, 2013 at 10:58 AM, Jim Grosbach <grosbach at apple.com> wrote:> > On Oct 25, 2013, at 6:14 PM, David Peixotto <dpeixott at codeaurora.org> > wrote: > > *From:* Jim Grosbach [mailto:grosbach at apple.com <grosbach at apple.com>] > *Sent:* Friday, October 25, 2013 4:31 PM > *To:* David Peixotto > *Cc:* Renato Golin; LLVM Dev > *Subject:* Re: [LLVMdev] Add support for ldr pseudo instruction in ARM > integrated assembler**** > ** ** > ** ** > On Oct 25, 2013, at 3:53 PM, David Peixotto <dpeixott at codeaurora.org> > wrote:**** > > > **** > Hi Renato, Thanks for the thoughtful reply. Please find my thoughts below. > **** > **** > -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, > hosted by The Linux Foundation**** > **** > **** > *From:* Renato Golin [mailto:renato.golin at linaro.org<renato.golin at linaro.org> > ] > *Sent:* Friday, October 25, 2013 1:11 PM > *To:* David Peixotto > *Cc:* LLVM Dev; Logan Chien; Gabor Ballabas; Rafael Espíndola; Richard > Barton; Amara Emerson > *Subject:* Re: Add support for ldr pseudo instruction in ARM integrated > assembler**** > **** > On 25 October 2013 18:33, David Peixotto <dpeixott at codeaurora.org> wrote:* > *** > > Both armasm and gnu as support an ldr pseudo instruction for loading > constants that lowers to either a mov, movn, or a pc-relative ldr from the > constant pool. It would be great if the llvm integrated assembler could > support this feature as well.**** > > **** > Hi David,**** > **** > As much as I think that it's important to add support for old codebases to > be compiled, I also have to consider the importance of compatibility and > compiler sanity.**** > **** > Adding support for this GNU extension (that ARMCC also seem to support) > can:**** > **** > 1. Add multiple representations to the same operation, which is fine when > you're converting ASM into binary, but not so much when you're doing the > reverse.**** > **** > By the reverse, do you mean converting binary to asm? If so I do not think > there would be an issue here because we would never need to generate the > ldr pseudo instruction. It is really a convenience for the programmer much > like macros. Once it has been converted to mov/ldr+constant pool there > would be no need to go the opposite direction. I think it is similar to > macros in that regard in that we do not need to reconstruct the macro after > it has been processed by the assembler to generate the final output.**** > ** ** > I’m not sure macros are a good analogy, but there are other > pseudo-instructions that we’re not always able to reconstruct in > disassembled code back to how the user wrote them. Or if we do, it’s purely > via heuristic methods. I don’t see this as a big issue.**** > ** ** > Do the ARM usages include allowing a single pseudo-instruction to expand > to multiple real instructions? For example, a movw/movt pair? If so, I’m > *very* opposed to that part. A single assembler instruction, pseudo or > otherwise, should represent a single instruction in the final output. Even > with a single instruction, I’m still leery, as it makes the source unclear > whether a constant load is a plain ‘move’ instruction or a memory > reference. That makes it very difficult to read the assembly and do any > sort of thinking about performance considerations.**** > > I believe that the ldr pseudo will always expand to a single instruction > (mov, movn, or ldr). I see that armasm also supports a mov32 pseudo > instruction that expands to movw/movt pair. GCC does not support mov32 and > I’m not proposing that we add it to llvm.**** > > > OK. If we can restrict this to always only generating a single > instruction, I’m less concerned. Still not a huge fan, but not actively > opposed. Thanks for checking into that. > > > **** > We have been trying to deprecate pre-UAL and GNU-extensions as much as > possible, and this is a move that is supported from most of the ARM LLVM > community. The main reason is, as I said, to be able to come to and from > ASM, having the same canonical representation. This will make it a lot > easier to test the compiler and to avoid silent asm/disasm failures not > covered by the tests. We also have been avoiding to remove pre-UAL support > just because "bacon", but if it impacts a needed change, or if removing it > fixes a bug, we tend to trim it off.**** > **** > 2. Increment the complexity of the assembler and disassembler in areas > that you cannot predict right now.**** > **** > Yes, I was wondering about this part. I’m not familiar with whether the > assembler can easily create constant pools which is something that would > definitely be needed. From my first glance it looks like creating and > placing the constant pool is the tricky part in actually implementing this > feature.**** > ** ** > This is where things get interesting. IIRC, the ARM tools rely on the user > to locate the constant pools (there’s a .constpool directive or something > like that which spurs out whatever constants have been collected up to that > point. It’s up to the user to make sure that the pool is in range for any > load instructions referencing it. I assume a diagnostic is expected if > they’re not, which is the part that would be tricky here. Not undoable, > just tricky to thread the information through from where it starts to where > it’ll be diagnosable (post-layout and relaxation in the object writer, > probably).**** > > I see. Yes the ARM documents say it is the user’s responsibility to ensure > the constant can be placed in a constant pool that is reachable by the > instruction. LTORG is the arm directive that forces the assembler to emit > the current constant pool. The corresponding gcc directive is .ltorg. Both > armasm and gcc give an error if the constant pool is placed too far away. > > > Sounds good. I’m not hugely fond of the assembler doing that sort of > thing, but it’s not too horrible. The implementation to get a good > diagnostic is where it’ll get “fun." > > **** > ** ** > Tangentially related, I would also very strongly oppose changing the > compiler’s asm printer to use these constructs for its constant pool > references. I don’t think that’s being proposed, but throwing it out there > just in case that’s a direction anyone is considering.**** > > No, I’m not proposing to change the printer for constant pool references. > I just want to support reading assembly with ldr pseudos. > > > OK, cool. We’re on the same page, then. > >Agreed, I'd worry more about the "let's pick some instructions to figure out what the programmer meant!" rather than "eh, it's a shortcut". -eric> **** > > > **** > **** > It seems that this syntax allows for you to define symbols and constant > pools, and both have interactions with the rest of the code generation, > relocation symbols, etc. I can't guarantee right now that there will be no > conflicts with anything elsewhere, and to make sure we're covering all > bases, a huge amount of tests will need to be added. Of course, people are > free to work on things that make their work compile and run, but you have > to be aware that GNU-extensions and pre-UAL features are not taken lightly. > **** > **** > ** ** > If I understand the history correctly (very hazy, so I may not), this > originated in ARM’s compiler, and GNU as adopted it from there.**** > > > **** > **** > Ultimately, if there is a feature that cannot be done any other way in ARM > UAL, than we shall consider the options and implement the best. But if this > is just syntactic sugar, than I'd strongly suggest you to upgrade your > assembly files. The argument that "GCC does it" is used far too often in > the LLVM list, and I can't say I'm a big fan of it. GCC did not take *only* > smart decisions in its life (neither did LLVM), and copying everything from > one side to the other blindly is not a good strategy.**** > ** ** > In the general case, I vehemently agree with this. I am deeply concerned > about the direction of some of the target assemblers in this regard.**** > ** ** > I’m a little more sympathetic to this specific case because it’s a feature > of not just binutils, but the ARM tools also and last I checked, well > documented. That said, great care and deliberation need to be taken.**** > ** ** > In summary, I’m very skeptical of the feature, but can see value. It’s on > the line for me. I’m interested in hearing more of what y’all think about > it.**** > ** ** > -Jim**** > ** ** > > > **** > **** > I may be wrong, correct me if I'm wrong, but as far as I got it, this > looks just like syntactic sugar.**** > **** > I cannot think of why it would be more than syntactic sugar at this point. > For me it is a new feature that I encountered when trying to compile some > of our code bases here so it is possible that there could be something more > that I’m missing. That said, I do think it is a useful sugar when you > actually have to write assembly.**** > **** > I think probably it is blindly convert each instance like this:**** > **** > ldr r0, =foo**** > @continuation point**** > ==>**** > ldr r0, [pc]**** > b 2f**** > 1:**** > .word foo**** > 2:**** > @continuation point**** > **** > I think the first is much easier for the programmer to write and maintain. > Btw, the local numeric label syntax (1:, b 1f) was new to me as well. > Luckily for me the integrated assembler already supports this syntax :)*** > * > **** > cheers,**** > --renato**** > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev**** > ** ** > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131029/664d7bbf/attachment.html>
Reasonably Related Threads
- [LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler
- [LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler
- [LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler
- [LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler
- [LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler