David Peixotto
2013-Oct-25  17:33 UTC
[LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler
Both armasm and gnu as support an ldr pseudo instruction for loading
constants that lowers to either a mov, movn, or a pc-relative ldr from the
constant pool. It would be great if the llvm integrated assembler could
support this feature as well.
For example, using gnu as to compile this code:
    .text
    foo:
      ldr r0, =0x1
      ldr r0, =-0x1
      ldr r0, =0x1000001
      ldr r0, =bar
      ldr r0, =baz
    bar:
Produces an object file like this (with relocations shown):
    Disassembly of section .text:
      00000000 <foo>:
         0:   e3a00001        mov     r0, #1
         4:   e3e00000        mvn     r0, #0
         8:   e59f0004        ldr     r0, [pc, #4]    ; 14 <bar>
         c:   e59f0004        ldr     r0, [pc, #4]    ; 18 <bar+0x4>
        10:   e59f0004        ldr     r0, [pc, #4]    ; 1c <bar+0x8>
      00000014 <bar>:
        14:   01000001        .word   0x01000001
        18:   00000014        .word   0x00000014
                              18: R_ARM_ABS32 .text
        1c:   00000000        .word   0x00000000
                              1c: R_ARM_ABS32 baz
Currently the llvm integrated assembler fails on this input with an error
like:
    error: unexpected token in operand
      ldr r0, =0x1
              ^
I am interested in seeing support for this feature added to the integrated
assembler. Is anybody planning to add this feature? I am willing to do the
work, but I am not sure exactly where to start so I thought I would bring it
up here in case anybody has some suggestions (or wants to implement it
themselves :)
CC'ing some recent contributors to the llvm arm assembler.
gnu as reference:
https://sourceware.org/binutils/docs/as/ARM-Opcodes.html#ARM-Opcodes
-David
-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
by The Linux Foundation
Renato Golin
2013-Oct-25  20:11 UTC
[LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler
On 25 October 2013 18:33, David Peixotto <dpeixott at codeaurora.org> wrote:> Both armasm and gnu as support an ldr pseudo instruction for loading > constants that lowers to either a mov, movn, or a pc-relative ldr from the > constant pool. It would be great if the llvm integrated assembler could > support this feature as well. >Hi David, As much as I think that it's important to add support for old codebases to be compiled, I also have to consider the importance of compatibility and compiler sanity. Adding support for this GNU extension (that ARMCC also seem to support) can: 1. Add multiple representations to the same operation, which is fine when you're converting ASM into binary, but not so much when you're doing the reverse. We have been trying to deprecate pre-UAL and GNU-extensions as much as possible, and this is a move that is supported from most of the ARM LLVM community. The main reason is, as I said, to be able to come to and from ASM, having the same canonical representation. This will make it a lot easier to test the compiler and to avoid silent asm/disasm failures not covered by the tests. We also have been avoiding to remove pre-UAL support just because "bacon", but if it impacts a needed change, or if removing it fixes a bug, we tend to trim it off. 2. Increment the complexity of the assembler and disassembler in areas that you cannot predict right now. It seems that this syntax allows for you to define symbols and constant pools, and both have interactions with the rest of the code generation, relocation symbols, etc. I can't guarantee right now that there will be no conflicts with anything elsewhere, and to make sure we're covering all bases, a huge amount of tests will need to be added. Of course, people are free to work on things that make their work compile and run, but you have to be aware that GNU-extensions and pre-UAL features are not taken lightly. Ultimately, if there is a feature that cannot be done any other way in ARM UAL, than we shall consider the options and implement the best. But if this is just syntactic sugar, than I'd strongly suggest you to upgrade your assembly files. The argument that "GCC does it" is used far too often in the LLVM list, and I can't say I'm a big fan of it. GCC did not take *only* smart decisions in its life (neither did LLVM), and copying everything from one side to the other blindly is not a good strategy. I may be wrong, correct me if I'm wrong, but as far as I got it, this looks just like syntactic sugar. cheers, --renato -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131025/273aedfd/attachment.html>
Sean Silva
2013-Oct-25  21:35 UTC
[LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler
On Fri, Oct 25, 2013 at 1:33 PM, David Peixotto <dpeixott at codeaurora.org>wrote:> Both armasm and gnu as support an ldr pseudo instruction for loading > constants that lowers to either a mov, movn, or a pc-relative ldr from the > constant pool. It would be great if the llvm integrated assembler could > support this feature as well. > > For example, using gnu as to compile this code: > .text > foo: > ldr r0, =0x1 > ldr r0, =-0x1 > ldr r0, =0x1000001 > ldr r0, =bar > ldr r0, =baz > bar: > > Produces an object file like this (with relocations shown): > Disassembly of section .text: > 00000000 <foo>: > 0: e3a00001 mov r0, #1 > 4: e3e00000 mvn r0, #0 > 8: e59f0004 ldr r0, [pc, #4] ; 14 <bar> > c: e59f0004 ldr r0, [pc, #4] ; 18 <bar+0x4> > 10: e59f0004 ldr r0, [pc, #4] ; 1c <bar+0x8> > > 00000014 <bar>: > 14: 01000001 .word 0x01000001 > 18: 00000014 .word 0x00000014 > 18: R_ARM_ABS32 .text > 1c: 00000000 .word 0x00000000 > 1c: R_ARM_ABS32 baz > > Currently the llvm integrated assembler fails on this input with an error > like: > error: unexpected token in operand > ldr r0, =0x1 > ^ > I am interested in seeing support for this feature added to the integrated > assembler. Is anybody planning to add this feature? I am willing to do the > work, but I am not sure exactly where to start so I thought I would bring > it > up here in case anybody has some suggestions (or wants to implement it > themselves :) >I would like to see this features supported. I have run into code in the wild that cannot be handled by the LLVM toolchain due to this issue. -- Sean Silva> > CC'ing some recent contributors to the llvm arm assembler. > > gnu as reference: > https://sourceware.org/binutils/docs/as/ARM-Opcodes.html#ARM-Opcodes > > -David > > -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, > hosted > by The Linux Foundation > > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131025/9f1386f2/attachment.html>
Renato Golin
2013-Oct-25  22:04 UTC
[LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler
On 25 October 2013 22:35, Sean Silva <chisophugis at gmail.com> wrote:> I would like to see this features supported. >Hi Sean, I'm not opposing, I'm just saying that we'll need critical reasons for having a support that will introduce GCC-extensions blindly.> I have run into code in the wild that cannot be handled by the LLVM > toolchain due to this issue. >What kind of situation? Was it possible to change the source file? Or was it something that only this extension can express? cheers, --renato -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131025/6da4c59f/attachment.html>
Sean Silva
2013-Oct-25  22:38 UTC
[LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler
On Fri, Oct 25, 2013 at 4:11 PM, Renato Golin <renato.golin at linaro.org>wrote:> On 25 October 2013 18:33, David Peixotto <dpeixott at codeaurora.org> wrote: > >> Both armasm and gnu as support an ldr pseudo instruction for loading >> constants that lowers to either a mov, movn, or a pc-relative ldr from the >> constant pool. It would be great if the llvm integrated assembler could >> support this feature as well. >> > > Hi David, > > As much as I think that it's important to add support for old codebases to > be compiled, I also have to consider the importance of compatibility and > compiler sanity. >FYI, Linux has at least 40 uses of this feature. And `git log --oneline --pickaxe-regex -S'ldr.*=' --since='1 year ago' arch/arm | wc -l` gives 55 uses. I'm not sure you can dismiss this as "support for old codebases". There are also at least 3 uses in x264 and >60 in ffmpeg. These were just the first 3 projects that I could think of that would have arm assembler in them; put another way, 3/3 projects that I sampled need this feature.> > Adding support for this GNU extension (that ARMCC also seem to support) > can: > > 1. Add multiple representations to the same operation, which is fine when > you're converting ASM into binary, but not so much when you're doing the > reverse. > > We have been trying to deprecate pre-UAL and GNU-extensions as much as > possible, and this is a move that is supported from most of the ARM LLVM > community. The main reason is, as I said, to be able to come to and from > ASM, having the same canonical representation. This will make it a lot > easier to test the compiler and to avoid silent asm/disasm failures not > covered by the tests. We also have been avoiding to remove pre-UAL support > just because "bacon", but if it impacts a needed change, or if removing it > fixes a bug, we tend to trim it off. > > 2. Increment the complexity of the assembler and disassembler in areas > that you cannot predict right now. > > It seems that this syntax allows for you to define symbols and constant > pools, and both have interactions with the rest of the code generation, > relocation symbols, etc. I can't guarantee right now that there will be no > conflicts with anything elsewhere, and to make sure we're covering all > bases, a huge amount of tests will need to be added. Of course, people are > free to work on things that make their work compile and run, but you have > to be aware that GNU-extensions and pre-UAL features are not taken lightly. > > Ultimately, if there is a feature that cannot be done any other way in ARM > UAL, than we shall consider the options and implement the best. But if this > is just syntactic sugar, than I'd strongly suggest you to upgrade your > assembly files. The argument that "GCC does it" is used far too often in > the LLVM list, and I can't say I'm a big fan of it. GCC did not take *only* > smart decisions in its life (neither did LLVM), and copying everything from > one side to the other blindly is not a good strategy. > > I may be wrong, correct me if I'm wrong, but as far as I got it, this > looks just like syntactic sugar. >If the syntax sugar is part of the input format that users of the assembler expect, then I think it makes sense to support it. The concerns you have raised seem valid, but appear minor next to the reason for an assembler's existence: to assemble people's code (*not* its own test suite; that would be masturbatory). A person using the LLVM toolchain for its assembler doesn't care at all about those concerns, and just wants their code to work. A developer of one of the 3 projects I sampled would not be lying if they told their colleague: "I tried the LLVM toolchain for ARM, but the assembler is broken so I wasn't able to finish the build; I'll try again once the next version comes out and hopefully they will have fixed it by then". -- Sean Silva> > cheers, > --renato > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131025/fa8f3eb4/attachment.html>
David Peixotto
2013-Oct-25  22:53 UTC
[LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler
Hi Renato, Thanks for the thoughtful reply. Please find my thoughts below.
 
-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
by The Linux Foundation
 
 
From: Renato Golin [mailto:renato.golin at linaro.org] 
Sent: Friday, October 25, 2013 1:11 PM
To: David Peixotto
Cc: LLVM Dev; Logan Chien; Gabor Ballabas; Rafael Espíndola; Richard Barton;
Amara Emerson
Subject: Re: Add support for ldr pseudo instruction in ARM integrated
assembler
 
On 25 October 2013 18:33, David Peixotto <dpeixott at codeaurora.org>
wrote:
Both armasm and gnu as support an ldr pseudo instruction for loading
constants that lowers to either a mov, movn, or a pc-relative ldr from the
constant pool. It would be great if the llvm integrated assembler could
support this feature as well.
 
Hi David,
 
As much as I think that it's important to add support for old codebases to
be compiled, I also have to consider the importance of compatibility and
compiler sanity.
 
Adding support for this GNU extension (that ARMCC also seem to support) can:
 
1. Add multiple representations to the same operation, which is fine when
you're converting ASM into binary, but not so much when you're doing the
reverse.
 
By the reverse, do you mean converting binary to asm? If so I do not think
there would be an issue here because we would never need to generate the ldr
pseudo instruction. It is really a convenience for the programmer much like
macros. Once it has been converted to mov/ldr+constant pool there would be
no need to go the opposite direction. I think it is similar to macros in
that regard in that we do not need to reconstruct the macro after it has
been processed by the assembler to generate the final output.
 
We have been trying to deprecate pre-UAL and GNU-extensions as much as
possible, and this is a move that is supported from most of the ARM LLVM
community. The main reason is, as I said, to be able to come to and from
ASM, having the same canonical representation. This will make it a lot
easier to test the compiler and to avoid silent asm/disasm failures not
covered by the tests. We also have been avoiding to remove pre-UAL support
just because "bacon", but if it impacts a needed change, or if
removing it
fixes a bug, we tend to trim it off.
 
2. Increment the complexity of the assembler and disassembler in areas that
you cannot predict right now.
 
Yes, I was wondering about this part. Im not familiar with whether the
assembler can easily create constant pools which is something that would
definitely be needed.  From my first glance it looks like creating and
placing the constant pool is the tricky part in actually implementing this
feature.
 
It seems that this syntax allows for you to define symbols and constant
pools, and both have interactions with the rest of the code generation,
relocation symbols, etc. I can't guarantee right now that there will be no
conflicts with anything elsewhere, and to make sure we're covering all
bases, a huge amount of tests will need to be added. Of course, people are
free to work on things that make their work compile and run, but you have to
be aware that GNU-extensions and pre-UAL features are not taken lightly.
 
 
Ultimately, if there is a feature that cannot be done any other way in ARM
UAL, than we shall consider the options and implement the best. But if this
is just syntactic sugar, than I'd strongly suggest you to upgrade your
assembly files. The argument that "GCC does it" is used far too often
in the
LLVM list, and I can't say I'm a big fan of it. GCC did not take *only*
smart decisions in its life (neither did LLVM), and copying everything from
one side to the other blindly is not a good strategy.
 
I may be wrong, correct me if I'm wrong, but as far as I got it, this looks
just like syntactic sugar.
 
I cannot think of why it would be more than syntactic sugar at this point.
For me it is a new feature that I encountered when trying to compile some of
our code bases here so it is possible that there could be something more
that Im missing. That said, I do think it is a useful sugar when you
actually have to write assembly.
 
I think probably it is blindly convert each instance like this:
 
    ldr r0, =foo
    @continuation point
==>
    ldr r0, [pc]
    b 2f
1:
    .word foo
2:
   @continuation point
 
I think the first is much easier for the programmer to write and maintain.
Btw, the local numeric label syntax (1:, b 1f) was new to me as well.
Luckily for me the integrated assembler already supports this syntax :)
 
cheers,
--renato
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131025/cb3986a3/attachment.html>
Reasonably Related Threads
- [LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler
- [LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler
- [LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler
- [LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler
- [LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler