Jim Grosbach
2013-Nov-01 19:32 UTC
[LLVMdev] Implementing the ldr pseudo instruction in ARM integrated assembler
On Nov 1, 2013, at 12:15 PM, David Peixotto <dpeixott at codeaurora.org> wrote:>>>>> I was thinking that without the .ltorg directive the constant pool >>>>> would go at the end of the section. >>>>> >>>> So where does the assembler place the constant pool(s) if that >>>> directive isn't present? I was under the impression it was always >> required. >>> >>> From my understanding it is not required. I see that GCC will place it >>> at the end of the section. I don't know if it will ever place it >>> anywhere besides the end of the section when there is no .ltorg >> directive. >>> >>> Here is the relevant section from the gcc docs >>> (https://sourceware.org/binutils/docs/as/ARM-Directives.html): >>> >>> """ >>> .ltorg >>> This directive causes the current contents of the literal pool to be >>> dumped into the current section (which is assumed to be the .text >>> section) at the current location (aligned to a word boundary). GAS >>> maintains a separate literal pool for each section and each >>> sub-section. The .ltorg directive will only affect the literal pool of >>> the current section and sub-section. At the end of assembly all >>> remaining, un-empty literal pools will automatically be dumped. >>> """ >>> >> >> What does ARM's documentation say? > > The ARM documentation says that the assembler puts the current literal pool > at the end of every code section, where the sections are determined by the > AREA directive or the end of the assembly. If the default literal pool will > be out of range the programmer can use the LTORG directive to assemble the > current literal pool immediately.Well, they’re consistent at least, so that’s good. That’s also pretty well-defined. I was afraid there was going to be some requirement that the assembler try to analyze things and figure out where good places were (a-la the constant island pass). Glad to hear that’s not the case. There’s still a problem for Darwin, or any other platform that use subsections-via-symbols type layout tricks, though. There’s no assembler-time way to know how far apart the atoms in the section will be at runtime, as the linker can, and will, move things around. The quick thought would be to emit them when the next atom begins, but that’ll fall over due to the typical “.align” which precedes the next function. The constant pool for the previous function would end up being emitted after the alignment directive for the following function, which will make that next symbol potentially not sufficiently aligned. For example, _foo: ldr, r1, =0x12345678 … bx lr .align 4 _bar: … The result will be that _bar is not 16-byte aligned, but only 4-byte aligned, which will come as quite the surprise to the programmer. The next thought is to detect subsections-via-symbols and require the directive if another atom is seen and there is a non-empty constant pool. That gets a bit of chicken-and-egg, though, as the subsections-via-symbols directive is typically the last line of the .s file. We could, perhaps, always require an explicit directive for all constant pools when using subsections-via-symbols and add a diagnostic check at the end of parsing (when we’re spitting out the non-empty pools) to see if there was a subsections-via-symbols directive in there anywhere. Anyways, the main point of all of this is to reinforce the “there be dragons here” nature of this feature. It interacts with other parts of the assembler and the underlying assumptions of the platform in interesting ways. Lots of *really* careful test cases will be necessary.
David Peixotto
2013-Nov-01 20:34 UTC
[LLVMdev] Implementing the ldr pseudo instruction in ARM integrated assembler
> There's still a problem for Darwin, or any other platform that use > subsections-via-symbols type layout tricks, though. There's no assembler- > time way to know how far apart the atoms in the section will be at > runtime, as the linker can, and will, move things around.Hmm, yes that does sound quite tricky. How do we currently deal with that for other pc-relative loads. Say the programmer writes something like this _foo: ldr r1, [pc, #392] _bar: .space 200 _baz: .space 200 _some_important_constants: .word 0x12345678 If bar and baz get deleted by the linker the offset would obviously be wrong. Do we do anything special for this or just rely on the programmer not to write this code?> We could, perhaps, always require an explicit directive for all constant > pools when using subsections-via-symbols and add a diagnostic check at the > end of parsing (when we're spitting out the non-empty pools) to see if > there was a subsections-via-symbols directive in there anywhere.This seems to be a reasonable choice. It seems it would still be difficult to detect all the cases where we could run into trouble. For example, something like _foo: ldr r1, =0x12345678 _bar: ... .ltorg If bar gets deleted we would still have the wrong offsets.> Anyways, the main point of all of this is to reinforce the "there be > dragons here" nature of this feature. It interacts with other parts of the > assembler and the underlying assumptions of the platform in interesting > ways. Lots of *really* careful test cases will be necessary.Yes I see your point. Thanks for brining .subsections_via_symbols to my attention.
Jim Grosbach
2013-Nov-01 20:43 UTC
[LLVMdev] Implementing the ldr pseudo instruction in ARM integrated assembler
On Nov 1, 2013, at 1:34 PM, David Peixotto <dpeixott at codeaurora.org> wrote:>> There's still a problem for Darwin, or any other platform that use >> subsections-via-symbols type layout tricks, though. There's no assembler- >> time way to know how far apart the atoms in the section will be at >> runtime, as the linker can, and will, move things around. > > Hmm, yes that does sound quite tricky. How do we currently deal with that > for other pc-relative loads. Say the programmer writes something like this > > _foo: > ldr r1, [pc, #392] > _bar: > .space 200 > _baz: > .space 200 > _some_important_constants: .word 0x12345678 > > If bar and baz get deleted by the linker the offset would obviously be > wrong. Do we do anything special for this or just rely on the programmer not > to write this code?That code is basically undefined behaviour with subsections-via-symbols. PC-relative loads can’t cross atoms. This is why, for example, that all labels inside a function must be assembler-local labels. You’ll see all sorts of “interesting” code sequences and relocation tricks for pc-relative stuff.> >> We could, perhaps, always require an explicit directive for all constant >> pools when using subsections-via-symbols and add a diagnostic check at the >> end of parsing (when we're spitting out the non-empty pools) to see if >> there was a subsections-via-symbols directive in there anywhere. > > This seems to be a reasonable choice. It seems it would still be difficult > to detect all the cases where we could run into trouble. For example, > something like > > _foo: > ldr r1, =0x12345678 > _bar: > ... > .ltorg > > If bar gets deleted we would still have the wrong offsets.Yeah, that would need to be illegal as well. Fun times… It sounds like this is getting constrained to a pretty reasonable set of features for the assembler to deal with. It’ll definitely be a good usability thing and will make lots of existing code a lot happier with clang, which is a good thing. Thanks for taking on the challenge and working on it. -Jim> >> Anyways, the main point of all of this is to reinforce the "there be >> dragons here" nature of this feature. It interacts with other parts of the >> assembler and the underlying assumptions of the platform in interesting >> ways. Lots of *really* careful test cases will be necessary. > > Yes I see your point. Thanks for brining .subsections_via_symbols to my > attention. > >
Reasonably Related Threads
- [LLVMdev] Implementing the ldr pseudo instruction in ARM integrated assembler
- [LLVMdev] Implementing the ldr pseudo instruction in ARM integrated assembler
- [LLVMdev] Implementing the ldr pseudo instruction in ARM integrated assembler
- [LLVMdev] Implementing the ldr pseudo instruction in ARM integrated assembler
- [LLVMdev] Implementing the ldr pseudo instruction in ARM integrated assembler