thr3ads.net - llvm dev - [LLVMdev] R_ARM_ABS32 disassembly with integrated-as [Oct 2012]

If this information is useful, please help other people find it:
Share via:

Jim Grosbach

2012-Oct-09 22:58 UTC

[LLVMdev] R_ARM_ABS32 disassembly with integrated-as

On Oct 7, 2012, at 3:14 AM, Renato Golin <rengolin at systemcall.org>
wrote:
> On 5 October 2012 17:48, Jim Grosbach <grosbach at apple.com> wrote:
>> The recent MachO data-in-code support should have fixed a lot of the
problems. There's probably still some quirks in the specifics ($a vs. $t and
making sure the symbols get into the ELF properly), but the core functionality
to know how to mark data regions is there and works very well.
> 
> Hi Jim,
> 
> I'm trying to help Greg crack it down. From your recent commits, I
> take it you're re-using a data-in-code detection previously used only
> for ASM output, to object output, via the
> EmitDataRegion/EmitDataRegionEnd.
> 
It's a bit more than that. Those Emit* methods are new for this support.
There was spotty support for the raw $a/$t/$d stuff before, and this abstracted
and extended it to support both asm and binary emission as well as added uses
for the methods to the various bits in the ARM backend where data-in-code
regions get created (jump tables, constant pools, et. al.).

> I haven't looked too deep in the MC, but I'm supposing that will
work
> automatically when the output streamer is printing object code and
> meets a non-code region, so in theory, changing MCELFStreamer
> accordingly (overriding those functions in there) would take care of
> data vs. code issue in ELF.
Yep. They'll likely be implemented as, effectively, an EmitLabel().
> Assuming LLVM doesn't generate ARM/Thumb veneers inside the same
> function (ie. a Thumb function has only Thumb code), Greg could use
> the EmitDataRegion and EmitDataRegionEnd, with the former saving the
> state of the current code (Thumb/Arm) and the latter restoring it, by
> emiting the $d and $a/t respectively.
> 
> Does it seem like a good initial approach?
> 
> Continuing... It seems MCELFStreamer already has a EmitThumbFunc,
> which looks to me as the wrong place to be.
That's just the handler for the .thumb_func directive. It has nothing to do
with emitting the contents of the actual function.
> I'd imagine MCELFStreamer
> would have EmitFunc and MCARMELCStreamer (or whatever) would identify
> its type and call the appropriate EmitThumbFunc/EmitARMFunc. Being
> pedantic, even that is still too high level because of the ARM/Thumb
> veneers, but we don't want to worry about that if LLVM doesn't even
> try to mix ARM and Thumb (and assuming external libraries would have
> the symbols, if they do).
This is complicated a bit by needing to work for plain .s files, not just
compiler generated files. Those can intermix arm and thumb code in crazy ways.

The assembler already has a thumb vs. arm mode state (which gets adjusted via
the .arm/.thumb directives and the .code synonyms). ELF will want to check that
state and use it to determine whether a data-region-end directive should result
in a $a or a $t in the output ELF.
> Generating or not, LLVM's disassembler should know about those symbols
> and should be able to mark them accordingly. Where would be the best
> part to put those symbols (in an enum or table), so that the
> MCStreamer and the disassembler could reference a single place?
It's not the disassembler itself that should know about them, but the driver
for the disassembler. In this case, llvm-objdump. The disassembler doesn't
have that kind of gestalt knowledge.

-Jim

> 
> -- 
> cheers,
> --renato
> 
> http://systemcall.org/

Renato Golin

2012-Oct-10 19:05 UTC

head link

[LLVMdev] R_ARM_ABS32 disassembly with integrated-as

Thanks Jim!

I have updated the bug with your comments, I think it's a good start.

Greg, let me know if that's not enough, I think I can help you from now on.

cheers,
--renato

On 9 October 2012 23:58, Jim Grosbach <grosbach at apple.com>
wrote:>
> On Oct 7, 2012, at 3:14 AM, Renato Golin <rengolin at systemcall.org>
wrote:
>
>> On 5 October 2012 17:48, Jim Grosbach <grosbach at apple.com>
wrote:
>>> The recent MachO data-in-code support should have fixed a lot of
the problems. There's probably still some quirks in the specifics ($a vs. $t
and making sure the symbols get into the ELF properly), but the core
functionality to know how to mark data regions is there and works very well.
>>
>> Hi Jim,
>>
>> I'm trying to help Greg crack it down. From your recent commits, I
>> take it you're re-using a data-in-code detection previously used
only
>> for ASM output, to object output, via the
>> EmitDataRegion/EmitDataRegionEnd.
>>
>
> It's a bit more than that. Those Emit* methods are new for this
support. There was spotty support for the raw $a/$t/$d stuff before, and this
abstracted and extended it to support both asm and binary emission as well as
added uses for the methods to the various bits in the ARM backend where
data-in-code regions get created (jump tables, constant pools, et. al.).
>
>
>> I haven't looked too deep in the MC, but I'm supposing that
will work
>> automatically when the output streamer is printing object code and
>> meets a non-code region, so in theory, changing MCELFStreamer
>> accordingly (overriding those functions in there) would take care of
>> data vs. code issue in ELF.
>
> Yep. They'll likely be implemented as, effectively, an EmitLabel().
>
>> Assuming LLVM doesn't generate ARM/Thumb veneers inside the same
>> function (ie. a Thumb function has only Thumb code), Greg could use
>> the EmitDataRegion and EmitDataRegionEnd, with the former saving the
>> state of the current code (Thumb/Arm) and the latter restoring it, by
>> emiting the $d and $a/t respectively.
>>
>> Does it seem like a good initial approach?
>>
>> Continuing... It seems MCELFStreamer already has a EmitThumbFunc,
>> which looks to me as the wrong place to be.
>
> That's just the handler for the .thumb_func directive. It has nothing
to do with emitting the contents of the actual function.
>
>> I'd imagine MCELFStreamer
>> would have EmitFunc and MCARMELCStreamer (or whatever) would identify
>> its type and call the appropriate EmitThumbFunc/EmitARMFunc. Being
>> pedantic, even that is still too high level because of the ARM/Thumb
>> veneers, but we don't want to worry about that if LLVM doesn't
even
>> try to mix ARM and Thumb (and assuming external libraries would have
>> the symbols, if they do).
>
> This is complicated a bit by needing to work for plain .s files, not just
compiler generated files. Those can intermix arm and thumb code in crazy ways.
>
> The assembler already has a thumb vs. arm mode state (which gets adjusted
via the .arm/.thumb directives and the .code synonyms). ELF will want to check
that state and use it to determine whether a data-region-end directive should
result in a $a or a $t in the output ELF.
>
>> Generating or not, LLVM's disassembler should know about those
symbols
>> and should be able to mark them accordingly. Where would be the best
>> part to put those symbols (in an enum or table), so that the
>> MCStreamer and the disassembler could reference a single place?
>
> It's not the disassembler itself that should know about them, but the
driver for the disassembler. In this case, llvm-objdump. The disassembler
doesn't have that kind of gestalt knowledge.
>
> -Jim
>
>
>>
>> --
>> cheers,
>> --renato
>>
>> http://systemcall.org/
>


-- 
cheers,
--renato

http://systemcall.org/

Jim Grosbach

2012-Oct-10 20:05 UTC

head link

[LLVMdev] R_ARM_ABS32 disassembly with integrated-as

Cool; glad to help.

When I added the data region bits, I tried to keep the ARM-style annotations in
mind a bit, so hopefully things will fit together without too much trouble.

-Jim

On Oct 10, 2012, at 12:05 PM, Renato Golin <rengolin at systemcall.org>
wrote:
> Thanks Jim!
> 
> I have updated the bug with your comments, I think it's a good start.
> 
> Greg, let me know if that's not enough, I think I can help you from now
on.
> 
> cheers,
> --renato
> 
> On 9 October 2012 23:58, Jim Grosbach <grosbach at apple.com> wrote:
>> 
>> On Oct 7, 2012, at 3:14 AM, Renato Golin <rengolin at
systemcall.org> wrote:
>> 
>>> On 5 October 2012 17:48, Jim Grosbach <grosbach at apple.com>
wrote:
>>>> The recent MachO data-in-code support should have fixed a lot
of the problems. There's probably still some quirks in the specifics ($a vs.
$t and making sure the symbols get into the ELF properly), but the core
functionality to know how to mark data regions is there and works very well.
>>> 
>>> Hi Jim,
>>> 
>>> I'm trying to help Greg crack it down. From your recent
commits, I
>>> take it you're re-using a data-in-code detection previously
used only
>>> for ASM output, to object output, via the
>>> EmitDataRegion/EmitDataRegionEnd.
>>> 
>> 
>> It's a bit more than that. Those Emit* methods are new for this
support. There was spotty support for the raw $a/$t/$d stuff before, and this
abstracted and extended it to support both asm and binary emission as well as
added uses for the methods to the various bits in the ARM backend where
data-in-code regions get created (jump tables, constant pools, et. al.).
>> 
>> 
>>> I haven't looked too deep in the MC, but I'm supposing that
will work
>>> automatically when the output streamer is printing object code and
>>> meets a non-code region, so in theory, changing MCELFStreamer
>>> accordingly (overriding those functions in there) would take care
of
>>> data vs. code issue in ELF.
>> 
>> Yep. They'll likely be implemented as, effectively, an EmitLabel().
>> 
>>> Assuming LLVM doesn't generate ARM/Thumb veneers inside the
same
>>> function (ie. a Thumb function has only Thumb code), Greg could use
>>> the EmitDataRegion and EmitDataRegionEnd, with the former saving
the
>>> state of the current code (Thumb/Arm) and the latter restoring it,
by
>>> emiting the $d and $a/t respectively.
>>> 
>>> Does it seem like a good initial approach?
>>> 
>>> Continuing... It seems MCELFStreamer already has a EmitThumbFunc,
>>> which looks to me as the wrong place to be.
>> 
>> That's just the handler for the .thumb_func directive. It has
nothing to do with emitting the contents of the actual function.
>> 
>>> I'd imagine MCELFStreamer
>>> would have EmitFunc and MCARMELCStreamer (or whatever) would
identify
>>> its type and call the appropriate EmitThumbFunc/EmitARMFunc. Being
>>> pedantic, even that is still too high level because of the
ARM/Thumb
>>> veneers, but we don't want to worry about that if LLVM
doesn't even
>>> try to mix ARM and Thumb (and assuming external libraries would
have
>>> the symbols, if they do).
>> 
>> This is complicated a bit by needing to work for plain .s files, not
just compiler generated files. Those can intermix arm and thumb code in crazy
ways.
>> 
>> The assembler already has a thumb vs. arm mode state (which gets
adjusted via the .arm/.thumb directives and the .code synonyms). ELF will want
to check that state and use it to determine whether a data-region-end directive
should result in a $a or a $t in the output ELF.
>> 
>>> Generating or not, LLVM's disassembler should know about those
symbols
>>> and should be able to mark them accordingly. Where would be the
best
>>> part to put those symbols (in an enum or table), so that the
>>> MCStreamer and the disassembler could reference a single place?
>> 
>> It's not the disassembler itself that should know about them, but
the driver for the disassembler. In this case, llvm-objdump. The disassembler
doesn't have that kind of gestalt knowledge.
>> 
>> -Jim
>> 
>> 
>>> 
>>> --
>>> cheers,
>>> --renato
>>> 
>>> http://systemcall.org/
>> 
> 
> 
> 
> -- 
> cheers,
> --renato
> 
> http://systemcall.org/

Reasonably Related Threads

Search for more maybe matching threads

llvm dev - Oct 2012 - [LLVMdev] R_ARM_ABS32 disassembly with integrated-as

[LLVMdev] R_ARM_ABS32 disassembly with integrated-as

[LLVMdev] R_ARM_ABS32 disassembly with integrated-as

[LLVMdev] R_ARM_ABS32 disassembly with integrated-as

Reasonably Related Threads