thr3ads.net - llvm dev - [LLVMdev] [lld] ARM/Thumb atom forming [Jan 2015]

If this information is useful, please help other people find it:
Share via:

Denis Protivensky

2015-Jan-12 14:22 UTC

[LLVMdev] [lld] ARM/Thumb atom forming

Thanks, Shankar.

I needed to override all the places where st_value had been used, and it worked.

But there another problem appeared: after correcting all atoms, I cannot
distinguish between ARM and Thumb symbols in the further stages when fixing up
relocations.

I used to check targetVAddress (in terms of the relocation handler) since it
contained 1 in the least bit when addressing Thumb symbols. Now targetVAddress
always contains 0 in the least bit, because atoms are properly aligned and have
proper contents.

I tried applying a workaround and use dyn_cast to retrieve information from
overridden ARMELFDefinedAtoms, but DefinedAtoms' children do not support
dyn_casts.

In general, I can describe the issue as inability to pass extra information
between linking stages (passes).
Is there a way to do that?

The solution I see is to add a sort of custom context with abstract interface
passed along different stages, and directly cast it to specific implementation
where needed. That's a lot of changes though, so I'd like to hear more
thoughts.

Regards,
  Denis.

On 01/02/2015 09:32 PM, Shankar Easwaran wrote:

You could just override symbolContentSize in the ARM Reader (remove the
last bit to indicate thumb).

Shankar Easwaran

On 12/24/2014 3:09 AM, Denis Protivensky wrote:> Hi guys,
>
> I'm working on ARM architecture support for lld.
> I faced the problem with ARM/Thumb symbols described below.
>
> ARM ELF Reference specifies that symbols addressing Thumb instructions
> have zero bit of st_value field set (see 4.5.3).
> General ELF Reference says that st_value holds virtual address offset
> from the beginning of the section
> for executable files and shared objects (see Chapter 4 - Symbol Values).
>
> When atoms are created in ELFFile::createAtoms, their content size and
> content data, and their addresses are formed using st_value.
> Since st_value has zero bit set for symbols addressing Thumb
> instructions, corresponding atoms' addresses are always
> one byte ahead of real values.
> Content size and, therefore, content data may also be wrong for both ARM
> and Thumb symbols depending on their order (see
ELFFile::symbolContentSize):
> when content size is calculated, it takes the difference between offsets
> of two adjacent symbols, and if one of them is Thumb, and the other is not,
> the resulting value will be one byte smaller or one byte larger than
> expected.
> Therefore, atom's content data is also malformed since it uses given
> miscalculated content size value.
>
> Such a wrong behavior results in:
> - situations when the very first instruction of an atom has the first
> byte set to zero
> (if there's a gap between previous atom and the current, the initial
> instruction's first byte is skipped)
> - situations when the very first instruction is split between two atoms
> (the right atom which should hold the instruction, and the
> previous one, which "stole" the very first byte of the initial
instruction)
>
> Is there a way to override this behavior so that both ARM and Thumb atoms
> formed correctly, and that I can distinguish between them in the later
> stages
> for proper relocation calculations?
>
> Regards!
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu<mailto:LLVMdev at cs.uiuc.edu>        
http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the
Linux Foundation

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150112/47406058/attachment.html>

Shankar Easwaran

2015-Jan-12 16:08 UTC

head link

[LLVMdev] [lld] ARM/Thumb atom forming

You could use the codemodel to say that the code is Thumb for thumb code.

On 1/12/2015 8:22 AM, Denis Protivensky wrote:> Thanks, Shankar.
>
> I needed to override all the places where st_value had been used, and it
worked.
>
> But there another problem appeared: after correcting all atoms, I cannot
distinguish between ARM and Thumb symbols in the further stages when fixing up
relocations.
>
> I used to check targetVAddress (in terms of the relocation handler) since
it contained 1 in the least bit when addressing Thumb symbols. Now
targetVAddress always contains 0 in the least bit, because atoms are properly
aligned and have proper contents.
>
> I tried applying a workaround and use dyn_cast to retrieve information from
overridden ARMELFDefinedAtoms, but DefinedAtoms' children do not support
dyn_casts.
>
> In general, I can describe the issue as inability to pass extra information
between linking stages (passes).
> Is there a way to do that?
>
> The solution I see is to add a sort of custom context with abstract
interface passed along different stages, and directly cast it to specific
implementation where needed. That's a lot of changes though, so I'd like
to hear more thoughts.
>
> Regards,
>    Denis.
>
> On 01/02/2015 09:32 PM, Shankar Easwaran wrote:
>
> You could just override symbolContentSize in the ARM Reader (remove the
> last bit to indicate thumb).
>
> Shankar Easwaran
>
> On 12/24/2014 3:09 AM, Denis Protivensky wrote:
>> Hi guys,
>>
>> I'm working on ARM architecture support for lld.
>> I faced the problem with ARM/Thumb symbols described below.
>>
>> ARM ELF Reference specifies that symbols addressing Thumb instructions
>> have zero bit of st_value field set (see 4.5.3).
>> General ELF Reference says that st_value holds virtual address offset
>> from the beginning of the section
>> for executable files and shared objects (see Chapter 4 - Symbol
Values).
>>
>> When atoms are created in ELFFile::createAtoms, their content size and
>> content data, and their addresses are formed using st_value.
>> Since st_value has zero bit set for symbols addressing Thumb
>> instructions, corresponding atoms' addresses are always
>> one byte ahead of real values.
>> Content size and, therefore, content data may also be wrong for both
ARM
>> and Thumb symbols depending on their order (see
ELFFile::symbolContentSize):
>> when content size is calculated, it takes the difference between
offsets
>> of two adjacent symbols, and if one of them is Thumb, and the other is
not,
>> the resulting value will be one byte smaller or one byte larger than
>> expected.
>> Therefore, atom's content data is also malformed since it uses
given
>> miscalculated content size value.
>>
>> Such a wrong behavior results in:
>> - situations when the very first instruction of an atom has the first
>> byte set to zero
>> (if there's a gap between previous atom and the current, the
initial
>> instruction's first byte is skipped)
>> - situations when the very first instruction is split between two atoms
>> (the right atom which should hold the instruction, and the
>> previous one, which "stole" the very first byte of the
initial instruction)
>>
>> Is there a way to override this behavior so that both ARM and Thumb
atoms
>> formed correctly, and that I can distinguish between them in the later
>> stages
>> for proper relocation calculations?
>>
>> Regards!
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu<mailto:LLVMdev at cs.uiuc.edu>        
http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>
> --
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
by the Linux Foundation
>
>
>

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the
Linux Foundation

Denis Protivensky

2015-Jan-13 06:44 UTC

head link

[LLVMdev] [lld] ARM/Thumb atom forming

Shankar, thank you again.
Now I see how this trick is done in MIPS, so I'll use it as a reference.

- Denis.

On 01/12/2015 07:08 PM, Shankar Easwaran wrote:

You could use the codemodel to say that the code is Thumb for thumb code.

On 1/12/2015 8:22 AM, Denis Protivensky wrote:> Thanks, Shankar.
>
> I needed to override all the places where st_value had been used, and it
worked.
>
> But there another problem appeared: after correcting all atoms, I cannot
distinguish between ARM and Thumb symbols in the further stages when fixing up
relocations.
>
> I used to check targetVAddress (in terms of the relocation handler) since
it contained 1 in the least bit when addressing Thumb symbols. Now
targetVAddress always contains 0 in the least bit, because atoms are properly
aligned and have proper contents.
>
> I tried applying a workaround and use dyn_cast to retrieve information from
overridden ARMELFDefinedAtoms, but DefinedAtoms' children do not support
dyn_casts.
>
> In general, I can describe the issue as inability to pass extra information
between linking stages (passes).
> Is there a way to do that?
>
> The solution I see is to add a sort of custom context with abstract
interface passed along different stages, and directly cast it to specific
implementation where needed. That's a lot of changes though, so I'd like
to hear more thoughts.
>
> Regards,
>    Denis.
>
> On 01/02/2015 09:32 PM, Shankar Easwaran wrote:
>
> You could just override symbolContentSize in the ARM Reader (remove the
> last bit to indicate thumb).
>
> Shankar Easwaran
>
> On 12/24/2014 3:09 AM, Denis Protivensky wrote:
>> Hi guys,
>>
>> I'm working on ARM architecture support for lld.
>> I faced the problem with ARM/Thumb symbols described below.
>>
>> ARM ELF Reference specifies that symbols addressing Thumb instructions
>> have zero bit of st_value field set (see 4.5.3).
>> General ELF Reference says that st_value holds virtual address offset
>> from the beginning of the section
>> for executable files and shared objects (see Chapter 4 - Symbol
Values).
>>
>> When atoms are created in ELFFile::createAtoms, their content size and
>> content data, and their addresses are formed using st_value.
>> Since st_value has zero bit set for symbols addressing Thumb
>> instructions, corresponding atoms' addresses are always
>> one byte ahead of real values.
>> Content size and, therefore, content data may also be wrong for both
ARM
>> and Thumb symbols depending on their order (see
ELFFile::symbolContentSize):
>> when content size is calculated, it takes the difference between
offsets
>> of two adjacent symbols, and if one of them is Thumb, and the other is
not,
>> the resulting value will be one byte smaller or one byte larger than
>> expected.
>> Therefore, atom's content data is also malformed since it uses
given
>> miscalculated content size value.
>>
>> Such a wrong behavior results in:
>> - situations when the very first instruction of an atom has the first
>> byte set to zero
>> (if there's a gap between previous atom and the current, the
initial
>> instruction's first byte is skipped)
>> - situations when the very first instruction is split between two atoms
>> (the right atom which should hold the instruction, and the
>> previous one, which "stole" the very first byte of the
initial instruction)
>>
>> Is there a way to override this behavior so that both ARM and Thumb
atoms
>> formed correctly, and that I can distinguish between them in the later
>> stages
>> for proper relocation calculations?
>>
>> Regards!
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu<mailto:LLVMdev at
cs.uiuc.edu><mailto:LLVMdev at cs.uiuc.edu>        
http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>
> --
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
by the Linux Foundation
>
>
>

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the
Linux Foundation


-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150112/26f29525/attachment.html>

Maybe Matching Threads

Search for more reasonably related threads

llvm dev - Jan 2015 - [LLVMdev] [lld] ARM/Thumb atom forming

[LLVMdev] [lld] ARM/Thumb atom forming

[LLVMdev] [lld] ARM/Thumb atom forming

[LLVMdev] [lld] ARM/Thumb atom forming

Maybe Matching Threads