thr3ads.net - llvm dev - [LLVMdev] [lld][ELF] How to transfer st_other field value from input to output file [Nov 2014]

If this information is useful, please help other people find it:
Share via:

Nick Kledzik

2014-Nov-11 19:19 UTC

[LLVMdev] [lld][ELF] How to transfer st_other field value from input to output file

I had a similar issue with arm vs thumb in mach-o.  Each function’s thumbness is
marked in its symbol table entry.

But it is even worse, a function could change encoding in the middle (only hand
coded assembly could do this).

My solution was to add a new Reference Kind for mach-o which is the current
instruction encoding.  The offsetInAtom() is the offset where the encoding kind
changes.  Usually there is just one at offset zero that sets the encoding for
the whole function.  So determining the thumbness requires scanning the
References.  But it turns out in practice the scan is rarely done because the
result can be cached by whatever algorithm needs that info.

-Nick


On Nov 11, 2014, at 6:50 AM, Simon Atanasyan <simon at atanasyan.com>
wrote:> I was too optimistic. It is possible to use the contentTypes field for
> handling STO_MICROMIPS and I have a working solution but the solution
> is really ugly. This approach has at least two the following
> shortcomings:
> 
> 1. A MIPS ELF symbol can hold multiple STO_xxx flags stored in the
> st_other field (STO_MIPS_PIC, STO_MIPS_MICROMIPS, STO_MIPS_MIPS16
> ...). Sometimes these flags can be even combined. If we use the
> contentTypes field, we have to define a separate ContentType flag for
> each such combination. So we get a combinatorics explosion.
> 
> 2. If we handle MIPS specific ContentType flags together with other
> flags, it is pollute the common ELF code. If we factor out the
> processing of MIPS specific flags, we have to duplicate code because a
> symbol with say STO_MICROMIPS flag should be processed (setup size,
> permissions etc) the same way as a regular DefinedAtom::typeCode
> symbol.
> 
> I considered to create a map symbol name => symbol flags, fill this
> map while read object files, and use the map while write a linked
> file. But I need to handle both local and global symbols and it is
> possible to get symbols with the same name.
> 
> It looks like the only solution (if I do not miss anything else) is to
> add one more filed to the DefinedAtom class to hold
> target/architecture specific set of flags and modify Native and YAML
> formats correspondingly. Interpretation of this field is completely
> target/architecture dependent.
> 
> Any opinions?
> 
> On Thu, Nov 6, 2014 at 7:09 PM, Simon Atanasyan <simon at
atanasyan.com> wrote:
>> STO_MIPS16 and STO_MICROMIPS flags denote that the symbol use a
>> different "compressed" instructions encoding. Both these
flags can be
>> combined with usual "visibility" flags.
>> 
>> It looks like adding new flag into the contentTypes set might solve
>> the problem. Thanks for the idea. I try to implement it.
>> 
>> On Thu, Nov 6, 2014 at 6:52 PM, Shankar Easwaran
>> <shankare at codeaurora.org> wrote:
>>> One way to do that is to add new visibility / contentTypes
(whatever is
>>> relevant) added for each of the values st_other picks ?
>>> 
>>> What are the other values st_other can take on MIPS ?
>>> 
>>> On 11/6/2014 8:50 AM, Simon Atanasyan wrote:
>>>> On MIPS st_other field in the ELF symbols table might contain
some
>>>> additional MIPS-specific flags besides visibility ones. These
flags
>>>> should be copied to the output linked file. If YAML =>
Native
>>>> conversion is switched off, there is no problem. But in case of
the
>>>> conversion we lose st_other field values.
>>>> 
>>>> So I need an advice how to keep this information. Is it a good
idea to
>>>> extend YAML and Native format to store these data? Is there any
>>>> alternative solutions?
> 
> -- 
> Simon Atanasyan

Shankar Easwaran

2014-Nov-11 19:39 UTC

head link

[LLVMdev] [lld][ELF] How to transfer st_other field value from input to output file

On 11/11/2014 1:19 PM, Nick Kledzik wrote:>   had a similar issue with arm vs thumb in mach-o.  Each function’s
thumbness is marked in its symbol table entry.
>
> But it is even worse, a function could change encoding in the middle (only
hand coded assembly could do this).
>
> My solution was to add a new Reference Kind for mach-o which is the current
instruction encoding.  The offsetInAtom() is the offset where the encoding kind
changes.  Usually there is just one at offset zero that sets the encoding for
the whole function.  So determining the thumbness requires scanning the
References.  But it turns out in practice the scan is rarely done because the
result can be cached by whatever algorithm needs that infolld needs to have some way to encode flavor specific attributes/target 
specific attributes. This is becoming more important IMHO.

Shankar Easwaran

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the
Linux Foundation

Simon Atanasyan

2014-Nov-11 19:44 UTC

head link

[LLVMdev] [lld][ELF] How to transfer st_other field value from input to output file

I did the same trick to mark some sort of MIPS GOT entries. And sure I
can do the same thing in the current case. But I think using Reference
as a flag is not a good solution.

Anyway if I do not find a better solution, I will have to use this
workaround once again.

On Tue, Nov 11, 2014 at 10:19 PM, Nick Kledzik <kledzik at apple.com>
wrote:> I had a similar issue with arm vs thumb in mach-o.  Each function’s
thumbness is marked in its symbol table entry.
>
> But it is even worse, a function could change encoding in the middle (only
hand coded assembly could do this).
>
> My solution was to add a new Reference Kind for mach-o which is the current
instruction encoding.  The offsetInAtom() is the offset where the encoding kind
changes.  Usually there is just one at offset zero that sets the encoding for
the whole function.  So determining the thumbness requires scanning the
References.  But it turns out in practice the scan is rarely done because the
result can be cached by whatever algorithm needs that info.
>
> -Nick
>
>
> On Nov 11, 2014, at 6:50 AM, Simon Atanasyan <simon at atanasyan.com>
wrote:
>> I was too optimistic. It is possible to use the contentTypes field for
>> handling STO_MICROMIPS and I have a working solution but the solution
>> is really ugly. This approach has at least two the following
>> shortcomings:
>>
>> 1. A MIPS ELF symbol can hold multiple STO_xxx flags stored in the
>> st_other field (STO_MIPS_PIC, STO_MIPS_MICROMIPS, STO_MIPS_MIPS16
>> ...). Sometimes these flags can be even combined. If we use the
>> contentTypes field, we have to define a separate ContentType flag for
>> each such combination. So we get a combinatorics explosion.
>>
>> 2. If we handle MIPS specific ContentType flags together with other
>> flags, it is pollute the common ELF code. If we factor out the
>> processing of MIPS specific flags, we have to duplicate code because a
>> symbol with say STO_MICROMIPS flag should be processed (setup size,
>> permissions etc) the same way as a regular DefinedAtom::typeCode
>> symbol.
>>
>> I considered to create a map symbol name => symbol flags, fill this
>> map while read object files, and use the map while write a linked
>> file. But I need to handle both local and global symbols and it is
>> possible to get symbols with the same name.
>>
>> It looks like the only solution (if I do not miss anything else) is to
>> add one more filed to the DefinedAtom class to hold
>> target/architecture specific set of flags and modify Native and YAML
>> formats correspondingly. Interpretation of this field is completely
>> target/architecture dependent.
>>
>> Any opinions?
>>
>> On Thu, Nov 6, 2014 at 7:09 PM, Simon Atanasyan <simon at
atanasyan.com> wrote:
>>> STO_MIPS16 and STO_MICROMIPS flags denote that the symbol use a
>>> different "compressed" instructions encoding. Both these
flags can be
>>> combined with usual "visibility" flags.
>>>
>>> It looks like adding new flag into the contentTypes set might solve
>>> the problem. Thanks for the idea. I try to implement it.
>>>
>>> On Thu, Nov 6, 2014 at 6:52 PM, Shankar Easwaran
>>> <shankare at codeaurora.org> wrote:
>>>> One way to do that is to add new visibility / contentTypes
(whatever is
>>>> relevant) added for each of the values st_other picks ?
>>>>
>>>> What are the other values st_other can take on MIPS ?
>>>>
>>>> On 11/6/2014 8:50 AM, Simon Atanasyan wrote:
>>>>> On MIPS st_other field in the ELF symbols table might
contain some
>>>>> additional MIPS-specific flags besides visibility ones.
These flags
>>>>> should be copied to the output linked file. If YAML =>
Native
>>>>> conversion is switched off, there is no problem. But in
case of the
>>>>> conversion we lose st_other field values.
>>>>>
>>>>> So I need an advice how to keep this information. Is it a
good idea to
>>>>> extend YAML and Native format to store these data? Is there
any
>>>>> alternative solutions?
>>
>> --
>> Simon Atanasyan
>


-- 
Simon Atanasyan

Nick Kledzik

2014-Nov-11 19:51 UTC

head link

[LLVMdev] [lld][ELF] How to transfer st_other field value from input to output file

On Nov 11, 2014, at 11:44 AM, Simon Atanasyan <simon at atanasyan.com>
wrote:> I did the same trick to mark some sort of MIPS GOT entries. And sure I
> can do the same thing in the current case. But I think using Reference
> as a flag is not a good solution.
> 
> Anyway if I do not find a better solution, I will have to use this
> workaround once again.If you really have multiple different code types in one atom, then you must have
a list of transition points.  The Reference list is a great match for that,
whereas an atom level attribute would not work.

-Nick
> 
> On Tue, Nov 11, 2014 at 10:19 PM, Nick Kledzik <kledzik at apple.com>
wrote:
>> I had a similar issue with arm vs thumb in mach-o.  Each function’s
thumbness is marked in its symbol table entry.
>> 
>> But it is even worse, a function could change encoding in the middle
(only hand coded assembly could do this).
>> 
>> My solution was to add a new Reference Kind for mach-o which is the
current instruction encoding.  The offsetInAtom() is the offset where the
encoding kind changes.  Usually there is just one at offset zero that sets the
encoding for the whole function.  So determining the thumbness requires scanning
the References.  But it turns out in practice the scan is rarely done because
the result can be cached by whatever algorithm needs that info.
>> 
>> -Nick
>> 
>> 
>> On Nov 11, 2014, at 6:50 AM, Simon Atanasyan <simon at
atanasyan.com> wrote:
>>> I was too optimistic. It is possible to use the contentTypes field
for
>>> handling STO_MICROMIPS and I have a working solution but the
solution
>>> is really ugly. This approach has at least two the following
>>> shortcomings:
>>> 
>>> 1. A MIPS ELF symbol can hold multiple STO_xxx flags stored in the
>>> st_other field (STO_MIPS_PIC, STO_MIPS_MICROMIPS, STO_MIPS_MIPS16
>>> ...). Sometimes these flags can be even combined. If we use the
>>> contentTypes field, we have to define a separate ContentType flag
for
>>> each such combination. So we get a combinatorics explosion.
>>> 
>>> 2. If we handle MIPS specific ContentType flags together with other
>>> flags, it is pollute the common ELF code. If we factor out the
>>> processing of MIPS specific flags, we have to duplicate code
because a
>>> symbol with say STO_MICROMIPS flag should be processed (setup size,
>>> permissions etc) the same way as a regular DefinedAtom::typeCode
>>> symbol.
>>> 
>>> I considered to create a map symbol name => symbol flags, fill
this
>>> map while read object files, and use the map while write a linked
>>> file. But I need to handle both local and global symbols and it is
>>> possible to get symbols with the same name.
>>> 
>>> It looks like the only solution (if I do not miss anything else) is
to
>>> add one more filed to the DefinedAtom class to hold
>>> target/architecture specific set of flags and modify Native and
YAML
>>> formats correspondingly. Interpretation of this field is completely
>>> target/architecture dependent.
>>> 
>>> Any opinions?
>>> 
>>> On Thu, Nov 6, 2014 at 7:09 PM, Simon Atanasyan <simon at
atanasyan.com> wrote:
>>>> STO_MIPS16 and STO_MICROMIPS flags denote that the symbol use a
>>>> different "compressed" instructions encoding. Both
these flags can be
>>>> combined with usual "visibility" flags.
>>>> 
>>>> It looks like adding new flag into the contentTypes set might
solve
>>>> the problem. Thanks for the idea. I try to implement it.
>>>> 
>>>> On Thu, Nov 6, 2014 at 6:52 PM, Shankar Easwaran
>>>> <shankare at codeaurora.org> wrote:
>>>>> One way to do that is to add new visibility / contentTypes
(whatever is
>>>>> relevant) added for each of the values st_other picks ?
>>>>> 
>>>>> What are the other values st_other can take on MIPS ?
>>>>> 
>>>>> On 11/6/2014 8:50 AM, Simon Atanasyan wrote:
>>>>>> On MIPS st_other field in the ELF symbols table might
contain some
>>>>>> additional MIPS-specific flags besides visibility ones.
These flags
>>>>>> should be copied to the output linked file. If YAML
=> Native
>>>>>> conversion is switched off, there is no problem. But in
case of the
>>>>>> conversion we lose st_other field values.
>>>>>> 
>>>>>> So I need an advice how to keep this information. Is it
a good idea to
>>>>>> extend YAML and Native format to store these data? Is
there any
>>>>>> alternative solutions?
>>> 
>>> --
>>> Simon Atanasyan
>> 
> 
> 
> 
> -- 
> Simon Atanasyan

Simon Atanasyan

2014-Nov-11 19:51 UTC

head link

[LLVMdev] [lld][ELF] How to transfer st_other field value from input to output file

On Tue, Nov 11, 2014 at 10:39 PM, Shankar Easwaran
<shankare at codeaurora.org> wrote:> lld needs to have some way to encode flavor specific attributes/target
> specific attributes. This is becoming more important IMHO.
I agree. I think we cannot fit all our supported architectures and
targets into the common format without any arch/tgt specific
information.

-- 
Simon Atanasyan

Rui Ueyama

2014-Nov-11 19:54 UTC

head link

[LLVMdev] [lld][ELF] How to transfer st_other field value from input to output file

This falls into the usual topic that whether or not we should have a
generic map attached to an atom. You used a reference as an alternative for
the map in this case but the basic idea is the same.

Although using a reference would be practical, it still feels a hack to me.
It's awkward at least. Why don't you add an accessor to the attribute
you
want to DefinedAtom? We'll have a few or maybe ten more member functions in
DefinedAtom, but it's not bad -- architectures that don't need them are
able to just not use them. And the number of attributes we want is limited
because the number of architectures we want to support in LLD is not that
many.

On Tue, Nov 11, 2014 at 11:19 AM, Nick Kledzik <kledzik at apple.com>
wrote:
> I had a similar issue with arm vs thumb in mach-o.  Each function’s
> thumbness is marked in its symbol table entry.
>
> But it is even worse, a function could change encoding in the middle (only
> hand coded assembly could do this).
>
> My solution was to add a new Reference Kind for mach-o which is the
> current instruction encoding.  The offsetInAtom() is the offset where the
> encoding kind changes.  Usually there is just one at offset zero that sets
> the encoding for the whole function.  So determining the thumbness requires
> scanning the References.  But it turns out in practice the scan is rarely
> done because the result can be cached by whatever algorithm needs that
info.
>
> -Nick
>
>
> On Nov 11, 2014, at 6:50 AM, Simon Atanasyan <simon at atanasyan.com>
wrote:
> > I was too optimistic. It is possible to use the contentTypes field for
> > handling STO_MICROMIPS and I have a working solution but the solution
> > is really ugly. This approach has at least two the following
> > shortcomings:
> >
> > 1. A MIPS ELF symbol can hold multiple STO_xxx flags stored in the
> > st_other field (STO_MIPS_PIC, STO_MIPS_MICROMIPS, STO_MIPS_MIPS16
> > ...). Sometimes these flags can be even combined. If we use the
> > contentTypes field, we have to define a separate ContentType flag for
> > each such combination. So we get a combinatorics explosion.
> >
> > 2. If we handle MIPS specific ContentType flags together with other
> > flags, it is pollute the common ELF code. If we factor out the
> > processing of MIPS specific flags, we have to duplicate code because a
> > symbol with say STO_MICROMIPS flag should be processed (setup size,
> > permissions etc) the same way as a regular DefinedAtom::typeCode
> > symbol.
> >
> > I considered to create a map symbol name => symbol flags, fill this
> > map while read object files, and use the map while write a linked
> > file. But I need to handle both local and global symbols and it is
> > possible to get symbols with the same name.
> >
> > It looks like the only solution (if I do not miss anything else) is to
> > add one more filed to the DefinedAtom class to hold
> > target/architecture specific set of flags and modify Native and YAML
> > formats correspondingly. Interpretation of this field is completely
> > target/architecture dependent.
> >
> > Any opinions?
> >
> > On Thu, Nov 6, 2014 at 7:09 PM, Simon Atanasyan <simon at
atanasyan.com>
> wrote:
> >> STO_MIPS16 and STO_MICROMIPS flags denote that the symbol use a
> >> different "compressed" instructions encoding. Both these
flags can be
> >> combined with usual "visibility" flags.
> >>
> >> It looks like adding new flag into the contentTypes set might
solve
> >> the problem. Thanks for the idea. I try to implement it.
> >>
> >> On Thu, Nov 6, 2014 at 6:52 PM, Shankar Easwaran
> >> <shankare at codeaurora.org> wrote:
> >>> One way to do that is to add new visibility / contentTypes
(whatever is
> >>> relevant) added for each of the values st_other picks ?
> >>>
> >>> What are the other values st_other can take on MIPS ?
> >>>
> >>> On 11/6/2014 8:50 AM, Simon Atanasyan wrote:
> >>>> On MIPS st_other field in the ELF symbols table might
contain some
> >>>> additional MIPS-specific flags besides visibility ones.
These flags
> >>>> should be copied to the output linked file. If YAML =>
Native
> >>>> conversion is switched off, there is no problem. But in
case of the
> >>>> conversion we lose st_other field values.
> >>>>
> >>>> So I need an advice how to keep this information. Is it a
good idea to
> >>>> extend YAML and Native format to store these data? Is
there any
> >>>> alternative solutions?
> >
> > --
> > Simon Atanasyan
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141111/3910c81b/attachment.html>

Nick Kledzik

2014-Nov-11 20:31 UTC

head link

[LLVMdev] [lld][ELF] How to transfer st_other field value from input to output file

On Nov 11, 2014, at 11:54 AM, Rui Ueyama <ruiu at google.com>
wrote:> This falls into the usual topic that whether or not we should have a
generic map attached to an atom. You used a reference as an alternative for the
map in this case but the basic idea is the same.
> 
> Although using a reference would be practical, it still feels a hack to me.
It's awkward at least. Why don't you add an accessor to the attribute
you want to DefinedAtom? We'll have a few or maybe ten more member functions
in DefinedAtom, but it's not bad -- architectures that don't need them
are able to just not use them. And the number of attributes we want is limited
because the number of architectures we want to support in LLD is not that many.If there are architecture/platform specific atom attributes, I’ve fine with
adding more accessors to DefinedAtom.  We just need to review them to see if
there is similar needs on multiple flavors and design names and values that are
clear.

Regarding References, the ELF flavor puts the raw ELF relocation type as the
Reference Kind.  Mach-o does not do that.  The mach-o relocation type is only 4
bits.  You need to process lots of other information (including other bits in
the reloc record, the instruction content, and perhaps a “paired” relocation to
determine the “kind”).  So, Mach-O Reference Kind values are abstract and
internal to the mach-o ArchHandler.  Given that, using a Reference Kind to track
thumbness (which only ArchHander_arm cares about), works well.

That said, the ability to handle thumb and arm within a function is probably
over engineering.  I’d be fine with adding to DefinedAtom something like:

  enum CodeModel { 
     // Note: all these values need word smithing
    codeNA,
    codeMIPS_PIC,
    codeMIPS_micro,
    codeMIPS_16,
    codeARM_16,
    codeARM_32,
 };

virtual CodeModel codeModel() { return codeNA; }

-Nick

> 
> On Tue, Nov 11, 2014 at 11:19 AM, Nick Kledzik <kledzik at apple.com>
wrote:
> I had a similar issue with arm vs thumb in mach-o.  Each function’s
thumbness is marked in its symbol table entry.
> 
> But it is even worse, a function could change encoding in the middle (only
hand coded assembly could do this).
> 
> My solution was to add a new Reference Kind for mach-o which is the current
instruction encoding.  The offsetInAtom() is the offset where the encoding kind
changes.  Usually there is just one at offset zero that sets the encoding for
the whole function.  So determining the thumbness requires scanning the
References.  But it turns out in practice the scan is rarely done because the
result can be cached by whatever algorithm needs that info.
> 
> -Nick
> 
> 
> On Nov 11, 2014, at 6:50 AM, Simon Atanasyan <simon at atanasyan.com>
wrote:
> > I was too optimistic. It is possible to use the contentTypes field for
> > handling STO_MICROMIPS and I have a working solution but the solution
> > is really ugly. This approach has at least two the following
> > shortcomings:
> >
> > 1. A MIPS ELF symbol can hold multiple STO_xxx flags stored in the
> > st_other field (STO_MIPS_PIC, STO_MIPS_MICROMIPS, STO_MIPS_MIPS16
> > ...). Sometimes these flags can be even combined. If we use the
> > contentTypes field, we have to define a separate ContentType flag for
> > each such combination. So we get a combinatorics explosion.
> >
> > 2. If we handle MIPS specific ContentType flags together with other
> > flags, it is pollute the common ELF code. If we factor out the
> > processing of MIPS specific flags, we have to duplicate code because a
> > symbol with say STO_MICROMIPS flag should be processed (setup size,
> > permissions etc) the same way as a regular DefinedAtom::typeCode
> > symbol.
> >
> > I considered to create a map symbol name => symbol flags, fill this
> > map while read object files, and use the map while write a linked
> > file. But I need to handle both local and global symbols and it is
> > possible to get symbols with the same name.
> >
> > It looks like the only solution (if I do not miss anything else) is to
> > add one more filed to the DefinedAtom class to hold
> > target/architecture specific set of flags and modify Native and YAML
> > formats correspondingly. Interpretation of this field is completely
> > target/architecture dependent.
> >
> > Any opinions?
> >
> > On Thu, Nov 6, 2014 at 7:09 PM, Simon Atanasyan <simon at
atanasyan.com> wrote:
> >> STO_MIPS16 and STO_MICROMIPS flags denote that the symbol use a
> >> different "compressed" instructions encoding. Both these
flags can be
> >> combined with usual "visibility" flags.
> >>
> >> It looks like adding new flag into the contentTypes set might
solve
> >> the problem. Thanks for the idea. I try to implement it.
> >>
> >> On Thu, Nov 6, 2014 at 6:52 PM, Shankar Easwaran
> >> <shankare at codeaurora.org> wrote:
> >>> One way to do that is to add new visibility / contentTypes
(whatever is
> >>> relevant) added for each of the values st_other picks ?
> >>>
> >>> What are the other values st_other can take on MIPS ?
> >>>
> >>> On 11/6/2014 8:50 AM, Simon Atanasyan wrote:
> >>>> On MIPS st_other field in the ELF symbols table might
contain some
> >>>> additional MIPS-specific flags besides visibility ones.
These flags
> >>>> should be copied to the output linked file. If YAML =>
Native
> >>>> conversion is switched off, there is no problem. But in
case of the
> >>>> conversion we lose st_other field values.
> >>>>
> >>>> So I need an advice how to keep this information. Is it a
good idea to
> >>>> extend YAML and Native format to store these data? Is
there any
> >>>> alternative solutions?
> >
> > --
> > Simon Atanasyan
> 
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141111/236765ef/attachment.html>

llvm dev - Nov 2014 - [LLVMdev] [lld][ELF] How to transfer st_other field value from input to output file

[LLVMdev] [lld][ELF] How to transfer st_other field value from input to output file

[LLVMdev] [lld][ELF] How to transfer st_other field value from input to output file

[LLVMdev] [lld][ELF] How to transfer st_other field value from input to output file

[LLVMdev] [lld][ELF] How to transfer st_other field value from input to output file

[LLVMdev] [lld][ELF] How to transfer st_other field value from input to output file

[LLVMdev] [lld][ELF] How to transfer st_other field value from input to output file

[LLVMdev] [lld][ELF] How to transfer st_other field value from input to output file