thr3ads.net - llvm dev - [LLVMdev] [lld] Undefined symbols postprocessing [Feb 2015]

If this information is useful, please help other people find it:
Share via:

Shankar Easwaran

2015-Feb-19 17:15 UTC

[LLVMdev] [lld] Undefined symbols postprocessing

+ Nick

On 2/19/2015 9:00 AM, Shankar Easwaran wrote:> On 2/19/2015 3:58 AM, Denis Protivensky wrote:
>> Joerg:
>>> I propose to add the ability to ignore undefined symbols during
initial
>>> resolution, and then postprocess only those undefines for the
second
>>> time
>>> after the pass manager execution.
>> Do you want to do that before or after dead code elimination?
>> I think dead code elimination should be performed after all possible 
>> object code modifications done by lld. Therefore, it should be done 
>> after undefines' postprocessing as well.
> Gnu does dead code elimination before undefines are reported. So if a 
> function is not called and it has a undefined reference its would not 
> be an undef.
>>
>> Shankar:
>>> I propose to add the ability to ignore undefined symbols during
initial
>>> resolution, and then postprocess only those undefines for the
second
>>> time
>>> after the pass manager execution.
>> I came across this same problem, and was planning on adding a
>> notifyUndefinedSymbol to the LinkingContext, if the linker wants to add
>> a defined symbol and coalesce it, it would be possible.
>>
>> Do you think this will work for your case too ?
>> With this option, I don't see:
>> - how to postpone processing and reaction on undefines. If the 
>> callback is called from within Resolver::resolve(), you should react 
>> on it immediately, because otherwise the code will still fail in 
>> Resolver::resolve().
>> - how to know if a symbol is needed within the callback body. The 
>> need of any symbol is determined in some other place. So I need to 
>> keep a sort of indication (boolean flags, whatever) to know which 
>> symbols are really needed.
>> - the exact interface of notifyUndefinedSymbol callback. If it 
>> receives `StringRef` name of the undefined symbol, what reaction 
>> should be? Should it return new symbols to add back to the caller as 
>> `const Atom*`?
> notifyUndefinedSymbol will allow the context to coalesce the undefined 
> atom with a defined atom.
>
> Atom *notifyUndefinedSymbol(StringRef name) could be the interface.
>
>> Thanks,
>>    Denis.
>>
>
>

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the
Linux Foundation

Denis Protivensky

2015-Feb-20 06:40 UTC

head link

[LLVMdev] [lld] Undefined symbols postprocessing

Shankar,

Okay, I guessed the correct interface.
But what about the moment at which the function is called?
If it's called from Resolver::resolve(), it doesn't make any difference
to me as I cannot determine the need of specific symbols at that time.

- Denis.

On 02/19/2015 08:15 PM, Shankar Easwaran wrote:

+ Nick

On 2/19/2015 9:00 AM, Shankar Easwaran wrote:> On 2/19/2015 3:58 AM, Denis Protivensky wrote:
>> Joerg:
>>> I propose to add the ability to ignore undefined symbols during
initial
>>> resolution, and then postprocess only those undefines for the
second
>>> time
>>> after the pass manager execution.
>> Do you want to do that before or after dead code elimination?
>> I think dead code elimination should be performed after all possible
>> object code modifications done by lld. Therefore, it should be done
>> after undefines' postprocessing as well.
> Gnu does dead code elimination before undefines are reported. So if a
> function is not called and it has a undefined reference its would not
> be an undef.
>>
>> Shankar:
>>> I propose to add the ability to ignore undefined symbols during
initial
>>> resolution, and then postprocess only those undefines for the
second
>>> time
>>> after the pass manager execution.
>> I came across this same problem, and was planning on adding a
>> notifyUndefinedSymbol to the LinkingContext, if the linker wants to add
>> a defined symbol and coalesce it, it would be possible.
>>
>> Do you think this will work for your case too ?
>> With this option, I don't see:
>> - how to postpone processing and reaction on undefines. If the
>> callback is called from within Resolver::resolve(), you should react
>> on it immediately, because otherwise the code will still fail in
>> Resolver::resolve().
>> - how to know if a symbol is needed within the callback body. The
>> need of any symbol is determined in some other place. So I need to
>> keep a sort of indication (boolean flags, whatever) to know which
>> symbols are really needed.
>> - the exact interface of notifyUndefinedSymbol callback. If it
>> receives `StringRef` name of the undefined symbol, what reaction
>> should be? Should it return new symbols to add back to the caller as
>> `const Atom*`?
> notifyUndefinedSymbol will allow the context to coalesce the undefined
> atom with a defined atom.
>
> Atom *notifyUndefinedSymbol(StringRef name) could be the interface.
>
>> Thanks,
>>    Denis.
>>
>
>

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the
Linux Foundation


-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150219/e3ebf5ca/attachment.html>

Denis Protivensky

2015-Feb-20 06:59 UTC

head link

[LLVMdev] [lld] Undefined symbols postprocessing

Joerg:
How do you then make sure to not export redundant symbols? Consider
_GLOBAL_OFFSET_TABLE_ -- if the only user is in a dead function, it
should not be in the symbol table. Same for __tls_get_addr.
I agree that dead code elimination needs additional consideration, but my
problem is that lld pollutes the symbol table inserting symbols unconditionally.
I'd want to find a solution to this problem first as it generates even more
redundant symbols right now.

Rui:
I don't know if this is directly applicable to your problem, but for PE/COFF
I needed to add symbols conditionally. If you have a function func and if
there's a reference to __imp_func, linker needs to create a data containing
the address of func as __imp_func content. It's rarely used, so I wanted to
create the __imp_ atom only when there's an unresolved reference to that
symbol.

What I did at that moment is to define a (virtual) library file which
dynamically creates an atom. The virtual library file is added at end of the
input file list, and if the core linker looks it up for a symbol starting
__imp_, the library creates an object file containing the symbol on the fly and
returns it.

My experience of doing that is that worked but might have been too tricky. If
this trick is directly applicable to your problem, you may want to do that. If
not, I'm perhaps okay with your suggestion (although I didn't think
about that hard yet.)
Looks like your trick won't work for me, because the virtual library you add
is parsed in the Resolver::resolve() method where I don't have enough
knowledge whether to add specific symbols or not. My problem is that I can only
do it in the relocation pass (or some other pass if needed), which goes after
symbol resolution.

Thanks,
  Denis.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150219/cc8254b6/attachment.html>

Shankar Easwaran

2015-Feb-23 19:33 UTC

head link

[LLVMdev] [lld] Undefined symbols postprocessing

On 2/20/2015 12:59 AM, Denis Protivensky wrote:> Joerg:
> How do you then make sure to not export redundant symbols? Consider
> _GLOBAL_OFFSET_TABLE_ -- if the only user is in a dead function, it
> should not be in the symbol table. Same for __tls_get_addr.
> I agree that dead code elimination needs additional consideration, but my
problem is that lld pollutes the symbol table inserting symbols unconditionally.
I'd want to find a solution to this problem first as it generates even more
redundant symbols right now.
>
> Rui:
> I don't know if this is directly applicable to your problem, but for
PE/COFF I needed to add symbols conditionally. If you have a function func and
if there's a reference to __imp_func, linker needs to create a data
containing the address of func as __imp_func content. It's rarely used, so I
wanted to create the __imp_ atom only when there's an unresolved reference
to that symbol.
>
> What I did at that moment is to define a (virtual) library file which
dynamically creates an atom. The virtual library file is added at end of the
input file list, and if the core linker looks it up for a symbol starting
__imp_, the library creates an object file containing the symbol on the fly and
returns it.
>
> My experience of doing that is that worked but might have been too tricky.
If this trick is directly applicable to your problem, you may want to do that.
If not, I'm perhaps okay with your suggestion (although I didn't think
about that hard yet.)
> Looks like your trick won't work for me, because the virtual library
you add is parsed in the Resolver::resolve() method where I don't have
enough knowledge whether to add specific symbols or not. My problem is that I
can only do it in the relocation pass (or some other pass if needed), which goes
after symbol resolution.Not sure why you want to call the Resolver again, wouldnt this API 
suffice ? You can have a new API in the symbol table class called from 
the resolver, with the list of undefined symbols.

Shankar Easwaran

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the
Linux Foundation

Michael Spencer

2015-Feb-23 22:52 UTC

head link

[LLVMdev] [lld] Undefined symbols postprocessing

On Thu, Feb 19, 2015 at 10:40 PM, Denis Protivensky
<dprotivensky at accesssoftek.com> wrote:> Shankar,
>
> Okay, I guessed the correct interface.
> But what about the moment at which the function is called?
> If it's called from Resolver::resolve(), it doesn't make any
difference to
> me as I cannot determine the need of specific symbols at that time.
>
> - Denis.
None of the symbols we are looking up require the full resolver, and
they are all special linker symbols. I propose two things.

1. Provide a hook as per what Shankar suggested for the resolver. User
references to linker defined symbols such as _GLOBAL_OFFSET_TABLE_ get
created and possibly deadstripped here. The linking context owns the
atom.
2. The ELFLinkingContext gains <Atom
*getOrCreateLinkerDefinedAtom(StringRef);>. This can be used in passes
to get the symbols. The hook in (1) would call this to create the
atoms.

This gives a single place where linker defined atoms are actually
created, and allows correct deadstripping and object file references
without doing multiple resolver passes.

- Michael Spencer
>
>
> On 02/19/2015 08:15 PM, Shankar Easwaran wrote:
>
> + Nick
>
> On 2/19/2015 9:00 AM, Shankar Easwaran wrote:
>> On 2/19/2015 3:58 AM, Denis Protivensky wrote:
>>> Joerg:
>>>> I propose to add the ability to ignore undefined symbols during
initial
>>>> resolution, and then postprocess only those undefines for the
second
>>>> time
>>>> after the pass manager execution.
>>> Do you want to do that before or after dead code elimination?
>>> I think dead code elimination should be performed after all
possible
>>> object code modifications done by lld. Therefore, it should be done
>>> after undefines' postprocessing as well.
>> Gnu does dead code elimination before undefines are reported. So if a
>> function is not called and it has a undefined reference its would not
>> be an undef.
>>>
>>> Shankar:
>>>> I propose to add the ability to ignore undefined symbols during
initial
>>>> resolution, and then postprocess only those undefines for the
second
>>>> time
>>>> after the pass manager execution.
>>> I came across this same problem, and was planning on adding a
>>> notifyUndefinedSymbol to the LinkingContext, if the linker wants to
add
>>> a defined symbol and coalesce it, it would be possible.
>>>
>>> Do you think this will work for your case too ?
>>> With this option, I don't see:
>>> - how to postpone processing and reaction on undefines. If the
>>> callback is called from within Resolver::resolve(), you should
react
>>> on it immediately, because otherwise the code will still fail in
>>> Resolver::resolve().
>>> - how to know if a symbol is needed within the callback body. The
>>> need of any symbol is determined in some other place. So I need to
>>> keep a sort of indication (boolean flags, whatever) to know which
>>> symbols are really needed.
>>> - the exact interface of notifyUndefinedSymbol callback. If it
>>> receives `StringRef` name of the undefined symbol, what reaction
>>> should be? Should it return new symbols to add back to the caller
as
>>> `const Atom*`?
>> notifyUndefinedSymbol will allow the context to coalesce the undefined
>> atom with a defined atom.
>>
>> Atom *notifyUndefinedSymbol(StringRef name) could be the interface.
>>
>>> Thanks,
>>>    Denis.
>>>
>>
>>
>
>
> --
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
by
> the Linux Foundation
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

llvm dev - Feb 2015 - [LLVMdev] [lld] Undefined symbols postprocessing

[LLVMdev] [lld] Undefined symbols postprocessing

[LLVMdev] [lld] Undefined symbols postprocessing

[LLVMdev] [lld] Undefined symbols postprocessing

[LLVMdev] [lld] Undefined symbols postprocessing

[LLVMdev] [lld] Undefined symbols postprocessing