Hi All, I have a design question of how your linker would be suitable for modeling ELF semantics. The ELF linker needs the functionality of reading relocations ahead of symbol resolution for the following usecases :- - Add linker defined symbols if there is a relocation to the symbol (Examples are : defsym, PROVIDE) - Dont halt the linker operation if there are undefined symbols but they are not called from the root set (Do garbage collection and then report whether symbols are really undefined) - A reference to a symbol inside a group, from outside a group need to be through an undefined symbol - For string merging, relocations are needed in advance before they can be merged. - For identical code folding, relocations are needed in advance before they can be merged. There are also more usecase where there is not a symbol but a section, examples of them are :- - sections that contain mergeable strings (.rodata) - sections that contain Eh Frame information, where FDE's are discarded for functions that are garbage collected. So I was trying to figure out how the Chunks and relocations would be related in the Reader, which means that it would be very similiar to what we have with the Atom model. Thoughts / opinions ? Shankar Easwaran -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150605/c0d0b8c0/attachment.html>
On Fri, Jun 5, 2015 at 4:48 PM, Shankar Easwaran <shankarke at gmail.com> wrote:> Hi All, > > I have a design question of how your linker would be suitable for modeling > ELF semantics. > > The ELF linker needs the functionality of reading relocations ahead of > symbol resolution for the following usecases :- > > - Add linker defined symbols if there is a relocation to the symbol > (Examples are : defsym, PROVIDE) >Symbol table contains both undefined and defined symbols. We know what symbols are needed to be resolved to link that file correctly without reading relocation table. - Dont halt the linker operation if there are undefined symbols but they> are not called from the root set (Do garbage collection and then report > whether symbols are really undefined) >Dead-stripping is done after eliminating duplicate COMDAT symbols. Unreferenced symbols are naturally ignored.> - A reference to a symbol inside a group, from outside a group need to be > through an undefined symbol >I don't get the meaning of the question.> - For string merging, relocations are needed in advance before they can be > merged. > - For identical code folding, relocations are needed in advance before > they can be merged. >These happen after symbol resolution and you need to read relocation table.> There are also more usecase where there is not a symbol but a section, > examples of them are :- > > - sections that contain mergeable strings (.rodata) > - sections that contain Eh Frame information, where FDE's are discarded > for functions that are garbage collected. > > So I was trying to figure out how the Chunks and relocations would be > related in the Reader, which means that it would be very similiar to what > we have with the Atom model. > > Thoughts / opinions ? > > Shankar Easwaran >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150605/3fad78d9/attachment.html>
Thanks for the reply.> On Jun 5, 2015, at 20:45, Rui Ueyama <ruiu at google.com> wrote: > > >> On Fri, Jun 5, 2015 at 4:48 PM, Shankar Easwaran <shankarke at gmail.com> wrote: >> Hi All, >> >> I have a design question of how your linker would be suitable for modeling ELF semantics. >> >> The ELF linker needs the functionality of reading relocations ahead of symbol resolution for the following usecases :- >> >> - Add linker defined symbols if there is a relocation to the symbol (Examples are : defsym, PROVIDE) > > Symbol table contains both undefined and defined symbols. We know what symbols are needed to be resolved to link that file correctly without reading relocation table. > >> - Dont halt the linker operation if there are undefined symbols but they are not called from the root set (Do garbage collection and then report whether symbols are really undefined) > > Dead-stripping is done after eliminating duplicate COMDAT symbols. Unreferenced symbols are naturally ignored. > >> - A reference to a symbol inside a group, from outside a group need to be through an undefined symbol > > I don't get the meaning of the question. >If foo is in a group, and bar is calling foo which is outside the group, it cannot refer to foo directly but use a undefined symbol to refer to it.>> - For string merging, relocations are needed in advance before they can be merged. >> - For identical code folding, relocations are needed in advance before they can be merged. > > These happen after symbol resolution and you need to read relocation table. > >> >> There are also more usecase where there is not a symbol but a section, examples of them are :- >> >> - sections that contain mergeable strings (.rodata) >> - sections that contain Eh Frame information, where FDE's are discarded for functions that are garbage collected. >> >> So I was trying to figure out how the Chunks and relocations would be related in the Reader, which means that it would be very similiar to what we have with the Atom model. >> >> Thoughts / opinions ? >> >> Shankar Easwaran >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150605/e9b94740/attachment.html>
On Jun 5, 2015, at 6:45 PM, Rui Ueyama <ruiu at google.com> wrote:> > - Dont halt the linker operation if there are undefined symbols but they are not called from the root set (Do garbage collection and then report whether symbols are really undefined) > > Dead-stripping is done after eliminating duplicate COMDAT symbols. Unreferenced symbols are naturally ignored.Global symbols that did not make it into the final symbol table (because of coalescing or group COMDAT), are easy to discard. But, darwin supports “dead code stripping”. The way it works is you start with atoms (symbols) that must be preserved (for an executable program, that would be “main”), mark them live then start recursively marking live the atoms they reference. In order to do this, you must parse the relocations to and figure out: 1) which function/data each relocation applies to, and 2) what function/data each relocation references. A couple interesting points: a) The master symbol table does not help with dead code stripping because often functions/data are static or anonymous and thus are not in the master symbol table. b) It is ok for dead code to reference undefined symbols, since the dead code will be stripped away. The resolver phase normally ends when there are no undefined symbols remaining or with an error about undefined symbols. But with dead stripping, it is not an error to end with undefined symbols. c) Once the dead code is identified, any symbols the dead code added to the master symbol table need to be removed. -Nick -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150608/e5a1a4ff/attachment.html>
Apparently Analagous Threads
- [LLVMdev] [lld] Linker cannot handle sections with non-unique names
- [LLVMdev] [lld] Linker cannot handle sections with non-unique names
- [LLVMdev] [lld] Implementing the aliasing feature
- [LLVMdev] [lld] Implementing the aliasing feature
- [LLVMdev] [lld] Implementing the aliasing feature