Hi Nick, We have few symbols like __bss_start, __bss_end, which are Undefined symbols in the code. I want a way in the Reader to create specific atoms before the linker bootstraps. I didnt find a way to do that with the existing interfaces. The way it needs to work is as below :- 1) ReaderELF creates Absolute symbols (for __bss_start, __bss_end etc) 2) ReaderELF reads each file and adds Atoms to the list 3) If the atoms the linker defined were Global, the atoms that the Reader created should get overridden with the linker created ones. This may also be needed to pul in specific symbols from archive libraries, too. I was thinking to add an interface to ReaderELF which would be called by the driver, but the problem is the DefinedAtom/AbsoluteAtom have which file owns the atom. I was discussing with Michael on this, and he was proposing to add a Pre-Read file. Do you have any other opinions too ? Thanks Shankar Easwaran -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation
On Dec 7, 2012, at 11:51 AM, Shankar Easwaran wrote:> We have few symbols like __bss_start, __bss_end, which are Undefined symbols in the code. > > I want a way in the Reader to create specific atoms before the linker bootstraps. > > I didnt find a way to do that with the existing interfaces. > > The way it needs to work is as below :- > > 1) ReaderELF creates Absolute symbols (for __bss_start, __bss_end etc) > 2) ReaderELF reads each file and adds Atoms to the list > 3) If the atoms the linker defined were Global, the atoms that the Reader created should get overridden with the linker created ones. > > This may also be needed to pul in specific symbols from archive libraries, too. > > I was thinking to add an interface to ReaderELF which would be called by the driver, but the problem is the DefinedAtom/AbsoluteAtom have which file owns the atom. > > I was discussing with Michael on this, and he was proposing to add a Pre-Read file. > > Do you have any other opinions too ?We have a similar requirement in darwin's ld64 linker, but even more general. Any binary can do the following to introspect itself: struct stuff { int a; int b; }; extern struct stuff* stuff_start __asm("section$start$__DATA$__my"); extern struct stuff* stuff_end __asm("section$end$__DATA$__my"); void examineSection() { const struct stuff* p; for (p = stuff_start; p < stuff_end; ++p) { // do stuff with p } } That is, there are magic symbol names which reference the beginning or ending of any particular section. To support this, the linker lazily creates atoms when references to these magic symbols are discovered during resolving. I have some hooks for this already in place in lld: 1) There is Writer::addFiles(). This method gives any writer a change to add files/atoms to the set of atoms the Resolver works on. The Writer::addFiles() method is called after all input files are added. If you want to add something lazily (like darwin linker does for section$start$ symbols), the writer returns a File object akin to a static library. That it, it provides no initial atoms, but can provide atoms as a last resort (so an .o files would override it). The WriterMachO already uses the addFiles() method to add CRuntime symbols. 2) DefinedAtom::ContentType already has typeFirstInSection and typeLastInSection. These are intended to be used for the content type of the atoms which represent the magic symbols for the start and end of a section. The key here is that the Pass (not written yet) which sorts atoms, knows to sort these atoms to the start or end of their respective sections. If you don't want this full general, lazy approach, you could have your WriteELF::addFiles() return a regular object file that has atoms named __bss_start and __bss_end, but they are marked mergeAsWeak so that any user defined atoms will override them. -Nick
Thanks for the reply Nick. I will use the Writer::addFiles functionality. Do you want to move the SimpleFile class to lld/Core ? It might be useful for other types of object files too(like for ELF here). How does typeFirstInSection/typeLastinSection know that the addresses that need to be used for those symbols are the symbol values for the section start / section end ? I didnt see references to typeFirstInSection/typeLastInSection in the MachO part of lld too, any pointers to how you are doing that will be helpful. If not, I need to duplicate that piece of code, which doesnot make sense. Thanks Shankar Easwaran On 12/7/2012 4:59 PM, Nick Kledzik wrote:> On Dec 7, 2012, at 11:51 AM, Shankar Easwaran wrote: >> We have few symbols like __bss_start, __bss_end, which are Undefined symbols in the code. >> >> I want a way in the Reader to create specific atoms before the linker bootstraps. >> >> I didnt find a way to do that with the existing interfaces. >> >> The way it needs to work is as below :- >> >> 1) ReaderELF creates Absolute symbols (for __bss_start, __bss_end etc) >> 2) ReaderELF reads each file and adds Atoms to the list >> 3) If the atoms the linker defined were Global, the atoms that the Reader created should get overridden with the linker created ones. >> >> This may also be needed to pul in specific symbols from archive libraries, too. >> >> I was thinking to add an interface to ReaderELF which would be called by the driver, but the problem is the DefinedAtom/AbsoluteAtom have which file owns the atom. >> >> I was discussing with Michael on this, and he was proposing to add a Pre-Read file. >> >> Do you have any other opinions too ? > We have a similar requirement in darwin's ld64 linker, but even more general. Any binary can do the following to introspect itself: > > struct stuff { int a; int b; }; > > extern struct stuff* stuff_start __asm("section$start$__DATA$__my"); > extern struct stuff* stuff_end __asm("section$end$__DATA$__my"); > > void examineSection() { > const struct stuff* p; > for (p = stuff_start; p < stuff_end; ++p) { > // do stuff with p > } > } > > That is, there are magic symbol names which reference the beginning or ending of any particular section. To support this, the linker lazily creates atoms when references to these magic symbols are discovered during resolving. > > I have some hooks for this already in place in lld: > > 1) There is Writer::addFiles(). This method gives any writer a change to add files/atoms to the set of atoms the Resolver works on. The Writer::addFiles() method is called after all input files are added. If you want to add something lazily (like darwin linker does for section$start$ symbols), the writer returns a File object akin to a static library. That it, it provides no initial atoms, but can provide atoms as a last resort (so an .o files would override it). The WriterMachO already uses the addFiles() method to add CRuntime symbols. > > 2) DefinedAtom::ContentType already has typeFirstInSection and typeLastInSection. These are intended to be used for the content type of the atoms which represent the magic symbols for the start and end of a section. The key here is that the Pass (not written yet) which sorts atoms, knows to sort these atoms to the start or end of their respective sections. > > If you don't want this full general, lazy approach, you could have your WriteELF::addFiles() return a regular object file that has atoms named __bss_start and __bss_end, but they are marked mergeAsWeak so that any user defined atoms will override them. > > -Nick > > > >-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation
Hi Nick, On 12/7/2012 4:59 PM, Nick Kledzik wrote:> > We have a similar requirement in darwin's ld64 linker, but even more general. Any binary can do the following to introspect itself: > > struct stuff { int a; int b; }; > > extern struct stuff* stuff_start __asm("section$start$__DATA$__my"); > extern struct stuff* stuff_end __asm("section$end$__DATA$__my"); > > void examineSection() { > const struct stuff* p; > for (p = stuff_start; p < stuff_end; ++p) { > // do stuff with p > } > } > > That is, there are magic symbol names which reference the beginning or ending of any particular section. To support this, the linker lazily creates atoms when references to these magic symbols are discovered during resolving. > > I have some hooks for this already in place in lld: > > 1) There is Writer::addFiles(). This method gives any writer a change to add files/atoms to the set of atoms the Resolver works on. The Writer::addFiles() method is called after all input files are added. If you want to add something lazily (like darwin linker does for section$start$ symbols), the writer returns a File object akin to a static library. That it, it provides no initial atoms, but can provide atoms as a last resort (so an .o files would override it). The WriterMachO already uses the addFiles() method to add CRuntime symbols. > > 2) DefinedAtom::ContentType already has typeFirstInSection and typeLastInSection. These are intended to be used for the content type of the atoms which represent the magic symbols for the start and end of a section. The key here is that the Pass (not written yet) which sorts atoms, knows to sort these atoms to the start or end of their respective sections. > > If you don't want this full general, lazy approach, you could have your WriteELF::addFiles() return a regular object file that has atoms named __bss_start and __bss_end, but they are marked mergeAsWeak so that any user defined atoms will override them.The case I have is a bit different now. I added symbols __bss_start/__bss_end/_end using WriterELF::addFiles(). The symbols get overridden appropriately but the value of the symbols are known only after the sections have been merged and the virtual addresses assigned to those symbols. So when I am trying to write these atoms to the output file, I want to set the value of these symbols to the values computed by the ELF Writer. These atoms are NativeAtoms and i dont see a function to set the value of the atom, How do I go about accomplishing this functionality. Thanks Shankar Easwaran -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation
Apparently Analagous Threads
- [LLVMdev] Need to create symbols only once
- [LLVMdev] Need to create symbols only once
- [LLVMdev] Need to create symbols only once
- [LLVMdev] [lld] -emit-yaml doesnot contain linker added symbols specified with command line options
- [LLVMdev] [lld] -emit-yaml doesnot contain linker added symbols specified with command line options