Hi, Currently lld ties up all atoms in a section for ELF together. This proposal just breaks it by handling it differently. *This requires **NO ELF ABI changes. *_*Definitions :-*_ A section is not considered safe if there is some code that appears to be present between function boundaries (or) optimizes sections to place data at the end or beginning of a section (that contains no symbol). A section is considered safe if symbols contained within the section have been associated with their appropriate sizes and there is no data present between function boundaries. Examples of safe sections are, code generated by compilers. Examples of unsafe sections are, hand written assembly code. _*Changes Needed :-*_ The change that I am trying to propose is the compiler emits a section, called (*.safe_sections) *that contains section indices on what sections are safe. The section would have a SHF_EXCLUDE flag, to prevent other linkers from consuming this section and making it to the output file. Data structure for this :- .safe_sections <total size> <section index> <boolean flag -- safe/unsafe> ... ... *_Advantages_ *There are advantages that the atoms within a safe section could just be allocated in the output file which means better output file layout, and Better performance! This would also result in more atoms getting gc'ed. a) looking at profile information b) taking a order file *_Changes needed in the assembler_ *a) add an additional flag in the section for people writing assembly code, to mark a section safe or unsafe. * **_Changes needed in lld_ *a) Read the safe section if its present in the object file b) Tie atoms together within a section if the section is not safe * *Thanks Shankar Easwaran* * -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130725/a27d5d58/attachment.html>
I think I share the goal with you to make the foundation for better dead-strip, so thank you for suggesting. I'm not sure if marking a section as a whole as "safe" or "unsafe" is the best approach, though. Some comments. - If the compiler generated code is always "safe", and if we can distinguish it from hand-written assembly code by checking if there's a gap between symbols, can we just assume a section with no gap is always "safe"? - "Safeness" is not an attribute of the section but of the symbol, I think. The symbol is "safe" if there's no direct reference to the symbol data. All references should go through relocations. A section may contain both "safe" and "unsafe" symbols. - How about making the compiler to create a new section for each "safe" atom, as it does for inline functions? On Thu, Jul 25, 2013 at 10:54 AM, Shankar Easwaran <shankare at codeaurora.org>wrote:> Hi, > > Currently lld ties up all atoms in a section for ELF together. This > proposal just breaks it by handling it differently. > > *This requires **NO ELF ABI changes. > > **Definitions :-* > > A section is not considered safe if there is some code that appears to be > present between function boundaries (or) optimizes sections to place data > at the end or beginning of a section (that contains no symbol). > > A section is considered safe if symbols contained within the section have > been associated with their appropriate sizes and there is no data present > between function boundaries. > > Examples of safe sections are, code generated by compilers. > > Examples of unsafe sections are, hand written assembly code. > > *Changes Needed :-* > > The change that I am trying to propose is the compiler emits a section, > called (*.safe_sections) *that contains section indices on what sections > are safe. > > The section would have a SHF_EXCLUDE flag, to prevent other linkers from > consuming this section and making it to the output file. > > Data structure for this :- > > .safe_sections > <total size> > <section index> <boolean flag -- safe/unsafe> > ... > ... > > > *Advantages > *There are advantages that the atoms within a safe section could just be > allocated in the output file which means better output file layout, and > Better performance! > > This would also result in more atoms getting gc'ed. > > a) looking at profile information > b) taking a order file > > *Changes needed in the assembler > > *a) add an additional flag in the section for people writing assembly > code, to mark a section safe or unsafe. > * > **Changes needed in lld > > *a) Read the safe section if its present in the object file > b) Tie atoms together within a section if the section is not safe > * > *Thanks > > Shankar Easwaran* > * > > -- > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130725/c0e83c77/attachment.html>
On 7/25/2013 3:56 PM, Rui Ueyama wrote:> I think I share the goal with you to make the foundation for better > dead-strip, so thank you for suggesting. I'm not sure if marking a section > as a whole as "safe" or "unsafe" is the best approach, though. Some > comments. > > - If the compiler generated code is always "safe", and if we can > distinguish it from hand-written assembly code by checking if there's a gap > between symbols, can we just assume a section with no gap is always "safe"?Gaps could just be caused due to alignment, but the code may be safe, which the compiler knows very well.> - "Safeness" is not an attribute of the section but of the symbol, I > think. The symbol is "safe" if there's no direct reference to the symbol > data. All references should go through relocations. A section may contain > both "safe" and "unsafe" symbols.Sections contain symbols. In the context of ELF, marking sections as safe/not is more desirable because of the switches (-ffunction-sections and -fdata-sections available already).> - How about making the compiler to create a new section for each "safe" > atom, as it does for inline functions?You already have a switch called -ffunction-sections and -fdata-sections to put function and data in seperate sections.> > > On Thu, Jul 25, 2013 at 10:54 AM, Shankar Easwaran > <shankare at codeaurora.org>wrote: > >> Hi, >> >> Currently lld ties up all atoms in a section for ELF together. This >> proposal just breaks it by handling it differently. >> >> *This requires **NO ELF ABI changes. >> >> **Definitions :-* >> >> A section is not considered safe if there is some code that appears to be >> present between function boundaries (or) optimizes sections to place data >> at the end or beginning of a section (that contains no symbol). >> >> A section is considered safe if symbols contained within the section have >> been associated with their appropriate sizes and there is no data present >> between function boundaries. >> >> Examples of safe sections are, code generated by compilers. >> >> Examples of unsafe sections are, hand written assembly code. >> >> *Changes Needed :-* >> >> The change that I am trying to propose is the compiler emits a section, >> called (*.safe_sections) *that contains section indices on what sections >> are safe. >> >> The section would have a SHF_EXCLUDE flag, to prevent other linkers from >> consuming this section and making it to the output file. >> >> Data structure for this :- >> >> .safe_sections >> <total size> >> <section index> <boolean flag -- safe/unsafe> >> ... >> ... >> >> >> *Advantages >> *There are advantages that the atoms within a safe section could just be >> allocated in the output file which means better output file layout, and >> Better performance! >> >> This would also result in more atoms getting gc'ed. >> >> a) looking at profile information >> b) taking a order file >> >> *Changes needed in the assembler >> >> *a) add an additional flag in the section for people writing assembly >> code, to mark a section safe or unsafe. >> * >> **Changes needed in lld >> >> *a) Read the safe section if its present in the object file >> b) Tie atoms together within a section if the section is not safe >> * >> *Thanks >> >> Shankar Easwaran* >> * >> >> -- >> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation >> >>-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation
On 25/07/13 18:54, Shankar Easwaran wrote:> Hi, > > Currently lld ties up all atoms in a section for ELF together. This > proposal just breaks it by handling it differently. > > *This requires **NO ELF ABI changes. > > *_*Definitions :-*_ > > A section is not considered safe if there is some code that appears to > be present between function boundaries (or) optimizes sections to > place data at the end or beginning of a section (that contains no symbol). > > A section is considered safe if symbols contained within the section > have been associated with their appropriate sizes and there is no data > present between function boundaries.I'd like to see a more precise definition of "safe". For example just from the above description it is not clear that "safe" disallows one function falling through into another, but based on the intended use cases this clearly isn't allowed. How is alignment handled? If I have two functions in the same section with different .align directives will these be respected when the section is split apart? Is it OK for a loop within a function to have a .align? What about relocations? If calls are implemented with branches taking pc-relative offsets then the assembler might patch in the branch offset and not emit a relocation. This clearly prevents functions from being removed / reordered, so I assume it is a requirement that a safe section always uses relocations for branches between functions and if it has a choice of long or short branches it aways conservatively uses a long branch. This should be made explicit in the description of safe. If you have a symbol at the same address as a function how do you decide if it should be associated with this function or the end of the last function? Is it a requirement that there are no references to symbols defined inside the function except for the function symbol itself? If so how does this work when you have debug info (which might have references to addresses within the function)? -- Richard Osborne | XMOS http://www.xmos.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130731/7a9a124a/attachment.html>
Thanks for your very detailed analysis. From other email conversations, it looks like -ffunction-sections and -fdata-sections are doing what is being iterated in the original proposal. On 7/31/2013 5:38 AM, Richard Osborne wrote:> I'd like to see a more precise definition of "safe". For example just > from the above description it is not clear that "safe" disallows one > function falling through into another, but based on the intended use > cases this clearly isn't allowed. >Doesnt this break the model even with ELF, For example if the code would have been compiled with -ffunction-sections, the fall through into another would just happen by chance when the linker merges similiar sections together ?> How is alignment handled? If I have two functions in the same section > with different .align directives will these be respected when the > section is split apart? Is it OK for a loop within a function to have > a .align?Yes alignment is handled, Each atom has a seperate alignment which is derived from the position where the atom was in the section and the alignment of the section itself.> > What about relocations? If calls are implemented with branches taking > pc-relative offsets then the assembler might patch in the branch > offset and not emit a relocation. This clearly prevents functions from > being removed / reordered, so I assume it is a requirement that a safe > section always uses relocations for branches between functions and if > it has a choice of long or short branches it aways conservatively uses > a long branch. This should be made explicit in the description of safe.Yes you are right.> > If you have a symbol at the same address as a function how do you > decide if it should be associated with this function or the end of the > last function?Are you talking about weak symbols here ?> > Is it a requirement that there are no references to symbols defined > inside the function except for the function symbol itself? If so how > does this work when you have debug info (which might have references > to addresses within the function)? >The model needs to read the debug information that corresponds to the function and keep them housed within the atom data structure itself. Thanks Shankar Easwaran -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation