Rui Ueyama via llvm-dev
2017-Oct-26 20:43 UTC
[llvm-dev] [RFC] Making .eh_frame more linker-friendly
On Thu, Oct 26, 2017 at 1:13 PM, Evgeny Astigeevich < Evgeny.Astigeevich at arm.com> wrote:> Hi, > > > > There will be problems with eh_frame_hdr. Eh_frame_hdr is needed to use > the binary search instead of the linear search. Having eh_frame per a > function will cause no eh_frame_hdr or multiple eh_frame_hdr and will > degrade search from binary to linear. >Linkers would combine .eh_frame sections into one .eh_frame, so that's not an issue, no?> As we create eh_frame_hdr in most cases there is no problem to filter out > garbage eh_frame sections. If there is information about unused symbols, > the implementation is very simple. BTW there is no need to do full decoding > of eh_frame records to remove garbage. > > Paul is right there will be code size overhead. Eh_frame is usually > created per a compilation module with common information in CFI. Multiple > eh_frames will cause a lot of redundant CFI. There might be a case when the > total size of redundant CFIs will be greater than the total size of removed > garbage. >As I wrote in the previous message, I don't think there's a size issue in link results because even existing linkers merge CIEs by contents.> > Thanks, > > Evgeny Astigeevich > > The Arm Compiler Optimization team > > > > > > *From: *llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of > "Robinson, Paul via llvm-dev" <llvm-dev at lists.llvm.org> > *Reply-To: *"Robinson, Paul" <paul.robinson at sony.com> > *Date: *Thursday, 26 October 2017 at 19:58 > *To: *Rui Ueyama <ruiu at google.com>, Reid Kleckner <rnk at google.com> > *Cc: *"llvm-dev at lists.llvm.org" <llvm-dev at lists.llvm.org> > > *Subject: *Re: [llvm-dev] [RFC] Making .eh_frame more linker-friendly > > > > The .eh_frame section (which is basically a DWARF .debug_frame section) > was not designed with deduplication/gc in mind. I haven't studied it > closely, but it looks like the bulk of it is frame descriptions which are > divided up basically per-function, with some common overhead factored out. > If you want to put each per-function part into its own ELF section, there's > overhead for that which you are more aware of than I am, and then either > you need to replicate the common part into each per-function section or > accept a relocation from each per-function section into the separate common > section. > > > > Looking at my latest clang build in Ubuntu, the executable has 96320 frame > descriptions of which all but one use the same common part; in this case, > that common part is 24 bytes. The size is not fixed, but is guaranteed to > be a multiple of the target address size, and it probably can't be any > smaller than 24 on a normal machine. This might help give you some > estimates about the size effect of different choices. > > > > HTH, > > --paulr > > > > *From:* llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] *On Behalf Of *Rui > Ueyama via llvm-dev > *Sent:* Thursday, October 26, 2017 11:19 AM > *To:* Reid Kleckner > *Cc:* llvm-dev > *Subject:* Re: [llvm-dev] [RFC] Making .eh_frame more linker-friendly > > > > No I haven't. Thank you for the pointer. > > > > Looks like the problem of the inverted edges was discussed there. But I > guess my bigger question is this: why do we still create one big .eh_frame > even if -ffunction-sections is given? > > > > When the option is given, Clang creates .text, .rela.text and > .gcc_exception_table sections for each function, but it still creates a > monolithic .eh_frame that covers all function sections, which seems odd to > me. > > > > On Thu, Oct 26, 2017 at 9:47 AM, Reid Kleckner <rnk at google.com> wrote: > > Have you seen the discussion of SHF_LINK_ORDER on the generic-abi@ > mailing list? I think it implements exactly what you describe. My > understanding is that ARM EHABI leverages this for the same purpose. > > > > https://groups.google.com/forum/#!topic/generic-abi/_CbBM6T6WeM > > > > On Wed, Oct 25, 2017 at 6:42 PM, Rui Ueyama via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > Hi, > > > > Many linkers including lld have a feature to eliminate unused sections > from output to make output smaller (which is essentially a mark-sweep gc > where sections are vertices and relocations are edges). lld and GNU gold > have yet another feature, ICF, to merge functions by contents to save more > space. > > > > When we remove or merge a function, we want to eliminate its exception > handling information as well. But that isn't very easy to do due to the > format of .eh_frame. Here are reasons: > > > > 1. Linkers have to parse, split, eliminate exception handling information > for dead functions, and then reconstruct an .eh_frame section. It is > tedious, and it doesn't feel very much like a task that linkers have to do > (linkers usually handle sections as opaque blobs and are agnostic of > section contents.) That is contrary to other data where section is the > atomic unit of inclusion/elimination. > > > > 2. From the viewpoint of gc, .eh_frame has reverse edges to sections. > Usually, if section A depends on section B, there's a relocation in A > pointing to B. But that isn't the case for .eh_frame, but opposite. If > section A has exception handling information in .eh_frame section B, B has > a relocation against A. This makes implementing a gc tricky, and when it is > combined to (1), it is more tricky. > > > > 3. Comparing .eh_frame contents for equivalence is hard. In order to merge > functions by contents, we need to verify that their exception handling > information is also the same, but doing it isn't easy given the current > .eh_frame format. > > > > So, I don't feel .eh_frame needed to be designed that way. Maybe we can > improve. Here is my rough idea: > > > > 1. We can emit an .eh_frame section for each .text section. So, if you > pass -ffunction-sections, the resulting object file would have multiple > .eh_frame sections. This makes .eh_frame a unit of garbage collection and > eliminates the need to parse .eh_frame contents. It also makes it very easy > to compare .eh_frame sections for function merging. > > > > 2. Make each .eh_frame section have a link to its .text section. We could > set a section index of a .text section to its corresponding .eh_frame's > sh_link field. This would make gc much easier. (If text section A is > pointed by an .eh_frame section B via sh_link, that A is alive means B is > alive. It is still reverse, but this is much more manageable.) > > > > I think doing the above things doesn't break the compatibility with > existing linkers, and new linkers can take advantage of the format that is > more friendly to the linker. I don't think of any obvious disadvantage of > doing them, except that we would have more sections, but I may be wrong as > I'm no expert of .eh_frame. > > > > What do you guys think? > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171026/03a33454/attachment.html>
Evgeny Astigeevich via llvm-dev
2017-Nov-10 11:41 UTC
[llvm-dev] [RFC] Making .eh_frame more linker-friendly
Hi Igor,> It sounds like the linker has to be aware of the .eh_frame section details to be able to generate .eh_frame_hdr and eliminate duplicate CIEs, right?Yes, a linker needs some details but not all of them. It needs to know sizes of records and initial locations (PC Begin) to find out which functions FDEs belong to.> So, is there any difference whether it knows that in one place or two?What do you mean “one place or two”? If .eh_frame_hdr is not created a linker does not need to parse .eh_frame sections. It simply merges them into one section. The format of .eh_frame allows to do this without parsing .eh_frame sections. Thanks, Evgeny Astigeevich From: Igor Kudrin <ikudrin.dev at gmail.com> Date: Thursday, 9 November 2017 at 11:29 To: Rui Ueyama <ruiu at google.com>, Evgeny Astigeevich <Evgeny.Astigeevich at arm.com> Cc: "llvm-dev at lists.llvm.org" <llvm-dev at lists.llvm.org>, nd <nd at arm.com> Subject: Re: [llvm-dev] [RFC] Making .eh_frame more linker-friendly It sounds like the linker has to be aware of the .eh_frame section details to be able to generate .eh_frame_hdr and eliminate duplicate CIEs, right? So, is there any difference whether it knows that in one place or two? Best Regards, Igor Kudrin C++ Developer, Access Softek, Inc. On 27-Oct-17 3:43, Rui Ueyama via llvm-dev wrote: On Thu, Oct 26, 2017 at 1:13 PM, Evgeny Astigeevich <Evgeny.Astigeevich at arm.com<mailto:Evgeny.Astigeevich at arm.com>> wrote: Hi, There will be problems with eh_frame_hdr. Eh_frame_hdr is needed to use the binary search instead of the linear search. Having eh_frame per a function will cause no eh_frame_hdr or multiple eh_frame_hdr and will degrade search from binary to linear. Linkers would combine .eh_frame sections into one .eh_frame, so that's not an issue, no? As we create eh_frame_hdr in most cases there is no problem to filter out garbage eh_frame sections. If there is information about unused symbols, the implementation is very simple. BTW there is no need to do full decoding of eh_frame records to remove garbage. Paul is right there will be code size overhead. Eh_frame is usually created per a compilation module with common information in CFI. Multiple eh_frames will cause a lot of redundant CFI. There might be a case when the total size of redundant CFIs will be greater than the total size of removed garbage. As I wrote in the previous message, I don't think there's a size issue in link results because even existing linkers merge CIEs by contents. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171110/0e535a3a/attachment.html>