Krzysztof Pszeniczny via llvm-dev
2019-Oct-02 19:19 UTC
[llvm-dev] [RFC] Propeller: A frame work for Post Link Optimizations
On Wed, Oct 2, 2019 at 8:41 PM Maksim Panchenko via llvm-dev < llvm-dev at lists.llvm.org> wrote:> *Pessimization/overhead for stack unwinding used by system-wide profilers > and > for exception handling* > > Larger CFI programs put an extra burden on unwinding at runtime as more CFI > (and thus native) instructions have to be executed. This will cause more > overhead for any profiler that records stack traces, and, as you correctly > note > in the proposal, for any program that heavily uses exceptions. >The number of CFI instructions that have to be executed when unwinding any given stack stays the same. The CFI instructions for a function have to be duplicated in every basic block section, but when performing unwinding only one such a set is executed -- the copy for the current basic block. However, this copy contains precisely the same CFI instructions as the ones that would have to be executed if there were no basic block sections. -- Krzysztof Pszeniczny -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191002/af57c080/attachment.html>
Maksim Panchenko via llvm-dev
2019-Oct-02 20:24 UTC
[llvm-dev] [RFC] Propeller: A frame work for Post Link Optimizations
Thanks for clarifying. This means once you move to the next basic block (or any other basic block in the function) you have to execute an entirely new set of CFI instructions except for the common CIE part. While indeed this is not as bad, on average, the overall active memory footprint will increase. Creating one FDE per basic block means that .eh_frame_hdr, an allocatable section, will be bloated too. This will increase the FDE lookup time. I don’t see .eh_frame_hdr being mentioned in the proposal. Maksim On 10/2/19, 12:20 PM, "Krzysztof Pszeniczny" <kpszeniczny at google.com<mailto:kpszeniczny at google.com>> wrote: On Wed, Oct 2, 2019 at 8:41 PM Maksim Panchenko via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote: *Pessimization/overhead for stack unwinding used by system-wide profilers and for exception handling* Larger CFI programs put an extra burden on unwinding at runtime as more CFI (and thus native) instructions have to be executed. This will cause more overhead for any profiler that records stack traces, and, as you correctly note in the proposal, for any program that heavily uses exceptions. The number of CFI instructions that have to be executed when unwinding any given stack stays the same. The CFI instructions for a function have to be duplicated in every basic block section, but when performing unwinding only one such a set is executed -- the copy for the current basic block. However, this copy contains precisely the same CFI instructions as the ones that would have to be executed if there were no basic block sections. -- Krzysztof Pszeniczny -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191002/927f80c8/attachment.html>
James Y Knight via llvm-dev
2019-Oct-02 20:58 UTC
[llvm-dev] [RFC] Propeller: A frame work for Post Link Optimizations
I'm a bit confused by this subthread -- doesn't BOLT have the exact same CFI bloat issue? From my cursory reading of the propellor doc, the CFI duplication is _necessary_ to represent discontiguous functions, not anything particular to the way Propellor happens to generate those discontiguous functions. And emitting discontiguous functions is a fundamental goal of this, right? On Wed, Oct 2, 2019 at 4:25 PM Maksim Panchenko via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Thanks for clarifying. This means once you move to the next basic block > (or any other basic > > block in the function) you have to execute an entirely new set of CFI > instructions > > except for the common CIE part. While indeed this is not as bad, on > average, the overall > > active memory footprint will increase. > > > > Creating one FDE per basic block means that .eh_frame_hdr, an allocatable > section, > > will be bloated too. This will increase the FDE lookup time. I don’t see > .eh_frame_hdr > > being mentioned in the proposal. > > > > Maksim > > > > On 10/2/19, 12:20 PM, "Krzysztof Pszeniczny" <kpszeniczny at google.com> > wrote: > > > > > > > > On Wed, Oct 2, 2019 at 8:41 PM Maksim Panchenko via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > *Pessimization/overhead for stack unwinding used by system-wide profilers > and > for exception handling* > > Larger CFI programs put an extra burden on unwinding at runtime as more CFI > (and thus native) instructions have to be executed. This will cause more > overhead for any profiler that records stack traces, and, as you correctly > note > in the proposal, for any program that heavily uses exceptions. > > > > The number of CFI instructions that have to be executed when unwinding any > given stack stays the same. The CFI instructions for a function have to be > duplicated in every basic block section, but when performing unwinding only > one such a set is executed -- the copy for the current basic block. > However, this copy contains precisely the same CFI instructions as the ones > that would have to be executed if there were no basic block sections. > > > > -- > > Krzysztof Pszeniczny > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191002/80804f93/attachment.html>
Possibly Parallel Threads
- [RFC] Propeller: A frame work for Post Link Optimizations
- [RFC] Propeller: A frame work for Post Link Optimizations
- [RFC] Propeller: A frame work for Post Link Optimizations
- [RFC] Propeller: A frame work for Post Link Optimizations
- [RFC] Propeller: A frame work for Post Link Optimizations