Sterling Augustine via llvm-dev
2020-Aug-13 22:35 UTC
[llvm-dev] Exceptions and performance
There is a fair amount of dispute and detail here, and real benchmarks can be difficult to write, because you often end up in arguments about whether or not the two styles of coding are equivalent or not. But I agree with Dave--exceptions generally inhibit optimization. One way to think about this is that, generally speaking, the less the compiler can prove about a program, the less aggressive it can be with optimizations. If it doesn't know what the control flow will exactly look like, it can't do certain transformations. Exceptions add a conditional goto from every single function call (without good noexcept hygiene, which practically no program has), to the exception handler or cleanup landing pad (whether the exception handler is written in the code explicitly or not, something has to run the destructors of objects that are going out of scope.) Worse, inlining doesn't really save you, because any function call that gets inlined also will have an implied conditional goto for any function it calls, now to two possible landing pads. Is it safe to move code across this goto? It's comparatively hard to prove things about that. Worse, a compiler generally know if a function call throws or not, or event the type of exception that will get thrown. It definitely knows what the return type and value of a function is. It is very easy for a compiler to reason about the error checking around a function call--the code is all there and explicit. All of this restricts what it can prove about the program. And the less it can prove, the less aggressive it can be. On Thu, Aug 13, 2020 at 3:13 PM David Blaikie via llvm-dev < llvm-dev at lists.llvm.org> wrote:> On Thu, Aug 13, 2020 at 2:59 PM Haoran Xu via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > > > Hello, > > > > I have been long taught that exceptions are considered the bad parts in > the design of C++ language, and one should avoid using it since it's hard > to write exception-safe code (I have no disagreement on this part) and it > impedes compiler optimization. > > > > However, when recently I dig into the implementation of exception > handling mechanism in LLVM recently, my impression is that using exceptions > should instead improve performance (in the common case that no exception is > thrown out), compared with the traditional approach of returning an error > code in every function that can fail: no error-code-checking logic is > executed in the fastpath, and error-handling code are moved from the main > binary to the exception table, so the CPU is doing less work, and also > instruction cache locality should be improved. Is my understanding correct? > > (this is probably a bit too broad a topic for this forum/tends to end > up in heated discussions, etc, but let's see how we go) > > I believe one of the main reasons your understanding there might be > incorrect: Not every function returns an error code. But essentially > every function can throw (yes, you can annotate them - but it's one of > those cases where C++ may've got the defaults wrong (like const, etc) > and it's impractical/unlikely someone's going to effectively maintain > noexcept on most of their codebase (and there are performance problems > with noexcept - since it's not UB to throw from such a function, it's > guaranteed to call terminate, so you have to generate exception > handling code in every noexcept function)) > > > So my question is: > > (1) Is the argument that 'exception hurts compiler optimization, and > should not be used in perf-sensitive code' outdated now? > > I don't believe so - having every function call being potentially > control flow changing inhibits certain optimizations that would > otherwise be possible (certain code motion, reordering, etc can't be > done) compared to a relatively smaller number of explicit > error-handling calls. > > > Or more generally, what are the pros and cons of using exception vs > error code, from a LLVM backend developer's perspective? > > Exceptions inhibit code motion, but allow for smaller code/keeping the > cold path/handling further away. I think that's basically the > mechanical tradeoff (setting aside issues of readability, > maintainability, etc - having an explicit keyword/notation on calls > that can throw could help a lot from a maintainability perspective, > for instance) > > > (2) In current LLVM design, is there still any optimization that is > prevented by using exceptions? Or is the current LLVM already able to > provide the same optimization quality for code using exceptions, compared > with code that does not? > > I believe that exceptions do still hinder optimizations when compared > to code that doesn't throw or use a return code*. While that may seem > like an unfair comparison, it's my understanding that it's part of the > excessive cost of exceptions. Not to mention how expensive they can be > when they are thrown, compared to explicit error handling. > > * Note this isn't a case of missed optimization opportunities (well, > perhaps in LTO it's a missed opportunity - if we did whole program > analysis to prove certain call trees never throw in practice, then we > could propagate that information and improve optimizations) - but a > limitation of exceptions. > > - Dave > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200813/a56a4f4f/attachment.html>
Hello David and Sterling, thanks for the reply! I have no intention to heat up the discussion, please pardon me if I asked questions that might sound silly to you -- it's just that I truly didn't understand since I'm not expert in llvm or optimizer or exception handling at all. I believe one of the main reasons your understanding there might be> incorrect: Not every function returns an error code. But essentially every > function can throw >I think if a function could throw, then it *should* return an error code in the no-except approach, otherwise there would be no way to convey the error to caller. So I did't get the point that "essentially every function can throw". Indeed by default most of the C++ functions may throw (and some of them may actually be non-throwing), but LLVM should (at least in theory) be able to correctly figure out which functions may actually throw as long as it has the definition, is that correct? So if a function does not call external functions, the whole control flow graph (including the throw parts) should be available to LLVM even if there are exceptions, so I think there should be no limitation in moving code around? In the case that the function calls external functions, then it should already be hard to move code across that external function call, so I thought having exceptions does not pose an additional limitation. Maybe I misunderstood some part? Exceptions add a conditional goto from every single function call (without> good noexcept hygiene, which practically no program has), to the exception > handler or cleanup landing pad (whether the exception handler is written in > the code explicitly or not, something has to run the destructors of objects > that are going out of scope.) >I did't quite understand what is different between this and an error-checking 'if'. Let's say you have a function call f(). Then with error code, the logic would look like error_code = f() if (error_code) { (call all destructors) <-- generated by llvm return error_code; } and with exception, the code would (conceptually) look like: exceptions = f() if (exceptions) { <-- generated by llvm call all destructors; <-- generated by llvm resume exceptions; <-- generated by llvm } <-- generated by llvm The only difference is that the code for 'if' branch would physically reside in the exception table, but I couldn't see why this would make a difference to the optimizer. The optimizer could in theory just optimize the code as if it were a physically-existing 'if' branch (and achieve the same optimized code as the error-code approach), and then finally put the 'if' branch into the exception table. Am I missing some piece? Thanks, Haoran Sterling Augustine <saugustine at google.com> 于2020年8月13日周四 下午3:35写道:> There is a fair amount of dispute and detail here, and real benchmarks can > be difficult to write, because you often end up in arguments about whether > or not the two styles of coding are equivalent or not. > > But I agree with Dave--exceptions generally inhibit optimization. > > One way to think about this is that, generally speaking, the less the > compiler can prove about a program, the less aggressive it can be with > optimizations. If it doesn't know what the control flow will exactly look > like, it can't do certain transformations. Exceptions add a conditional > goto from every single function call (without good noexcept hygiene, which > practically no program has), to the exception handler or cleanup landing > pad (whether the exception handler is written in the code explicitly or > not, something has to run the destructors of objects that are going out of > scope.) Worse, inlining doesn't really save you, because any function call > that gets inlined also will have an implied conditional goto for any > function it calls, now to two possible landing pads. Is it safe to move > code across this goto? It's comparatively hard to prove things about that. > > Worse, a compiler generally know if a function call throws or not, or > event the type of exception that will get thrown. It definitely knows what > the return type and value of a function is. > > It is very easy for a compiler to reason about the error checking around a > function call--the code is all there and explicit. All of this restricts > what it can prove about the program. And the less it can prove, the less > aggressive it can be. > > > On Thu, Aug 13, 2020 at 3:13 PM David Blaikie via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> On Thu, Aug 13, 2020 at 2:59 PM Haoran Xu via llvm-dev >> <llvm-dev at lists.llvm.org> wrote: >> > >> > Hello, >> > >> > I have been long taught that exceptions are considered the bad parts in >> the design of C++ language, and one should avoid using it since it's hard >> to write exception-safe code (I have no disagreement on this part) and it >> impedes compiler optimization. >> > >> > However, when recently I dig into the implementation of exception >> handling mechanism in LLVM recently, my impression is that using exceptions >> should instead improve performance (in the common case that no exception is >> thrown out), compared with the traditional approach of returning an error >> code in every function that can fail: no error-code-checking logic is >> executed in the fastpath, and error-handling code are moved from the main >> binary to the exception table, so the CPU is doing less work, and also >> instruction cache locality should be improved. Is my understanding correct? >> >> (this is probably a bit too broad a topic for this forum/tends to end >> up in heated discussions, etc, but let's see how we go) >> >> I believe one of the main reasons your understanding there might be >> incorrect: Not every function returns an error code. But essentially >> every function can throw (yes, you can annotate them - but it's one of >> those cases where C++ may've got the defaults wrong (like const, etc) >> and it's impractical/unlikely someone's going to effectively maintain >> noexcept on most of their codebase (and there are performance problems >> with noexcept - since it's not UB to throw from such a function, it's >> guaranteed to call terminate, so you have to generate exception >> handling code in every noexcept function)) >> >> > So my question is: >> > (1) Is the argument that 'exception hurts compiler optimization, and >> should not be used in perf-sensitive code' outdated now? >> >> I don't believe so - having every function call being potentially >> control flow changing inhibits certain optimizations that would >> otherwise be possible (certain code motion, reordering, etc can't be >> done) compared to a relatively smaller number of explicit >> error-handling calls. >> >> > Or more generally, what are the pros and cons of using exception vs >> error code, from a LLVM backend developer's perspective? >> >> Exceptions inhibit code motion, but allow for smaller code/keeping the >> cold path/handling further away. I think that's basically the >> mechanical tradeoff (setting aside issues of readability, >> maintainability, etc - having an explicit keyword/notation on calls >> that can throw could help a lot from a maintainability perspective, >> for instance) >> >> > (2) In current LLVM design, is there still any optimization that is >> prevented by using exceptions? Or is the current LLVM already able to >> provide the same optimization quality for code using exceptions, compared >> with code that does not? >> >> I believe that exceptions do still hinder optimizations when compared >> to code that doesn't throw or use a return code*. While that may seem >> like an unfair comparison, it's my understanding that it's part of the >> excessive cost of exceptions. Not to mention how expensive they can be >> when they are thrown, compared to explicit error handling. >> >> * Note this isn't a case of missed optimization opportunities (well, >> perhaps in LTO it's a missed opportunity - if we did whole program >> analysis to prove certain call trees never throw in practice, then we >> could propagate that information and improve optimizations) - but a >> limitation of exceptions. >> >> - Dave >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200813/181ed0fb/attachment.html>
Sterling Augustine via llvm-dev
2020-Aug-13 23:58 UTC
[llvm-dev] Exceptions and performance
On Thu, Aug 13, 2020 at 4:38 PM Haoran Xu <haoranxu510 at gmail.com> wrote:> I think if a function could throw, then it *should* return an error code > in the no-except approach, otherwise there would be no way to convey the > error to caller. >And here is the rub. What a function *should* do, and what it is required to do by the standard (and therefore what the the compiler can rely on), are very different things.> Am I missing some piece? >I think the thing you are missing is what the compiler *actually* knows and can prove, versus what it might know if the code is well structured. (Most code is not well structured.) What does the standard say the following function could possibly return? What type will the error be? void Foo(); Now, according to the standard, what exceptions can it throw, and what are the possible types of those exceptions? Remember, we are talking about what the compiler can actually prove. Anyway, I won't get into this further. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200813/a0c93cb1/attachment-0001.html>
On Thu, Aug 13, 2020 at 4:38 PM Haoran Xu <haoranxu510 at gmail.com> wrote:> > Hello David and Sterling, thanks for the reply! > > I have no intention to heat up the discussion, please pardon me if I asked questions that might sound silly to you -- it's just that I truly didn't understand since I'm not expert in llvm or optimizer or exception handling at all. > >> I believe one of the main reasons your understanding there might be >> incorrect: Not every function returns an error code. But essentially every function can throw > > I think if a function could throw, then it *should* return an error code in the no-except approach, otherwise there would be no way to convey the error to caller.I don't know of any codebase that uses "nothrow" ubiquitously enough to represent this situation. In general in C++ code there will be many more functions that can in theory throw (as far as the compiler's concerned - ie: The functions are not marked "nothrow") but never do in practice, than there would be functions that actually need error handling/failures.> So I did't get the point that "essentially every function can throw". Indeed by default most of the C++ functions may throw (and some of them may actually be non-throwing), but LLVM should (at least in theory) be able to correctly figure out which functions may actually throw as long as it has the definition, is that correct?As long as it has the definition of the entire call graph - which is unlikely without whole program analysis.> So if a function does not call external functions, the whole control flow graph (including the throw parts) should be available to LLVM even if there are exceptions, so I think there should be no limitation in moving code around?That's a fairly uncommon property to hold - most code calls other external functions. (certainly external to a file - and in many cases external even to whole program optimization, depending on how you define "whole program" and how you build your code, whether you have access to the source of every library to rebuild them, etc)> In the case that the function calls external functions, then it should already be hard to move code across that external function call, so I thought having exceptions does not pose an additional limitation. Maybe I misunderstood some part?There's still things that could be done moving across a call to external code - eg: int f1() { static int i = 3; ++i; f2(); return i; } - a compiler could move the increment back and forth over f2 depending on its needs. If 'f2' can throw, then it's unsafe to move the increment.>> Exceptions add a conditional goto from every single function call (without good noexcept hygiene, which practically no program has), to the exception handler or cleanup landing pad (whether the exception handler is written in the code explicitly or not, something has to run the destructors of objects that are going out of scope.) > > I did't quite understand what is different between this and an error-checking 'if'. Let's say you have a function call f(). Then with error code, the logic would look like > error_code = f() > if (error_code) { > (call all destructors) <-- generated by llvm > return error_code; > } > and with exception, the code would (conceptually) look like: > exceptions = f() > if (exceptions) { <-- generated by llvm > call all destructors; <-- generated by llvm > resume exceptions; <-- generated by llvm > } <-- generated by llvm > The only difference is that the code for 'if' branch would physically reside in the exception table, but I couldn't see why this would make a difference to the optimizer. The optimizer could in theory just optimize the code as if it were a physically-existing 'if' branch (and achieve the same optimized code as the error-code approach), and then finally put the 'if' branch into the exception table. Am I missing some piece?The ABI - the agreement between compilation of different files/objects limits what the compiler can do to change the mechanism for exception handling. The compiler building 'f()' and the compiler building f()'s caller must agree on how the exception will be modeled - so the compiler can't readily make a decision about tradeoffs of different schemes for exception propagation depending on profiles or other things. Yes, with a full ABI break you could have a completely different exception handling scheme that doesn't use tables at all - adds an extra implicit return value from every function call wvhich is the "did this throw" bit, with some other sidetable of the actual data that was thrown, etc - I believe this is the sort of model Swift uses, for instance. But that has a different set of performance tradeoffs (& again, the defaults are problematic - C++ compilers must assume all functions throw unless annotated otherwise, and C++ writers don't annotate them enough to help much there) - Dave> > Thanks, > Haoran > > > > Sterling Augustine <saugustine at google.com> 于2020年8月13日周四 下午3:35写道: >> >> There is a fair amount of dispute and detail here, and real benchmarks can be difficult to write, because you often end up in arguments about whether or not the two styles of coding are equivalent or not. >> >> But I agree with Dave--exceptions generally inhibit optimization. >> >> One way to think about this is that, generally speaking, the less the compiler can prove about a program, the less aggressive it can be with optimizations. If it doesn't know what the control flow will exactly look like, it can't do certain transformations. Exceptions add a conditional goto from every single function call (without good noexcept hygiene, which practically no program has), to the exception handler or cleanup landing pad (whether the exception handler is written in the code explicitly or not, something has to run the destructors of objects that are going out of scope.) Worse, inlining doesn't really save you, because any function call that gets inlined also will have an implied conditional goto for any function it calls, now to two possible landing pads. Is it safe to move code across this goto? It's comparatively hard to prove things about that. >> >> Worse, a compiler generally know if a function call throws or not, or event the type of exception that will get thrown. It definitely knows what the return type and value of a function is. >> >> It is very easy for a compiler to reason about the error checking around a function call--the code is all there and explicit. All of this restricts what it can prove about the program. And the less it can prove, the less aggressive it can be. >> >> >> On Thu, Aug 13, 2020 at 3:13 PM David Blaikie via llvm-dev <llvm-dev at lists.llvm.org> wrote: >>> >>> On Thu, Aug 13, 2020 at 2:59 PM Haoran Xu via llvm-dev >>> <llvm-dev at lists.llvm.org> wrote: >>> > >>> > Hello, >>> > >>> > I have been long taught that exceptions are considered the bad parts in the design of C++ language, and one should avoid using it since it's hard to write exception-safe code (I have no disagreement on this part) and it impedes compiler optimization. >>> > >>> > However, when recently I dig into the implementation of exception handling mechanism in LLVM recently, my impression is that using exceptions should instead improve performance (in the common case that no exception is thrown out), compared with the traditional approach of returning an error code in every function that can fail: no error-code-checking logic is executed in the fastpath, and error-handling code are moved from the main binary to the exception table, so the CPU is doing less work, and also instruction cache locality should be improved. Is my understanding correct? >>> >>> (this is probably a bit too broad a topic for this forum/tends to end >>> up in heated discussions, etc, but let's see how we go) >>> >>> I believe one of the main reasons your understanding there might be >>> incorrect: Not every function returns an error code. But essentially >>> every function can throw (yes, you can annotate them - but it's one of >>> those cases where C++ may've got the defaults wrong (like const, etc) >>> and it's impractical/unlikely someone's going to effectively maintain >>> noexcept on most of their codebase (and there are performance problems >>> with noexcept - since it's not UB to throw from such a function, it's >>> guaranteed to call terminate, so you have to generate exception >>> handling code in every noexcept function)) >>> >>> > So my question is: >>> > (1) Is the argument that 'exception hurts compiler optimization, and should not be used in perf-sensitive code' outdated now? >>> >>> I don't believe so - having every function call being potentially >>> control flow changing inhibits certain optimizations that would >>> otherwise be possible (certain code motion, reordering, etc can't be >>> done) compared to a relatively smaller number of explicit >>> error-handling calls. >>> >>> > Or more generally, what are the pros and cons of using exception vs error code, from a LLVM backend developer's perspective? >>> >>> Exceptions inhibit code motion, but allow for smaller code/keeping the >>> cold path/handling further away. I think that's basically the >>> mechanical tradeoff (setting aside issues of readability, >>> maintainability, etc - having an explicit keyword/notation on calls >>> that can throw could help a lot from a maintainability perspective, >>> for instance) >>> >>> > (2) In current LLVM design, is there still any optimization that is prevented by using exceptions? Or is the current LLVM already able to provide the same optimization quality for code using exceptions, compared with code that does not? >>> >>> I believe that exceptions do still hinder optimizations when compared >>> to code that doesn't throw or use a return code*. While that may seem >>> like an unfair comparison, it's my understanding that it's part of the >>> excessive cost of exceptions. Not to mention how expensive they can be >>> when they are thrown, compared to explicit error handling. >>> >>> * Note this isn't a case of missed optimization opportunities (well, >>> perhaps in LTO it's a missed opportunity - if we did whole program >>> analysis to prove certain call trees never throw in practice, then we >>> could propagate that information and improve optimizations) - but a >>> limitation of exceptions. >>> >>> - Dave >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev