On Thu, Aug 13, 2020 at 6:11 PM Haoran Xu <haoranxu510 at gmail.com> wrote:> > Thanks for the insights David! > > For your first 3 points, is it correct to understand it as following: the external function prototypes are missing reliable information on whether the function throws and what exceptions it may throw (due to C++'s design failures and that it is impractical to maintain such information in a large codebase), which is the main cause that code using exceptions cannot be optimized as effectively as code that does not.That (lack of precise throw information for external functions due to requiring manual annotation of nothrow), I believe, is a significant issue, yes.>> The ABI - the agreement between compilation of different files/objects >> limits what the compiler can do to change the mechanism for exception >> handling. The compiler building 'f()' and the compiler building f()'s >> caller must agree on how the exception will be modeled - so the > > I don't have much expertise on optimizer, but I think the compiler doesn't need to break ABI to optimize those code..? > The two code snippets in my example has effectively the same structure, whatever optimization (e.g. move code around) that would work in the error-code case should also work in the exception case? > For example, if the compile decides to move some code after f() to before f(), it would do below for the error-code case (just to demonstrate that the same transformation could work for both exception and no-exception code): > [moved logic...] > if (err = f()) { > [undo side effect of moved logic..] > call destructors > return err > } > and for code that uses exception: > [moved logic...] > if (exception = f()) { > [undo side effect of moved logic..] > call destructors > resume exception > } > and this is not breaking the ABI at all, right?Sorry, yes. No difference between explicit error handling and exceptions. The difference is in all the functions that don't have explicit error handling but (in the exception-using equivalent code) aren't marked nothrow (where, without exceptions, the compiler could move the increment across the call). - Dave> > > > David Blaikie <dblaikie at gmail.com> 于2020年8月13日周四 下午5:06写道: >> >> On Thu, Aug 13, 2020 at 4:38 PM Haoran Xu <haoranxu510 at gmail.com> wrote: >> > >> > Hello David and Sterling, thanks for the reply! >> > >> > I have no intention to heat up the discussion, please pardon me if I asked questions that might sound silly to you -- it's just that I truly didn't understand since I'm not expert in llvm or optimizer or exception handling at all. >> > >> >> I believe one of the main reasons your understanding there might be >> >> incorrect: Not every function returns an error code. But essentially every function can throw >> > >> > I think if a function could throw, then it *should* return an error code in the no-except approach, otherwise there would be no way to convey the error to caller. >> >> I don't know of any codebase that uses "nothrow" ubiquitously enough >> to represent this situation. In general in C++ code there will be many >> more functions that can in theory throw (as far as the compiler's >> concerned - ie: The functions are not marked "nothrow") but never do >> in practice, than there would be functions that actually need error >> handling/failures. >> >> > So I did't get the point that "essentially every function can throw". Indeed by default most of the C++ functions may throw (and some of them may actually be non-throwing), but LLVM should (at least in theory) be able to correctly figure out which functions may actually throw as long as it has the definition, is that correct? >> >> As long as it has the definition of the entire call graph - which is >> unlikely without whole program analysis. >> >> > So if a function does not call external functions, the whole control flow graph (including the throw parts) should be available to LLVM even if there are exceptions, so I think there should be no limitation in moving code around? >> >> That's a fairly uncommon property to hold - most code calls other >> external functions. (certainly external to a file - and in many cases >> external even to whole program optimization, depending on how you >> define "whole program" and how you build your code, whether you have >> access to the source of every library to rebuild them, etc) >> >> > In the case that the function calls external functions, then it should already be hard to move code across that external function call, so I thought having exceptions does not pose an additional limitation. Maybe I misunderstood some part? >> >> There's still things that could be done moving across a call to >> external code - eg: int f1() { static int i = 3; ++i; f2(); return i; >> } - a compiler could move the increment back and forth over f2 >> depending on its needs. If 'f2' can throw, then it's unsafe to move >> the increment. >> >> >> Exceptions add a conditional goto from every single function call (without good noexcept hygiene, which practically no program has), to the exception handler or cleanup landing pad (whether the exception handler is written in the code explicitly or not, something has to run the destructors of objects that are going out of scope.) >> > >> > I did't quite understand what is different between this and an error-checking 'if'. Let's say you have a function call f(). Then with error code, the logic would look like >> > error_code = f() >> > if (error_code) { >> > (call all destructors) <-- generated by llvm >> > return error_code; >> > } >> > and with exception, the code would (conceptually) look like: >> > exceptions = f() >> > if (exceptions) { <-- generated by llvm >> > call all destructors; <-- generated by llvm >> > resume exceptions; <-- generated by llvm >> > } <-- generated by llvm >> > The only difference is that the code for 'if' branch would physically reside in the exception table, but I couldn't see why this would make a difference to the optimizer. The optimizer could in theory just optimize the code as if it were a physically-existing 'if' branch (and achieve the same optimized code as the error-code approach), and then finally put the 'if' branch into the exception table. Am I missing some piece? >> >> The ABI - the agreement between compilation of different files/objects >> limits what the compiler can do to change the mechanism for exception >> handling. The compiler building 'f()' and the compiler building f()'s >> caller must agree on how the exception will be modeled - so the >> compiler can't readily make a decision about tradeoffs of different >> schemes for exception propagation depending on profiles or other >> things. Yes, with a full ABI break you could have a completely >> different exception handling scheme that doesn't use tables at all - >> adds an extra implicit return value from every function call wvhich is >> the "did this throw" bit, with some other sidetable of the actual data >> that was thrown, etc - I believe this is the sort of model Swift uses, >> for instance. But that has a different set of performance tradeoffs (& >> again, the defaults are problematic - C++ compilers must assume all >> functions throw unless annotated otherwise, and C++ writers don't >> annotate them enough to help much there) >> >> - Dave >> >> > >> > Thanks, >> > Haoran >> > >> > >> > >> > Sterling Augustine <saugustine at google.com> 于2020年8月13日周四 下午3:35写道: >> >> >> >> There is a fair amount of dispute and detail here, and real benchmarks can be difficult to write, because you often end up in arguments about whether or not the two styles of coding are equivalent or not. >> >> >> >> But I agree with Dave--exceptions generally inhibit optimization. >> >> >> >> One way to think about this is that, generally speaking, the less the compiler can prove about a program, the less aggressive it can be with optimizations. If it doesn't know what the control flow will exactly look like, it can't do certain transformations. Exceptions add a conditional goto from every single function call (without good noexcept hygiene, which practically no program has), to the exception handler or cleanup landing pad (whether the exception handler is written in the code explicitly or not, something has to run the destructors of objects that are going out of scope.) Worse, inlining doesn't really save you, because any function call that gets inlined also will have an implied conditional goto for any function it calls, now to two possible landing pads. Is it safe to move code across this goto? It's comparatively hard to prove things about that. >> >> >> >> Worse, a compiler generally know if a function call throws or not, or event the type of exception that will get thrown. It definitely knows what the return type and value of a function is. >> >> >> >> It is very easy for a compiler to reason about the error checking around a function call--the code is all there and explicit. All of this restricts what it can prove about the program. And the less it can prove, the less aggressive it can be. >> >> >> >> >> >> On Thu, Aug 13, 2020 at 3:13 PM David Blaikie via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> >>> >> >>> On Thu, Aug 13, 2020 at 2:59 PM Haoran Xu via llvm-dev >> >>> <llvm-dev at lists.llvm.org> wrote: >> >>> > >> >>> > Hello, >> >>> > >> >>> > I have been long taught that exceptions are considered the bad parts in the design of C++ language, and one should avoid using it since it's hard to write exception-safe code (I have no disagreement on this part) and it impedes compiler optimization. >> >>> > >> >>> > However, when recently I dig into the implementation of exception handling mechanism in LLVM recently, my impression is that using exceptions should instead improve performance (in the common case that no exception is thrown out), compared with the traditional approach of returning an error code in every function that can fail: no error-code-checking logic is executed in the fastpath, and error-handling code are moved from the main binary to the exception table, so the CPU is doing less work, and also instruction cache locality should be improved. Is my understanding correct? >> >>> >> >>> (this is probably a bit too broad a topic for this forum/tends to end >> >>> up in heated discussions, etc, but let's see how we go) >> >>> >> >>> I believe one of the main reasons your understanding there might be >> >>> incorrect: Not every function returns an error code. But essentially >> >>> every function can throw (yes, you can annotate them - but it's one of >> >>> those cases where C++ may've got the defaults wrong (like const, etc) >> >>> and it's impractical/unlikely someone's going to effectively maintain >> >>> noexcept on most of their codebase (and there are performance problems >> >>> with noexcept - since it's not UB to throw from such a function, it's >> >>> guaranteed to call terminate, so you have to generate exception >> >>> handling code in every noexcept function)) >> >>> >> >>> > So my question is: >> >>> > (1) Is the argument that 'exception hurts compiler optimization, and should not be used in perf-sensitive code' outdated now? >> >>> >> >>> I don't believe so - having every function call being potentially >> >>> control flow changing inhibits certain optimizations that would >> >>> otherwise be possible (certain code motion, reordering, etc can't be >> >>> done) compared to a relatively smaller number of explicit >> >>> error-handling calls. >> >>> >> >>> > Or more generally, what are the pros and cons of using exception vs error code, from a LLVM backend developer's perspective? >> >>> >> >>> Exceptions inhibit code motion, but allow for smaller code/keeping the >> >>> cold path/handling further away. I think that's basically the >> >>> mechanical tradeoff (setting aside issues of readability, >> >>> maintainability, etc - having an explicit keyword/notation on calls >> >>> that can throw could help a lot from a maintainability perspective, >> >>> for instance) >> >>> >> >>> > (2) In current LLVM design, is there still any optimization that is prevented by using exceptions? Or is the current LLVM already able to provide the same optimization quality for code using exceptions, compared with code that does not? >> >>> >> >>> I believe that exceptions do still hinder optimizations when compared >> >>> to code that doesn't throw or use a return code*. While that may seem >> >>> like an unfair comparison, it's my understanding that it's part of the >> >>> excessive cost of exceptions. Not to mention how expensive they can be >> >>> when they are thrown, compared to explicit error handling. >> >>> >> >>> * Note this isn't a case of missed optimization opportunities (well, >> >>> perhaps in LTO it's a missed opportunity - if we did whole program >> >>> analysis to prove certain call trees never throw in practice, then we >> >>> could propagate that information and improve optimizations) - but a >> >>> limitation of exceptions. >> >>> >> >>> - Dave >> >>> _______________________________________________ >> >>> LLVM Developers mailing list >> >>> llvm-dev at lists.llvm.org >> >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Thanks for the reply. Sorry, yes. No difference between explicit error handling and> exceptions. The difference is in all the functions that don't have > explicit error handling but (in the exception-using equivalent code) > aren't marked nothrow (where, without exceptions, the compiler could > move the increment across the call). >According to your argument, assuming that correctly annotating throw specifications is impractical, if one wants to mitigate the performance implications of exceptions, not using exceptions oneself is not going to be helpful (since as you said, explicit error-handling and exceptions have no difference), and instead one must completely disable exceptions with -fno-exceptions? On the other hand, if the project is not built with -fno-exceptions, using exceptions to replace explicit error-handling would not degrade performance. In other words, compare the 3 settings: (1) A project built with -fno-exception, and use explicit error-code-handling (2) A project built without -fno-exception, but does not use exception either, and use explicit error-code-handling (3) A project that use exceptions for error handling. Then setting (1) would be fastest, and setting (2)(3) would have similar performance (maybe (3) would be even faster than (2) due to better code locality?). Do you agree with this? David Blaikie <dblaikie at gmail.com> 于2020年8月13日周四 下午6:14写道:> On Thu, Aug 13, 2020 at 6:11 PM Haoran Xu <haoranxu510 at gmail.com> wrote: > > > > Thanks for the insights David! > > > > For your first 3 points, is it correct to understand it as following: > the external function prototypes are missing reliable information on > whether the function throws and what exceptions it may throw (due to C++'s > design failures and that it is impractical to maintain such information in > a large codebase), which is the main cause that code using exceptions > cannot be optimized as effectively as code that does not. > > That (lack of precise throw information for external functions due to > requiring manual annotation of nothrow), I believe, is a significant > issue, yes. > > >> The ABI - the agreement between compilation of different files/objects > >> limits what the compiler can do to change the mechanism for exception > >> handling. The compiler building 'f()' and the compiler building f()'s > >> caller must agree on how the exception will be modeled - so the > > > > I don't have much expertise on optimizer, but I think the compiler > doesn't need to break ABI to optimize those code..? > > The two code snippets in my example has effectively the same structure, > whatever optimization (e.g. move code around) that would work in the > error-code case should also work in the exception case? > > For example, if the compile decides to move some code after f() to > before f(), it would do below for the error-code case (just to demonstrate > that the same transformation could work for both exception and no-exception > code): > > [moved logic...] > > if (err = f()) { > > [undo side effect of moved logic..] > > call destructors > > return err > > } > > and for code that uses exception: > > [moved logic...] > > if (exception = f()) { > > [undo side effect of moved logic..] > > call destructors > > resume exception > > } > > and this is not breaking the ABI at all, right? > > Sorry, yes. No difference between explicit error handling and > exceptions. The difference is in all the functions that don't have > explicit error handling but (in the exception-using equivalent code) > aren't marked nothrow (where, without exceptions, the compiler could > move the increment across the call). > > - Dave > > > > > > > > > David Blaikie <dblaikie at gmail.com> 于2020年8月13日周四 下午5:06写道: > >> > >> On Thu, Aug 13, 2020 at 4:38 PM Haoran Xu <haoranxu510 at gmail.com> > wrote: > >> > > >> > Hello David and Sterling, thanks for the reply! > >> > > >> > I have no intention to heat up the discussion, please pardon me if I > asked questions that might sound silly to you -- it's just that I truly > didn't understand since I'm not expert in llvm or optimizer or exception > handling at all. > >> > > >> >> I believe one of the main reasons your understanding there might be > >> >> incorrect: Not every function returns an error code. But essentially > every function can throw > >> > > >> > I think if a function could throw, then it *should* return an error > code in the no-except approach, otherwise there would be no way to convey > the error to caller. > >> > >> I don't know of any codebase that uses "nothrow" ubiquitously enough > >> to represent this situation. In general in C++ code there will be many > >> more functions that can in theory throw (as far as the compiler's > >> concerned - ie: The functions are not marked "nothrow") but never do > >> in practice, than there would be functions that actually need error > >> handling/failures. > >> > >> > So I did't get the point that "essentially every function can throw". > Indeed by default most of the C++ functions may throw (and some of them may > actually be non-throwing), but LLVM should (at least in theory) be able to > correctly figure out which functions may actually throw as long as it has > the definition, is that correct? > >> > >> As long as it has the definition of the entire call graph - which is > >> unlikely without whole program analysis. > >> > >> > So if a function does not call external functions, the whole control > flow graph (including the throw parts) should be available to LLVM even if > there are exceptions, so I think there should be no limitation in moving > code around? > >> > >> That's a fairly uncommon property to hold - most code calls other > >> external functions. (certainly external to a file - and in many cases > >> external even to whole program optimization, depending on how you > >> define "whole program" and how you build your code, whether you have > >> access to the source of every library to rebuild them, etc) > >> > >> > In the case that the function calls external functions, then it > should already be hard to move code across that external function call, so > I thought having exceptions does not pose an additional limitation. Maybe I > misunderstood some part? > >> > >> There's still things that could be done moving across a call to > >> external code - eg: int f1() { static int i = 3; ++i; f2(); return i; > >> } - a compiler could move the increment back and forth over f2 > >> depending on its needs. If 'f2' can throw, then it's unsafe to move > >> the increment. > >> > >> >> Exceptions add a conditional goto from every single function call > (without good noexcept hygiene, which practically no program has), to the > exception handler or cleanup landing pad (whether the exception handler is > written in the code explicitly or not, something has to run the destructors > of objects that are going out of scope.) > >> > > >> > I did't quite understand what is different between this and an > error-checking 'if'. Let's say you have a function call f(). Then with > error code, the logic would look like > >> > error_code = f() > >> > if (error_code) { > >> > (call all destructors) <-- generated by llvm > >> > return error_code; > >> > } > >> > and with exception, the code would (conceptually) look like: > >> > exceptions = f() > >> > if (exceptions) { <-- generated by llvm > >> > call all destructors; <-- generated by llvm > >> > resume exceptions; <-- generated by llvm > >> > } <-- generated by llvm > >> > The only difference is that the code for 'if' branch would physically > reside in the exception table, but I couldn't see why this would make a > difference to the optimizer. The optimizer could in theory just optimize > the code as if it were a physically-existing 'if' branch (and achieve the > same optimized code as the error-code approach), and then finally put the > 'if' branch into the exception table. Am I missing some piece? > >> > >> The ABI - the agreement between compilation of different files/objects > >> limits what the compiler can do to change the mechanism for exception > >> handling. The compiler building 'f()' and the compiler building f()'s > >> caller must agree on how the exception will be modeled - so the > >> compiler can't readily make a decision about tradeoffs of different > >> schemes for exception propagation depending on profiles or other > >> things. Yes, with a full ABI break you could have a completely > >> different exception handling scheme that doesn't use tables at all - > >> adds an extra implicit return value from every function call wvhich is > >> the "did this throw" bit, with some other sidetable of the actual data > >> that was thrown, etc - I believe this is the sort of model Swift uses, > >> for instance. But that has a different set of performance tradeoffs (& > >> again, the defaults are problematic - C++ compilers must assume all > >> functions throw unless annotated otherwise, and C++ writers don't > >> annotate them enough to help much there) > >> > >> - Dave > >> > >> > > >> > Thanks, > >> > Haoran > >> > > >> > > >> > > >> > Sterling Augustine <saugustine at google.com> 于2020年8月13日周四 下午3:35写道: > >> >> > >> >> There is a fair amount of dispute and detail here, and real > benchmarks can be difficult to write, because you often end up in arguments > about whether or not the two styles of coding are equivalent or not. > >> >> > >> >> But I agree with Dave--exceptions generally inhibit optimization. > >> >> > >> >> One way to think about this is that, generally speaking, the less > the compiler can prove about a program, the less aggressive it can be with > optimizations. If it doesn't know what the control flow will exactly look > like, it can't do certain transformations. Exceptions add a conditional > goto from every single function call (without good noexcept hygiene, which > practically no program has), to the exception handler or cleanup landing > pad (whether the exception handler is written in the code explicitly or > not, something has to run the destructors of objects that are going out of > scope.) Worse, inlining doesn't really save you, because any function call > that gets inlined also will have an implied conditional goto for any > function it calls, now to two possible landing pads. Is it safe to move > code across this goto? It's comparatively hard to prove things about that. > >> >> > >> >> Worse, a compiler generally know if a function call throws or not, > or event the type of exception that will get thrown. It definitely knows > what the return type and value of a function is. > >> >> > >> >> It is very easy for a compiler to reason about the error checking > around a function call--the code is all there and explicit. All of this > restricts what it can prove about the program. And the less it can prove, > the less aggressive it can be. > >> >> > >> >> > >> >> On Thu, Aug 13, 2020 at 3:13 PM David Blaikie via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> >>> > >> >>> On Thu, Aug 13, 2020 at 2:59 PM Haoran Xu via llvm-dev > >> >>> <llvm-dev at lists.llvm.org> wrote: > >> >>> > > >> >>> > Hello, > >> >>> > > >> >>> > I have been long taught that exceptions are considered the bad > parts in the design of C++ language, and one should avoid using it since > it's hard to write exception-safe code (I have no disagreement on this > part) and it impedes compiler optimization. > >> >>> > > >> >>> > However, when recently I dig into the implementation of exception > handling mechanism in LLVM recently, my impression is that using exceptions > should instead improve performance (in the common case that no exception is > thrown out), compared with the traditional approach of returning an error > code in every function that can fail: no error-code-checking logic is > executed in the fastpath, and error-handling code are moved from the main > binary to the exception table, so the CPU is doing less work, and also > instruction cache locality should be improved. Is my understanding correct? > >> >>> > >> >>> (this is probably a bit too broad a topic for this forum/tends to > end > >> >>> up in heated discussions, etc, but let's see how we go) > >> >>> > >> >>> I believe one of the main reasons your understanding there might be > >> >>> incorrect: Not every function returns an error code. But essentially > >> >>> every function can throw (yes, you can annotate them - but it's one > of > >> >>> those cases where C++ may've got the defaults wrong (like const, > etc) > >> >>> and it's impractical/unlikely someone's going to effectively > maintain > >> >>> noexcept on most of their codebase (and there are performance > problems > >> >>> with noexcept - since it's not UB to throw from such a function, > it's > >> >>> guaranteed to call terminate, so you have to generate exception > >> >>> handling code in every noexcept function)) > >> >>> > >> >>> > So my question is: > >> >>> > (1) Is the argument that 'exception hurts compiler optimization, > and should not be used in perf-sensitive code' outdated now? > >> >>> > >> >>> I don't believe so - having every function call being potentially > >> >>> control flow changing inhibits certain optimizations that would > >> >>> otherwise be possible (certain code motion, reordering, etc can't be > >> >>> done) compared to a relatively smaller number of explicit > >> >>> error-handling calls. > >> >>> > >> >>> > Or more generally, what are the pros and cons of using exception > vs error code, from a LLVM backend developer's perspective? > >> >>> > >> >>> Exceptions inhibit code motion, but allow for smaller code/keeping > the > >> >>> cold path/handling further away. I think that's basically the > >> >>> mechanical tradeoff (setting aside issues of readability, > >> >>> maintainability, etc - having an explicit keyword/notation on calls > >> >>> that can throw could help a lot from a maintainability perspective, > >> >>> for instance) > >> >>> > >> >>> > (2) In current LLVM design, is there still any optimization that > is prevented by using exceptions? Or is the current LLVM already able to > provide the same optimization quality for code using exceptions, compared > with code that does not? > >> >>> > >> >>> I believe that exceptions do still hinder optimizations when > compared > >> >>> to code that doesn't throw or use a return code*. While that may > seem > >> >>> like an unfair comparison, it's my understanding that it's part of > the > >> >>> excessive cost of exceptions. Not to mention how expensive they can > be > >> >>> when they are thrown, compared to explicit error handling. > >> >>> > >> >>> * Note this isn't a case of missed optimization opportunities (well, > >> >>> perhaps in LTO it's a missed opportunity - if we did whole program > >> >>> analysis to prove certain call trees never throw in practice, then > we > >> >>> could propagate that information and improve optimizations) - but a > >> >>> limitation of exceptions. > >> >>> > >> >>> - Dave > >> >>> _______________________________________________ > >> >>> LLVM Developers mailing list > >> >>> llvm-dev at lists.llvm.org > >> >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200813/4a1c1224/attachment-0001.html>
On Thu, Aug 13, 2020 at 6:35 PM Haoran Xu <haoranxu510 at gmail.com> wrote:> > Thanks for the reply. > >> Sorry, yes. No difference between explicit error handling and >> exceptions. The difference is in all the functions that don't have >> explicit error handling but (in the exception-using equivalent code) >> aren't marked nothrow (where, without exceptions, the compiler could >> move the increment across the call). > > According to your argument, assuming that correctly annotating throw specifications is impractical, if one wants to mitigate the performance implications of exceptions, not using exceptions oneself is not going to be helpful (since as you said, explicit error-handling and exceptions have no difference), and instead one must completely disable exceptions with -fno-exceptions? > > On the other hand, if the project is not built with -fno-exceptions, using exceptions to replace explicit error-handling would not degrade performance.Once you get past the nothrow default problems, then you probably have to deal with the performance tradeoffs between the current strategy for exception implementations (table based, etc) compared to the tradeoffs for explicit error handling. You'd probably find that using exceptions for /every/ error return would not be the right perf tradeoff for many use cases where errors are at least somewhat common. So you'd probably end up with a hybrid solution - some things using exceptions where the failure is rare/going to be costly anyway (stopping the process, showing an error to a user/asking for user input about retrying, etc) and then still using explicit error handling for other things.> In other words, compare the 3 settings: > (1) A project built with -fno-exception, and use explicit error-code-handling > (2) A project built without -fno-exception, but does not use exception either, and use explicit error-code-handling > (3) A project that use exceptions for error handling. > Then setting (1) would be fastest, and setting (2)(3) would have similar performance (maybe (3) would be even faster than (2) due to better code locality?). Do you agree with this? > > > David Blaikie <dblaikie at gmail.com> 于2020年8月13日周四 下午6:14写道: >> >> On Thu, Aug 13, 2020 at 6:11 PM Haoran Xu <haoranxu510 at gmail.com> wrote: >> > >> > Thanks for the insights David! >> > >> > For your first 3 points, is it correct to understand it as following: the external function prototypes are missing reliable information on whether the function throws and what exceptions it may throw (due to C++'s design failures and that it is impractical to maintain such information in a large codebase), which is the main cause that code using exceptions cannot be optimized as effectively as code that does not. >> >> That (lack of precise throw information for external functions due to >> requiring manual annotation of nothrow), I believe, is a significant >> issue, yes. >> >> >> The ABI - the agreement between compilation of different files/objects >> >> limits what the compiler can do to change the mechanism for exception >> >> handling. The compiler building 'f()' and the compiler building f()'s >> >> caller must agree on how the exception will be modeled - so the >> > >> > I don't have much expertise on optimizer, but I think the compiler doesn't need to break ABI to optimize those code..? >> > The two code snippets in my example has effectively the same structure, whatever optimization (e.g. move code around) that would work in the error-code case should also work in the exception case? >> > For example, if the compile decides to move some code after f() to before f(), it would do below for the error-code case (just to demonstrate that the same transformation could work for both exception and no-exception code): >> > [moved logic...] >> > if (err = f()) { >> > [undo side effect of moved logic..] >> > call destructors >> > return err >> > } >> > and for code that uses exception: >> > [moved logic...] >> > if (exception = f()) { >> > [undo side effect of moved logic..] >> > call destructors >> > resume exception >> > } >> > and this is not breaking the ABI at all, right? >> >> Sorry, yes. No difference between explicit error handling and >> exceptions. The difference is in all the functions that don't have >> explicit error handling but (in the exception-using equivalent code) >> aren't marked nothrow (where, without exceptions, the compiler could >> move the increment across the call). >> >> - Dave >> >> > >> > >> > >> > David Blaikie <dblaikie at gmail.com> 于2020年8月13日周四 下午5:06写道: >> >> >> >> On Thu, Aug 13, 2020 at 4:38 PM Haoran Xu <haoranxu510 at gmail.com> wrote: >> >> > >> >> > Hello David and Sterling, thanks for the reply! >> >> > >> >> > I have no intention to heat up the discussion, please pardon me if I asked questions that might sound silly to you -- it's just that I truly didn't understand since I'm not expert in llvm or optimizer or exception handling at all. >> >> > >> >> >> I believe one of the main reasons your understanding there might be >> >> >> incorrect: Not every function returns an error code. But essentially every function can throw >> >> > >> >> > I think if a function could throw, then it *should* return an error code in the no-except approach, otherwise there would be no way to convey the error to caller. >> >> >> >> I don't know of any codebase that uses "nothrow" ubiquitously enough >> >> to represent this situation. In general in C++ code there will be many >> >> more functions that can in theory throw (as far as the compiler's >> >> concerned - ie: The functions are not marked "nothrow") but never do >> >> in practice, than there would be functions that actually need error >> >> handling/failures. >> >> >> >> > So I did't get the point that "essentially every function can throw". Indeed by default most of the C++ functions may throw (and some of them may actually be non-throwing), but LLVM should (at least in theory) be able to correctly figure out which functions may actually throw as long as it has the definition, is that correct? >> >> >> >> As long as it has the definition of the entire call graph - which is >> >> unlikely without whole program analysis. >> >> >> >> > So if a function does not call external functions, the whole control flow graph (including the throw parts) should be available to LLVM even if there are exceptions, so I think there should be no limitation in moving code around? >> >> >> >> That's a fairly uncommon property to hold - most code calls other >> >> external functions. (certainly external to a file - and in many cases >> >> external even to whole program optimization, depending on how you >> >> define "whole program" and how you build your code, whether you have >> >> access to the source of every library to rebuild them, etc) >> >> >> >> > In the case that the function calls external functions, then it should already be hard to move code across that external function call, so I thought having exceptions does not pose an additional limitation. Maybe I misunderstood some part? >> >> >> >> There's still things that could be done moving across a call to >> >> external code - eg: int f1() { static int i = 3; ++i; f2(); return i; >> >> } - a compiler could move the increment back and forth over f2 >> >> depending on its needs. If 'f2' can throw, then it's unsafe to move >> >> the increment. >> >> >> >> >> Exceptions add a conditional goto from every single function call (without good noexcept hygiene, which practically no program has), to the exception handler or cleanup landing pad (whether the exception handler is written in the code explicitly or not, something has to run the destructors of objects that are going out of scope.) >> >> > >> >> > I did't quite understand what is different between this and an error-checking 'if'. Let's say you have a function call f(). Then with error code, the logic would look like >> >> > error_code = f() >> >> > if (error_code) { >> >> > (call all destructors) <-- generated by llvm >> >> > return error_code; >> >> > } >> >> > and with exception, the code would (conceptually) look like: >> >> > exceptions = f() >> >> > if (exceptions) { <-- generated by llvm >> >> > call all destructors; <-- generated by llvm >> >> > resume exceptions; <-- generated by llvm >> >> > } <-- generated by llvm >> >> > The only difference is that the code for 'if' branch would physically reside in the exception table, but I couldn't see why this would make a difference to the optimizer. The optimizer could in theory just optimize the code as if it were a physically-existing 'if' branch (and achieve the same optimized code as the error-code approach), and then finally put the 'if' branch into the exception table. Am I missing some piece? >> >> >> >> The ABI - the agreement between compilation of different files/objects >> >> limits what the compiler can do to change the mechanism for exception >> >> handling. The compiler building 'f()' and the compiler building f()'s >> >> caller must agree on how the exception will be modeled - so the >> >> compiler can't readily make a decision about tradeoffs of different >> >> schemes for exception propagation depending on profiles or other >> >> things. Yes, with a full ABI break you could have a completely >> >> different exception handling scheme that doesn't use tables at all - >> >> adds an extra implicit return value from every function call wvhich is >> >> the "did this throw" bit, with some other sidetable of the actual data >> >> that was thrown, etc - I believe this is the sort of model Swift uses, >> >> for instance. But that has a different set of performance tradeoffs (& >> >> again, the defaults are problematic - C++ compilers must assume all >> >> functions throw unless annotated otherwise, and C++ writers don't >> >> annotate them enough to help much there) >> >> >> >> - Dave >> >> >> >> > >> >> > Thanks, >> >> > Haoran >> >> > >> >> > >> >> > >> >> > Sterling Augustine <saugustine at google.com> 于2020年8月13日周四 下午3:35写道: >> >> >> >> >> >> There is a fair amount of dispute and detail here, and real benchmarks can be difficult to write, because you often end up in arguments about whether or not the two styles of coding are equivalent or not. >> >> >> >> >> >> But I agree with Dave--exceptions generally inhibit optimization. >> >> >> >> >> >> One way to think about this is that, generally speaking, the less the compiler can prove about a program, the less aggressive it can be with optimizations. If it doesn't know what the control flow will exactly look like, it can't do certain transformations. Exceptions add a conditional goto from every single function call (without good noexcept hygiene, which practically no program has), to the exception handler or cleanup landing pad (whether the exception handler is written in the code explicitly or not, something has to run the destructors of objects that are going out of scope.) Worse, inlining doesn't really save you, because any function call that gets inlined also will have an implied conditional goto for any function it calls, now to two possible landing pads. Is it safe to move code across this goto? It's comparatively hard to prove things about that. >> >> >> >> >> >> Worse, a compiler generally know if a function call throws or not, or event the type of exception that will get thrown. It definitely knows what the return type and value of a function is. >> >> >> >> >> >> It is very easy for a compiler to reason about the error checking around a function call--the code is all there and explicit. All of this restricts what it can prove about the program. And the less it can prove, the less aggressive it can be. >> >> >> >> >> >> >> >> >> On Thu, Aug 13, 2020 at 3:13 PM David Blaikie via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> >> >>> >> >> >>> On Thu, Aug 13, 2020 at 2:59 PM Haoran Xu via llvm-dev >> >> >>> <llvm-dev at lists.llvm.org> wrote: >> >> >>> > >> >> >>> > Hello, >> >> >>> > >> >> >>> > I have been long taught that exceptions are considered the bad parts in the design of C++ language, and one should avoid using it since it's hard to write exception-safe code (I have no disagreement on this part) and it impedes compiler optimization. >> >> >>> > >> >> >>> > However, when recently I dig into the implementation of exception handling mechanism in LLVM recently, my impression is that using exceptions should instead improve performance (in the common case that no exception is thrown out), compared with the traditional approach of returning an error code in every function that can fail: no error-code-checking logic is executed in the fastpath, and error-handling code are moved from the main binary to the exception table, so the CPU is doing less work, and also instruction cache locality should be improved. Is my understanding correct? >> >> >>> >> >> >>> (this is probably a bit too broad a topic for this forum/tends to end >> >> >>> up in heated discussions, etc, but let's see how we go) >> >> >>> >> >> >>> I believe one of the main reasons your understanding there might be >> >> >>> incorrect: Not every function returns an error code. But essentially >> >> >>> every function can throw (yes, you can annotate them - but it's one of >> >> >>> those cases where C++ may've got the defaults wrong (like const, etc) >> >> >>> and it's impractical/unlikely someone's going to effectively maintain >> >> >>> noexcept on most of their codebase (and there are performance problems >> >> >>> with noexcept - since it's not UB to throw from such a function, it's >> >> >>> guaranteed to call terminate, so you have to generate exception >> >> >>> handling code in every noexcept function)) >> >> >>> >> >> >>> > So my question is: >> >> >>> > (1) Is the argument that 'exception hurts compiler optimization, and should not be used in perf-sensitive code' outdated now? >> >> >>> >> >> >>> I don't believe so - having every function call being potentially >> >> >>> control flow changing inhibits certain optimizations that would >> >> >>> otherwise be possible (certain code motion, reordering, etc can't be >> >> >>> done) compared to a relatively smaller number of explicit >> >> >>> error-handling calls. >> >> >>> >> >> >>> > Or more generally, what are the pros and cons of using exception vs error code, from a LLVM backend developer's perspective? >> >> >>> >> >> >>> Exceptions inhibit code motion, but allow for smaller code/keeping the >> >> >>> cold path/handling further away. I think that's basically the >> >> >>> mechanical tradeoff (setting aside issues of readability, >> >> >>> maintainability, etc - having an explicit keyword/notation on calls >> >> >>> that can throw could help a lot from a maintainability perspective, >> >> >>> for instance) >> >> >>> >> >> >>> > (2) In current LLVM design, is there still any optimization that is prevented by using exceptions? Or is the current LLVM already able to provide the same optimization quality for code using exceptions, compared with code that does not? >> >> >>> >> >> >>> I believe that exceptions do still hinder optimizations when compared >> >> >>> to code that doesn't throw or use a return code*. While that may seem >> >> >>> like an unfair comparison, it's my understanding that it's part of the >> >> >>> excessive cost of exceptions. Not to mention how expensive they can be >> >> >>> when they are thrown, compared to explicit error handling. >> >> >>> >> >> >>> * Note this isn't a case of missed optimization opportunities (well, >> >> >>> perhaps in LTO it's a missed opportunity - if we did whole program >> >> >>> analysis to prove certain call trees never throw in practice, then we >> >> >>> could propagate that information and improve optimizations) - but a >> >> >>> limitation of exceptions. >> >> >>> >> >> >>> - Dave >> >> >>> _______________________________________________ >> >> >>> LLVM Developers mailing list >> >> >>> llvm-dev at lists.llvm.org >> >> >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev