Andrew Kelley via llvm-dev
2016-Dec-18 02:43 UTC
[llvm-dev] LLD status update and performance chart
On Sat, Dec 17, 2016 at 9:32 PM, Sean Silva via llvm-dev < llvm-dev at lists.llvm.org> wrote:> > > On Fri, Dec 16, 2016 at 12:31 PM, Pete Cooper via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> >> On Dec 16, 2016, at 11:46 AM, Rui Ueyama <ruiu at google.com> wrote: >> >> On Fri, Dec 16, 2016 at 11:18 AM, Pete Cooper <peter_cooper at apple.com> >> wrote: >> >>> Hi Rui >>> >>> I agree separating the components out in to libraries only makes sense >>> when there is a clear reason to do so. However, just this year there was a >>> very involved discussion about what it means to be a library. >>> Specifically, I don't think your current 'main-as-library' argument is >>> valid while you call exit or (if you) rely on mutable global state. Having >>> a single entry point via a main function is fine, but that function cannot >>> then kill the process which its linked in to. >>> >> >> Our main function returns as long as input object files are not >> corrupted. If you are doing in-memory linking, I think it is unlikely that >> the object files in memory are corrupted (especially when you just created >> them using LLVM), so I think this satisfies most users needs in practice. >> Do you have a concern about that? >> >> Ultimately my concern is that there is *any* code path calling exit. I >> would say that this prevents the lld library from being used in-process. >> But others opinions may differ, and I honestly don't have a use case in >> mind, just that I don't think library code should ever call exit. >> > > I agreed with the sentiment at first, but after thinking about it for a > while, I actually have convinced myself that it doesn't hold water under > closer inspection. > > The fundamental thing is that the LLVM libraries actually do have tons of > fatal errors; they're just in the form of assert's (or we'll dereference a > null pointer, or run off the end of a data structure, or go into an > infinite loop, etc.). > > If you pass a corrupted Module to LLVM through the library API, you can > certainly trip tons of "fatal errors" (in the form of failed assertions or > UB). The way that LLVM gets around this is by having a policy of "if you > pass it corrupted Module that doesn't pass the verifier, it's your fault, > you're using our API wrong". Why can't an LLD library API have that same > requirement? >I agree that if an API user violates the API of a library, it is appropriate for the library to abort with a fatal error. However if the API is used correctly, but some error occurs, this error should be reported back to the API consumer.> If it is safe for clang to uses the LLVM library API without running the > verifier as its default configuration for non-development builds, why would > it be unsafe for (say) clang to pass an object file directly to LLD as a > library without verification? Like Rui said, it's absolutely possible to > create a verifier pass for LLD; it just hasn't been written because most > object files we've seen so far seem to come from a small number of > well-tested codepaths that always (as in the `assert` meaning of "always") > create valid ELF files. In fact, we've added graceful recovery as > appropriate (e.g. r259831), which is a step above error handling! > > Also, I'd like to point out that Clang, even when it does run the LLVM > verifier (which is not the default except in development builds), runs it > with fatal error handling. Is anybody aware of a program that uses LLVM as > a library, produces IR in memory, runs the verifier, and does not simply > abort if the verifier fails in non-development builds? >I'm doing the same as clang: #ifndef NDEBUG char *error = nullptr; LLVMVerifyModule(g->module, LLVMAbortProcessAction, &error); #endif However the LLVM API is defined such that trying to call codegen functions on an invalid module is undefined, and aborting in this case makes sense. This is really just a more sophisticated assert(). I'm a fan of assert(); I like having assertions on during development. It makes sense for a library to assert if API is violated. But errors not due to API violations should be reported back to the caller instead of aborting. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161217/97e62834/attachment.html>
Sean Silva via llvm-dev
2016-Dec-18 02:49 UTC
[llvm-dev] LLD status update and performance chart
On Sat, Dec 17, 2016 at 6:43 PM, Andrew Kelley <superjoe30 at gmail.com> wrote:> On Sat, Dec 17, 2016 at 9:32 PM, Sean Silva via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> >> >> On Fri, Dec 16, 2016 at 12:31 PM, Pete Cooper via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> >>> On Dec 16, 2016, at 11:46 AM, Rui Ueyama <ruiu at google.com> wrote: >>> >>> On Fri, Dec 16, 2016 at 11:18 AM, Pete Cooper <peter_cooper at apple.com> >>> wrote: >>> >>>> Hi Rui >>>> >>>> I agree separating the components out in to libraries only makes sense >>>> when there is a clear reason to do so. However, just this year there was a >>>> very involved discussion about what it means to be a library. >>>> Specifically, I don't think your current 'main-as-library' argument is >>>> valid while you call exit or (if you) rely on mutable global state. Having >>>> a single entry point via a main function is fine, but that function cannot >>>> then kill the process which its linked in to. >>>> >>> >>> Our main function returns as long as input object files are not >>> corrupted. If you are doing in-memory linking, I think it is unlikely that >>> the object files in memory are corrupted (especially when you just created >>> them using LLVM), so I think this satisfies most users needs in practice. >>> Do you have a concern about that? >>> >>> Ultimately my concern is that there is *any* code path calling exit. I >>> would say that this prevents the lld library from being used in-process. >>> But others opinions may differ, and I honestly don't have a use case in >>> mind, just that I don't think library code should ever call exit. >>> >> >> I agreed with the sentiment at first, but after thinking about it for a >> while, I actually have convinced myself that it doesn't hold water under >> closer inspection. >> >> The fundamental thing is that the LLVM libraries actually do have tons of >> fatal errors; they're just in the form of assert's (or we'll dereference a >> null pointer, or run off the end of a data structure, or go into an >> infinite loop, etc.). >> >> If you pass a corrupted Module to LLVM through the library API, you can >> certainly trip tons of "fatal errors" (in the form of failed assertions or >> UB). The way that LLVM gets around this is by having a policy of "if you >> pass it corrupted Module that doesn't pass the verifier, it's your fault, >> you're using our API wrong". Why can't an LLD library API have that same >> requirement? >> > > I agree that if an API user violates the API of a library, it is > appropriate for the library to abort with a fatal error. > > However if the API is used correctly, but some error occurs, this error > should be reported back to the API consumer. > > >> If it is safe for clang to uses the LLVM library API without running the >> verifier as its default configuration for non-development builds, why would >> it be unsafe for (say) clang to pass an object file directly to LLD as a >> library without verification? Like Rui said, it's absolutely possible to >> create a verifier pass for LLD; it just hasn't been written because most >> object files we've seen so far seem to come from a small number of >> well-tested codepaths that always (as in the `assert` meaning of "always") >> create valid ELF files. In fact, we've added graceful recovery as >> appropriate (e.g. r259831), which is a step above error handling! >> >> Also, I'd like to point out that Clang, even when it does run the LLVM >> verifier (which is not the default except in development builds), runs it >> with fatal error handling. Is anybody aware of a program that uses LLVM as >> a library, produces IR in memory, runs the verifier, and does not simply >> abort if the verifier fails in non-development builds? >> > > I'm doing the same as clang: > > #ifndef NDEBUG > char *error = nullptr; > LLVMVerifyModule(g->module, LLVMAbortProcessAction, &error); > #endif > > However the LLVM API is defined such that trying to call codegen functions > on an invalid module is undefined, and aborting in this case makes sense. > This is really just a more sophisticated assert(). > > I'm a fan of assert(); I like having assertions on during development. It > makes sense for a library to assert if API is violated. But errors not due > to API violations should be reported back to the caller instead of aborting. > >Well, LLD/ELF's API is also documented to not be guaranteed to return if you pass it corrupted object files: ``` The current policy is that it is your reponsibility to give trustworthy object files. The function is guaranteed to return as long as you do not pass corrupted or malicious object files. A corrupted file could cause a fatal error or SEGV. That being said, you don't need to worry too much about it if you create object files in the usual way and give them to the linker. It is naturally expected to work, or otherwise it's a linker's bug. ``` -- Sean Silva -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161217/a41cc65e/attachment.html>
Antoine Pitrou via llvm-dev
2016-Dec-18 14:53 UTC
[llvm-dev] LLD status update and performance chart
On Sat, 17 Dec 2016 21:43:16 -0500 Andrew Kelley via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > I agree that if an API user violates the API of a library, it is > appropriate for the library to abort with a fatal error.<unlurking> Is it? If you pass an invalid fd to the libc, it replies with a EBADF, it doesn't crash hard. Most mature libraries have guards against invalid or inconsistent parameter values, and return error codes to the caller. As someone who maintains and uses an LLVM binding to Python (llvmlite), it's one of the annoyances we have faced: if someone makes a mistake when calling one of the exposed APIs, that API may crash the process (while, as Python programmers, they would rather get an exception, which at least makes it easier to debug and diagnose the issue). Getting a crude assert-induced crash on a CI machine or a user's machine is no fun. Of course, a C or C++ library cannot guard against everything, especially not against invalid pointers or corrupted memory. But large classes of user errors may be better served by actually returning an error code rather than failing on an assert. </unlurking> Regards Antoine.
Andrew Kelley via llvm-dev
2016-Dec-19 00:02 UTC
[llvm-dev] LLD status update and performance chart
On Sun, Dec 18, 2016 at 9:53 AM, Antoine Pitrou via llvm-dev < llvm-dev at lists.llvm.org> wrote:> On Sat, 17 Dec 2016 21:43:16 -0500 > Andrew Kelley via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > > I agree that if an API user violates the API of a library, it is > > appropriate for the library to abort with a fatal error. > > Is it? If you pass an invalid fd to the libc, it replies with a EBADF, > it doesn't crash hard. Most mature libraries have guards against invalid > or inconsistent parameter values, and return error codes to the caller. > > As someone who maintains and uses an LLVM binding to Python (llvmlite), > it's one of the annoyances we have faced: if someone makes a mistake > when calling one of the exposed APIs, that API may crash the process > (while, as Python programmers, they would rather get an exception, > which at least makes it easier to debug and diagnose the issue). > Getting a crude assert-induced crash on a CI machine or a user's > machine is no fun. > > Of course, a C or C++ library cannot guard against everything, > especially not against invalid pointers or corrupted memory. But large > classes of user errors may be better served by actually returning an > error code rather than failing on an assert.Going above and beyond to avoid crashing for invalid use of the API is nice, and results in fewer bug reports filed for the library that should be filed for the upstream application. However, it's not required for a robust library. The weaker guarantee of adhering to the API causing no exit paths in the library, is easier to adhere to, while still letting the upstream application developer create software that does not crash in the library regardless of the user input. Trying to go beyond this guarantee is the kind of thing that will make development less pleasant for Rafael and co, and while nice, is not really necessary. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161218/21b47cad/attachment.html>