thr3ads.net - llvm dev - [llvm-dev] [RFC] Error handling in LLVM libraries. [Feb 2016]

If this information is useful, please help other people find it:
Share via:

Lang Hames via llvm-dev

2016-Feb-10 00:23 UTC

[llvm-dev] [RFC] Error handling in LLVM libraries.

> > I don't think these are really independent. Whether or not you
need to
emit> > a diagnostic depends on whether a caller can handle the corresponding
error,> > which isn't something you know at the point where the error is
raised.
> But you do in the diag handler. For example, if you are trying to open
> multiple files, some of which are bitcode, you know to ignore
> InvalidBitcodeSignature.
That means anticipating the kinds of errors you do/don't want to recover
from at the point at which you insatiate your diagnostic handler. At that
point you may not have the information that you need to know whether you
want to recover. E.g. What if one path through library code can recover,
and another can't? Worse still, what if the divergence point between these
two paths is in library code itself? Now you need to thread two diagnostic
handlers up to that point. This scales very poorly.

Simply put: diagnostic handlers aren't library friendly. It's tough
enough
to get all libraries to agree on an error type, without having them all
agree on a diagnostic handlers too, and thread them everywhere.
> This highlights why I think it is important to keep diagnostics and
> errors separate. In the solution you have there is a need to allocate
> a std::string, even if that is never used.
Errors are only constructed on the error path. There is no construction on
the success path.
> If it was always a
> std::string it would not be that bad, but the framework seems designed
> to allow even more context to be stored, opening the way for fun cases
> of error handling interfering with lifetime management.
This is a real problem, but I think the solution is to have a style
guideline: You can't use reference types (in the general sense -
references, pointers, etc.) in errors. Errors that would otherwise require
references should have their error-message stringified at the point of
creation.
> > Consider someone who wants to use libObject to write an object-file
repair> > tool: A "bad_symbol" error code tells you what went wrong,
but not
where or> > why, so it's not much use in attempting a fix. On the other hand
"bad_symbol> > { idx = 10 }" gives you enough information to pop up a dialog,
ask the
user> > what the symbol at index 10 should be, then re-write that symbol table
> > entry.
> Using error results to find deep information you want to patch seems
> like a bad idea. A tool that wants to patch symbol indexes should be
> reading them directly.
I'm not sure I agree, but I don't have strong feelings on this
particular
example - it was intended as a straw man. My point is that error recover is
useful in general: there is a reason things like exceptions exist.
Diagnostic handlers are very poorly suited to general error recovery.
> > void check(TypedError Err) {
> >   if (!Err)
> >     return;
> >
> >   logAllUnhandledErrors(std::move(Err), errs(), "<tool
name>");
> >   exit(1);
> > }
> That is not the same that you get with a diagnostic handler. What you
> get is an exit after the error was propagated over the library layer,
> instead of an exit at the point where the issue is found.
As long as you log the error I'm not sure I see a meaningful difference? If
you're not debugging you shouldn't be crashing in your library. If you
are
debugging you want to set a breakpoint on the error constructor instead.
> I would prefer the diagnostic handler for exactly the same reason :-)
> Generality has a cost...
Could you explain what you see as the cost in this instance?

The scheme has a higher cost that std::error_code alone when you're
returning an error, but I don't think optimizing the error case is a
sensible thing to do. It has a potentially *lower* cost on the success
path, as you don't need to take up an argument for the diagnostic handler.

The most serious objection that you've raised, to my mind, is the potential
lifetime issues if people mistakenly use references in their error types.
I'm planning to write additions to llvm's coding guidelines to cover
that,
which I think should be sufficient.

On the flip side, compared to a diagnostic handler, I think the scheme I've
proposed has the following benefits:

- No need to thread a diagnostic handler type through the world (and
consequently no need to agree on its type). Retrofitting rich error logging
to an interface that already returns errors requires less work than adding
a diagnostic handler.
- The ability to describe error hierarchies (e.g. the class of all errors
that represent malformed objects, which may be further subclassed to
describe specific errors).
- The ability for client code to meaningfully distinguish between error
instances, rather than just error types.
> , and I don't think we should invest on it until
> we have a concrete case where that is needed.
The error infrastructure investment is largely done, and I'm volunteering
to maintain it. Myself and Kevin Enderby are signing up to weave this
through libObject and the JIT. Other libraries could be converted as people
see fit, with ECError providing an interface between the std::error_code
world and TypedError.
> Given that I would
> suggest going the more incremental way. Add a CheckedError that wraps
> error_code and use to make sure the errors are checked. When a better
> diagnostic is needed, pass in a diagnostic handler.
Our first use-case for this system is libObject, where we want to use this
to enable richer error messages. An incremental approach would involve
threading a diagnostic handler through everything (and returning
CheckedError), then un-threading it again would involve a lot of churn. I'd
prefer to move directly to the scheme I'm proposing.

Cheers,
Lang.


On Tue, Feb 9, 2016 at 3:27 PM, Rafael Espíndola <rafael.espindola at
gmail.com> wrote:
> > I don't think these are really independent. Whether or not you
need to
> emit
> > a diagnostic depends on whether a caller can handle the corresponding
> error,
> > which isn't something you know at the point where the error is
raised.
>
> But you do in the diag handler. For example, if you are trying to open
> multiple files, some of which are bitcode, you know to ignore
> InvalidBitcodeSignature.
>
>
> > That's the idea behind the 'log' method on TypedErrorInfo:
It lets you
> > transform meaningful information about the problem into a log message
> > *after* you know whether it can be handled or not. Recoverable errors
> can be
> > logged (if the client wants to do so), but they don't have to be.
>
> ...
>
> > By contrast, in my system this would be:
> >
> > class InvalidBitcodeSignature :
TypedErrorInfo<InvalidBitcodeSignature>
> {};
> > class CorruptedBitcode : TypedErrorInfo<CorruptedBitcode> {
> > public:
> >   CorruptedBitcode(std::string Msg) : Msg(Msg) {}
> >   void log(raw_ostream &OS) const override { OS << Msg; }
> > private:
> >   std::string Msg;
> > };
>
> This highlights why I think it is important to keep diagnostics and
> errors separate. In the solution you have there is a need to allocate
> a std::string, even if that is never used. If it was always a
> std::string it would not be that bad, but the framework seems designed
> to allow even more context to be stored, opening the way for fun cases
> of error handling interfering with lifetime management.
>
> > Consider someone who wants to use libObject to write an object-file
> repair
> > tool: A "bad_symbol" error code tells you what went wrong,
but not where
> or
> > why, so it's not much use in attempting a fix. On the other hand
> "bad_symbol
> > { idx = 10 }" gives you enough information to pop up a dialog,
ask the
> user
> > what the symbol at index 10 should be, then re-write that symbol table
> > entry.
>
> Using error results to find deep information you want to patch seems
> like a bad idea. A tool that wants to patch symbol indexes should be
> reading them directly.
>
> >
> >> Other advantages are:
> >>
> >> * It is easy to use fatal error in simple programs by just using a
> >> diganostic handler that exits.
> >> * Very debugger friendly. It is easy to put a breakpoint at the
diag
> >> handler.
> >
> > This proposal provides the same advantages. I noted the second point
in
> the
> > original RFC. The first is easily implemented using an idiom I've
seen in
> > llvm-objdump and elsewhere:
> >
> > void check(TypedError Err) {
> >   if (!Err)
> >     return;
> >
> >   logAllUnhandledErrors(std::move(Err), errs(), "<tool
name>");
> >   exit(1);
> > }
>
> That is not the same that you get with a diagnostic handler. What you
> get is an exit after the error was propagated over the library layer,
> instead of an exit at the point where the issue is found.
>
> We use the above code now because lib/Object has no diagnostic handler.
>
> > Side-note: you can actually go one better and replace this idiom with:
> >
> > template <typename T>
> > T check(TypedErrorOr<T> ValOrErr) {
> >   if (!ValOrErr) {
> >     logAllUnhandledErrors(ValOrErr.takeError(), errs(), "<tool
name>");
> >     exit(1);
> >   }
> >   return std::move(*ValOrErr);
> > }
>
> Yes, we do pretty much that in ELF/COFF lld.
>
>
> > Using TypedError as a wrapper for std::error_code would prevent errors
> from
> > be silently dropped, which is a huge step forward. (Though TypedError
> would
> > be a misnomer - we'd need a new name. :)
>
> CheckedError?
>
> > Using a diagnostic handler would allow richer diagnostics from
libObject,
> > but it supports a narrower class of error-recovery than my proposal.
I'm
> > inclined to favor my proposal as a strictly more general solution.
>
> I would prefer the diagnostic handler for exactly the same reason :-)
> Generality has a cost, and I don't think we should invest on it until
> we have a concrete case where that is needed. Given that I would
> suggest going the more incremental way. Add a CheckedError that wraps
> error_code and use to make sure the errors are checked. When a better
> diagnostic is needed, pass in a diagnostic handler.
>
> If we ever get to the point were we really want to *return* more than
> an error_code, it will be a much smaller refactoring to
> s/CheckedError/TypedError/.
>
> Cheers,
> Rafael
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160209/3dddfdfc/attachment.html>

David Blaikie via llvm-dev

2016-Feb-10 01:06 UTC

head link

[llvm-dev] [RFC] Error handling in LLVM libraries.

On Tue, Feb 9, 2016 at 4:23 PM, Lang Hames via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> > > I don't think these are really independent. Whether or not
you need to
> emit
> > > a diagnostic depends on whether a caller can handle the
corresponding
> error,
> > > which isn't something you know at the point where the error
is raised.
>
> > But you do in the diag handler. For example, if you are trying to open
> > multiple files, some of which are bitcode, you know to ignore
> > InvalidBitcodeSignature.
>
> That means anticipating the kinds of errors you do/don't want to
recover
> from at the point at which you insatiate your diagnostic handler. At that
> point you may not have the information that you need to know whether you
> want to recover. E.g. What if one path through library code can recover,
> and another can't?
>
+1 to this (I had the same thought - stateful handlers would be pretty
tricky, doubly so with \/ this point)

> Worse still, what if the divergence point between these two paths is in
> library code itself? Now you need to thread two diagnostic handlers up to
> that point. This scales very poorly.
>
> Simply put: diagnostic handlers aren't library friendly. It's tough
enough
> to get all libraries to agree on an error type, without having them all
> agree on a diagnostic handlers too, and thread them everywhere.
>
I think it depends a bit on the library & how generic it is - likely lower
level libraries (like libObject) will want a granular error-based result,
not a diagnostic engine. (the clients will be more widely varied (than,
say, Clang) & have different diagnostic/behavioral needs, etc)

But I think it makes total sense for Clang to use a diagnostic handling
approach for the most part (all the possible ways that C++ can be written
incorrectly essentially amount to the same thing for a consumer of Clang -
code was bad & I need some text (and fixits, notes, etc) to tell the user
why), the IR verifier similarly - and possibly LLD, for example, depending
on how broad the use cases become.

>
> > This highlights why I think it is important to keep diagnostics and
> > errors separate. In the solution you have there is a need to allocate
> > a std::string, even if that is never used.
>
> Errors are only constructed on the error path. There is no construction on
> the success path.
>
> > If it was always a
> > std::string it would not be that bad, but the framework seems designed
> > to allow even more context to be stored, opening the way for fun cases
> > of error handling interfering with lifetime management.
>
> This is a real problem, but I think the solution is to have a style
> guideline: You can't use reference types (in the general sense -
> references, pointers, etc.) in errors. Errors that would otherwise require
> references should have their error-message stringified at the point of
> creation.
>
> > > Consider someone who wants to use libObject to write an
object-file
> repair
> > > tool: A "bad_symbol" error code tells you what went
wrong, but not
> where or
> > > why, so it's not much use in attempting a fix. On the other
hand
> "bad_symbol
> > > { idx = 10 }" gives you enough information to pop up a
dialog, ask the
> user
> > > what the symbol at index 10 should be, then re-write that symbol
table
> > > entry.
>
> > Using error results to find deep information you want to patch seems
> > like a bad idea. A tool that wants to patch symbol indexes should be
> > reading them directly.
>
> I'm not sure I agree, but I don't have strong feelings on this
particular
> example - it was intended as a straw man. My point is that error recover is
> useful in general: there is a reason things like exceptions exist.
> Diagnostic handlers are very poorly suited to general error recovery.
>
> > > void check(TypedError Err) {
> > >   if (!Err)
> > >     return;
> > >
> > >   logAllUnhandledErrors(std::move(Err), errs(), "<tool
name>");
> > >   exit(1);
> > > }
>
> > That is not the same that you get with a diagnostic handler. What you
> > get is an exit after the error was propagated over the library layer,
> > instead of an exit at the point where the issue is found.
>
> As long as you log the error I'm not sure I see a meaningful
difference?
> If you're not debugging you shouldn't be crashing in your library.
If you
> are debugging you want to set a breakpoint on the error constructor
instead.
>
> > I would prefer the diagnostic handler for exactly the same reason :-)
> > Generality has a cost...
>
> Could you explain what you see as the cost in this instance?
>
> The scheme has a higher cost that std::error_code alone when you're
> returning an error, but I don't think optimizing the error case is a
> sensible thing to do. It has a potentially *lower* cost on the success
> path, as you don't need to take up an argument for the diagnostic
handler.
>
> The most serious objection that you've raised, to my mind, is the
> potential lifetime issues if people mistakenly use references in their
> error types. I'm planning to write additions to llvm's coding
guidelines to
> cover that, which I think should be sufficient.
>
> On the flip side, compared to a diagnostic handler, I think the scheme
> I've proposed has the following benefits:
>
> - No need to thread a diagnostic handler type through the world (and
> consequently no need to agree on its type). Retrofitting rich error logging
> to an interface that already returns errors requires less work than adding
> a diagnostic handler.
> - The ability to describe error hierarchies (e.g. the class of all errors
> that represent malformed objects, which may be further subclassed to
> describe specific errors).
> - The ability for client code to meaningfully distinguish between error
> instances, rather than just error types.
>
> > , and I don't think we should invest on it until
> > we have a concrete case where that is needed.
>
> The error infrastructure investment is largely done, and I'm
volunteering
> to maintain it. Myself and Kevin Enderby are signing up to weave this
> through libObject and the JIT. Other libraries could be converted as people
> see fit, with ECError providing an interface between the std::error_code
> world and TypedError.
>
> > Given that I would
> > suggest going the more incremental way. Add a CheckedError that wraps
> > error_code and use to make sure the errors are checked. When a better
> > diagnostic is needed, pass in a diagnostic handler.
>
> Our first use-case for this system is libObject, where we want to use this
> to enable richer error messages. An incremental approach would involve
> threading a diagnostic handler through everything (and returning
> CheckedError), then un-threading it again would involve a lot of churn.
I'd
> prefer to move directly to the scheme I'm proposing.
>
> Cheers,
> Lang.
>
>
> On Tue, Feb 9, 2016 at 3:27 PM, Rafael Espíndola <
> rafael.espindola at gmail.com> wrote:
>
>> > I don't think these are really independent. Whether or not you
need to
>> emit
>> > a diagnostic depends on whether a caller can handle the
corresponding
>> error,
>> > which isn't something you know at the point where the error is
raised.
>>
>> But you do in the diag handler. For example, if you are trying to open
>> multiple files, some of which are bitcode, you know to ignore
>> InvalidBitcodeSignature.
>>
>>
>> > That's the idea behind the 'log' method on
TypedErrorInfo: It lets you
>> > transform meaningful information about the problem into a log
message
>> > *after* you know whether it can be handled or not. Recoverable
errors
>> can be
>> > logged (if the client wants to do so), but they don't have to
be.
>>
>> ...
>>
>> > By contrast, in my system this would be:
>> >
>> > class InvalidBitcodeSignature :
TypedErrorInfo<InvalidBitcodeSignature>
>> {};
>> > class CorruptedBitcode : TypedErrorInfo<CorruptedBitcode> {
>> > public:
>> >   CorruptedBitcode(std::string Msg) : Msg(Msg) {}
>> >   void log(raw_ostream &OS) const override { OS << Msg;
}
>> > private:
>> >   std::string Msg;
>> > };
>>
>> This highlights why I think it is important to keep diagnostics and
>> errors separate. In the solution you have there is a need to allocate
>> a std::string, even if that is never used. If it was always a
>> std::string it would not be that bad, but the framework seems designed
>> to allow even more context to be stored, opening the way for fun cases
>> of error handling interfering with lifetime management.
>>
>> > Consider someone who wants to use libObject to write an
object-file
>> repair
>> > tool: A "bad_symbol" error code tells you what went
wrong, but not
>> where or
>> > why, so it's not much use in attempting a fix. On the other
hand
>> "bad_symbol
>> > { idx = 10 }" gives you enough information to pop up a
dialog, ask the
>> user
>> > what the symbol at index 10 should be, then re-write that symbol
table
>> > entry.
>>
>> Using error results to find deep information you want to patch seems
>> like a bad idea. A tool that wants to patch symbol indexes should be
>> reading them directly.
>>
>> >
>> >> Other advantages are:
>> >>
>> >> * It is easy to use fatal error in simple programs by just
using a
>> >> diganostic handler that exits.
>> >> * Very debugger friendly. It is easy to put a breakpoint at
the diag
>> >> handler.
>> >
>> > This proposal provides the same advantages. I noted the second
point in
>> the
>> > original RFC. The first is easily implemented using an idiom
I've seen
>> in
>> > llvm-objdump and elsewhere:
>> >
>> > void check(TypedError Err) {
>> >   if (!Err)
>> >     return;
>> >
>> >   logAllUnhandledErrors(std::move(Err), errs(), "<tool
name>");
>> >   exit(1);
>> > }
>>
>> That is not the same that you get with a diagnostic handler. What you
>> get is an exit after the error was propagated over the library layer,
>> instead of an exit at the point where the issue is found.
>>
>> We use the above code now because lib/Object has no diagnostic handler.
>>
>> > Side-note: you can actually go one better and replace this idiom
with:
>> >
>> > template <typename T>
>> > T check(TypedErrorOr<T> ValOrErr) {
>> >   if (!ValOrErr) {
>> >     logAllUnhandledErrors(ValOrErr.takeError(), errs(),
"<tool name>");
>> >     exit(1);
>> >   }
>> >   return std::move(*ValOrErr);
>> > }
>>
>> Yes, we do pretty much that in ELF/COFF lld.
>>
>>
>> > Using TypedError as a wrapper for std::error_code would prevent
errors
>> from
>> > be silently dropped, which is a huge step forward. (Though
TypedError
>> would
>> > be a misnomer - we'd need a new name. :)
>>
>> CheckedError?
>>
>> > Using a diagnostic handler would allow richer diagnostics from
>> libObject,
>> > but it supports a narrower class of error-recovery than my
proposal. I'm
>> > inclined to favor my proposal as a strictly more general solution.
>>
>> I would prefer the diagnostic handler for exactly the same reason :-)
>> Generality has a cost, and I don't think we should invest on it
until
>> we have a concrete case where that is needed. Given that I would
>> suggest going the more incremental way. Add a CheckedError that wraps
>> error_code and use to make sure the errors are checked. When a better
>> diagnostic is needed, pass in a diagnostic handler.
>>
>> If we ever get to the point were we really want to *return* more than
>> an error_code, it will be a much smaller refactoring to
>> s/CheckedError/TypedError/.
>>
>> Cheers,
>> Rafael
>>
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160209/b33641cf/attachment-0001.html>

Lang Hames via llvm-dev

2016-Feb-10 01:29 UTC

head link

[llvm-dev] [RFC] Error handling in LLVM libraries.

Hi David,
> But I think it makes total sense for Clang to use a diagnostic handling
approach for the most part (all the possible ways that C++ can be written
incorrectly essentially amount to the same thing for a consumer of Clang - code
was bad & I need some text (and fixits, notes, etc) to tell the user why),
the IR verifier similarly - and possibly LLD, for example, depending on how
broad the use cases become.

Absolutely. The way I phrased my objection is overly broad. Diagnostic handlers
have valid use-cases, but they're not good at conveying structured
information to clients, which is a useful thing to do in general. I'm
definitely not proposing that we replace clang's diagnostic handler with my
scheme.

- Lang.

Sent from my iPhone
> On Feb 9, 2016, at 5:06 PM, David Blaikie <dblaikie at gmail.com>
wrote:
> 
> 
> 
>> On Tue, Feb 9, 2016 at 4:23 PM, Lang Hames via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
>> > > I don't think these are really independent. Whether or
not you need to emit
>> > > a diagnostic depends on whether a caller can handle the
corresponding error,
>> > > which isn't something you know at the point where the
error is raised.
>> 
>> > But you do in the diag handler. For example, if you are trying to
open
>> > multiple files, some of which are bitcode, you know to ignore
>> > InvalidBitcodeSignature.
>> 
>> That means anticipating the kinds of errors you do/don't want to
recover from at the point at which you insatiate your diagnostic handler. At
that point you may not have the information that you need to know whether you
want to recover. E.g. What if one path through library code can recover, and
another can't?
> 
> +1 to this (I had the same thought - stateful handlers would be pretty
tricky, doubly so with \/ this point)
>  
>> Worse still, what if the divergence point between these two paths is in
library code itself? Now you need to thread two diagnostic handlers up to that
point. This scales very poorly.
>> 
>> Simply put: diagnostic handlers aren't library friendly. It's
tough enough to get all libraries to agree on an error type, without having them
all agree on a diagnostic handlers too, and thread them everywhere.
> 
> I think it depends a bit on the library & how generic it is - likely
lower level libraries (like libObject) will want a granular error-based result,
not a diagnostic engine. (the clients will be more widely varied (than, say,
Clang) & have different diagnostic/behavioral needs, etc)
> 
> But I think it makes total sense for Clang to use a diagnostic handling
approach for the most part (all the possible ways that C++ can be written
incorrectly essentially amount to the same thing for a consumer of Clang - code
was bad & I need some text (and fixits, notes, etc) to tell the user why),
the IR verifier similarly - and possibly LLD, for example, depending on how
broad the use cases become.
>  
>> 
>> > This highlights why I think it is important to keep diagnostics
and
>> > errors separate. In the solution you have there is a need to
allocate
>> > a std::string, even if that is never used.
>> 
>> Errors are only constructed on the error path. There is no construction
on the success path.
>> 
>> > If it was always a
>> > std::string it would not be that bad, but the framework seems
designed
>> > to allow even more context to be stored, opening the way for fun
cases
>> > of error handling interfering with lifetime management.
>> 
>> This is a real problem, but I think the solution is to have a style
guideline: You can't use reference types (in the general sense - references,
pointers, etc.) in errors. Errors that would otherwise require references should
have their error-message stringified at the point of creation.
>> 
>> > > Consider someone who wants to use libObject to write an
object-file repair
>> > > tool: A "bad_symbol" error code tells you what went
wrong, but not where or
>> > > why, so it's not much use in attempting a fix. On the
other hand "bad_symbol
>> > > { idx = 10 }" gives you enough information to pop up a
dialog, ask the user
>> > > what the symbol at index 10 should be, then re-write that
symbol table
>> > > entry.
>> 
>> > Using error results to find deep information you want to patch
seems
>> > like a bad idea. A tool that wants to patch symbol indexes should
be
>> > reading them directly.
>> 
>> I'm not sure I agree, but I don't have strong feelings on this
particular example - it was intended as a straw man. My point is that error
recover is useful in general: there is a reason things like exceptions exist.
Diagnostic handlers are very poorly suited to general error recovery.
>> 
>> > > void check(TypedError Err) {
>> > >   if (!Err)
>> > >     return;
>> > >
>> > >   logAllUnhandledErrors(std::move(Err), errs(),
"<tool name>");
>> > >   exit(1);
>> > > }
>> 
>> > That is not the same that you get with a diagnostic handler. What
you
>> > get is an exit after the error was propagated over the library
layer,
>> > instead of an exit at the point where the issue is found.
>> 
>> As long as you log the error I'm not sure I see a meaningful
difference? If you're not debugging you shouldn't be crashing in your
library. If you are debugging you want to set a breakpoint on the error
constructor instead.
>> 
>> > I would prefer the diagnostic handler for exactly the same reason
:-)
>> > Generality has a cost...
>> 
>> Could you explain what you see as the cost in this instance?
>> 
>> The scheme has a higher cost that std::error_code alone when you're
returning an error, but I don't think optimizing the error case is a
sensible thing to do. It has a potentially *lower* cost on the success path, as
you don't need to take up an argument for the diagnostic handler.
>> 
>> The most serious objection that you've raised, to my mind, is the
potential lifetime issues if people mistakenly use references in their error
types. I'm planning to write additions to llvm's coding guidelines to
cover that, which I think should be sufficient.
>> 
>> On the flip side, compared to a diagnostic handler, I think the scheme
I've proposed has the following benefits:
>> 
>> - No need to thread a diagnostic handler type through the world (and
consequently no need to agree on its type). Retrofitting rich error logging to
an interface that already returns errors requires less work than adding a
diagnostic handler.
>> - The ability to describe error hierarchies (e.g. the class of all
errors that represent malformed objects, which may be further subclassed to
describe specific errors).
>> - The ability for client code to meaningfully distinguish between error
instances, rather than just error types.
>> 
>> > , and I don't think we should invest on it until
>> > we have a concrete case where that is needed.
>> 
>> The error infrastructure investment is largely done, and I'm
volunteering to maintain it. Myself and Kevin Enderby are signing up to weave
this through libObject and the JIT. Other libraries could be converted as people
see fit, with ECError providing an interface between the std::error_code world
and TypedError.
>> 
>> > Given that I would
>> > suggest going the more incremental way. Add a CheckedError that
wraps
>> > error_code and use to make sure the errors are checked. When a
better
>> > diagnostic is needed, pass in a diagnostic handler.
>> 
>> Our first use-case for this system is libObject, where we want to use
this to enable richer error messages. An incremental approach would involve
threading a diagnostic handler through everything (and returning CheckedError),
then un-threading it again would involve a lot of churn. I'd prefer to move
directly to the scheme I'm proposing.
>> 
>> Cheers,
>> Lang.
>> 
>> 
>>> On Tue, Feb 9, 2016 at 3:27 PM, Rafael Espíndola
<rafael.espindola at gmail.com> wrote:
>>> > I don't think these are really independent. Whether or not
you need to emit
>>> > a diagnostic depends on whether a caller can handle the
corresponding error,
>>> > which isn't something you know at the point where the
error is raised.
>>> 
>>> But you do in the diag handler. For example, if you are trying to
open
>>> multiple files, some of which are bitcode, you know to ignore
>>> InvalidBitcodeSignature.
>>> 
>>> 
>>> > That's the idea behind the 'log' method on
TypedErrorInfo: It lets you
>>> > transform meaningful information about the problem into a log
message
>>> > *after* you know whether it can be handled or not. Recoverable
errors can be
>>> > logged (if the client wants to do so), but they don't have
to be.
>>> 
>>> ...
>>> 
>>> > By contrast, in my system this would be:
>>> >
>>> > class InvalidBitcodeSignature :
TypedErrorInfo<InvalidBitcodeSignature> {};
>>> > class CorruptedBitcode :
TypedErrorInfo<CorruptedBitcode> {
>>> > public:
>>> >   CorruptedBitcode(std::string Msg) : Msg(Msg) {}
>>> >   void log(raw_ostream &OS) const override { OS <<
Msg; }
>>> > private:
>>> >   std::string Msg;
>>> > };
>>> 
>>> This highlights why I think it is important to keep diagnostics and
>>> errors separate. In the solution you have there is a need to
allocate
>>> a std::string, even if that is never used. If it was always a
>>> std::string it would not be that bad, but the framework seems
designed
>>> to allow even more context to be stored, opening the way for fun
cases
>>> of error handling interfering with lifetime management.
>>> 
>>> > Consider someone who wants to use libObject to write an
object-file repair
>>> > tool: A "bad_symbol" error code tells you what went
wrong, but not where or
>>> > why, so it's not much use in attempting a fix. On the
other hand "bad_symbol
>>> > { idx = 10 }" gives you enough information to pop up a
dialog, ask the user
>>> > what the symbol at index 10 should be, then re-write that
symbol table
>>> > entry.
>>> 
>>> Using error results to find deep information you want to patch
seems
>>> like a bad idea. A tool that wants to patch symbol indexes should
be
>>> reading them directly.
>>> 
>>> >
>>> >> Other advantages are:
>>> >>
>>> >> * It is easy to use fatal error in simple programs by just
using a
>>> >> diganostic handler that exits.
>>> >> * Very debugger friendly. It is easy to put a breakpoint
at the diag
>>> >> handler.
>>> >
>>> > This proposal provides the same advantages. I noted the second
point in the
>>> > original RFC. The first is easily implemented using an idiom
I've seen in
>>> > llvm-objdump and elsewhere:
>>> >
>>> > void check(TypedError Err) {
>>> >   if (!Err)
>>> >     return;
>>> >
>>> >   logAllUnhandledErrors(std::move(Err), errs(), "<tool
name>");
>>> >   exit(1);
>>> > }
>>> 
>>> That is not the same that you get with a diagnostic handler. What
you
>>> get is an exit after the error was propagated over the library
layer,
>>> instead of an exit at the point where the issue is found.
>>> 
>>> We use the above code now because lib/Object has no diagnostic
handler.
>>> 
>>> > Side-note: you can actually go one better and replace this
idiom with:
>>> >
>>> > template <typename T>
>>> > T check(TypedErrorOr<T> ValOrErr) {
>>> >   if (!ValOrErr) {
>>> >     logAllUnhandledErrors(ValOrErr.takeError(), errs(),
"<tool name>");
>>> >     exit(1);
>>> >   }
>>> >   return std::move(*ValOrErr);
>>> > }
>>> 
>>> Yes, we do pretty much that in ELF/COFF lld.
>>> 
>>> 
>>> > Using TypedError as a wrapper for std::error_code would
prevent errors from
>>> > be silently dropped, which is a huge step forward. (Though
TypedError would
>>> > be a misnomer - we'd need a new name. :)
>>> 
>>> CheckedError?
>>> 
>>> > Using a diagnostic handler would allow richer diagnostics from
libObject,
>>> > but it supports a narrower class of error-recovery than my
proposal. I'm
>>> > inclined to favor my proposal as a strictly more general
solution.
>>> 
>>> I would prefer the diagnostic handler for exactly the same reason
:-)
>>> Generality has a cost, and I don't think we should invest on it
until
>>> we have a concrete case where that is needed. Given that I would
>>> suggest going the more incremental way. Add a CheckedError that
wraps
>>> error_code and use to make sure the errors are checked. When a
better
>>> diagnostic is needed, pass in a diagnostic handler.
>>> 
>>> If we ever get to the point were we really want to *return* more
than
>>> an error_code, it will be a much smaller refactoring to
>>> s/CheckedError/TypedError/.
>>> 
>>> Cheers,
>>> Rafael
>> 
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> -------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160209/10da87e0/attachment.html>

Rafael Espíndola via llvm-dev

2016-Feb-10 14:47 UTC

head link

[llvm-dev] [RFC] Error handling in LLVM libraries.

>> This highlights why I think it is important to keep diagnostics and
>> errors separate. In the solution you have there is a need to allocate
>> a std::string, even if that is never used.
>
> Errors are only constructed on the error path. There is no construction on
> the success path.
But they are always created, even if it as error the caller wants to
ignore. For example, you will always create a "file foo.o in bar.a is
not a bitcode" message (or copy sufficient information for that to be
created). With a dignostic handler no copying is needed, since the
call happens in the context where the error is found. It is easy to
see us in a position where a lot of context is copied because some
client somewhere might want it.

So I am worried we are coding for the hypothetical and adding
complexity. In particular, if we are going this way I think it is
critical that your patch *removes* the existing diagnostic handlers
(while making sure test/Bitcode/invalid.ll still passes) so that we
don't end up with two overlapping solutions. We were still not even
done converting away from bool+std::string :-(

Cheers,
Rafael

David Blaikie via llvm-dev

2016-Feb-10 16:10 UTC

head link

[llvm-dev] [RFC] Error handling in LLVM libraries.

On Wed, Feb 10, 2016 at 6:47 AM, Rafael Espíndola <llvm-dev at
lists.llvm.org>
wrote:
> >> This highlights why I think it is important to keep diagnostics
and
> >> errors separate. In the solution you have there is a need to
allocate
> >> a std::string, even if that is never used.
> >
> > Errors are only constructed on the error path. There is no
construction
> on
> > the success path.
>
> But they are always created, even if it as error the caller wants to
> ignore. For example, you will always create a "file foo.o in bar.a is
> not a bitcode" message (or copy sufficient information for that to be
> created). With a dignostic handler no copying is needed, since the
> call happens in the context where the error is found. It is easy to
> see us in a position where a lot of context is copied because some
> client somewhere might want it.
>
> So I am worried we are coding for the hypothetical and adding
> complexity. In particular, if we are going this way I think it is
> critical that your patch *removes* the existing diagnostic handlers
> (while making sure test/Bitcode/invalid.ll still passes) so that we
> don't end up with two overlapping solutions.

Removes diagnostic handlers in other parts of LLVM (& Clang) - the IR
verifier's diagnostic handling is what you're referring to here ^?

I don't think that would be an improvement - it seems like different
situations call for different tools (as I was saying yesterday on this
thread). In some places a diagnostic handler is the right tool, and in some
places error codes/results/etc are the right tool. We already live in that
world & it seems like a reasonable one (there doesn't seem to be a
fundamental conflict between our bool+std::string or error_code returns and
existing diagnostic handlers - I think they can reasonably coexist in
different parts of the codebase for different use cases)

> We were still not even
> done converting away from bool+std::string :-(
>
> Cheers,
> Rafael
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160210/b546b2ef/attachment.html>

Lang Hames via llvm-dev

2016-Feb-10 17:37 UTC

head link

[llvm-dev] [RFC] Error handling in LLVM libraries.

Hi Rafael,
> But they are always created, even if it as error the caller wants to
> ignore. For example, you will always create a "file foo.o in bar.a is
> not a bitcode" message (or copy sufficient information for that to be
> created). With a dignostic handler no copying is needed, since the
> call happens in the context where the error is found. It is easy to
> see us in a position where a lot of context is copied because some
> client somewhere might want it.
I'm not saying that this system would replace diagnostic handlers in
general, or that it should be shoehorned into places where it doesn't fit.
This system is a replacement for std::error_code, because std::error_code
is deficient as an error return. It happens that once you have a generic
error return value you can use it for a lot of simple diagnostic cases,
rather than threading a diagnostic handler everywhere. If you want to mix
and match my error scheme with a diagnostic handler you still can.

As to the performance concern - you're trying to optimizing the error case.
As noted, my scheme is slower than std::error_code on the error_path, but
not prohibitively so. I've thought about this, and cannot imagine a
reasonable use of this system that would lead to unacceptable overhead on
the error path.
> So I am worried we are coding for the hypothetical and adding complexity.
Don't worry - I'm too busy with real problems to tackle hypothetical
ones.
Here's another example of a problem that my proposal solves:

I recently added support for remote JITing to ORC. There are in-tree
library facilities for JITing code into a process on the other end of an
abstract "RPCChannel". What happens if something goes wrong at one
end? We
want to be able to communicate an error back across the RPCChannel
(assuming it's still intact) so the other side can recover or fail
gracefully. That means we need to be able to serialize an error with enough
information to describe what went wrong. There's no practical way to
maintain a serialization routine for all possible std::error_codes that
might come up, even if they were powerful enough to describe everything
that could go wrong (which again, being static kinds, they're not). With my
proposal however, a JITError base class can be defined as:

class JITError : public TypedErrorInfo<JITError> {
public:
  virtual void serialize(RPCChannel &C) const = 0;
};

Now you just describe serialization/deserialization for each error when you
define it. :)

(Yes - this hand waves over deserialization. It's solvable. The gory
details will add nothing to this discussion).
> In particular, if we are going this way I think it is
> critical that your patch *removes* the existing diagnostic handlers
> (while making sure test/Bitcode/invalid.ll still passes) so that we
> don't end up with two overlapping solutions. We were still not even
> done converting away from bool+std::string :-(
I think I've covered this now, but once more for emphasis: I'm not going
to
start tearing all the diagnostic handlers out. I absolutely see the value
of diagnostic handlers for producing intricate diagnostics. My proposal
tackles the error return problem, and it turns out that once you have a
customizable error return value you can use it to produce decent
diagnostics, as long as the client doesn't need to be able to control the
formatting of those diagnostics. This property will allow us to produce
better diagnostics that we currently do in several tools, without the need
to introduce diagnostic handlers there. We can still introduce them later
if we want to.

Cheers,
Lang.

On Wed, Feb 10, 2016 at 6:47 AM, Rafael Espíndola <
rafael.espindola at gmail.com> wrote:
> >> This highlights why I think it is important to keep diagnostics
and
> >> errors separate. In the solution you have there is a need to
allocate
> >> a std::string, even if that is never used.
> >
> > Errors are only constructed on the error path. There is no
construction
> on
> > the success path.
>
> But they are always created, even if it as error the caller wants to
> ignore. For example, you will always create a "file foo.o in bar.a is
> not a bitcode" message (or copy sufficient information for that to be
> created). With a dignostic handler no copying is needed, since the
> call happens in the context where the error is found. It is easy to
> see us in a position where a lot of context is copied because some
> client somewhere might want it.
>
> So I am worried we are coding for the hypothetical and adding
> complexity. In particular, if we are going this way I think it is
> critical that your patch *removes* the existing diagnostic handlers
> (while making sure test/Bitcode/invalid.ll still passes) so that we
> don't end up with two overlapping solutions. We were still not even
> done converting away from bool+std::string :-(
>
> Cheers,
> Rafael
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160210/383d585c/attachment.html>

Maybe Matching Threads

Search for more reasonably related threads

llvm dev - Feb 2016 - [RFC] Error handling in LLVM libraries.

[llvm-dev] [RFC] Error handling in LLVM libraries.

[llvm-dev] [RFC] Error handling in LLVM libraries.

[llvm-dev] [RFC] Error handling in LLVM libraries.

[llvm-dev] [RFC] Error handling in LLVM libraries.

[llvm-dev] [RFC] Error handling in LLVM libraries.

[llvm-dev] [RFC] Error handling in LLVM libraries.

Maybe Matching Threads