thr3ads.net - llvm dev - [llvm-dev] [RFC] Error handling in LLVM libraries. [Feb 2016]

If this information is useful, please help other people find it:
Share via:

Lang Hames via llvm-dev

2016-Feb-09 23:05 UTC

[llvm-dev] [RFC] Error handling in LLVM libraries.

Hi Rafael,
> The main thing I like about the diagnostic system is that it lets us
> differentiate two related but independent concepts:
>
> * Giving the human using the program diagnostics about what went wrong.
> * Propagating an error to the caller so that the upper library layer
> can handle it or pass it up the stack.
I don't think these are really independent. Whether or not you need to emit
a diagnostic depends on whether a caller can handle the corresponding
error, which isn't something you know at the point where the error is
raised. That's the idea behind the 'log' method on TypedErrorInfo:
It lets
you transform meaningful information about the problem into a log message
*after* you know whether it can be handled or not. Recoverable errors can
be logged (if the client wants to do so), but they don't have to be.

Using TypedError for diagnostics also means one fewer friction point when
composing library code: Rather than having to agree on both the error
return type and the diagnostic context type you only have to get agreement
on the error return type.
> That is way more than what we need to pass to the caller. In fact, the
> only cases we need to differentiate are "this is not a bitcode file at
> all" and "the file is broken", which we do with the
following enum +
> std::error_code.
By contrast, in my system this would be:

class InvalidBitcodeSignature : TypedErrorInfo<InvalidBitcodeSignature>
{};
class CorruptedBitcode : TypedErrorInfo<CorruptedBitcode> {
public:
  CorruptedBitcode(std::string Msg) : Msg(Msg) {}
  void log(raw_ostream &OS) const override { OS << Msg; }
private:
  std::string Msg;
};

Once you factor out the need to define an error category, I suspect my
system actually requires less code, not that either of them require much.
> Note that with error_code one can define arbitrary error categories.
The problem with error categories is that they're limited to static
groupings of kinds of errors, but the user might need dynamic information
to recover.

Consider someone who wants to use libObject to write an object-file repair
tool: A "bad_symbol" error code tells you what went wrong, but not
where or
why, so it's not much use in attempting a fix. On the other hand
"bad_symbol { idx = 10 }" gives you enough information to pop up a
dialog,
ask the user what the symbol at index 10 should be, then re-write that
symbol table entry.
> Other advantages are:
>
> * It is easy to use fatal error in simple programs by just using a
> diganostic handler that exits.
> * Very debugger friendly. It is easy to put a breakpoint at the diaghandler.

This proposal provides the same advantages. I noted the second point in the
original RFC. The first is easily implemented using an idiom I've seen in
llvm-objdump and elsewhere:

void check(TypedError Err) {
  if (!Err)
    return;

  logAllUnhandledErrors(std::move(Err), errs(), "<tool name>");
  exit(1);
}

...
TypedErrorOr<Result> R = doSomething();
check(R.takeError());
... *R;

Side-note: you can actually go one better and replace this idiom with:

template <typename T>
T check(TypedErrorOr<T> ValOrErr) {
  if (!ValOrErr) {
    logAllUnhandledErrors(ValOrErr.takeError(), errs(), "<tool
name>");
    exit(1);
  }
  return std::move(*ValOrErr);
}

...
Result R = check(doSomething());
...

Mind you, this new idiom works equally well for ErrorOr/std::error_code.
> So, could you achieve your objectives by:
>
> * Moving the diagnostic handling code to Support so that all of llvm can
use it.> * Defining TypedError as a wrapper over std::error_code with a
> destructor that verifies the error was handled?
Using TypedError as a wrapper for std::error_code would prevent errors from
be silently dropped, which is a huge step forward. (Though TypedError would
be a misnomer - we'd need a new name. :)

Using a diagnostic handler would allow richer diagnostics from libObject,
but it supports a narrower class of error-recovery than my proposal. I'm
inclined to favor my proposal as a strictly more general solution.

Cheers,
Lang.


On Tue, Feb 9, 2016 at 11:12 AM, Rafael Espíndola <
rafael.espindola at gmail.com> wrote:
> On 3 February 2016 at 02:33, Lang Hames via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
> > Hi Mehdi,
> >
> >> If you subclass a diagnostic right now, isn’t the RTTI information
> >> available to the handler, which can then achieve the same
dispatching /
> >> custom handling per type of diagnostic?
> >> (I’m not advocating the diagnostic system, which I found less
convenient
> >> to use than what you are proposing)
> >
> > I have to confess I haven't looked at the diagnostic classes
closely.
> I'll
> > take a look and get back to you on this one. :)
>
> The main thing I like about the diagnostic system is that it lets us
> differentiate two related but independent concepts:
>
> * Giving the human using the program diagnostics about what went wrong.
> * Propagating an error to the caller so that the upper library layer
> can handle it or pass it up the stack.
>
> For example, when given a broken bitcode file we are able to produce
> the diagnostic "Unknown attribute kind (52)" which shows exactly
what
> we are complaining about. We are able to do that because the
> diagnostic is produced close to the spot where the error is detected.
>
> That is way more than what we need to pass to the caller. In fact, the
> only cases we need to differentiate are "this is not a bitcode file at
> all" and "the file is broken", which we do with the
following enum +
> std::error_code.
>
>   enum class BitcodeError { InvalidBitcodeSignature = 1, CorruptedBitcode
> };
>
> So we use the context to produce a diagnostic, but the error doesn't
> need to store it. Compare that with what we had before r225562.
>
> Note that with error_code one can define arbitrary error categories.
>
> Other advantages are:
>
> * It is easy to use fatal error in simple programs by just using a
> diganostic handler that exits.
> * Very debugger friendly. It is easy to put a breakpoint at the diag
> handler.
>
> Given that distinction, what I agree that is really missing is
> infrastructure to make sure std::error_code is handled and for
> filtering it.
>
> So, could you achieve your objectives by:
>
> * Moving the diagnostic handling code to Support so that all of llvm can
> use it.
> * Defining TypedError as a wrapper over std::error_code with a
> destructor that verifies the error was handled?
>
> Thanks,
> Rafael
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160209/9c69fd2c/attachment.html>

Rafael Espíndola via llvm-dev

2016-Feb-09 23:27 UTC

head link

[llvm-dev] [RFC] Error handling in LLVM libraries.

> I don't think these are really independent. Whether or not you need to
emit
> a diagnostic depends on whether a caller can handle the corresponding
error,
> which isn't something you know at the point where the error is raised.
But you do in the diag handler. For example, if you are trying to open
multiple files, some of which are bitcode, you know to ignore
InvalidBitcodeSignature.

> That's the idea behind the 'log' method on TypedErrorInfo: It
lets you
> transform meaningful information about the problem into a log message
> *after* you know whether it can be handled or not. Recoverable errors can
be
> logged (if the client wants to do so), but they don't have to be.
...
> By contrast, in my system this would be:
>
> class InvalidBitcodeSignature :
TypedErrorInfo<InvalidBitcodeSignature> {};
> class CorruptedBitcode : TypedErrorInfo<CorruptedBitcode> {
> public:
>   CorruptedBitcode(std::string Msg) : Msg(Msg) {}
>   void log(raw_ostream &OS) const override { OS << Msg; }
> private:
>   std::string Msg;
> };
This highlights why I think it is important to keep diagnostics and
errors separate. In the solution you have there is a need to allocate
a std::string, even if that is never used. If it was always a
std::string it would not be that bad, but the framework seems designed
to allow even more context to be stored, opening the way for fun cases
of error handling interfering with lifetime management.
> Consider someone who wants to use libObject to write an object-file repair
> tool: A "bad_symbol" error code tells you what went wrong, but
not where or
> why, so it's not much use in attempting a fix. On the other hand
"bad_symbol
> { idx = 10 }" gives you enough information to pop up a dialog, ask the
user
> what the symbol at index 10 should be, then re-write that symbol table
> entry.
Using error results to find deep information you want to patch seems
like a bad idea. A tool that wants to patch symbol indexes should be
reading them directly.
>
>> Other advantages are:
>>
>> * It is easy to use fatal error in simple programs by just using a
>> diganostic handler that exits.
>> * Very debugger friendly. It is easy to put a breakpoint at the diag
>> handler.
>
> This proposal provides the same advantages. I noted the second point in the
> original RFC. The first is easily implemented using an idiom I've seen
in
> llvm-objdump and elsewhere:
>
> void check(TypedError Err) {
>   if (!Err)
>     return;
>
>   logAllUnhandledErrors(std::move(Err), errs(), "<tool
name>");
>   exit(1);
> }
That is not the same that you get with a diagnostic handler. What you
get is an exit after the error was propagated over the library layer,
instead of an exit at the point where the issue is found.

We use the above code now because lib/Object has no diagnostic handler.
> Side-note: you can actually go one better and replace this idiom with:
>
> template <typename T>
> T check(TypedErrorOr<T> ValOrErr) {
>   if (!ValOrErr) {
>     logAllUnhandledErrors(ValOrErr.takeError(), errs(), "<tool
name>");
>     exit(1);
>   }
>   return std::move(*ValOrErr);
> }
Yes, we do pretty much that in ELF/COFF lld.

> Using TypedError as a wrapper for std::error_code would prevent errors from
> be silently dropped, which is a huge step forward. (Though TypedError would
> be a misnomer - we'd need a new name. :)
CheckedError?
> Using a diagnostic handler would allow richer diagnostics from libObject,
> but it supports a narrower class of error-recovery than my proposal.
I'm
> inclined to favor my proposal as a strictly more general solution.
I would prefer the diagnostic handler for exactly the same reason :-)
Generality has a cost, and I don't think we should invest on it until
we have a concrete case where that is needed. Given that I would
suggest going the more incremental way. Add a CheckedError that wraps
error_code and use to make sure the errors are checked. When a better
diagnostic is needed, pass in a diagnostic handler.

If we ever get to the point were we really want to *return* more than
an error_code, it will be a much smaller refactoring to
s/CheckedError/TypedError/.

Cheers,
Rafael

Lang Hames via llvm-dev

2016-Feb-10 00:23 UTC

head link

[llvm-dev] [RFC] Error handling in LLVM libraries.

> > I don't think these are really independent. Whether or not you
need to
emit> > a diagnostic depends on whether a caller can handle the corresponding
error,> > which isn't something you know at the point where the error is
raised.
> But you do in the diag handler. For example, if you are trying to open
> multiple files, some of which are bitcode, you know to ignore
> InvalidBitcodeSignature.
That means anticipating the kinds of errors you do/don't want to recover
from at the point at which you insatiate your diagnostic handler. At that
point you may not have the information that you need to know whether you
want to recover. E.g. What if one path through library code can recover,
and another can't? Worse still, what if the divergence point between these
two paths is in library code itself? Now you need to thread two diagnostic
handlers up to that point. This scales very poorly.

Simply put: diagnostic handlers aren't library friendly. It's tough
enough
to get all libraries to agree on an error type, without having them all
agree on a diagnostic handlers too, and thread them everywhere.
> This highlights why I think it is important to keep diagnostics and
> errors separate. In the solution you have there is a need to allocate
> a std::string, even if that is never used.
Errors are only constructed on the error path. There is no construction on
the success path.
> If it was always a
> std::string it would not be that bad, but the framework seems designed
> to allow even more context to be stored, opening the way for fun cases
> of error handling interfering with lifetime management.
This is a real problem, but I think the solution is to have a style
guideline: You can't use reference types (in the general sense -
references, pointers, etc.) in errors. Errors that would otherwise require
references should have their error-message stringified at the point of
creation.
> > Consider someone who wants to use libObject to write an object-file
repair> > tool: A "bad_symbol" error code tells you what went wrong,
but not
where or> > why, so it's not much use in attempting a fix. On the other hand
"bad_symbol> > { idx = 10 }" gives you enough information to pop up a dialog,
ask the
user> > what the symbol at index 10 should be, then re-write that symbol table
> > entry.
> Using error results to find deep information you want to patch seems
> like a bad idea. A tool that wants to patch symbol indexes should be
> reading them directly.
I'm not sure I agree, but I don't have strong feelings on this
particular
example - it was intended as a straw man. My point is that error recover is
useful in general: there is a reason things like exceptions exist.
Diagnostic handlers are very poorly suited to general error recovery.
> > void check(TypedError Err) {
> >   if (!Err)
> >     return;
> >
> >   logAllUnhandledErrors(std::move(Err), errs(), "<tool
name>");
> >   exit(1);
> > }
> That is not the same that you get with a diagnostic handler. What you
> get is an exit after the error was propagated over the library layer,
> instead of an exit at the point where the issue is found.
As long as you log the error I'm not sure I see a meaningful difference? If
you're not debugging you shouldn't be crashing in your library. If you
are
debugging you want to set a breakpoint on the error constructor instead.
> I would prefer the diagnostic handler for exactly the same reason :-)
> Generality has a cost...
Could you explain what you see as the cost in this instance?

The scheme has a higher cost that std::error_code alone when you're
returning an error, but I don't think optimizing the error case is a
sensible thing to do. It has a potentially *lower* cost on the success
path, as you don't need to take up an argument for the diagnostic handler.

The most serious objection that you've raised, to my mind, is the potential
lifetime issues if people mistakenly use references in their error types.
I'm planning to write additions to llvm's coding guidelines to cover
that,
which I think should be sufficient.

On the flip side, compared to a diagnostic handler, I think the scheme I've
proposed has the following benefits:

- No need to thread a diagnostic handler type through the world (and
consequently no need to agree on its type). Retrofitting rich error logging
to an interface that already returns errors requires less work than adding
a diagnostic handler.
- The ability to describe error hierarchies (e.g. the class of all errors
that represent malformed objects, which may be further subclassed to
describe specific errors).
- The ability for client code to meaningfully distinguish between error
instances, rather than just error types.
> , and I don't think we should invest on it until
> we have a concrete case where that is needed.
The error infrastructure investment is largely done, and I'm volunteering
to maintain it. Myself and Kevin Enderby are signing up to weave this
through libObject and the JIT. Other libraries could be converted as people
see fit, with ECError providing an interface between the std::error_code
world and TypedError.
> Given that I would
> suggest going the more incremental way. Add a CheckedError that wraps
> error_code and use to make sure the errors are checked. When a better
> diagnostic is needed, pass in a diagnostic handler.
Our first use-case for this system is libObject, where we want to use this
to enable richer error messages. An incremental approach would involve
threading a diagnostic handler through everything (and returning
CheckedError), then un-threading it again would involve a lot of churn. I'd
prefer to move directly to the scheme I'm proposing.

Cheers,
Lang.


On Tue, Feb 9, 2016 at 3:27 PM, Rafael Espíndola <rafael.espindola at
gmail.com> wrote:
> > I don't think these are really independent. Whether or not you
need to
> emit
> > a diagnostic depends on whether a caller can handle the corresponding
> error,
> > which isn't something you know at the point where the error is
raised.
>
> But you do in the diag handler. For example, if you are trying to open
> multiple files, some of which are bitcode, you know to ignore
> InvalidBitcodeSignature.
>
>
> > That's the idea behind the 'log' method on TypedErrorInfo:
It lets you
> > transform meaningful information about the problem into a log message
> > *after* you know whether it can be handled or not. Recoverable errors
> can be
> > logged (if the client wants to do so), but they don't have to be.
>
> ...
>
> > By contrast, in my system this would be:
> >
> > class InvalidBitcodeSignature :
TypedErrorInfo<InvalidBitcodeSignature>
> {};
> > class CorruptedBitcode : TypedErrorInfo<CorruptedBitcode> {
> > public:
> >   CorruptedBitcode(std::string Msg) : Msg(Msg) {}
> >   void log(raw_ostream &OS) const override { OS << Msg; }
> > private:
> >   std::string Msg;
> > };
>
> This highlights why I think it is important to keep diagnostics and
> errors separate. In the solution you have there is a need to allocate
> a std::string, even if that is never used. If it was always a
> std::string it would not be that bad, but the framework seems designed
> to allow even more context to be stored, opening the way for fun cases
> of error handling interfering with lifetime management.
>
> > Consider someone who wants to use libObject to write an object-file
> repair
> > tool: A "bad_symbol" error code tells you what went wrong,
but not where
> or
> > why, so it's not much use in attempting a fix. On the other hand
> "bad_symbol
> > { idx = 10 }" gives you enough information to pop up a dialog,
ask the
> user
> > what the symbol at index 10 should be, then re-write that symbol table
> > entry.
>
> Using error results to find deep information you want to patch seems
> like a bad idea. A tool that wants to patch symbol indexes should be
> reading them directly.
>
> >
> >> Other advantages are:
> >>
> >> * It is easy to use fatal error in simple programs by just using a
> >> diganostic handler that exits.
> >> * Very debugger friendly. It is easy to put a breakpoint at the
diag
> >> handler.
> >
> > This proposal provides the same advantages. I noted the second point
in
> the
> > original RFC. The first is easily implemented using an idiom I've
seen in
> > llvm-objdump and elsewhere:
> >
> > void check(TypedError Err) {
> >   if (!Err)
> >     return;
> >
> >   logAllUnhandledErrors(std::move(Err), errs(), "<tool
name>");
> >   exit(1);
> > }
>
> That is not the same that you get with a diagnostic handler. What you
> get is an exit after the error was propagated over the library layer,
> instead of an exit at the point where the issue is found.
>
> We use the above code now because lib/Object has no diagnostic handler.
>
> > Side-note: you can actually go one better and replace this idiom with:
> >
> > template <typename T>
> > T check(TypedErrorOr<T> ValOrErr) {
> >   if (!ValOrErr) {
> >     logAllUnhandledErrors(ValOrErr.takeError(), errs(), "<tool
name>");
> >     exit(1);
> >   }
> >   return std::move(*ValOrErr);
> > }
>
> Yes, we do pretty much that in ELF/COFF lld.
>
>
> > Using TypedError as a wrapper for std::error_code would prevent errors
> from
> > be silently dropped, which is a huge step forward. (Though TypedError
> would
> > be a misnomer - we'd need a new name. :)
>
> CheckedError?
>
> > Using a diagnostic handler would allow richer diagnostics from
libObject,
> > but it supports a narrower class of error-recovery than my proposal.
I'm
> > inclined to favor my proposal as a strictly more general solution.
>
> I would prefer the diagnostic handler for exactly the same reason :-)
> Generality has a cost, and I don't think we should invest on it until
> we have a concrete case where that is needed. Given that I would
> suggest going the more incremental way. Add a CheckedError that wraps
> error_code and use to make sure the errors are checked. When a better
> diagnostic is needed, pass in a diagnostic handler.
>
> If we ever get to the point were we really want to *return* more than
> an error_code, it will be a much smaller refactoring to
> s/CheckedError/TypedError/.
>
> Cheers,
> Rafael
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160209/3dddfdfc/attachment.html>

Pete Cooper via llvm-dev

2016-Feb-10 01:48 UTC

head link

[llvm-dev] [RFC] Error handling in LLVM libraries.

Hi Rafael, Lang
> On Feb 9, 2016, at 3:27 PM, Rafael Espíndola via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> If we ever get to the point were we really want to *return* more than
> an error_code, it will be a much smaller refactoring to
> s/CheckedError/TypedError/.So i think you’re right that we should use a CheckedError (ECError in Lang’s
patch.  I don’t mind on the final name), but since a CheckedError isa
TypedError, seems like we don’t have anything to lose with the generality of
TypedError.

That is, if we provide TypedError as a base, then anyone can extend it as
needed, and clients who only need ECError can use that.  This is more or less
covered in one of Lang’s patches where he changes the return type to TypedError,
but the error being returned is 'return
make_typed_error<ECError>(EC);’.  If a client then decided to return
something other than an ECError, you wouldn’t have to change that function
return type, or any of its callers.

Cheers,
Pete

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160209/f924eba1/attachment.html>

llvm dev - Feb 2016 - [RFC] Error handling in LLVM libraries.

[llvm-dev] [RFC] Error handling in LLVM libraries.

[llvm-dev] [RFC] Error handling in LLVM libraries.

[llvm-dev] [RFC] Error handling in LLVM libraries.

[llvm-dev] [RFC] Error handling in LLVM libraries.