thr3ads.net - llvm dev - [LLVMdev] object construction patterns and unique

If this information is useful, please help other people find it:
Share via:

Chandler Carruth

2012-Jun-20 07:28 UTC

[LLVMdev] object construction patterns and unique_ptr

On Tue, Jun 19, 2012 at 11:18 AM, Jeffrey Yasskin <jyasskin at
google.com>wrote:
> On Jun 18, 2012 1:24 PM, "Nick Kledzik" <kledzik at
apple.com> wrote:
> >
> > On Jun 16, 2012, at 3:51 PM, Chris Lattner wrote:
> > > On Jun 15, 2012, at 3:48 PM, Nick Kledzik wrote:
> > >
> > >> In lld we have a Reader class which is a factory that takes
.o
> file(s) and produces an in memory lld::File object(s).  But in doing so,
> there could be I/O errors (file not readable) or file may be malformed.  We
> are also using C++11 in lld, so we use std::unique_ptr for managing object
> ownership.
> > >>
> > >> The Reader class currently has an interface that can be
simplified
> down to:
> > >>
> > >>   virtual error_code readFile(StringRef path,
> std::unique_ptr<lld::File> &result);
> > >>
> > >> But this interface has become awkward to use. There are two
"return"
> values.  This method is a unique_ptr "source" but the use of a
by-refernce
> parameter means you have to start with an uninitialized unique_ptr and the
> readFile() has to std::move() into it.   unique_ptr is much nicer to use as
> a return value.
> > >
> > > Not a c++'11 expert here, but does a returning an std::tuple
work?
> > I talked to Howard about that.  It does work, but it is cumbersome.
>  Either:
> >
> > std::tuple<std::unique_ptr<lld::File>, error_code>  result
> reader.readFile(path);
> >
>
> I suggest an error_or<T> class that can be combined with any result
type,
> not just File. It could be implemented with a fully generic variant class,
> but the interface would likely be better with some specialization to
errors.
>I also really like the variant design. It has a few important advantages
IMO:

1) No pointers or heap allocations. These make code more bug prone and can
needlessly hurt optimizers. (although the latter is hopefully not relevant
in this specific case)

2) No "error" state in the file object. Either you have a fully
constructed
object or you don't.

3) An explicit API to check and instrument whether an error has occured.
For example, it is harder to "forget" to check the error, and carry on
manipulating the file. The implementation can assert in debug builds to
catch this early, etc.

4) Doesn't require polymorphic types (but they can be supported if needed).
Essentially, you can wrap a *specific* file type, rather than being forced
to use some generic file type.

One interesting possibility is that you can grow the same pattern to
different levels of complexity as needed (and thus keep it as simple as
possible as long as possible):

error_or<T>  -- wraps the existing error_code with a nice interface
linker_error_or<T> -- wraps both an error_code and some linker specific
error handling logic
file_error_or<T> -- provides completely custom error information specific
to file-loading-errors.

I'm not saying which (if any) of these would be appropriate, just pointing
out that this pattern can be followed to greater and lesser degrees of
specificity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20120620/ffc3b999/attachment.html>

Nick Kledzik

2012-Jun-20 19:36 UTC

head link

[LLVMdev] object construction patterns and unique_ptr

On Jun 20, 2012, at 12:28 AM, Chandler Carruth wrote:> On Tue, Jun 19, 2012 at 11:18 AM, Jeffrey Yasskin <jyasskin at
google.com> wrote:
> On Jun 18, 2012 1:24 PM, "Nick Kledzik" <kledzik at
apple.com> wrote:
> >
> > On Jun 16, 2012, at 3:51 PM, Chris Lattner wrote:
> > > On Jun 15, 2012, at 3:48 PM, Nick Kledzik wrote:
> > >
> > >> In lld we have a Reader class which is a factory that takes
.o file(s) and produces an in memory lld::File object(s).  But in doing so,
there could be I/O errors (file not readable) or file may be malformed.  We are
also using C++11 in lld, so we use std::unique_ptr for managing object
ownership.
> > >>
> > >> The Reader class currently has an interface that can be
simplified down to:
> > >>
> > >>   virtual error_code readFile(StringRef path,
std::unique_ptr<lld::File> &result);
> > >>
> > >> But this interface has become awkward to use. There are two
"return" values.  This method is a unique_ptr "source" but
the use of a by-refernce parameter means you have to start with an uninitialized
unique_ptr and the readFile() has to std::move() into it.   unique_ptr is much
nicer to use as a return value.
> > >
> > > Not a c++'11 expert here, but does a returning an std::tuple
work?
> > I talked to Howard about that.  It does work, but it is cumbersome. 
Either:
> >
> > std::tuple<std::unique_ptr<lld::File>, error_code>  result
= reader.readFile(path);
> >
> 
> I suggest an error_or<T> class that can be combined with any result
type, not just File. It could be implemented with a fully generic variant class,
but the interface would likely be better with some specialization to errors.
I too think that error_or<T> is better than std::pair<error_code,
T>:
* more compact
* can name fields (e.g. error, object instead of first, second)
* can embed asserts that object field is not accessed if there is an error.

But there are some drawbacks:
* lld uses C++11, so we thought we'd add explicit ownership by using
std::unique_ptr.  That means we'd need to layer the templates (e.g.
error_or<std::unique_ptr<lld::File>>), or forgo using unique_ptr, or
cause error_or to always internally use unique_ptr.
* We still have not figured out how to encapsulate errors for object file
parsing.  The last thread was an enum *and* and a string.

> I also really like the variant design. It has a few important advantages
IMO:
> 
> 1) No pointers or heap allocations. These make code more bug prone and can
needlessly hurt optimizers. (although the latter is hopefully not relevant in
this specific case)
> 
> 2) No "error" state in the file object. Either you have a fully
constructed object or you don't.I do agree that either you successfully parsed the file and have a full File
object or you failed and have some sort of error.  That means they are mutually
exclusive, so having every File object carry an error_code is a waste and
confusing.   But, that also make the case for the FileError subclass...
> 4) Doesn't require polymorphic types (but they can be supported if
needed). Essentially, you can wrap a *specific* file type, rather than being
forced to use some generic file type.Yes, the template lets you do that.  But in the specific case of the linker, it
will always be using the generic type.  You can pass the linker paths to object
files, shared libraries, or static libraries.  The linker passes those paths
onto the readFile() method which therefore must return the generic File type.

> 3) An explicit API to check and instrument whether an error has occured.
For example, it is harder to "forget" to check the error, and carry on
manipulating the file. The implementation can assert in debug builds to catch
this early, etc.The same assert functionality can be added to the FileError design too.  About
the only thing you can do with a File is iterator its Atoms.  The FileError
class can implement its iterators to assert.  Also, it is already the case that
the linker handles libraries and object files differently, yet they both come
back from readFile() as a generic File object.  The linker already has to sort
all the File objects by kind, so noticing a FileError in the mix not a problem.

I've noticed resistance to the FileError design.  It seems natural to me
having recently done some work on the Darwin linker to support threaded and
pipelined linking.  The idea there is that you create an object immediately that
will eventually become fully populated with Atoms.   But the object is first in
an incomplete state until 1) the compiler is done producing the .o file, and 2)
the Reader has parsed the .o file into atoms.   The linker main thread creates
these incomplete objects and then waits for them to transition to the complete
state (with atoms) or to an error state (if some parsing problem happened).

Instead of thinking of FileError as a degenerate File with an error_code slapped
on.  Think of FileError as an object encapsulating the attempt to read and parse
the file.  There could be a rich interface to query it about what went wrong. 
The error_code design attempts to capture everything about a failure in an
'int'.  The FileError class defines a rich object interface to the
error.  Perhaps it could print out a range of bytes from the file and highlight
which bits were bad.  Or perhaps it could dump some context showing all the
sections or all the symbols and then why some bits are invalid within that
context.  In other words, the error reporting is open ended with a FileError
class.

-Nick

David Blaikie

2012-Jun-20 20:07 UTC

head link

[LLVMdev] object construction patterns and unique_ptr

On Wed, Jun 20, 2012 at 12:36 PM, Nick Kledzik <kledzik at apple.com>
wrote:> On Jun 20, 2012, at 12:28 AM, Chandler Carruth wrote:
>> On Tue, Jun 19, 2012 at 11:18 AM, Jeffrey Yasskin <jyasskin at
google.com> wrote:
>> On Jun 18, 2012 1:24 PM, "Nick Kledzik" <kledzik at
apple.com> wrote:
>> >
>> > On Jun 16, 2012, at 3:51 PM, Chris Lattner wrote:
>> > > On Jun 15, 2012, at 3:48 PM, Nick Kledzik wrote:
>> > >
>> > >> In lld we have a Reader class which is a factory that
takes .o file(s) and produces an in memory lld::File object(s).  But in doing
so, there could be I/O errors (file not readable) or file may be malformed.  We
are also using C++11 in lld, so we use std::unique_ptr for managing object
ownership.
>> > >>
>> > >> The Reader class currently has an interface that can be
simplified down to:
>> > >>
>> > >>   virtual error_code readFile(StringRef path,
std::unique_ptr<lld::File> &result);
>> > >>
>> > >> But this interface has become awkward to use. There are
two "return" values.  This method is a unique_ptr "source"
but the use of a by-refernce parameter means you have to start with an
uninitialized unique_ptr and the readFile() has to std::move() into it.  
unique_ptr is much nicer to use as a return value.
>> > >
>> > > Not a c++'11 expert here, but does a returning an
std::tuple work?
>> > I talked to Howard about that.  It does work, but it is
cumbersome.  Either:
>> >
>> > std::tuple<std::unique_ptr<lld::File>, error_code>
 result = reader.readFile(path);
>> >
>>
>> I suggest an error_or<T> class that can be combined with any
result type, not just File. It could be implemented with a fully generic variant
class, but the interface would likely be better with some specialization to
errors.
>
> I too think that error_or<T> is better than std::pair<error_code,
T>:
> * more compact
> * can name fields (e.g. error, object instead of first, second)
> * can embed asserts that object field is not accessed if there is an error.
>
> But there are some drawbacks:
> * lld uses C++11, so we thought we'd add explicit ownership by using
std::unique_ptr.  That means we'd need to layer the templates (e.g.
error_or<std::unique_ptr<lld::File>>), or forgo using unique_ptr, or
cause error_or to always internally use unique_ptr.
Could you explain this a little further? It doesn't quite make sense
to me. In my (C++11) experience I'd only use a unique_ptr for a
polymorphic type that needs single ownership. For a non-polymorphic
single ownership without any valid copy semantics I would use a
move-only type rather than std::unique_ptr.

- David
> * We still have not figured out how to encapsulate errors for object file
parsing.  The last thread was an enum *and* and a string.
>
>
>> I also really like the variant design. It has a few important
advantages IMO:
>>
>> 1) No pointers or heap allocations. These make code more bug prone and
can needlessly hurt optimizers. (although the latter is hopefully not relevant
in this specific case)
>>
>> 2) No "error" state in the file object. Either you have a
fully constructed object or you don't.
> I do agree that either you successfully parsed the file and have a full
File object or you failed and have some sort of error.  That means they are
mutually exclusive, so having every File object carry an error_code is a waste
and confusing.   But, that also make the case for the FileError subclass...
>
>> 4) Doesn't require polymorphic types (but they can be supported if
needed). Essentially, you can wrap a *specific* file type, rather than being
forced to use some generic file type.
> Yes, the template lets you do that.  But in the specific case of the
linker, it will always be using the generic type.  You can pass the linker paths
to object files, shared libraries, or static libraries.  The linker passes those
paths onto the readFile() method which therefore must return the generic File
type.
>
>
>> 3) An explicit API to check and instrument whether an error has
occured. For example, it is harder to "forget" to check the error, and
carry on manipulating the file. The implementation can assert in debug builds to
catch this early, etc.
> The same assert functionality can be added to the FileError design too.
 About the only thing you can do with a File is iterator its Atoms.  The
FileError class can implement its iterators to assert.  Also, it is already the
case that the linker handles libraries and object files differently, yet they
both come back from readFile() as a generic File object.  The linker already has
to sort all the File objects by kind, so noticing a FileError in the mix not a
problem.
>
>
> I've noticed resistance to the FileError design.  It seems natural to
me having recently done some work on the Darwin linker to support threaded and
pipelined linking.  The idea there is that you create an object immediately that
will eventually become fully populated with Atoms.   But the object is first in
an incomplete state until 1) the compiler is done producing the .o file, and 2)
the Reader has parsed the .o file into atoms.   The linker main thread creates
these incomplete objects and then waits for them to transition to the complete
state (with atoms) or to an error state (if some parsing problem happened).
>
>
> Instead of thinking of FileError as a degenerate File with an error_code
slapped on.  Think of FileError as an object encapsulating the attempt to read
and parse the file.  There could be a rich interface to query it about what went
wrong.  The error_code design attempts to capture everything about a failure in
an 'int'.  The FileError class defines a rich object interface to the
error.  Perhaps it could print out a range of bytes from the file and highlight
which bits were bad.  Or perhaps it could dump some context showing all the
sections or all the symbols and then why some bits are invalid within that
context.  In other words, the error reporting is open ended with a FileError
class.
>
> -Nick
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Jeffrey Yasskin

2012-Jun-20 20:15 UTC

head link

[LLVMdev] object construction patterns and unique_ptr

On Wed, Jun 20, 2012 at 12:36 PM, Nick Kledzik <kledzik at apple.com>
wrote:> On Jun 20, 2012, at 12:28 AM, Chandler Carruth wrote:
>> On Tue, Jun 19, 2012 at 11:18 AM, Jeffrey Yasskin <jyasskin at
google.com> wrote:
>> On Jun 18, 2012 1:24 PM, "Nick Kledzik" <kledzik at
apple.com> wrote:
>> >
>> > On Jun 16, 2012, at 3:51 PM, Chris Lattner wrote:
>> > > On Jun 15, 2012, at 3:48 PM, Nick Kledzik wrote:
>> > >
>> > >> In lld we have a Reader class which is a factory that
takes .o file(s) and produces an in memory lld::File object(s).  But in doing
so, there could be I/O errors (file not readable) or file may be malformed.  We
are also using C++11 in lld, so we use std::unique_ptr for managing object
ownership.
>> > >>
>> > >> The Reader class currently has an interface that can be
simplified down to:
>> > >>
>> > >>   virtual error_code readFile(StringRef path,
std::unique_ptr<lld::File> &result);
>> > >>
>> > >> But this interface has become awkward to use. There are
two "return" values.  This method is a unique_ptr "source"
but the use of a by-refernce parameter means you have to start with an
uninitialized unique_ptr and the readFile() has to std::move() into it.  
unique_ptr is much nicer to use as a return value.
>> > >
>> > > Not a c++'11 expert here, but does a returning an
std::tuple work?
>> > I talked to Howard about that.  It does work, but it is
cumbersome.  Either:
>> >
>> > std::tuple<std::unique_ptr<lld::File>, error_code>
 result = reader.readFile(path);
>> >
>>
>> I suggest an error_or<T> class that can be combined with any
result type, not just File. It could be implemented with a fully generic variant
class, but the interface would likely be better with some specialization to
errors.
>
> I too think that error_or<T> is better than std::pair<error_code,
T>:
> * more compact
> * can name fields (e.g. error, object instead of first, second)
> * can embed asserts that object field is not accessed if there is an error.
>
> But there are some drawbacks:
> * lld uses C++11, so we thought we'd add explicit ownership by using
std::unique_ptr.  That means we'd need to layer the templates (e.g.
error_or<std::unique_ptr<lld::File>>), or forgo using unique_ptr, or
cause error_or to always internally use unique_ptr.
Consider *all* the C++11 features, not just unique_ptr. As Howard
suggested at http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-June/050867.html,
in C++11, I would probably make lld::File a move-only type rather than
asking users to manage its lifetime by wrapping it in unique_ptr.
error_or<lld::File> should work fine in that model.

If lld::File is an abstract base class, then you need the unique_ptr
... but if File is used enough you might win by writing a
type-erasing, move-only wrapper. Talk to Howard if you need help doing
that.

Even if the type-erasing wrapper isn't worth doing, layered templates
aren't the end of the world. They have the benefit that users can see
exactly what they're getting. If you prefer to hide the details of the
type, a typedef may be enough.
> * We still have not figured out how to encapsulate errors for object file
parsing.  The last thread was an enum *and* and a string.
Yes, you may need to iterate on exactly what fields error_or<T>
contains when it's representing an error. I wouldn't assume it can
just be a std::error_code.
> Instead of thinking of FileError as a degenerate File with an error_code
slapped on.  Think of FileError as an object encapsulating the attempt to read
and parse the file.
Names are important and worth spending time on. If you want people to
think of a class as a degenerate File with an error_code slapped on,
name it "FileError" (actually, "FileError" just sounds like
the
error_code part, without the File). If you want people to think of it
as the result of an attempt to read and parse a file, name it
something related to reading and parsing a file.

HTH,
Jeffrey

Apparently Analagous Threads

Search for more seemingly similar threads

llvm dev - Jun 2012 - [LLVMdev] object construction patterns and unique_ptr

[LLVMdev] object construction patterns and unique_ptr

[LLVMdev] object construction patterns and unique_ptr

[LLVMdev] object construction patterns and unique_ptr

[LLVMdev] object construction patterns and unique_ptr

Apparently Analagous Threads