Chandler Carruth
2012-Jun-20 07:28 UTC
[LLVMdev] object construction patterns and unique_ptr
On Tue, Jun 19, 2012 at 11:18 AM, Jeffrey Yasskin <jyasskin at google.com>wrote:> On Jun 18, 2012 1:24 PM, "Nick Kledzik" <kledzik at apple.com> wrote: > > > > On Jun 16, 2012, at 3:51 PM, Chris Lattner wrote: > > > On Jun 15, 2012, at 3:48 PM, Nick Kledzik wrote: > > > > > >> In lld we have a Reader class which is a factory that takes .o > file(s) and produces an in memory lld::File object(s). But in doing so, > there could be I/O errors (file not readable) or file may be malformed. We > are also using C++11 in lld, so we use std::unique_ptr for managing object > ownership. > > >> > > >> The Reader class currently has an interface that can be simplified > down to: > > >> > > >> virtual error_code readFile(StringRef path, > std::unique_ptr<lld::File> &result); > > >> > > >> But this interface has become awkward to use. There are two "return" > values. This method is a unique_ptr "source" but the use of a by-refernce > parameter means you have to start with an uninitialized unique_ptr and the > readFile() has to std::move() into it. unique_ptr is much nicer to use as > a return value. > > > > > > Not a c++'11 expert here, but does a returning an std::tuple work? > > I talked to Howard about that. It does work, but it is cumbersome. > Either: > > > > std::tuple<std::unique_ptr<lld::File>, error_code> result > reader.readFile(path); > > > > I suggest an error_or<T> class that can be combined with any result type, > not just File. It could be implemented with a fully generic variant class, > but the interface would likely be better with some specialization to errors. >I also really like the variant design. It has a few important advantages IMO: 1) No pointers or heap allocations. These make code more bug prone and can needlessly hurt optimizers. (although the latter is hopefully not relevant in this specific case) 2) No "error" state in the file object. Either you have a fully constructed object or you don't. 3) An explicit API to check and instrument whether an error has occured. For example, it is harder to "forget" to check the error, and carry on manipulating the file. The implementation can assert in debug builds to catch this early, etc. 4) Doesn't require polymorphic types (but they can be supported if needed). Essentially, you can wrap a *specific* file type, rather than being forced to use some generic file type. One interesting possibility is that you can grow the same pattern to different levels of complexity as needed (and thus keep it as simple as possible as long as possible): error_or<T> -- wraps the existing error_code with a nice interface linker_error_or<T> -- wraps both an error_code and some linker specific error handling logic file_error_or<T> -- provides completely custom error information specific to file-loading-errors. I'm not saying which (if any) of these would be appropriate, just pointing out that this pattern can be followed to greater and lesser degrees of specificity. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120620/ffc3b999/attachment.html>
On Jun 20, 2012, at 12:28 AM, Chandler Carruth wrote:> On Tue, Jun 19, 2012 at 11:18 AM, Jeffrey Yasskin <jyasskin at google.com> wrote: > On Jun 18, 2012 1:24 PM, "Nick Kledzik" <kledzik at apple.com> wrote: > > > > On Jun 16, 2012, at 3:51 PM, Chris Lattner wrote: > > > On Jun 15, 2012, at 3:48 PM, Nick Kledzik wrote: > > > > > >> In lld we have a Reader class which is a factory that takes .o file(s) and produces an in memory lld::File object(s). But in doing so, there could be I/O errors (file not readable) or file may be malformed. We are also using C++11 in lld, so we use std::unique_ptr for managing object ownership. > > >> > > >> The Reader class currently has an interface that can be simplified down to: > > >> > > >> virtual error_code readFile(StringRef path, std::unique_ptr<lld::File> &result); > > >> > > >> But this interface has become awkward to use. There are two "return" values. This method is a unique_ptr "source" but the use of a by-refernce parameter means you have to start with an uninitialized unique_ptr and the readFile() has to std::move() into it. unique_ptr is much nicer to use as a return value. > > > > > > Not a c++'11 expert here, but does a returning an std::tuple work? > > I talked to Howard about that. It does work, but it is cumbersome. Either: > > > > std::tuple<std::unique_ptr<lld::File>, error_code> result = reader.readFile(path); > > > > I suggest an error_or<T> class that can be combined with any result type, not just File. It could be implemented with a fully generic variant class, but the interface would likely be better with some specialization to errors.I too think that error_or<T> is better than std::pair<error_code, T>: * more compact * can name fields (e.g. error, object instead of first, second) * can embed asserts that object field is not accessed if there is an error. But there are some drawbacks: * lld uses C++11, so we thought we'd add explicit ownership by using std::unique_ptr. That means we'd need to layer the templates (e.g. error_or<std::unique_ptr<lld::File>>), or forgo using unique_ptr, or cause error_or to always internally use unique_ptr. * We still have not figured out how to encapsulate errors for object file parsing. The last thread was an enum *and* and a string.> I also really like the variant design. It has a few important advantages IMO: > > 1) No pointers or heap allocations. These make code more bug prone and can needlessly hurt optimizers. (although the latter is hopefully not relevant in this specific case) > > 2) No "error" state in the file object. Either you have a fully constructed object or you don't.I do agree that either you successfully parsed the file and have a full File object or you failed and have some sort of error. That means they are mutually exclusive, so having every File object carry an error_code is a waste and confusing. But, that also make the case for the FileError subclass...> 4) Doesn't require polymorphic types (but they can be supported if needed). Essentially, you can wrap a *specific* file type, rather than being forced to use some generic file type.Yes, the template lets you do that. But in the specific case of the linker, it will always be using the generic type. You can pass the linker paths to object files, shared libraries, or static libraries. The linker passes those paths onto the readFile() method which therefore must return the generic File type.> 3) An explicit API to check and instrument whether an error has occured. For example, it is harder to "forget" to check the error, and carry on manipulating the file. The implementation can assert in debug builds to catch this early, etc.The same assert functionality can be added to the FileError design too. About the only thing you can do with a File is iterator its Atoms. The FileError class can implement its iterators to assert. Also, it is already the case that the linker handles libraries and object files differently, yet they both come back from readFile() as a generic File object. The linker already has to sort all the File objects by kind, so noticing a FileError in the mix not a problem. I've noticed resistance to the FileError design. It seems natural to me having recently done some work on the Darwin linker to support threaded and pipelined linking. The idea there is that you create an object immediately that will eventually become fully populated with Atoms. But the object is first in an incomplete state until 1) the compiler is done producing the .o file, and 2) the Reader has parsed the .o file into atoms. The linker main thread creates these incomplete objects and then waits for them to transition to the complete state (with atoms) or to an error state (if some parsing problem happened). Instead of thinking of FileError as a degenerate File with an error_code slapped on. Think of FileError as an object encapsulating the attempt to read and parse the file. There could be a rich interface to query it about what went wrong. The error_code design attempts to capture everything about a failure in an 'int'. The FileError class defines a rich object interface to the error. Perhaps it could print out a range of bytes from the file and highlight which bits were bad. Or perhaps it could dump some context showing all the sections or all the symbols and then why some bits are invalid within that context. In other words, the error reporting is open ended with a FileError class. -Nick
On Wed, Jun 20, 2012 at 12:36 PM, Nick Kledzik <kledzik at apple.com> wrote:> On Jun 20, 2012, at 12:28 AM, Chandler Carruth wrote: >> On Tue, Jun 19, 2012 at 11:18 AM, Jeffrey Yasskin <jyasskin at google.com> wrote: >> On Jun 18, 2012 1:24 PM, "Nick Kledzik" <kledzik at apple.com> wrote: >> > >> > On Jun 16, 2012, at 3:51 PM, Chris Lattner wrote: >> > > On Jun 15, 2012, at 3:48 PM, Nick Kledzik wrote: >> > > >> > >> In lld we have a Reader class which is a factory that takes .o file(s) and produces an in memory lld::File object(s). But in doing so, there could be I/O errors (file not readable) or file may be malformed. We are also using C++11 in lld, so we use std::unique_ptr for managing object ownership. >> > >> >> > >> The Reader class currently has an interface that can be simplified down to: >> > >> >> > >> virtual error_code readFile(StringRef path, std::unique_ptr<lld::File> &result); >> > >> >> > >> But this interface has become awkward to use. There are two "return" values. This method is a unique_ptr "source" but the use of a by-refernce parameter means you have to start with an uninitialized unique_ptr and the readFile() has to std::move() into it. unique_ptr is much nicer to use as a return value. >> > > >> > > Not a c++'11 expert here, but does a returning an std::tuple work? >> > I talked to Howard about that. It does work, but it is cumbersome. Either: >> > >> > std::tuple<std::unique_ptr<lld::File>, error_code> result = reader.readFile(path); >> > >> >> I suggest an error_or<T> class that can be combined with any result type, not just File. It could be implemented with a fully generic variant class, but the interface would likely be better with some specialization to errors. > > I too think that error_or<T> is better than std::pair<error_code, T>: > * more compact > * can name fields (e.g. error, object instead of first, second) > * can embed asserts that object field is not accessed if there is an error. > > But there are some drawbacks: > * lld uses C++11, so we thought we'd add explicit ownership by using std::unique_ptr. That means we'd need to layer the templates (e.g. error_or<std::unique_ptr<lld::File>>), or forgo using unique_ptr, or cause error_or to always internally use unique_ptr.Could you explain this a little further? It doesn't quite make sense to me. In my (C++11) experience I'd only use a unique_ptr for a polymorphic type that needs single ownership. For a non-polymorphic single ownership without any valid copy semantics I would use a move-only type rather than std::unique_ptr. - David> * We still have not figured out how to encapsulate errors for object file parsing. The last thread was an enum *and* and a string. > > >> I also really like the variant design. It has a few important advantages IMO: >> >> 1) No pointers or heap allocations. These make code more bug prone and can needlessly hurt optimizers. (although the latter is hopefully not relevant in this specific case) >> >> 2) No "error" state in the file object. Either you have a fully constructed object or you don't. > I do agree that either you successfully parsed the file and have a full File object or you failed and have some sort of error. That means they are mutually exclusive, so having every File object carry an error_code is a waste and confusing. But, that also make the case for the FileError subclass... > >> 4) Doesn't require polymorphic types (but they can be supported if needed). Essentially, you can wrap a *specific* file type, rather than being forced to use some generic file type. > Yes, the template lets you do that. But in the specific case of the linker, it will always be using the generic type. You can pass the linker paths to object files, shared libraries, or static libraries. The linker passes those paths onto the readFile() method which therefore must return the generic File type. > > >> 3) An explicit API to check and instrument whether an error has occured. For example, it is harder to "forget" to check the error, and carry on manipulating the file. The implementation can assert in debug builds to catch this early, etc. > The same assert functionality can be added to the FileError design too. About the only thing you can do with a File is iterator its Atoms. The FileError class can implement its iterators to assert. Also, it is already the case that the linker handles libraries and object files differently, yet they both come back from readFile() as a generic File object. The linker already has to sort all the File objects by kind, so noticing a FileError in the mix not a problem. > > > I've noticed resistance to the FileError design. It seems natural to me having recently done some work on the Darwin linker to support threaded and pipelined linking. The idea there is that you create an object immediately that will eventually become fully populated with Atoms. But the object is first in an incomplete state until 1) the compiler is done producing the .o file, and 2) the Reader has parsed the .o file into atoms. The linker main thread creates these incomplete objects and then waits for them to transition to the complete state (with atoms) or to an error state (if some parsing problem happened). > > > Instead of thinking of FileError as a degenerate File with an error_code slapped on. Think of FileError as an object encapsulating the attempt to read and parse the file. There could be a rich interface to query it about what went wrong. The error_code design attempts to capture everything about a failure in an 'int'. The FileError class defines a rich object interface to the error. Perhaps it could print out a range of bytes from the file and highlight which bits were bad. Or perhaps it could dump some context showing all the sections or all the symbols and then why some bits are invalid within that context. In other words, the error reporting is open ended with a FileError class. > > -Nick > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Jeffrey Yasskin
2012-Jun-20 20:15 UTC
[LLVMdev] object construction patterns and unique_ptr
On Wed, Jun 20, 2012 at 12:36 PM, Nick Kledzik <kledzik at apple.com> wrote:> On Jun 20, 2012, at 12:28 AM, Chandler Carruth wrote: >> On Tue, Jun 19, 2012 at 11:18 AM, Jeffrey Yasskin <jyasskin at google.com> wrote: >> On Jun 18, 2012 1:24 PM, "Nick Kledzik" <kledzik at apple.com> wrote: >> > >> > On Jun 16, 2012, at 3:51 PM, Chris Lattner wrote: >> > > On Jun 15, 2012, at 3:48 PM, Nick Kledzik wrote: >> > > >> > >> In lld we have a Reader class which is a factory that takes .o file(s) and produces an in memory lld::File object(s). But in doing so, there could be I/O errors (file not readable) or file may be malformed. We are also using C++11 in lld, so we use std::unique_ptr for managing object ownership. >> > >> >> > >> The Reader class currently has an interface that can be simplified down to: >> > >> >> > >> virtual error_code readFile(StringRef path, std::unique_ptr<lld::File> &result); >> > >> >> > >> But this interface has become awkward to use. There are two "return" values. This method is a unique_ptr "source" but the use of a by-refernce parameter means you have to start with an uninitialized unique_ptr and the readFile() has to std::move() into it. unique_ptr is much nicer to use as a return value. >> > > >> > > Not a c++'11 expert here, but does a returning an std::tuple work? >> > I talked to Howard about that. It does work, but it is cumbersome. Either: >> > >> > std::tuple<std::unique_ptr<lld::File>, error_code> result = reader.readFile(path); >> > >> >> I suggest an error_or<T> class that can be combined with any result type, not just File. It could be implemented with a fully generic variant class, but the interface would likely be better with some specialization to errors. > > I too think that error_or<T> is better than std::pair<error_code, T>: > * more compact > * can name fields (e.g. error, object instead of first, second) > * can embed asserts that object field is not accessed if there is an error. > > But there are some drawbacks: > * lld uses C++11, so we thought we'd add explicit ownership by using std::unique_ptr. That means we'd need to layer the templates (e.g. error_or<std::unique_ptr<lld::File>>), or forgo using unique_ptr, or cause error_or to always internally use unique_ptr.Consider *all* the C++11 features, not just unique_ptr. As Howard suggested at http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-June/050867.html, in C++11, I would probably make lld::File a move-only type rather than asking users to manage its lifetime by wrapping it in unique_ptr. error_or<lld::File> should work fine in that model. If lld::File is an abstract base class, then you need the unique_ptr ... but if File is used enough you might win by writing a type-erasing, move-only wrapper. Talk to Howard if you need help doing that. Even if the type-erasing wrapper isn't worth doing, layered templates aren't the end of the world. They have the benefit that users can see exactly what they're getting. If you prefer to hide the details of the type, a typedef may be enough.> * We still have not figured out how to encapsulate errors for object file parsing. The last thread was an enum *and* and a string.Yes, you may need to iterate on exactly what fields error_or<T> contains when it's representing an error. I wouldn't assume it can just be a std::error_code.> Instead of thinking of FileError as a degenerate File with an error_code slapped on. Think of FileError as an object encapsulating the attempt to read and parse the file.Names are important and worth spending time on. If you want people to think of a class as a degenerate File with an error_code slapped on, name it "FileError" (actually, "FileError" just sounds like the error_code part, without the File). If you want people to think of it as the result of an attempt to read and parse a file, name it something related to reading and parsing a file. HTH, Jeffrey
Maybe Matching Threads
- [LLVMdev] object construction patterns and unique_ptr
- [LLVMdev] object construction patterns and unique_ptr
- [LLVMdev] object construction patterns and unique_ptr
- [LLVMdev] object construction patterns and unique_ptr
- [LLVMdev] object construction patterns and unique_ptr