On Jun 16, 2012, at 3:51 PM, Chris Lattner wrote:> On Jun 15, 2012, at 3:48 PM, Nick Kledzik wrote: > >> In lld we have a Reader class which is a factory that takes .o file(s) and produces an in memory lld::File object(s). But in doing so, there could be I/O errors (file not readable) or file may be malformed. We are also using C++11 in lld, so we use std::unique_ptr for managing object ownership. >> >> The Reader class currently has an interface that can be simplified down to: >> >> virtual error_code readFile(StringRef path, std::unique_ptr<lld::File> &result); >> >> But this interface has become awkward to use. There are two "return" values. This method is a unique_ptr "source" but the use of a by-refernce parameter means you have to start with an uninitialized unique_ptr and the readFile() has to std::move() into it. unique_ptr is much nicer to use as a return value. > > Not a c++'11 expert here, but does a returning an std::tuple work?I talked to Howard about that. It does work, but it is cumbersome. Either: std::tuple<std::unique_ptr<lld::File>, error_code> result = reader.readFile(path); Or you use std::tie() but then you still have to declare the two local variables and tie them together. You could define a wrapper class for the tuple, but then you realize it would be simpler to push the error_code into the File and get rid of the wrapper class.>> An other issue is that since llvm::error_code was designed to return a fixed set of strings, there is no way to return a dynamic error message (e.g. "relocation 27 is invalid"). > > I'm pretty sure that the linker can specify a custom error_category with whatever information you want, and then return an error_code put with that. Michael would know for sure, but it is definitely in the design of error_code for it to be extensible.>From what I can tell error_code is about wrapping out OS specific "int" error codes, and how to map those to static strings. I suppose you could abuse it by having a error_category which has a static vector or recently dynamically generated strings and the int "error code" is an index into that vector.Clang has this great Diagnostics infrastructure for showing where the error occurred in the source file. Does developing something like that make sense for object files whose source is not human readable?>> In asking around for clean models on how to do this better, I've heard two interesting ideas: >> >> 1) Incorporate the error_code and string message into the File class. That is the factory method always returns a File object. But there is a method on File objects to query if they are valid or not. If they are not, there is a method to get the error_code and/or error string. >> >> 2) A variation on this is to have a new subclass of File called FileError. If the Reader cannot construct a real FileFoo object because of some error, it instead constructs a FileError object and returns that. The FileError object has methods to get the error_code and/or error string. >> >> With both these ideas the Reader method is: >> >> virtual std::unique_ptr<lld::File> readFile(StringRef path); >> >> which is much easier to use. >> >> >> Any recommendations on a good way to design this? > > Would an API like: > > virtual std::unique_ptr<lld::File> readFile(StringRef path, error_code &ErrorInfo);That is a common pattern, used for instance in the bit code reader. If we can unify the OS level error and the syntax/semantic errors from parsing object files, this pattern might work. One other thought that leans me towards integrating the error code/msg into the File object is that it works cleaner when you start adding multithreading to the linker. When you have multiple threads going, each reading an object file and one of them returns an error, you need to cleaning shut down the other threads, and return the error. With the error status integrated into the File object, all you do is have each thread append it resulting File object onto a list, then after all threads are joined, run through the list once to see if any say they encountered an error. If the error_code is returned separately from the File, you have to create some data structure to bind together the error_code with which file path could not be parsed. -Nick
Michael Spencer
2012-Jun-18 23:36 UTC
[LLVMdev] object construction patterns and unique_ptr
On Mon, Jun 18, 2012 at 1:22 PM, Nick Kledzik <kledzik at apple.com> wrote:> On Jun 16, 2012, at 3:51 PM, Chris Lattner wrote: >> On Jun 15, 2012, at 3:48 PM, Nick Kledzik wrote: >> >>> In lld we have a Reader class which is a factory that takes .o file(s) and produces an in memory lld::File object(s). But in doing so, there could be I/O errors (file not readable) or file may be malformed. We are also using C++11 in lld, so we use std::unique_ptr for managing object ownership. >>> >>> The Reader class currently has an interface that can be simplified down to: >>> >>> virtual error_code readFile(StringRef path, std::unique_ptr<lld::File> &result); >>> >>> But this interface has become awkward to use. There are two "return" values. This method is a unique_ptr "source" but the use of a by-refernce parameter means you have to start with an uninitialized unique_ptr and the readFile() has to std::move() into it. unique_ptr is much nicer to use as a return value. >> >> Not a c++'11 expert here, but does a returning an std::tuple work? > I talked to Howard about that. It does work, but it is cumbersome. Either: > > std::tuple<std::unique_ptr<lld::File>, error_code> result = reader.readFile(path); > > Or you use std::tie() but then you still have to declare the two local variables and tie them together. You could define a wrapper class for the tuple, but then you realize it would be simpler to push the error_code into the File and get rid of the wrapper class. > >>> An other issue is that since llvm::error_code was designed to return a fixed set of strings, there is no way to return a dynamic error message (e.g. "relocation 27 is invalid"). >> >> I'm pretty sure that the linker can specify a custom error_category with whatever information you want, and then return an error_code put with that. Michael would know for sure, but it is definitely in the design of error_code for it to be extensible. > > >From what I can tell error_code is about wrapping out OS specific "int" error codes, and how to map those to static strings. I suppose you could abuse it by having a error_category which has a static vector or recently dynamically generated strings and the int "error code" is an index into that vector. > > Clang has this great Diagnostics infrastructure for showing where the error occurred in the source file. Does developing something like that make sense for object files whose source is not human readable?The chance that a user will get an invalid object file is incredibly low, and if they do get one there's generally little they can do about it. However when this does happen we need to be able to figure out what's going on. So we do not need clang style diagnostic printing for object file parsing, but we do still need to get useful diagnostics.>>> In asking around for clean models on how to do this better, I've heard two interesting ideas: >>> >>> 1) Incorporate the error_code and string message into the File class. That is the factory method always returns a File object. But there is a method on File objects to query if they are valid or not. If they are not, there is a method to get the error_code and/or error string.I like this with some modifications.>>> 2) A variation on this is to have a new subclass of File called FileError. If the Reader cannot construct a real FileFoo object because of some error, it instead constructs a FileError object and returns that. The FileError object has methods to get the error_code and/or error string.I dislike this variant because a FileError is not a File.>>> With both these ideas the Reader method is: >>> >>> virtual std::unique_ptr<lld::File> readFile(StringRef path); >>> >>> which is much easier to use. >>> >>> >>> Any recommendations on a good way to design this? >> >> Would an API like: >> >> virtual std::unique_ptr<lld::File> readFile(StringRef path, error_code &ErrorInfo); > > That is a common pattern, used for instance in the bit code reader. If we can unify the OS level error and the syntax/semantic errors from parsing object files, this pattern might work. > > One other thought that leans me towards integrating the error code/msg into the File object is that it works cleaner when you start adding multithreading to the linker. When you have multiple threads going, each reading an object file and one of them returns an error, you need to cleaning shut down the other threads, and return the error. With the error status integrated into the File object, all you do is have each thread append it resulting File object onto a list, then after all threads are joined, run through the list once to see if any say they encountered an error. If the error_code is returned separately from the File, you have to create some data structure to bind together the error_code with which file path could not be parsed. > > -NickI believe that specifically for object file parsing errors a pair<error_code, vector<string> values> (or something similar) as a member of File would work. This would be used as: // In the error_category stuff class blah_reader_error_category : public llvm::_do_message { public: virtual std::string message(int ev) const { ... case uknown_reloc: return "uknown relocation % at %"; ... } }; // In the reader. setError(make_pair(blah_error::unkown_reloc, vector<string>{relocationumer, loc})); // loc could be calculated by backtracking, or just the current file address. (we could even pass the address as a stop address to obj2yaml). // Someplace else. if (File->error()) o << File->getErrorMessage(); Which would do the string substitution. This lets us not spend any time building up a representation for diagnostics when none will be printed, and still get decent diagnostics with well defined enum values when needed. So: virtual std::unique_ptr<lld::File> readFile(StringRef path); class File { public: bool error(); error_code getError(); string getErrorMessage(); protected: void setError(pair<error_code, vector<string>>); private: pair<error_code, vector<string>> Error; } I believe we may also want to tablegen the different errors so it's trivial to add a new one. Note that for proper linker errors (undefined reference etc...) I believe we need something more extensive. - Michael Spencer
On Jun 18, 2012, at 1:22 PM, Nick Kledzik wrote:>>> But this interface has become awkward to use. There are two "return" values. This method is a unique_ptr "source" but the use of a by-refernce parameter means you have to start with an uninitialized unique_ptr and the readFile() has to std::move() into it. unique_ptr is much nicer to use as a return value. >> >> Not a c++'11 expert here, but does a returning an std::tuple work? > I talked to Howard about that. It does work, but it is cumbersome. Either: > > std::tuple<std::unique_ptr<lld::File>, error_code> result = reader.readFile(path); > > Or you use std::tie() but then you still have to declare the two local variables and tie them together. You could define a wrapper class for the tuple, but then you realize it would be simpler to push the error_code into the File and get rid of the wrapper class.Ok, yeah, that is annoying.>>> An other issue is that since llvm::error_code was designed to return a fixed set of strings, there is no way to return a dynamic error message (e.g. "relocation 27 is invalid"). >> >> I'm pretty sure that the linker can specify a custom error_category with whatever information you want, and then return an error_code put with that. Michael would know for sure, but it is definitely in the design of error_code for it to be extensible. > > From what I can tell error_code is about wrapping out OS specific "int" error codes, and how to map those to static strings. I suppose you could abuse it by having a error_category which has a static vector or recently dynamically generated strings and the int "error code" is an index into that vector.I'm pretty sure that error_code is designed to be "simple for errno's" but "general to other errors". Michael?> Clang has this great Diagnostics infrastructure for showing where the error occurred in the source file. Does developing something like that make sense for object files whose source is not human readable?I don't think this makes sense for the linker. The key aspect of Clang's diagnostics (and llvm::SourceMgr, which is similar but simpler) is that it points back into source code. This is the source of most of the complexity of its diagnostics stuff, and doesn't make a lot of sense for the linker. That said, one crazy hack that I've been wanting to do for some time in clang would apply equally well to the linker: all the command line options could be plopped into a text buffer, and diagnostics related to command line options could refer to *that* location. Imagine getting an error like this from the linker: zsh> ld foo.o bar.o baz.a -o a.out command-line:1:3: error: foo.o is not a valid object file ld foo.o bar.o baz.a -o a.out ^~~~~ zsh> ld foo.o bar.o baz.a -o a.out command-line:1:32: error: xyz.o member of baz.a is not a valid object file ld foo.o bar.o baz.a -o a.out ^~~~~ zsh> ld foo.o bar.o baz.a -o a.out command-line:1:46: error: can not open output file a.out: permission error ld foo.o bar.o baz.a -o a.out ^~~~~ or whatever.>From an API level, this would mean that the object file parsing logic would pass back up an error_code, and then the higher level driver logic would combine the string together with the location of the command line option for printing through llvm::SourceMgr.> One other thought that leans me towards integrating the error code/msg into the File object is that it works cleaner when you start adding multithreading to the linker. When you have multiple threads going, each reading an object file and one of them returns an error, you need to cleaning shut down the other threads, and return the error. With the error status integrated into the File object, all you do is have each thread append it resulting File object onto a list, then after all threads are joined, run through the list once to see if any say they encountered an error. If the error_code is returned separately from the File, you have to create some data structure to bind together the error_code with which file path could not be parsed.As with the other guys, I'd really prefer to avoid merging the error status *into* the file. In principle, it is the "open file" operation that can fail, not the file itself. -Chris
Nathan Jeffords
2012-Jun-19 17:49 UTC
[LLVMdev] object construction patterns and unique_ptr
snip Forgive me if I am way off base, but if this is a "C++11" library why can't exceptions be used to signal the error? Nathan -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120619/b0c724b4/attachment.html>
Jeffrey Yasskin
2012-Jun-19 18:18 UTC
[LLVMdev] object construction patterns and unique_ptr
On Jun 18, 2012 1:24 PM, "Nick Kledzik" <kledzik at apple.com> wrote:> > On Jun 16, 2012, at 3:51 PM, Chris Lattner wrote: > > On Jun 15, 2012, at 3:48 PM, Nick Kledzik wrote: > > > >> In lld we have a Reader class which is a factory that takes .o file(s)and produces an in memory lld::File object(s). But in doing so, there could be I/O errors (file not readable) or file may be malformed. We are also using C++11 in lld, so we use std::unique_ptr for managing object ownership.> >> > >> The Reader class currently has an interface that can be simplifieddown to:> >> > >> virtual error_code readFile(StringRef path,std::unique_ptr<lld::File> &result);> >> > >> But this interface has become awkward to use. There are two "return"values. This method is a unique_ptr "source" but the use of a by-refernce parameter means you have to start with an uninitialized unique_ptr and the readFile() has to std::move() into it. unique_ptr is much nicer to use as a return value.> > > > Not a c++'11 expert here, but does a returning an std::tuple work? > I talked to Howard about that. It does work, but it is cumbersome.Either:> > std::tuple<std::unique_ptr<lld::File>, error_code> result reader.readFile(path); >I suggest an error_or<T> class that can be combined with any result type, not just File. It could be implemented with a fully generic variant class, but the interface would likely be better with some specialization to errors.> Or you use std::tie() but then you still have to declare the two localvariables and tie them together. You could define a wrapper class for the tuple, but then you realize it would be simpler to push the error_code into the File and get rid of the wrapper class.> > >> An other issue is that since llvm::error_code was designed to return afixed set of strings, there is no way to return a dynamic error message (e.g. "relocation 27 is invalid").> > > > I'm pretty sure that the linker can specify a custom error_categorywith whatever information you want, and then return an error_code put with that. Michael would know for sure, but it is definitely in the design of error_code for it to be extensible.> > >From what I can tell error_code is about wrapping out OS specific "int"error codes, and how to map those to static strings. I suppose you could abuse it by having a error_category which has a static vector or recently dynamically generated strings and the int "error code" is an index into that vector.> > Clang has this great Diagnostics infrastructure for showing where theerror occurred in the source file. Does developing something like that make sense for object files whose source is not human readable?> > >> In asking around for clean models on how to do this better, I've heardtwo interesting ideas:> >> > >> 1) Incorporate the error_code and string message into the File class.That is the factory method always returns a File object. But there is a method on File objects to query if they are valid or not. If they are not, there is a method to get the error_code and/or error string.> >> > >> 2) A variation on this is to have a new subclass of File calledFileError. If the Reader cannot construct a real FileFoo object because of some error, it instead constructs a FileError object and returns that. The FileError object has methods to get the error_code and/or error string.> >> > >> With both these ideas the Reader method is: > >> > >> virtual std::unique_ptr<lld::File> readFile(StringRef path); > >> > >> which is much easier to use. > >> > >> > >> Any recommendations on a good way to design this? > > > > Would an API like: > > > > virtual std::unique_ptr<lld::File> readFile(StringRef path,error_code &ErrorInfo);> > That is a common pattern, used for instance in the bit code reader. Ifwe can unify the OS level error and the syntax/semantic errors from parsing object files, this pattern might work.> > One other thought that leans me towards integrating the error code/msginto the File object is that it works cleaner when you start adding multithreading to the linker. When you have multiple threads going, each reading an object file and one of them returns an error, you need to cleaning shut down the other threads, and return the error. With the error status integrated into the File object, all you do is have each thread append it resulting File object onto a list, then after all threads are joined, run through the list once to see if any say they encountered an error. If the error_code is returned separately from the File, you have to create some data structure to bind together the error_code with which file path could not be parsed.> > -Nick > > > > > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120619/6d263d3f/attachment.html>
On Jun 18, 2012, at 4:36 PM, Michael Spencer wrote:>>>> >>>> In asking around for clean models on how to do this better, I've heard two interesting ideas: >>>> >>>> 1) Incorporate the error_code and string message into the File class. That is the factory method always returns a File object. But there is a method on File objects to query if they are valid or not. If they are not, there is a method to get the error_code and/or error string. > > I like this with some modifications. > >>>> 2) A variation on this is to have a new subclass of File called FileError. If the Reader cannot construct a real FileFoo object because of some error, it instead constructs a FileError object and returns that. The FileError object has methods to get the error_code and/or error string. >My intuition is the opposite here. If you add the error information to the lld::File base class then leads you to write the constructor for FileFoo() which does all the parsing, and if it runs into a problem, just sets the error ivar and returns (your FileCOFF is an example of that). The problem this leads to is half constructed objects. That can trip up a client which might accidentally call methods on the half constructed object other than the error getting methods. It also means the destructor has to be careful when tearing down the half constructed object. On the other hand making a new subclass of lld::File called lld::FileError leads you to a different style of programming. You have to open and parse the file past the point that any error can occur before you construct the FileFoo object. If anything goes wrong up to the point, you instead construct a FileError object and return that.> I dislike this variant because a FileError is not a File.I'm not sure what you mean by "not a File". It is a subclass of lld::File. Its path() method returns the file system path that tried to be used. Think of it as a proxy. The real file could not parsed. It represents that state.> >>>> With both these ideas the Reader method is: >>>> >>>> virtual std::unique_ptr<lld::File> readFile(StringRef path); >>>> >>>> which is much easier to use. >>>> >>>> >>>> Any recommendations on a good way to design this? >>> >>> Would an API like: >>> >>> virtual std::unique_ptr<lld::File> readFile(StringRef path, error_code &ErrorInfo); >> >> That is a common pattern, used for instance in the bit code reader. If we can unify the OS level error and the syntax/semantic errors from parsing object files, this pattern might work. >> >> One other thought that leans me towards integrating the error code/msg into the File object is that it works cleaner when you start adding multithreading to the linker. When you have multiple threads going, each reading an object file and one of them returns an error, you need to cleaning shut down the other threads, and return the error. With the error status integrated into the File object, all you do is have each thread append it resulting File object onto a list, then after all threads are joined, run through the list once to see if any say they encountered an error. If the error_code is returned separately from the File, you have to create some data structure to bind together the error_code with which file path could not be parsed. >> >> -Nick > > I believe that specifically for object file parsing errors a > pair<error_code, vector<string> values> (or something similar) as a > member of File would work. This would be used as: > > // In the error_category stuff > class blah_reader_error_category : public llvm::_do_message { > public: > virtual std::string message(int ev) const { > ... > case uknown_reloc: > return "uknown relocation % at %"; > ... > } > }; > > // In the reader. > > setError(make_pair(blah_error::unkown_reloc, > vector<string>{relocationumer, loc})); // loc could be calculated by > backtracking, or just the current file address. (we could even pass > the address as a stop address to obj2yaml). > > // Someplace else. > > if (File->error()) > o << File->getErrorMessage(); > > Which would do the string substitution.What is the advantage of this delayed substitution? The site creating the error needs to convert all values to strings. Your example has two numbers (reloc value and address) which would need to be converted to strings. It would be simpler all around to just generate the complete string right there at the error site. Are you trying to support localization? or error numbers? Also, are these enum values per file format (i.e. does ELF and mach-o each define their own set)? Or is there one big set of enums for use by all Readers?> > This lets us not spend any time building up a representation for > diagnostics when none will be printed, and still get decent > diagnostics with well defined enum values when needed.What is the rule of thumb of when a enum value is needed (as opposed to success/failure)? My thinking is that an enum only makes sense if the client can do something about a specific enum value. For instance in some future IDE, maybe file_not_found error can prod the IDE to recompile something to regenerate the object file. Another reason for enums is if you want a way to localize error strings to different languages. The enums are an index into a per-localization string table. Given that these error (e.g. unknown relocation) are a bug in the compiler or linker and not a problem the user of the tools can remedy, I'm not sure they need localizing. -Nick
Chandler Carruth
2012-Jun-20 07:28 UTC
[LLVMdev] object construction patterns and unique_ptr
On Tue, Jun 19, 2012 at 11:18 AM, Jeffrey Yasskin <jyasskin at google.com>wrote:> On Jun 18, 2012 1:24 PM, "Nick Kledzik" <kledzik at apple.com> wrote: > > > > On Jun 16, 2012, at 3:51 PM, Chris Lattner wrote: > > > On Jun 15, 2012, at 3:48 PM, Nick Kledzik wrote: > > > > > >> In lld we have a Reader class which is a factory that takes .o > file(s) and produces an in memory lld::File object(s). But in doing so, > there could be I/O errors (file not readable) or file may be malformed. We > are also using C++11 in lld, so we use std::unique_ptr for managing object > ownership. > > >> > > >> The Reader class currently has an interface that can be simplified > down to: > > >> > > >> virtual error_code readFile(StringRef path, > std::unique_ptr<lld::File> &result); > > >> > > >> But this interface has become awkward to use. There are two "return" > values. This method is a unique_ptr "source" but the use of a by-refernce > parameter means you have to start with an uninitialized unique_ptr and the > readFile() has to std::move() into it. unique_ptr is much nicer to use as > a return value. > > > > > > Not a c++'11 expert here, but does a returning an std::tuple work? > > I talked to Howard about that. It does work, but it is cumbersome. > Either: > > > > std::tuple<std::unique_ptr<lld::File>, error_code> result > reader.readFile(path); > > > > I suggest an error_or<T> class that can be combined with any result type, > not just File. It could be implemented with a fully generic variant class, > but the interface would likely be better with some specialization to errors. >I also really like the variant design. It has a few important advantages IMO: 1) No pointers or heap allocations. These make code more bug prone and can needlessly hurt optimizers. (although the latter is hopefully not relevant in this specific case) 2) No "error" state in the file object. Either you have a fully constructed object or you don't. 3) An explicit API to check and instrument whether an error has occured. For example, it is harder to "forget" to check the error, and carry on manipulating the file. The implementation can assert in debug builds to catch this early, etc. 4) Doesn't require polymorphic types (but they can be supported if needed). Essentially, you can wrap a *specific* file type, rather than being forced to use some generic file type. One interesting possibility is that you can grow the same pattern to different levels of complexity as needed (and thus keep it as simple as possible as long as possible): error_or<T> -- wraps the existing error_code with a nice interface linker_error_or<T> -- wraps both an error_code and some linker specific error handling logic file_error_or<T> -- provides completely custom error information specific to file-loading-errors. I'm not saying which (if any) of these would be appropriate, just pointing out that this pattern can be followed to greater and lesser degrees of specificity. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120620/ffc3b999/attachment.html>
Possibly Parallel Threads
- [LLVMdev] object construction patterns and unique_ptr
- [LLVMdev] object construction patterns and unique_ptr
- [LLVMdev] object construction patterns and unique_ptr
- [LLVMdev] object construction patterns and unique_ptr
- [LLVMdev] object construction patterns and unique_ptr