On Thu, May 3, 2012 at 6:41 PM, Nick Kledzik <kledzik at apple.com> wrote:> I'd like to do some more work on the darwin executable writer. But most of the work is mach-o variations that are driven by command line options. So I wanted to work out with you how command line processing will work. > > Looking at the Darwin man page for ld(1), it seems most options are darwin specific. I image other platform linkers are similar. So, to me it does not make sense to have one big interface that supports all options on all linkers. Instead, Core just needs to model common options. Then all platform specific options are a "private" interface within the platform. > > In other words, I see: > 1) llvm/Support providing class to make command line parsing easy. > 2) lld/Core providing utilities for managing search paths (for finding input files). > 3) Each platform has its own main() which: > 3a) uses the generic command line utilities to parse the options > 3b) uses the options to set up the ResolverOptions > 3c) calls Core/Resolver > 3d) calls appropriate Passes (may depend on command line options) > 3e) calls its platform executable writer, using private interface to specify platform specific options > > > Thoughts? > > -NickThe problem I see with this is embeddability and code duplication. Programmatically setting up a link with common options for the target platform should be trivial, this complicates that by requiring manual setup of the process. I was discussing this problem with Chandler, in addition to how to share option parsing code with Clang. The general idea is to have a tablegen file for canonical options (e.g. cc1 options in Clang). Then each specific (ld64, gnu-ld, link.exe) driver would get a td file that mapped each option to one or more canonical options either by a common mapping operation, or calling a custom function. - Michael Spencer
On May 4, 2012, at 3:07 PM, Michael Spencer wrote:> On Thu, May 3, 2012 at 6:41 PM, Nick Kledzik <kledzik at apple.com> wrote: >> I'd like to do some more work on the darwin executable writer. But most of the work is mach-o variations that are driven by command line options. So I wanted to work out with you how command line processing will work. >> >> Looking at the Darwin man page for ld(1), it seems most options are darwin specific. I image other platform linkers are similar. So, to me it does not make sense to have one big interface that supports all options on all linkers. Instead, Core just needs to model common options. Then all platform specific options are a "private" interface within the platform. >> >> In other words, I see: >> 1) llvm/Support providing class to make command line parsing easy. >> 2) lld/Core providing utilities for managing search paths (for finding input files). >> 3) Each platform has its own main() which: >> 3a) uses the generic command line utilities to parse the options >> 3b) uses the options to set up the ResolverOptions >> 3c) calls Core/Resolver >> 3d) calls appropriate Passes (may depend on command line options) >> 3e) calls its platform executable writer, using private interface to specify platform specific options >> >> >> Thoughts? >> >> -Nick > > The problem I see with this is embeddability and code duplication. > Programmatically setting up a link with common options for the target > platform should be trivial, this complicates that by requiring manual > setup of the process. > > I was discussing this problem with Chandler, in addition to how to > share option parsing code with Clang. The general idea is to have a > tablegen file for canonical options (e.g. cc1 options in Clang). Then > each specific (ld64, gnu-ld, link.exe) driver would get a td file that > mapped each option to one or more canonical options either by a common > mapping operation, or calling a custom function.Ok. We need the non-command-line-parse part of main() to be available for use by clients that are embedding linker functionality into other programs, and tablegen could be a nice way to manage the options. Then my modified proposal is: 1) llvm provides utilities to make command line parsing easy, possibly tablegen based. For embedded linking, it should be easy to programmatically set these options and the defaults should all be reasonable. 2) lld/Core providing utilities for managing search paths (for finding input files). 3) Each platform has a link() function which: 3a) is called by the static linker's main() after command line options are parsed 3b) uses the options to set up the ResolverOptions 3c) calls Core/Resolver 3d) calls appropriate Passes (may depend on command line options) 3e) calls its platform executable writer, which can access the parsed command line options. One design point I think is important is to encapsulate the parsed command line options into some class/struct/namespace. It should not be a bunch of global variables (like llvm::cl::opt<> leads to). If you want to embed a compiler and linker into one program, you don't want conflicting global variables. -Nick
On May 4, 2012, at 5:36 PM, Nick Kledzik <kledzik at apple.com> wrote:> On May 4, 2012, at 3:07 PM, Michael Spencer wrote: >> On Thu, May 3, 2012 at 6:41 PM, Nick Kledzik <kledzik at apple.com> wrote: >>> I'd like to do some more work on the darwin executable writer. But most of the work is mach-o variations that are driven by command line options. So I wanted to work out with you how command line processing will work. >>> >>> Looking at the Darwin man page for ld(1), it seems most options are darwin specific. I image other platform linkers are similar. So, to me it does not make sense to have one big interface that supports all options on all linkers. Instead, Core just needs to model common options. Then all platform specific options are a "private" interface within the platform. >>> >>> In other words, I see: >>> 1) llvm/Support providing class to make command line parsing easy. >>> 2) lld/Core providing utilities for managing search paths (for finding input files). >>> 3) Each platform has its own main() which: >>> 3a) uses the generic command line utilities to parse the options >>> 3b) uses the options to set up the ResolverOptions >>> 3c) calls Core/Resolver >>> 3d) calls appropriate Passes (may depend on command line options) >>> 3e) calls its platform executable writer, using private interface to specify platform specific options >>> >>> >>> Thoughts? >>> >>> -Nick >> >> The problem I see with this is embeddability and code duplication. >> Programmatically setting up a link with common options for the target >> platform should be trivial, this complicates that by requiring manual >> setup of the process. >> >> I was discussing this problem with Chandler, in addition to how to >> share option parsing code with Clang. The general idea is to have a >> tablegen file for canonical options (e.g. cc1 options in Clang). Then >> each specific (ld64, gnu-ld, link.exe) driver would get a td file that >> mapped each option to one or more canonical options either by a common >> mapping operation, or calling a custom function. > Ok. We need the non-command-line-parse part of main() to be available for use by clients that are embedding linker functionality into other programs, and tablegen could be a nice way to manage the options. Then my modified proposal is: > > 1) llvm provides utilities to make command line parsing easy, possibly tablegen based. For embedded linking, it should be easy to programmatically set these options and the defaults should all be reasonable. > 2) lld/Core providing utilities for managing search paths (for finding input files). > 3) Each platform has a link() function which: > 3a) is called by the static linker's main() after command line options are parsed > 3b) uses the options to set up the ResolverOptions > 3c) calls Core/Resolver > 3d) calls appropriate Passes (may depend on command line options) > 3e) calls its platform executable writer, which can access the parsed command line options. > > One design point I think is important is to encapsulate the parsed command line options into some class/struct/namespace. It should not be a bunch of global variables (like llvm::cl::opt<> leads to). If you want to embed a compiler and linker into one program, you don't want conflicting global variables. >+1 to this last point in particular. It's important to be able to serialize that state. For example, we'll want to be able to dump out a human-readable version to help debug problems where such a combined program fails to link, but we want to reproduce the problem manually w/ a standalone linker. Being able to dump the intermediate files and a state description for the linker should allow us to generate a recipe for "run the linker w/ these options on these files and you'll see the same behavior the combined program is getting). -Jim