Mehdi AMINI via llvm-dev
2021-Jul-02 18:57 UTC
[llvm-dev] Binary utilities: switch command line parsing from llvm::cl to OptTable (byproduct: drop -long-option?)
On Fri, Jul 2, 2021 at 11:41 AM David Blaikie via llvm-dev < llvm-dev at lists.llvm.org> wrote:> On Fri, Jul 2, 2021 at 11:27 AM Fāng-ruì Sòng <maskray at google.com> wrote: > >> On Fri, Jul 2, 2021 at 11:17 AM David Blaikie <dblaikie at gmail.com> wrote: >> >>> On Fri, Jul 2, 2021 at 10:15 AM Fāng-ruì Sòng via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>>> llvm/tools/ include some binary utilities used as replacement for GNU >>>> binutils, e.g. llvm-objcopy, llvm-symbolizer, llvm-nm. >>>> In some old threads people discussed some drawbacks of using cl::opt >>>> for user-facing utilities (I cannot find them now). >>>> >>> >>> Would be good to describe some of the known drawbacks/expected benefits. >>> >> >> The summary is the list of benefits:) >> > > Ah, it looks more like a list of changes, not necessarily benefits - > removing certain syntaxes seems generally like a cost to me (potential to > break existing users), rather than an outright benefit. >Indeed: it isn't clear to me that these are outright "benefits".> > The API benefits sound nice, though presumably some could be retrofitted > to cl::opt if that was the only goal. Side benefits in addition to removing > global ctors are nice to have. > > >> The drawback is some initial boilerplate (e.g. llvm-tblgen >> -gen-opt-parser-defs in CMakeLists.txt, class NmOptTable in code). >> The handling of comma separated options -arch=x86_64,arm64 doesn't have >> direct OptTable support. llvm::SplitString is needed (just search for >> SplitString in https://reviews.llvm.org/D105330) >> But this doesn't tend to increase complexity because the >> cl::list<std::string> will need per-value verification anyway. >> >> >>> One potential one (though I don't recall it being discussed recently) >>> would be that maybe this addresses the issue of global ctors in cl::opt? >>> Does OptTable avoid/not use global constructors? That would be nice - it's >>> an ongoing issue that LLVM library users pay for command line argument >>> support they have no need for in the form of global ctor execution time. >>> >> >> OptTable is used as a local variable. So yes, it avoids global >> constructors, >> > > Nice :) >Note that MLIR is using cl::opt without global ctor (we build with `-Werror=global-constructors`). The pattern we use to write a tool with cl::opt and avoid global ctor (and can be used to avoid collision) looks like: https://github.com/llvm/llvm-project/blob/main/mlir/lib/IR/MLIRContext.cpp#L57-L83 The tool that wants to expose the MLIRContext options to the command line calls registerMLIRContextCLOptions() before parsing the command line. Wouldn't this translate directly to LLVM tools as well with some minor refactoring? The same applies to all of the infrastructure in MLIR, passes are registered explicitly, etc. This decouples the "is this code linked in" from "options are loaded" annoying part of the global constructors. -- Mehdi> > >> thus avoiding cl::opt option name collision. >> "If we decide to support binary utility multiplexing" below mentioned >> this point. >> >> Switching to OptTable is an appealing solution. I have prepared two >>>> patches for two binary utilities: llvm-nm and llvm-strings. >>>> >>>> * llvm-strings https://reviews.llvm.org/D104889 >>>> * llvm-nm https://reviews.llvm.org/D105330 >>>> >>>> llvm-symbolizer was switched last year. llvm-objdump was switched by >>>> thakis earlier this year. >>>> >>>> The switch can fix some corners with lib/Support/CommandLine.cpp. Here >>>> is a summary: >>>> >>>> * -t=d is removed (equal sign after a short option). Use -t d instead. >>>> * --demangle=0 (=0 to disable a boolean option) is removed. Omit the >>>> option or use --no-demangle instead. >>>> * To support boolean options (e.g. --demangle --no-demangle), we don't >>>> need to compare their positions (if (NoDemangle.getPosition() > >>>> Demangle.getPosition()) , see llvm-nm.cpp) >>>> * grouped short options can be specified with one line >>>> `setGroupedShortOptions`, instead of adding cl::Grouping to every short >>>> options. >>>> * We don't need to add cl::cat to every option and call >>>> `HideUnrelatedOptions` to hide unrelated options from --help. The issue >>>> would happen with cl::opt tools if linker garbage collection is disabled or >>>> libLLVM-13git.so is used. (See https://reviews.llvm.org/D104363) >>>> * If we decide to support binary utility multiplexting ( >>>> https://reviews.llvm.org/D104686), we will not get conflicting >>>> options. An option may have different meanings in different utilities >>>> (especially for one-letter options). >>>> >>>> *I expect that most users will not observe any difference.* >>>> >>>> There is a related topic whether we should disallow the single-dash >>>> `-long-option` form. >>>> >>> (Discussed in 2019: >>>> https://lists.llvm.org/pipermail/llvm-dev/2019-April/131786.html >>>> Accept --long-option but not -long-option for llvm binary utilities) >>>> *I'd like to disallow -long-option but may want to do this in a >>>> separate change.* >>>> >>> >>> I'd say definitely do this as a separate change. I expect there'd be a >>> long tail of users after this change ships in an LLVM release, etc, such >>> that we may want to undo some amount of it a long time after the change is >>> made. >>> >> >> Thanks for chiming in. >> >> >>> >>>> The main point is that (1) grouped short options have syntax conflict >>>> with one-dash long options. (2) the GNU getopt_long style two-dash long >>>> option is much more popular. >>>> >>>> I can think of potential pushback for some Mach-O specific options, >>>> e.g. nm -arch >>>> http://www.manpagez.com/man/1/nm/osx-10.12.6.php says `-arch` has one >>>> dash. >>>> If such options may have problems, we can keep supporting one dash >>>> forms. >>>> With OptTable, allowing one-dash forms for a specific option is easy. >>>> >>>> _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210702/3c5555c8/attachment.html>
Alexandre Ganea via llvm-dev
2021-Jul-02 19:49 UTC
[llvm-dev] Binary utilities: switch command line parsing from llvm::cl to OptTable (byproduct: drop -long-option?)
The API benefits sound nice, though presumably some could be retrofitted to cl::opt if that was the only goal. Side benefits in addition to removing global ctors are nice to have. The drawback is some initial boilerplate (e.g. llvm-tblgen -gen-opt-parser-defs in CMakeLists.txt, class NmOptTable in code). The handling of comma separated options -arch=x86_64,arm64 doesn't have direct OptTable support. llvm::SplitString is needed (just search for SplitString in https://reviews.llvm.org/D105330) But this doesn't tend to increase complexity because the cl::list<std::string> will need per-value verification anyway. One potential one (though I don't recall it being discussed recently) would be that maybe this addresses the issue of global ctors in cl::opt? Does OptTable avoid/not use global constructors? That would be nice - it's an ongoing issue that LLVM library users pay for command line argument support they have no need for in the form of global ctor execution time. OptTable is used as a local variable. So yes, it avoids global constructors, Nice :) Note that MLIR is using cl::opt without global ctor (we build with `-Werror=global-constructors`). The pattern we use to write a tool with cl::opt and avoid global ctor (and can be used to avoid collision) looks like: https://github.com/llvm/llvm-project/blob/main/mlir/lib/IR/MLIRContext.cpp#L57-L83 The tool that wants to expose the MLIRContext options to the command line calls registerMLIRContextCLOptions() before parsing the command line. Wouldn't this translate directly to LLVM tools as well with some minor refactoring? The same applies to all of the infrastructure in MLIR, passes are registered explicitly, etc. This decouples the "is this code linked in" from "options are loaded" annoying part of the global constructors. -- Mehdi [Alexandre Ganea] I think one other issue with cl::opt is that it aggregates the “command-line argument definition” and the “runtime parameter” de facto in a single object (unless cl::location is manually specified to every cl::opt). What MLIR does solves the issue mentioned by David, the fact that every tool pulls/initializes every cl::opt out there. However OptTable solves both problems, and makes the entry point thread-safe. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210702/3960ea12/attachment.html>