Fāng-ruì Sòng via llvm-dev
2021-Jul-02 23:07 UTC
[llvm-dev] Binary utilities: switch command line parsing from llvm::cl to OptTable (byproduct: drop -long-option?)
On Fri, Jul 2, 2021 at 3:26 PM Fāng-ruì Sòng <maskray at google.com> wrote:> On Fri, Jul 2, 2021 at 3:21 PM Mehdi AMINI <joker.eph at gmail.com> wrote: > >> >> >> On Fri, Jul 2, 2021 at 2:09 PM Fāng-ruì Sòng <maskray at google.com> wrote: >> >>> On Fri, Jul 2, 2021 at 2:03 PM Mehdi AMINI <joker.eph at gmail.com> wrote: >>> >>>> >>>> >>>> On Fri, Jul 2, 2021 at 1:27 PM Fāng-ruì Sòng <maskray at google.com> >>>> wrote: >>>> >>>>> On Fri, Jul 2, 2021 at 1:10 PM Mehdi AMINI <joker.eph at gmail.com> >>>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Fri, Jul 2, 2021 at 12:49 PM Alexandre Ganea < >>>>>> alexandre.ganea at ubisoft.com> wrote: >>>>>> >>>>>>> The API benefits sound nice, though presumably some could be >>>>>>> retrofitted to cl::opt if that was the only goal. Side benefits in addition >>>>>>> to removing global ctors are nice to have. >>>>>>> >>>>>>> >>>>>>> The drawback is some initial boilerplate (e.g. llvm-tblgen >>>>>>> -gen-opt-parser-defs in CMakeLists.txt, class NmOptTable in code). >>>>>>> >>>>>>> The handling of comma separated options -arch=x86_64,arm64 doesn't >>>>>>> have direct OptTable support. llvm::SplitString is needed (just search for >>>>>>> SplitString in https://reviews.llvm.org/D105330) >>>>>>> >>>>>>> But this doesn't tend to increase complexity because the >>>>>>> cl::list<std::string> will need per-value verification anyway. >>>>>>> >>>>>>> >>>>>>> >>>>>>> One potential one (though I don't recall it being discussed >>>>>>> recently) would be that maybe this addresses the issue of global ctors in >>>>>>> cl::opt? Does OptTable avoid/not use global constructors? That would be >>>>>>> nice - it's an ongoing issue that LLVM library users pay for command line >>>>>>> argument support they have no need for in the form of global ctor execution >>>>>>> time. >>>>>>> >>>>>>> >>>>>>> >>>>>>> OptTable is used as a local variable. So yes, it avoids global >>>>>>> constructors, >>>>>>> >>>>>>> >>>>>>> >>>>>>> Nice :) >>>>>>> >>>>>>> >>>>>>> >>>>>>> Note that MLIR is using cl::opt without global ctor (we build with >>>>>>> `-Werror=global-constructors`). >>>>>>> >>>>>>> >>>>>>> >>>>>>> The pattern we use to write a tool with cl::opt and avoid global >>>>>>> ctor (and can be used to avoid collision) looks like: >>>>>>> https://github.com/llvm/llvm-project/blob/main/mlir/lib/IR/MLIRContext.cpp#L57-L83 >>>>>>> >>>>>>> >>>>>>> >>>>>>> The tool that wants to expose the MLIRContext options to the command >>>>>>> line calls registerMLIRContextCLOptions() before parsing the command line. >>>>>>> >>>>>>> Wouldn't this translate directly to LLVM tools as well with some >>>>>>> minor refactoring? >>>>>>> >>>>>>> >>>>>>> >>>>>>> The same applies to all of the infrastructure in MLIR, passes are >>>>>>> registered explicitly, etc. This decouples the "is this code linked in" >>>>>>> from "options are loaded" annoying part of the global constructors. >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> Mehdi >>>>>>> >>>>>>> >>>>>>> >>>>>>> *[Alexandre Ganea] *I think one other issue with cl::opt is that it >>>>>>> aggregates the “command-line argument definition” and the “runtime >>>>>>> parameter” *de facto* in a single object (unless cl::location is >>>>>>> manually specified to every cl::opt). What MLIR does solves the issue >>>>>>> mentioned by David, the fact that every tool pulls/initializes every >>>>>>> cl::opt out there. However OptTable solves both problems, and makes the >>>>>>> entry point thread-safe. >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> I agree that removing the global state would be great! >>>>>> Right now what I see proposed with OptTable (like >>>>>> https://reviews.llvm.org/D104889) seems to just address the >>>>>> tools-specific options, and the value isn't clear to me for these cases, >>>>>> since these options aren't exposed through library entry points. >>>>>> I don't quite get right now how OptTable would compose at the LLVM >>>>>> scale? Are there examples of libraries exposing pluggable hooks for a tool >>>>>> to aggregate multiple libraries' options and expose them on the command >>>>>> line? >>>>>> >>>>>> >>>>> The first message listed: >>>>> >>>>> * -t=d is removed (equal sign after a short option). Use -t d instead. >>>>> * --demangle=0 (=0 to disable a boolean option) is removed. Omit the >>>>> option or use --no-demangle instead. >>>>> * To support boolean options (e.g. --demangle --no-demangle), we don't >>>>> need to compare their positions (if (NoDemangle.getPosition() > >>>>> Demangle.getPosition()) , see llvm-nm.cpp) >>>>> * grouped short options can be specified with one line >>>>> `setGroupedShortOptions`, instead of adding cl::Grouping to every short >>>>> options. >>>>> * We don't need to add cl::cat to every option and call >>>>> `HideUnrelatedOptions` to hide unrelated options from --help. The issue >>>>> would happen with cl::opt tools if linker garbage collection is disabled or >>>>> libLLVM-13git.so is used. (See https://reviews.llvm.org/D104363) >>>>> >>>>> To me *these points are all usability issues of cl::opt*. I care >>>>> about not exposing unnecessary interfaces so cl::opt accepting the weird >>>>> -t=d looks a downside to me. >>>>> >>>>> --demangle=0 is weird and some llvm/llvm/test tests do use cl::opt >>>>> options this way, so we cannot just remove this usage. As a workaround, we >>>>> could add a cl::foobar_toggle to a cl::opt to disallow =0. >>>>> We would end with more customization for one option, cl::cat (for >>>>> hiding unrelated options), cl::foobar_toggle (for disallowing =0), and >>>>> potentially others for other ad-hoc tasks. >>>>> >>>>> I can highlight another thing about the global state of cl::opt => *library >>>>> cl::opt and binary utility cl::opt share the same namespace*. >>>>> So cl::opt options (usually for debugging or testing) in library code >>>>> can end up in a tool's list of command line options. >>>>> This is usually undesired (e.g. llvm-objdump --x86-asm-syntax in >>>>> https://reviews.llvm.org/D100433). >>>>> People may not notice this if they always >>>>> use -DLLVM_LINK_LLVM_DYLIB=off and don't use linker garbage collection. >>>>> >>>> >>>> You're not answering my question here, are you? Are you answering to >>>> what I mentioned 3 emails before in an answer to David when I wrote >>>> "Indeed: it isn't clear to me that these are outright "benefits"."? >>>> >>> >>> OptTable doesn't have the listed usability issues. Not having the issues >>> is a large benefit to me. >>> >> >> Maybe, these are subjective though. >> (And the confusion above came from the fact that you just answered the >> wrong email) >> >> >>> >>> >>>> Because I still don't see clearly how to build something like `opt` >>>> with all the pass and the options with OptTable, how does it all compose? >>>> >>> >>> The proposed changes are specific to binary utilities: >>> llvm-{ar,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}. >>> 'opt', 'llc', etc are not in the scope. >>> (I guess I should have named the utilities more specifically to not >>> cause confusion.) >>> >>> For >>> llvm-{ar,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}, we >>> don't want a random cl::opt in lib/Transforms, lib/MC, orlib/LTO to be >>> accidentally specifiable on the command line. >>> (In the rare debugging case where such functionality is needed, it needs >>> a -mllvm prefix like what ld.lld does.) >>> >> >> Have you looked at what I mentioned with MLIR on how to use cl::opt >> without global constructor? This has exactly the behavior you're looking >> for. >> >> > I have read > https://github.com/llvm/llvm-project/blob/main/mlir/lib/IR/MLIRContext.cpp#L57-L83 > > Unless it does something more specially, I don't think it can avoid the > issue I mentioned in a previous message: > > > I can highlight another thing about the global state of cl::opt => *library > cl::opt and binary utility cl::opt share the same namespace*. > > This can be worked around with linker garbage collection by discarding > unreferenced cl::opt. >I re-read these messages. I think you probably meant something more generic - how to design decentralized command line option registry where every library can register some options. The binary utilities command line parsing issue I intended to address is fairly different: We want a single registry of all options and don't want to inherit options from llvm/lib/A just because the tool happens to depend on llvm/lib/A directly or indirectly. E.g. llvm-nm needs to depend on bitcode reader because it needs to handle LLVM bitcode files, however, I don't want a random cl::opt in bitcode reader to appear in llvm-nm's command line option list. So I just built mlir-opt and inspected its --help output. It has exactly the problem I called out in my first message: * We don't need to add cl::cat to every option and call `HideUnrelatedOptions` to hide unrelated options from --help. The issue would happen with cl::opt tools if linker garbage collection is disabled or libLLVM-13git.so is used. (See https://reviews.llvm.org/D104363) There are many unrelated options like --amdgpu-bypass-slow-div which is from llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210702/9aa8866a/attachment.html>
Mehdi AMINI via llvm-dev
2021-Jul-02 23:13 UTC
[llvm-dev] Binary utilities: switch command line parsing from llvm::cl to OptTable (byproduct: drop -long-option?)
Looks like our email crossed :) On Fri, Jul 2, 2021 at 4:07 PM Fāng-ruì Sòng <maskray at google.com> wrote:> On Fri, Jul 2, 2021 at 3:26 PM Fāng-ruì Sòng <maskray at google.com> wrote: > >> On Fri, Jul 2, 2021 at 3:21 PM Mehdi AMINI <joker.eph at gmail.com> wrote: >> >>> >>> >>> On Fri, Jul 2, 2021 at 2:09 PM Fāng-ruì Sòng <maskray at google.com> wrote: >>> >>>> On Fri, Jul 2, 2021 at 2:03 PM Mehdi AMINI <joker.eph at gmail.com> wrote: >>>> >>>>> >>>>> >>>>> On Fri, Jul 2, 2021 at 1:27 PM Fāng-ruì Sòng <maskray at google.com> >>>>> wrote: >>>>> >>>>>> On Fri, Jul 2, 2021 at 1:10 PM Mehdi AMINI <joker.eph at gmail.com> >>>>>> wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> On Fri, Jul 2, 2021 at 12:49 PM Alexandre Ganea < >>>>>>> alexandre.ganea at ubisoft.com> wrote: >>>>>>> >>>>>>>> The API benefits sound nice, though presumably some could be >>>>>>>> retrofitted to cl::opt if that was the only goal. Side benefits in addition >>>>>>>> to removing global ctors are nice to have. >>>>>>>> >>>>>>>> >>>>>>>> The drawback is some initial boilerplate (e.g. llvm-tblgen >>>>>>>> -gen-opt-parser-defs in CMakeLists.txt, class NmOptTable in code). >>>>>>>> >>>>>>>> The handling of comma separated options -arch=x86_64,arm64 doesn't >>>>>>>> have direct OptTable support. llvm::SplitString is needed (just search for >>>>>>>> SplitString in https://reviews.llvm.org/D105330) >>>>>>>> >>>>>>>> But this doesn't tend to increase complexity because the >>>>>>>> cl::list<std::string> will need per-value verification anyway. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> One potential one (though I don't recall it being discussed >>>>>>>> recently) would be that maybe this addresses the issue of global ctors in >>>>>>>> cl::opt? Does OptTable avoid/not use global constructors? That would be >>>>>>>> nice - it's an ongoing issue that LLVM library users pay for command line >>>>>>>> argument support they have no need for in the form of global ctor execution >>>>>>>> time. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> OptTable is used as a local variable. So yes, it avoids global >>>>>>>> constructors, >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Nice :) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Note that MLIR is using cl::opt without global ctor (we build with >>>>>>>> `-Werror=global-constructors`). >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> The pattern we use to write a tool with cl::opt and avoid global >>>>>>>> ctor (and can be used to avoid collision) looks like: >>>>>>>> https://github.com/llvm/llvm-project/blob/main/mlir/lib/IR/MLIRContext.cpp#L57-L83 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> The tool that wants to expose the MLIRContext options to the >>>>>>>> command line calls registerMLIRContextCLOptions() before parsing the >>>>>>>> command line. >>>>>>>> >>>>>>>> Wouldn't this translate directly to LLVM tools as well with some >>>>>>>> minor refactoring? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> The same applies to all of the infrastructure in MLIR, passes are >>>>>>>> registered explicitly, etc. This decouples the "is this code linked in" >>>>>>>> from "options are loaded" annoying part of the global constructors. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> Mehdi >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> *[Alexandre Ganea] *I think one other issue with cl::opt is that >>>>>>>> it aggregates the “command-line argument definition” and the “runtime >>>>>>>> parameter” *de facto* in a single object (unless cl::location is >>>>>>>> manually specified to every cl::opt). What MLIR does solves the issue >>>>>>>> mentioned by David, the fact that every tool pulls/initializes every >>>>>>>> cl::opt out there. However OptTable solves both problems, and makes the >>>>>>>> entry point thread-safe. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> I agree that removing the global state would be great! >>>>>>> Right now what I see proposed with OptTable (like >>>>>>> https://reviews.llvm.org/D104889) seems to just address the >>>>>>> tools-specific options, and the value isn't clear to me for these cases, >>>>>>> since these options aren't exposed through library entry points. >>>>>>> I don't quite get right now how OptTable would compose at the LLVM >>>>>>> scale? Are there examples of libraries exposing pluggable hooks for a tool >>>>>>> to aggregate multiple libraries' options and expose them on the command >>>>>>> line? >>>>>>> >>>>>>> >>>>>> The first message listed: >>>>>> >>>>>> * -t=d is removed (equal sign after a short option). Use -t d instead. >>>>>> * --demangle=0 (=0 to disable a boolean option) is removed. Omit the >>>>>> option or use --no-demangle instead. >>>>>> * To support boolean options (e.g. --demangle --no-demangle), we >>>>>> don't need to compare their positions (if (NoDemangle.getPosition() > >>>>>> Demangle.getPosition()) , see llvm-nm.cpp) >>>>>> * grouped short options can be specified with one line >>>>>> `setGroupedShortOptions`, instead of adding cl::Grouping to every short >>>>>> options. >>>>>> * We don't need to add cl::cat to every option and call >>>>>> `HideUnrelatedOptions` to hide unrelated options from --help. The issue >>>>>> would happen with cl::opt tools if linker garbage collection is disabled or >>>>>> libLLVM-13git.so is used. (See https://reviews.llvm.org/D104363) >>>>>> >>>>>> To me *these points are all usability issues of cl::opt*. I care >>>>>> about not exposing unnecessary interfaces so cl::opt accepting the weird >>>>>> -t=d looks a downside to me. >>>>>> >>>>>> --demangle=0 is weird and some llvm/llvm/test tests do use cl::opt >>>>>> options this way, so we cannot just remove this usage. As a workaround, we >>>>>> could add a cl::foobar_toggle to a cl::opt to disallow =0. >>>>>> We would end with more customization for one option, cl::cat (for >>>>>> hiding unrelated options), cl::foobar_toggle (for disallowing =0), and >>>>>> potentially others for other ad-hoc tasks. >>>>>> >>>>>> I can highlight another thing about the global state of cl::opt => *library >>>>>> cl::opt and binary utility cl::opt share the same namespace*. >>>>>> So cl::opt options (usually for debugging or testing) in library code >>>>>> can end up in a tool's list of command line options. >>>>>> This is usually undesired (e.g. llvm-objdump --x86-asm-syntax in >>>>>> https://reviews.llvm.org/D100433). >>>>>> People may not notice this if they always >>>>>> use -DLLVM_LINK_LLVM_DYLIB=off and don't use linker garbage collection. >>>>>> >>>>> >>>>> You're not answering my question here, are you? Are you answering to >>>>> what I mentioned 3 emails before in an answer to David when I wrote >>>>> "Indeed: it isn't clear to me that these are outright "benefits"."? >>>>> >>>> >>>> OptTable doesn't have the listed usability issues. Not having the >>>> issues is a large benefit to me. >>>> >>> >>> Maybe, these are subjective though. >>> (And the confusion above came from the fact that you just answered the >>> wrong email) >>> >>> >>>> >>>> >>>>> Because I still don't see clearly how to build something like `opt` >>>>> with all the pass and the options with OptTable, how does it all compose? >>>>> >>>> >>>> The proposed changes are specific to binary utilities: >>>> llvm-{ar,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}. >>>> 'opt', 'llc', etc are not in the scope. >>>> (I guess I should have named the utilities more specifically to not >>>> cause confusion.) >>>> >>>> For >>>> llvm-{ar,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}, we >>>> don't want a random cl::opt in lib/Transforms, lib/MC, orlib/LTO to be >>>> accidentally specifiable on the command line. >>>> (In the rare debugging case where such functionality is needed, it >>>> needs a -mllvm prefix like what ld.lld does.) >>>> >>> >>> Have you looked at what I mentioned with MLIR on how to use cl::opt >>> without global constructor? This has exactly the behavior you're looking >>> for. >>> >>> >> I have read >> https://github.com/llvm/llvm-project/blob/main/mlir/lib/IR/MLIRContext.cpp#L57-L83 >> >> Unless it does something more specially, I don't think it can avoid the >> issue I mentioned in a previous message: >> >> > I can highlight another thing about the global state of cl::opt => *library >> cl::opt and binary utility cl::opt share the same namespace*. >> >> This can be worked around with linker garbage collection by discarding >> unreferenced cl::opt. >> > > I re-read these messages. I think you probably meant something more > generic - how to design decentralized command line option registry where > every library can register some options. >Yes indeed! I'm not against using OptTable for the sole purpose of binary tools, but that does not seem to provide us with a path forward. So I am afraid that this is a local optima that is short sighted.> The binary utilities command line parsing issue I intended to address is > fairly different: > We want a single registry of all options and don't want to inherit options > from llvm/lib/A just because the tool happens to depend on llvm/lib/A > directly or indirectly. >I agree, what I pointed at in MLIR was an attempt to achieve this, the tool has to explicitly call a function at runtime to make the options available.> E.g. llvm-nm needs to depend on bitcode reader because it needs to handle > LLVM bitcode files, however, I don't want a random cl::opt in bitcode > reader to appear in llvm-nm's command line option list. > > So I just built mlir-opt and inspected its --help output. It has exactly > the problem I called out in my first message: >> * We don't need to add cl::cat to every option and call > > `HideUnrelatedOptions` to hide unrelated options from --help. The issue > would happen with cl::opt tools if linker garbage collection is disabled or > libLLVM-13git.so is used. (See https://reviews.llvm.org/D104363) > > There are many unrelated options like --amdgpu-bypass-slow-div which is from llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp > >I think you're missing the point here: mlir-opt does not have this problem *for mlir components.* It pays the price of the LLVM components using global constructors, but I pointed to the solution we use for mlir components that aren't using global constructors and avoid this problem. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210702/9ca63876/attachment.html>