thr3ads.net - llvm dev - [llvm-dev] Binary utilities: switch command line parsing from llvm::cl to OptTable (byproduct: drop -long-option?) [Jul 2021]

If this information is useful, please help other people find it:
Share via:

Mehdi AMINI via llvm-dev

2021-Jul-02 22:21 UTC

[llvm-dev] Binary utilities: switch command line parsing from llvm::cl to OptTable (byproduct: drop -long-option?)

On Fri, Jul 2, 2021 at 2:09 PM Fāng-ruì Sòng <maskray at google.com>
wrote:
> On Fri, Jul 2, 2021 at 2:03 PM Mehdi AMINI <joker.eph at gmail.com>
wrote:
>
>>
>>
>> On Fri, Jul 2, 2021 at 1:27 PM Fāng-ruì Sòng <maskray at
google.com> wrote:
>>
>>> On Fri, Jul 2, 2021 at 1:10 PM Mehdi AMINI <joker.eph at
gmail.com> wrote:
>>>
>>>>
>>>>
>>>> On Fri, Jul 2, 2021 at 12:49 PM Alexandre Ganea <
>>>> alexandre.ganea at ubisoft.com> wrote:
>>>>
>>>>> The API benefits sound nice, though presumably some could
be
>>>>> retrofitted to cl::opt if that was the only goal. Side
benefits in addition
>>>>> to removing global ctors are nice to have.
>>>>>
>>>>>
>>>>> The drawback is some initial boilerplate (e.g. llvm-tblgen
>>>>> -gen-opt-parser-defs in CMakeLists.txt, class NmOptTable in
code).
>>>>>
>>>>> The handling of comma separated options -arch=x86_64,arm64
doesn't
>>>>> have direct OptTable support. llvm::SplitString is needed
(just search for
>>>>> SplitString in https://reviews.llvm.org/D105330)
>>>>>
>>>>> But this doesn't tend to increase complexity because
the
>>>>> cl::list<std::string> will need per-value
verification anyway.
>>>>>
>>>>>
>>>>>
>>>>> One potential one (though I don't recall it being
discussed recently)
>>>>> would be that maybe this addresses the issue of global
ctors in cl::opt?
>>>>> Does OptTable avoid/not use global constructors? That would
be nice - it's
>>>>> an ongoing issue that LLVM library users pay for command
line argument
>>>>> support they have no need for in the form of global ctor
execution time.
>>>>>
>>>>>
>>>>>
>>>>> OptTable is used as a local variable. So yes, it avoids
global
>>>>> constructors,
>>>>>
>>>>>
>>>>>
>>>>> Nice :)
>>>>>
>>>>>
>>>>>
>>>>> Note that MLIR is using cl::opt without global ctor (we
build with
>>>>> `-Werror=global-constructors`).
>>>>>
>>>>>
>>>>>
>>>>> The pattern we use to write a tool with cl::opt and avoid
global ctor
>>>>> (and can be used to avoid collision) looks like:
>>>>>
https://github.com/llvm/llvm-project/blob/main/mlir/lib/IR/MLIRContext.cpp#L57-L83
>>>>>
>>>>>
>>>>>
>>>>> The tool that wants to expose the MLIRContext options to
the command
>>>>> line calls registerMLIRContextCLOptions() before parsing
the command line.
>>>>>
>>>>> Wouldn't this translate directly to LLVM tools as well
with some minor
>>>>> refactoring?
>>>>>
>>>>>
>>>>>
>>>>> The same applies to all of the infrastructure in MLIR,
passes are
>>>>> registered explicitly, etc. This decouples the "is
this code linked in"
>>>>> from "options are loaded" annoying part of the
global constructors.
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Mehdi
>>>>>
>>>>>
>>>>>
>>>>> *[Alexandre Ganea] *I think one other issue with cl::opt is
that it
>>>>> aggregates the “command-line argument definition” and the
“runtime
>>>>> parameter” *de facto* in a single object (unless
cl::location is
>>>>> manually specified to every cl::opt). What MLIR does solves
the issue
>>>>> mentioned by David, the fact that every tool
pulls/initializes every
>>>>> cl::opt out there. However OptTable solves both problems,
and makes the
>>>>> entry point thread-safe.
>>>>>
>>>>>
>>>>>
>>>>
>>>> I agree that removing the global state would be great!
>>>> Right now what I see proposed with OptTable (like
>>>> https://reviews.llvm.org/D104889) seems to just address the
>>>> tools-specific options, and the value isn't clear to me for
these cases,
>>>> since these options aren't exposed through library entry
points.
>>>> I don't quite get right now how OptTable would compose at
the LLVM
>>>> scale? Are there examples of libraries exposing pluggable hooks
for a tool
>>>> to aggregate multiple libraries' options and expose them on
the command
>>>> line?
>>>>
>>>>
>>> The first message listed:
>>>
>>> * -t=d is removed (equal sign after a short option). Use -t d
instead.
>>> * --demangle=0 (=0 to disable a boolean option) is removed. Omit
the
>>> option or use --no-demangle instead.
>>> * To support boolean options (e.g. --demangle --no-demangle), we
don't
>>> need to compare their positions (if (NoDemangle.getPosition() >
>>> Demangle.getPosition()) , see llvm-nm.cpp)
>>> * grouped short options can be specified with one line
>>> `setGroupedShortOptions`, instead of adding cl::Grouping to every
short
>>> options.
>>> * We don't need to add cl::cat to every option and call
>>> `HideUnrelatedOptions` to hide unrelated options from --help. The
issue
>>> would happen with cl::opt tools if linker garbage collection is
disabled or
>>> libLLVM-13git.so is used. (See https://reviews.llvm.org/D104363)
>>>
>>> To me *these points are all usability issues of cl::opt*. I care
about
>>> not exposing unnecessary interfaces so cl::opt accepting the weird
-t=d
>>> looks a downside to me.
>>>
>>> --demangle=0 is weird and some llvm/llvm/test tests do use cl::opt
>>> options this way, so we cannot just remove this usage. As a
workaround, we
>>> could add a cl::foobar_toggle to a cl::opt to disallow =0.
>>> We would end with more customization for one option, cl::cat (for
hiding
>>> unrelated options), cl::foobar_toggle (for disallowing =0), and
potentially
>>> others for other ad-hoc tasks.
>>>
>>> I can highlight another thing about the global state of cl::opt
=> *library
>>> cl::opt and binary utility cl::opt share the same namespace*.
>>> So cl::opt options (usually for debugging or testing) in library
code
>>> can end up in a tool's list of command line options.
>>> This is usually undesired (e.g. llvm-objdump --x86-asm-syntax in
>>> https://reviews.llvm.org/D100433).
>>> People may not notice this if they always use
-DLLVM_LINK_LLVM_DYLIB=off
>>> and don't use linker garbage collection.
>>>
>>
>> You're not answering my question here, are you? Are you answering
to what
>> I mentioned 3 emails before in an answer to David when I wrote
"Indeed: it
>> isn't clear to me that these are outright
"benefits"."?
>>
>
> OptTable doesn't have the listed usability issues. Not having the
issues
> is a large benefit to me.
>
Maybe, these are subjective though.
(And the confusion above came from the fact that  you just answered the
wrong email)

>
>
>> Because I still don't see clearly how to build something like `opt`
with
>> all the pass and the options with OptTable, how does it all compose?
>>
>
> The proposed changes are specific to binary utilities:
> llvm-{ar,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}.
> 'opt', 'llc', etc are not in the scope.
> (I guess I should have named the utilities more specifically to not cause
> confusion.)
>
> For llvm-{ar,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer},
> we don't want a random cl::opt in lib/Transforms, lib/MC, orlib/LTO to
be
> accidentally specifiable on the command line.
> (In the rare debugging case where such functionality is needed, it needs a
> -mllvm prefix like what ld.lld does.)
>
Have you looked at what I mentioned with MLIR on how to use cl::opt without
global constructor? This has exactly the behavior you're looking for.


-- 
Mehdi
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210702/ab37a411/attachment.html>

Fāng-ruì Sòng via llvm-dev

2021-Jul-02 22:26 UTC

head link

[llvm-dev] Binary utilities: switch command line parsing from llvm::cl to OptTable (byproduct: drop -long-option?)

On Fri, Jul 2, 2021 at 3:21 PM Mehdi AMINI <joker.eph at gmail.com> wrote:
>
>
> On Fri, Jul 2, 2021 at 2:09 PM Fāng-ruì Sòng <maskray at google.com>
wrote:
>
>> On Fri, Jul 2, 2021 at 2:03 PM Mehdi AMINI <joker.eph at
gmail.com> wrote:
>>
>>>
>>>
>>> On Fri, Jul 2, 2021 at 1:27 PM Fāng-ruì Sòng <maskray at
google.com> wrote:
>>>
>>>> On Fri, Jul 2, 2021 at 1:10 PM Mehdi AMINI <joker.eph at
gmail.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Fri, Jul 2, 2021 at 12:49 PM Alexandre Ganea <
>>>>> alexandre.ganea at ubisoft.com> wrote:
>>>>>
>>>>>> The API benefits sound nice, though presumably some
could be
>>>>>> retrofitted to cl::opt if that was the only goal. Side
benefits in addition
>>>>>> to removing global ctors are nice to have.
>>>>>>
>>>>>>
>>>>>> The drawback is some initial boilerplate (e.g.
llvm-tblgen
>>>>>> -gen-opt-parser-defs in CMakeLists.txt, class
NmOptTable in code).
>>>>>>
>>>>>> The handling of comma separated options
-arch=x86_64,arm64 doesn't
>>>>>> have direct OptTable support. llvm::SplitString is
needed (just search for
>>>>>> SplitString in https://reviews.llvm.org/D105330)
>>>>>>
>>>>>> But this doesn't tend to increase complexity
because the
>>>>>> cl::list<std::string> will need per-value
verification anyway.
>>>>>>
>>>>>>
>>>>>>
>>>>>> One potential one (though I don't recall it being
discussed recently)
>>>>>> would be that maybe this addresses the issue of global
ctors in cl::opt?
>>>>>> Does OptTable avoid/not use global constructors? That
would be nice - it's
>>>>>> an ongoing issue that LLVM library users pay for
command line argument
>>>>>> support they have no need for in the form of global
ctor execution time.
>>>>>>
>>>>>>
>>>>>>
>>>>>> OptTable is used as a local variable. So yes, it avoids
global
>>>>>> constructors,
>>>>>>
>>>>>>
>>>>>>
>>>>>> Nice :)
>>>>>>
>>>>>>
>>>>>>
>>>>>> Note that MLIR is using cl::opt without global ctor (we
build with
>>>>>> `-Werror=global-constructors`).
>>>>>>
>>>>>>
>>>>>>
>>>>>> The pattern we use to write a tool with cl::opt and
avoid global ctor
>>>>>> (and can be used to avoid collision) looks like:
>>>>>>
https://github.com/llvm/llvm-project/blob/main/mlir/lib/IR/MLIRContext.cpp#L57-L83
>>>>>>
>>>>>>
>>>>>>
>>>>>> The tool that wants to expose the MLIRContext options
to the command
>>>>>> line calls registerMLIRContextCLOptions() before
parsing the command line.
>>>>>>
>>>>>> Wouldn't this translate directly to LLVM tools as
well with some
>>>>>> minor refactoring?
>>>>>>
>>>>>>
>>>>>>
>>>>>> The same applies to all of the infrastructure in MLIR,
passes are
>>>>>> registered explicitly, etc. This decouples the "is
this code linked in"
>>>>>> from "options are loaded" annoying part of
the global constructors.
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Mehdi
>>>>>>
>>>>>>
>>>>>>
>>>>>> *[Alexandre Ganea] *I think one other issue with
cl::opt is that it
>>>>>> aggregates the “command-line argument definition” and
the “runtime
>>>>>> parameter” *de facto* in a single object (unless
cl::location is
>>>>>> manually specified to every cl::opt). What MLIR does
solves the issue
>>>>>> mentioned by David, the fact that every tool
pulls/initializes every
>>>>>> cl::opt out there. However OptTable solves both
problems, and makes the
>>>>>> entry point thread-safe.
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> I agree that removing the global state would be great!
>>>>> Right now what I see proposed with OptTable (like
>>>>> https://reviews.llvm.org/D104889) seems to just address the
>>>>> tools-specific options, and the value isn't clear to me
for these cases,
>>>>> since these options aren't exposed through library
entry points.
>>>>> I don't quite get right now how OptTable would compose
at the LLVM
>>>>> scale? Are there examples of libraries exposing pluggable
hooks for a tool
>>>>> to aggregate multiple libraries' options and expose
them on the command
>>>>> line?
>>>>>
>>>>>
>>>> The first message listed:
>>>>
>>>> * -t=d is removed (equal sign after a short option). Use -t d
instead.
>>>> * --demangle=0 (=0 to disable a boolean option) is removed.
Omit the
>>>> option or use --no-demangle instead.
>>>> * To support boolean options (e.g. --demangle --no-demangle),
we don't
>>>> need to compare their positions (if (NoDemangle.getPosition()
>
>>>> Demangle.getPosition()) , see llvm-nm.cpp)
>>>> * grouped short options can be specified with one line
>>>> `setGroupedShortOptions`, instead of adding cl::Grouping to
every short
>>>> options.
>>>> * We don't need to add cl::cat to every option and call
>>>> `HideUnrelatedOptions` to hide unrelated options from --help.
The issue
>>>> would happen with cl::opt tools if linker garbage collection is
disabled or
>>>> libLLVM-13git.so is used. (See
https://reviews.llvm.org/D104363)
>>>>
>>>> To me *these points are all usability issues of cl::opt*. I
care about
>>>> not exposing unnecessary interfaces so cl::opt accepting the
weird -t=d
>>>> looks a downside to me.
>>>>
>>>> --demangle=0 is weird and some llvm/llvm/test tests do use
cl::opt
>>>> options this way, so we cannot just remove this usage. As a
workaround, we
>>>> could add a cl::foobar_toggle to a cl::opt to disallow =0.
>>>> We would end with more customization for one option, cl::cat
(for
>>>> hiding unrelated options), cl::foobar_toggle (for disallowing
=0), and
>>>> potentially others for other ad-hoc tasks.
>>>>
>>>> I can highlight another thing about the global state of cl::opt
=> *library
>>>> cl::opt and binary utility cl::opt share the same namespace*.
>>>> So cl::opt options (usually for debugging or testing) in
library code
>>>> can end up in a tool's list of command line options.
>>>> This is usually undesired (e.g. llvm-objdump --x86-asm-syntax
in
>>>> https://reviews.llvm.org/D100433).
>>>> People may not notice this if they always
>>>> use -DLLVM_LINK_LLVM_DYLIB=off and don't use linker garbage
collection.
>>>>
>>>
>>> You're not answering my question here, are you? Are you
answering to
>>> what I mentioned 3 emails before in an answer to David when I wrote
>>> "Indeed: it isn't clear to me that these are outright
"benefits"."?
>>>
>>
>> OptTable doesn't have the listed usability issues. Not having the
issues
>> is a large benefit to me.
>>
>
> Maybe, these are subjective though.
> (And the confusion above came from the fact that  you just answered the
> wrong email)
>
>
>>
>>
>>> Because I still don't see clearly how to build something like
`opt` with
>>> all the pass and the options with OptTable, how does it all
compose?
>>>
>>
>> The proposed changes are specific to binary utilities:
>> llvm-{ar,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer}.
>> 'opt', 'llc', etc are not in the scope.
>> (I guess I should have named the utilities more specifically to not
cause
>> confusion.)
>>
>> For
llvm-{ar,cxxfilt,nm,objcopy,objdump,readobj,size,strings,symbolizer},
>> we don't want a random cl::opt in lib/Transforms, lib/MC, orlib/LTO
to be
>> accidentally specifiable on the command line.
>> (In the rare debugging case where such functionality is needed, it
needs
>> a -mllvm prefix like what ld.lld does.)
>>
>
> Have you looked at what I mentioned with MLIR on how to use cl::opt
> without global constructor? This has exactly the behavior you're
looking
> for.
>
>I have read
https://github.com/llvm/llvm-project/blob/main/mlir/lib/IR/MLIRContext.cpp#L57-L83

Unless it does something more specially, I don't think it can avoid the
issue I mentioned in a previous message:
> I can highlight another thing about the global state of cl::opt =>
*librarycl::opt and binary utility cl::opt share the same namespace*.

This can be worked around with linker garbage collection by discarding
unreferenced cl::opt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210702/b62b9d0f/attachment-0001.html>

llvm dev - Jul 2021 - Binary utilities: switch command line parsing from llvm::cl to OptTable (byproduct: drop -long-option?)

[llvm-dev] Binary utilities: switch command line parsing from llvm::cl to OptTable (byproduct: drop -long-option?)

[llvm-dev] Binary utilities: switch command line parsing from llvm::cl to OptTable (byproduct: drop -long-option?)