thr3ads.net - llvm dev - [LLVMdev] [RFC] Removing static initializers for command line options [Aug 2014]

If this information is useful, please help other people find it:
Share via:

Chris Bieneman

2014-Aug-18 18:49 UTC

[LLVMdev] [RFC] Removing static initializers for command line options

Today command line arguments in LLVM are global variables. An example argument
from Scalarizer.cpp is:

static cl::opt<bool> ScalarizeLoadStore
("scalarize-load-store", cl::Hidden, cl::init(false),
cl::desc("Allow the scalarizer pass to scalarize loads and
store"));

This poses a problem for clients of LLVM that aren’t traditional compilers (i.e.
WebKit, and Mesa). My proposal is to take a phased approach at addressing this
issue.

The first phase is to move the ownership of command line options to a singleton,
OptionRegistry. The OptionRegistry can be made to work with the existing global
command line definitions so that the changes to migrate options can be done in
small batches. The primary purpose of this change is to move the ownership of
the command line options out of the global scope, and to provide a vehicle for
threading them through the compiler. At the completion of this phase, all the
command line arguments will be constructed during LLVM initialization and
registered under the OptionRegistry. This will replace the 100’s of static
initialized cl::opt objects with a single static initialized OptionRegistry.

With this change options can be constructed during initialization. For the
example option above the pass initialization would get a line like:

cl::OptionRegistry::CreateOption<bool>("ScalarizeLoadStore",
"scalarize-load-store", cl::Hidden, cl::init(false),
cl::desc("Allow the scalarizer pass to scalarize loads and store"));

Also the pass would add a boolean member to store the value of the option which
would be initialized in the pass’s constructor like this:

ScalarizeLoadStore =
cl::OptionRegistry::GetValue<bool>("ScalarizeLoadStore");

These two operations need to occur at separate times due to object lifespans. At
the time when command lines are parsed passes have been initialized, but not
constructed. That means making options live in passes doesn’t really work, but
since we want the data to be part of the pass we need to initialize it during
construction.

A large part of this phase will be finding appropriate places for all the
command line options to be initialized, and identifying all the places where the
option data will need to be threaded through the compiler. One of the goals here
is to get rid of all global state in the compiler to (eventually) enable better
multi-threading by clients like WebKit.

The second phase is to split the OptionRegistry into two pieces. The first piece
is the parsing logic, and the second piece is the Option data store. The main
goal of this phase is to make the OptionRegistry represent everything needed to
parse command line options and to define a new second object, OptionStore, that
is populated with values by parsing the command line. The OptionRegistry will be
responsible for initializing “blank” option stores which can then be populated
by either the command line parser, or API calls.

The OptionRegistry should remain a singleton so that the parsing logic for all
options remains universally available. This is required to continue supporting
plugin loadable options.

The OptionStore should be created when a command line is parsed, or by an API
call in libraries, and can be passed through the pass manager and targets to
populate option data. The OptionStore should have a lifetime independent of
contexts, and pass managers because it can be re-used indiscriminately.

The core principle in this design is that the objects involved in parsing
options only need to exist once, but you need a full list of all options in
order to parse a command line. You should be able to have multiple copies of the
actual stored option data. Having multiple copies of the data store is one step
toward enabling two instances of LLVM in the same process to use optimization
passes with different options.

I haven’t come up with a specific implementation proposal for this yet, but I do
have some rough ideas. The basic flow that I’m thinking of is for command line
parsing to create an object that maps option names to their values without any
of the parsing data involved. This would allow for either parsing multiple
command lines, or generally just constructing multiple option data stores.
**Here is where things get foggy because I haven’t yet looked too deep.** Once
you construct a data store it will get passed into the pass manager (and
everywhere else that needs it), and it will be used to initialize all the option
values.

There has been discussion about making the option store reside within the
context, but this doesn’t feel right because the biggest consumer of option data
is the passes, and you can use a single pass manager with multiple contexts.

Thanks,
-Chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140818/8693106d/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cl_opt.diff
Type: application/octet-stream
Size: 12299 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140818/8693106d/attachment.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140818/8693106d/attachment-0001.html>

Rafael Espíndola

2014-Aug-18 21:42 UTC

head link

[LLVMdev] [RFC] Removing static initializers for command line options

On 18 August 2014 14:49, Chris Bieneman <beanz at apple.com>
wrote:> Today command line arguments in LLVM are global variables. An example
> argument from Scalarizer.cpp is:
>
> static cl::opt<bool> ScalarizeLoadStore
>   ("scalarize-load-store", cl::Hidden, cl::init(false),
>    cl::desc("Allow the scalarizer pass to scalarize loads and
store"));
>
>
> This poses a problem for clients of LLVM that aren’t traditional compilers
> (i.e. WebKit, and Mesa). My proposal is to take a phased approach at
> addressing this issue.
>
> The first phase is to move the ownership of command line options to a
> singleton, OptionRegistry. The OptionRegistry can be made to work with the
> existing global command line definitions so that the changes to migrate
> options can be done in small batches. The primary purpose of this change is
> to move the ownership of the command line options out of the global scope,
> and to provide a vehicle for threading them through the compiler. At the
> completion of this phase, all the command line arguments will be
constructed
> during LLVM initialization and registered under the OptionRegistry. This
> will replace the 100’s of static initialized cl::opt objects with a single
> static initialized OptionRegistry.
>
> With this change options can be constructed during initialization. For the
> example option above the pass initialization would get a line like:
>
>
cl::OptionRegistry::CreateOption<bool>("ScalarizeLoadStore",
>   "scalarize-load-store", cl::Hidden, cl::init(false),
>   cl::desc("Allow the scalarizer pass to scalarize loads and
store"));
>
>
> Also the pass would add a boolean member to store the value of the option
> which would be initialized in the pass’s constructor like this:
>
> ScalarizeLoadStore >
cl::OptionRegistry::GetValue<bool>("ScalarizeLoadStore");
>
For the first step it might be better to keep the option value as a
global. That way we only switch to using something like


static bool ScalarizeLoadStore;
 cl::OptionRegistry::CreateOption<bool>(&ScalarizeLoadStore,
"ScalarizeLoadStore",
   "scalarize-load-store", cl::Hidden, cl::init(false),
   cl::desc("Allow the scalarizer pass to scalarize loads and
store"));

and everything else remains as is.
> These two operations need to occur at separate times due to object
> lifespans. At the time when command lines are parsed passes have been
> initialized, but not constructed. That means making options live in passes
> doesn’t really work, but since we want the data to be part of the pass we
> need to initialize it during construction.
>
> A large part of this phase will be finding appropriate places for all the
> command line options to be initialized, and identifying all the places
where
> the option data will need to be threaded through the compiler. One of the
> goals here is to get rid of all global state in the compiler to
(eventually)
> enable better multi-threading by clients like WebKit.
>
> The second phase is to split the OptionRegistry into two pieces. The first
> piece is the parsing logic, and the second piece is the Option data store.
> The main goal of this phase is to make the OptionRegistry represent
> everything needed to parse command line options and to define a new second
> object, OptionStore, that is populated with values by parsing the command
> line. The OptionRegistry will be responsible for initializing “blank”
option
> stores which can then be populated by either the command line parser, or
API
> calls.
>
> The OptionRegistry should remain a singleton so that the parsing logic for
> all options remains universally available. This is required to continue
> supporting plugin loadable options.
>
> The OptionStore should be created when a command line is parsed, or by an
> API call in libraries, and can be passed through the pass manager and
> targets to populate option data. The OptionStore should have a lifetime
> independent of contexts, and pass managers because it can be re-used
> indiscriminately.
>
> The core principle in this design is that the objects involved in parsing
> options only need to exist once, but you need a full list of all options in
> order to parse a command line. You should be able to have multiple copies
of
> the actual stored option data. Having multiple copies of the data store is
> one step toward enabling two instances of LLVM in the same process to use
> optimization passes with different options.
>
> I haven’t come up with a specific implementation proposal for this yet, but
> I do have some rough ideas. The basic flow that I’m thinking of is for
> command line parsing to create an object that maps option names to their
> values without any of the parsing data involved. This would allow for
either
> parsing multiple command lines, or generally just constructing multiple
> option data stores. **Here is where things get foggy because I haven’t yet
> looked too deep.** Once you construct a data store it will get passed into
> the pass manager (and everywhere else that needs it), and it will be used
to
> initialize all the option values.
>
> There has been discussion about making the option store reside within the
> context, but this doesn’t feel right because the biggest consumer of option
> data is the passes, and you can use a single pass manager with multiple
> contexts.
>
Some passes take options directly in the constructor. For example

Inliner::Inliner(char &ID, int Threshold, bool InsertLifetime)

Maybe we could just say that there are two different types of options.
The ones we want to expose to users and the ones which we use for
testing llvm itself. The options we want to expose should be just
constructor arguments. With that distinction we should be able to just
not use the options added by  cl::OptionRegistry::CreateOption unless
cl::ParseCommandLineOptions is called. WebKit like clients would just
not call cl::ParseCommandLineOptions. Would that work?

Cheers,
Rafael

Chris Bieneman

2014-Aug-18 21:56 UTC

head link

[LLVMdev] [RFC] Removing static initializers for command line options

> On Aug 18, 2014, at 2:42 PM, Rafael Espíndola <rafael.espindola at
gmail.com> wrote:
> 
> On 18 August 2014 14:49, Chris Bieneman <beanz at apple.com
<mailto:beanz at apple.com>> wrote:
>> Today command line arguments in LLVM are global variables. An example
>> argument from Scalarizer.cpp is:
>> 
>> static cl::opt<bool> ScalarizeLoadStore
>>  ("scalarize-load-store", cl::Hidden, cl::init(false),
>>   cl::desc("Allow the scalarizer pass to scalarize loads and
store"));
>> 
>> 
>> This poses a problem for clients of LLVM that aren’t traditional
compilers
>> (i.e. WebKit, and Mesa). My proposal is to take a phased approach at
>> addressing this issue.
>> 
>> The first phase is to move the ownership of command line options to a
>> singleton, OptionRegistry. The OptionRegistry can be made to work with
the
>> existing global command line definitions so that the changes to migrate
>> options can be done in small batches. The primary purpose of this
change is
>> to move the ownership of the command line options out of the global
scope,
>> and to provide a vehicle for threading them through the compiler. At
the
>> completion of this phase, all the command line arguments will be
constructed
>> during LLVM initialization and registered under the OptionRegistry.
This
>> will replace the 100’s of static initialized cl::opt objects with a
single
>> static initialized OptionRegistry.
>> 
>> With this change options can be constructed during initialization. For
the
>> example option above the pass initialization would get a line like:
>> 
>>
cl::OptionRegistry::CreateOption<bool>("ScalarizeLoadStore",
>>  "scalarize-load-store", cl::Hidden, cl::init(false),
>>  cl::desc("Allow the scalarizer pass to scalarize loads and
store"));
>> 
>> 
>> Also the pass would add a boolean member to store the value of the
option
>> which would be initialized in the pass’s constructor like this:
>> 
>> ScalarizeLoadStore >>
cl::OptionRegistry::GetValue<bool>("ScalarizeLoadStore");
>> 
> 
> For the first step it might be better to keep the option value as a
> global. That way we only switch to using something like
> 
> 
> static bool ScalarizeLoadStore;
> cl::OptionRegistry::CreateOption<bool>(&ScalarizeLoadStore,
> "ScalarizeLoadStore",
>   "scalarize-load-store", cl::Hidden, cl::init(false),
>   cl::desc("Allow the scalarizer pass to scalarize loads and
store"));
> 
> and everything else remains as is.
I’d prefer to do the removal of global storage all at once. This can be done one
pass at a time, but doing all at once means I don’t need to revisit each pass as
many times. Whereas if I do this in two passes (where the option becomes owned,
but the storage remains global) I’ll need to revisit each pass again to get rid
of the global storage.

Keep in mind one of the ultimate goals that this is building toward is allowing
a single process to compile multiple programs at the same time with different
options.
> 
>> These two operations need to occur at separate times due to object
>> lifespans. At the time when command lines are parsed passes have been
>> initialized, but not constructed. That means making options live in
passes
>> doesn’t really work, but since we want the data to be part of the pass
we
>> need to initialize it during construction.
>> 
>> A large part of this phase will be finding appropriate places for all
the
>> command line options to be initialized, and identifying all the places
where
>> the option data will need to be threaded through the compiler. One of
the
>> goals here is to get rid of all global state in the compiler to
(eventually)
>> enable better multi-threading by clients like WebKit.
>> 
>> The second phase is to split the OptionRegistry into two pieces. The
first
>> piece is the parsing logic, and the second piece is the Option data
store.
>> The main goal of this phase is to make the OptionRegistry represent
>> everything needed to parse command line options and to define a new
second
>> object, OptionStore, that is populated with values by parsing the
command
>> line. The OptionRegistry will be responsible for initializing “blank”
option
>> stores which can then be populated by either the command line parser,
or API
>> calls.
>> 
>> The OptionRegistry should remain a singleton so that the parsing logic
for
>> all options remains universally available. This is required to continue
>> supporting plugin loadable options.
>> 
>> The OptionStore should be created when a command line is parsed, or by
an
>> API call in libraries, and can be passed through the pass manager and
>> targets to populate option data. The OptionStore should have a lifetime
>> independent of contexts, and pass managers because it can be re-used
>> indiscriminately.
>> 
>> The core principle in this design is that the objects involved in
parsing
>> options only need to exist once, but you need a full list of all
options in
>> order to parse a command line. You should be able to have multiple
copies of
>> the actual stored option data. Having multiple copies of the data store
is
>> one step toward enabling two instances of LLVM in the same process to
use
>> optimization passes with different options.
>> 
>> I haven’t come up with a specific implementation proposal for this yet,
but
>> I do have some rough ideas. The basic flow that I’m thinking of is for
>> command line parsing to create an object that maps option names to
their
>> values without any of the parsing data involved. This would allow for
either
>> parsing multiple command lines, or generally just constructing multiple
>> option data stores. **Here is where things get foggy because I haven’t
yet
>> looked too deep.** Once you construct a data store it will get passed
into
>> the pass manager (and everywhere else that needs it), and it will be
used to
>> initialize all the option values.
>> 
>> There has been discussion about making the option store reside within
the
>> context, but this doesn’t feel right because the biggest consumer of
option
>> data is the passes, and you can use a single pass manager with multiple
>> contexts.
>> 
> 
> Some passes take options directly in the constructor. For example
> 
> Inliner::Inliner(char &ID, int Threshold, bool InsertLifetime)
> 
> Maybe we could just say that there are two different types of options.
> The ones we want to expose to users and the ones which we use for
> testing llvm itself. The options we want to expose should be just
> constructor arguments. With that distinction we should be able to just
> not use the options added by  cl::OptionRegistry::CreateOption unless
> cl::ParseCommandLineOptions is called. WebKit like clients would just
> not call cl::ParseCommandLineOptions. Would that work?
This is actually how some of our internal clients are already working. There are
a few caveats with this approach:

(1) You can’t allow the pass manager to allocate your passes for you, because
those passes only read from cl::opts
(2) Not all of our passes have constructors for overriding their cl::opts (the
legacy Scalarizer is one)

I think it would in general be cleaner to provide a way for library clients to
use cl::opts without being forced to parse a command line.

Thanks,
-Chris
> 
> Cheers,
> Rafael
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140818/4575e5d9/attachment.html>

Keno Fischer

2014-Aug-19 06:14 UTC

head link

[LLVMdev] [RFC] Removing static initializers for command line options

I've recently experienced a similar pain with multiple
ExecutionEngines generating code (multiple LLVM-based JITs in the same
process). I don't have anything specific to add at this point, but I
wanted to express my whole hearted +1 on the project.

On Mon, Aug 18, 2014 at 2:49 PM, Chris Bieneman <beanz at apple.com>
wrote:> Today command line arguments in LLVM are global variables. An example
> argument from Scalarizer.cpp is:
>
> static cl::opt<bool> ScalarizeLoadStore
>   ("scalarize-load-store", cl::Hidden, cl::init(false),
>    cl::desc("Allow the scalarizer pass to scalarize loads and
store"));
>
>
> This poses a problem for clients of LLVM that aren’t traditional compilers
> (i.e. WebKit, and Mesa). My proposal is to take a phased approach at
> addressing this issue.
>
> The first phase is to move the ownership of command line options to a
> singleton, OptionRegistry. The OptionRegistry can be made to work with the
> existing global command line definitions so that the changes to migrate
> options can be done in small batches. The primary purpose of this change is
> to move the ownership of the command line options out of the global scope,
> and to provide a vehicle for threading them through the compiler. At the
> completion of this phase, all the command line arguments will be
constructed
> during LLVM initialization and registered under the OptionRegistry. This
> will replace the 100’s of static initialized cl::opt objects with a single
> static initialized OptionRegistry.
>
> With this change options can be constructed during initialization. For the
> example option above the pass initialization would get a line like:
>
>
cl::OptionRegistry::CreateOption<bool>("ScalarizeLoadStore",
>   "scalarize-load-store", cl::Hidden, cl::init(false),
>   cl::desc("Allow the scalarizer pass to scalarize loads and
store"));
>
>
> Also the pass would add a boolean member to store the value of the option
> which would be initialized in the pass’s constructor like this:
>
> ScalarizeLoadStore >
cl::OptionRegistry::GetValue<bool>("ScalarizeLoadStore");
>
>
> These two operations need to occur at separate times due to object
> lifespans. At the time when command lines are parsed passes have been
> initialized, but not constructed. That means making options live in passes
> doesn’t really work, but since we want the data to be part of the pass we
> need to initialize it during construction.
>
> A large part of this phase will be finding appropriate places for all the
> command line options to be initialized, and identifying all the places
where
> the option data will need to be threaded through the compiler. One of the
> goals here is to get rid of all global state in the compiler to
(eventually)
> enable better multi-threading by clients like WebKit.
>
> The second phase is to split the OptionRegistry into two pieces. The first
> piece is the parsing logic, and the second piece is the Option data store.
> The main goal of this phase is to make the OptionRegistry represent
> everything needed to parse command line options and to define a new second
> object, OptionStore, that is populated with values by parsing the command
> line. The OptionRegistry will be responsible for initializing “blank”
option
> stores which can then be populated by either the command line parser, or
API
> calls.
>
> The OptionRegistry should remain a singleton so that the parsing logic for
> all options remains universally available. This is required to continue
> supporting plugin loadable options.
>
> The OptionStore should be created when a command line is parsed, or by an
> API call in libraries, and can be passed through the pass manager and
> targets to populate option data. The OptionStore should have a lifetime
> independent of contexts, and pass managers because it can be re-used
> indiscriminately.
>
> The core principle in this design is that the objects involved in parsing
> options only need to exist once, but you need a full list of all options in
> order to parse a command line. You should be able to have multiple copies
of
> the actual stored option data. Having multiple copies of the data store is
> one step toward enabling two instances of LLVM in the same process to use
> optimization passes with different options.
>
> I haven’t come up with a specific implementation proposal for this yet, but
> I do have some rough ideas. The basic flow that I’m thinking of is for
> command line parsing to create an object that maps option names to their
> values without any of the parsing data involved. This would allow for
either
> parsing multiple command lines, or generally just constructing multiple
> option data stores. **Here is where things get foggy because I haven’t yet
> looked too deep.** Once you construct a data store it will get passed into
> the pass manager (and everywhere else that needs it), and it will be used
to
> initialize all the option values.
>
> There has been discussion about making the option store reside within the
> context, but this doesn’t feel right because the biggest consumer of option
> data is the passes, and you can use a single pass manager with multiple
> contexts.
>
> Thanks,
> -Chris
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

Sean Silva

2014-Aug-20 04:43 UTC

head link

[LLVMdev] [RFC] Removing static initializers for command line options

One interesting issue with moving away from the current system of static
initializers for cl::opt is that we will no longer have the automatic
registration of all the options so that -help will print everything
available and generally we will not be able to issue an error for an
"unknown command line option" (without calling into any other code).

The auto-registration is fundamentally tied with the globalness and the
static initializers; pondering this has led me down an interesting path
that has made me understand better my suggestion in the other thread. As I
see it, there are two very different sorts of uses of llvm::cl in LLVM:

1. For regular command line processing. E.g. if a tool accepts an output
file, then we need something that will parse the argument from the command
line.

2. As a way to easily set up a conduit from A to B, where A is the command
line and B is some place "deep" inside the LLVM library code that will
do
something in response to the command line.

(and, pending discussion, someday point A might include a proper
programmatic interface (i.e. in a way other than hijacking the command line
processing))

llvm::cl does a decent job for #1 and that is what it was designed for
AFAICT; these uses of llvm::cl live outside of library code and everything
is pretty happy, despite them being global and having static initializers.

The problem is that llvm::cl is not very well-suited to #2, yet it is used
for #2, and that is the real problem. We need a solution to problem #2
which does not use llvm::cl. Thus, I don't think that the solution you
propose here is the right direction.

The first step is to clearly differentiate between #1 and #2. I will say
"command line options" for #1 and "configuration/tweak
points" for #2.
(maybe "library options" is better for #2; neither is perfect
terminology)

The strawman I suggested in
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075503.html was a
stab at #2. There is no way to dodge being stringly typed since command
lines are stringly typed, so really it is just a question of how long a
solution stays stringly typed.

My thought process for staying stringly typed "the whole time"
(possibly
with some caching) comes from these two desires:
- adding a c/t point should require adding just one call into the c/t
machinery (this is both for convenience and for DRY/SPOT), and
- this change should be localized to the code being configured/tweaked
This is the thought process:

Note that llvm::cl is stringly typed until it parses the options. llvm::cl
gives the appearance of a typed interface because it uses static
initialization as a backdoor to globally transport the knowledge of the
expected type to the option parsing machinery (very early in the program
lifetime). Without this backdoor, we need to stay stringly typed longer, at
least until we reach the "localized" place where the single call into
the
c/t machinery is made; this single call is the only place that has the type
information needed for the c/t value to become properly typed. But there is
no way to know how long it will be until we reach that point (or even *if*
we reach that point; consider passes that are not run on this invocation).

Hence my suggestion of just putting a stringly typed key-value store (or
whatever) in an easily accessible place (like LLVMContext), and just
translating any unrecognized command line options (ones that are not for
#1) into that stringly typed storage.

I agree with Rafael that "constructor arguments to passes" are not c/t
points. In the future, there might be some way to integrate the two (from
the referenced post, you can probably tell that I kind of like the idea of
doing so), but for now, I think the clear incremental step is to attack #2
and solve it without llvm::cl. I have suggested a way to do this that I
think makes sense.

-- Sean Silva

On Mon, Aug 18, 2014 at 11:49 AM, Chris Bieneman <beanz at apple.com>
wrote:
> Today command line arguments in LLVM are global variables. An example
> argument from Scalarizer.cpp is:
>
> static cl::opt<bool> ScalarizeLoadStore
>   ("scalarize-load-store", cl::Hidden, cl::init(false),
>    cl::desc("Allow the scalarizer pass to scalarize loads and
store"));
>
>
> This poses a problem for clients of LLVM that aren’t traditional compilers
> (i.e. WebKit, and Mesa). My proposal is to take a phased approach at
> addressing this issue.
>
> The first phase is to move the ownership of command line options to a
> singleton, OptionRegistry. The OptionRegistry can be made to work with the
> existing global command line definitions so that the changes to migrate
> options can be done in small batches. The primary purpose of this change is
> to move the ownership of the command line options out of the global scope,
> and to provide a vehicle for threading them through the compiler. At the
> completion of this phase, all the command line arguments will be
> constructed during LLVM initialization and registered under the
> OptionRegistry. This will replace the 100’s of static initialized cl::opt
> objects with a single static initialized OptionRegistry.
>
> With this change options can be constructed during initialization. For the
> example option above the pass initialization would get a line like:
>
>
cl::OptionRegistry::CreateOption<bool>("ScalarizeLoadStore",
>   "scalarize-load-store", cl::Hidden, cl::init(false),
>   cl::desc("Allow the scalarizer pass to scalarize loads and
store"));
>
>
> Also the pass would add a boolean member to store the value of the option
> which would be initialized in the pass’s constructor like this:
>
> ScalarizeLoadStore >
cl::OptionRegistry::GetValue<bool>("ScalarizeLoadStore");
>
>
> These two operations need to occur at separate times due to object
> lifespans. At the time when command lines are parsed passes have been
> initialized, but not constructed. That means making options live in passes
> doesn’t really work, but since we want the data to be part of the pass we
> need to initialize it during construction.
>
> A large part of this phase will be finding appropriate places for all the
> command line options to be initialized, and identifying all the places
> where the option data will need to be threaded through the compiler. One of
> the goals here is to get rid of all global state in the compiler to
> (eventually) enable better multi-threading by clients like WebKit.
>
> The second phase is to split the OptionRegistry into two pieces. The first
> piece is the parsing logic, and the second piece is the Option data store.
> The main goal of this phase is to make the OptionRegistry represent
> everything needed to parse command line options and to define a new second
> object, OptionStore, that is populated with values by parsing the command
> line. The OptionRegistry will be responsible for initializing “blank”
> option stores which can then be populated by either the command line
> parser, or API calls.
>
> The OptionRegistry should remain a singleton so that the parsing logic for
> all options remains universally available. This is required to continue
> supporting plugin loadable options.
>
> The OptionStore should be created when a command line is parsed, or by an
> API call in libraries, and can be passed through the pass manager and
> targets to populate option data. The OptionStore should have a lifetime
> independent of contexts, and pass managers because it can be re-used
> indiscriminately.
>
> The core principle in this design is that the objects involved in parsing
> options only need to exist once, but you need a full list of all options in
> order to parse a command line. You should be able to have multiple copies
> of the actual stored option data. Having multiple copies of the data store
> is one step toward enabling two instances of LLVM in the same process to
> use optimization passes with different options.
>
> I haven’t come up with a specific implementation proposal for this yet,
> but I do have some rough ideas. The basic flow that I’m thinking of is for
> command line parsing to create an object that maps option names to their
> values without any of the parsing data involved. This would allow for
> either parsing multiple command lines, or generally just constructing
> multiple option data stores. **Here is where things get foggy because I
> haven’t yet looked too deep.** Once you construct a data store it will get
> passed into the pass manager (and everywhere else that needs it), and it
> will be used to initialize all the option values.
>
> There has been discussion about making the option store reside within the
> context, but this doesn’t feel right because the biggest consumer of option
> data is the passes, and you can use a single pass manager with multiple
> contexts.
>
> Thanks,
> -Chris
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140819/84bfece1/attachment.html>

Sean Silva

2014-Aug-20 04:52 UTC

head link

[LLVMdev] [RFC] Removing static initializers for command line options

To be clear: I agree with Rafael that we need to tread very carefully about
how we expose this machinery in the C API, if we expose it at all. My
suggestion is completely orthogonal to this though; all I'm talking about
is how to avoid the static constructors and global state caused by the
cl::opt's in library code, which as I understand it is the motivation for
the OP.

-- Sean Silva


On Tue, Aug 19, 2014 at 9:43 PM, Sean Silva <chisophugis at gmail.com>
wrote:
> One interesting issue with moving away from the current system of static
> initializers for cl::opt is that we will no longer have the automatic
> registration of all the options so that -help will print everything
> available and generally we will not be able to issue an error for an
> "unknown command line option" (without calling into any other
code).
>
> The auto-registration is fundamentally tied with the globalness and the
> static initializers; pondering this has led me down an interesting path
> that has made me understand better my suggestion in the other thread. As I
> see it, there are two very different sorts of uses of llvm::cl in LLVM:
>
> 1. For regular command line processing. E.g. if a tool accepts an output
> file, then we need something that will parse the argument from the command
> line.
>
> 2. As a way to easily set up a conduit from A to B, where A is the command
> line and B is some place "deep" inside the LLVM library code that
will do
> something in response to the command line.
>
> (and, pending discussion, someday point A might include a proper
> programmatic interface (i.e. in a way other than hijacking the command line
> processing))
>
> llvm::cl does a decent job for #1 and that is what it was designed for
> AFAICT; these uses of llvm::cl live outside of library code and everything
> is pretty happy, despite them being global and having static initializers.
>
> The problem is that llvm::cl is not very well-suited to #2, yet it is used
> for #2, and that is the real problem. We need a solution to problem #2
> which does not use llvm::cl. Thus, I don't think that the solution you
> propose here is the right direction.
>
> The first step is to clearly differentiate between #1 and #2. I will say
> "command line options" for #1 and "configuration/tweak
points" for #2.
> (maybe "library options" is better for #2; neither is perfect
terminology)
>
> The strawman I suggested in
> http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075503.html was a
> stab at #2. There is no way to dodge being stringly typed since command
> lines are stringly typed, so really it is just a question of how long a
> solution stays stringly typed.
>
> My thought process for staying stringly typed "the whole time"
(possibly
> with some caching) comes from these two desires:
> - adding a c/t point should require adding just one call into the c/t
> machinery (this is both for convenience and for DRY/SPOT), and
> - this change should be localized to the code being configured/tweaked
> This is the thought process:
>
> Note that llvm::cl is stringly typed until it parses the options. llvm::cl
> gives the appearance of a typed interface because it uses static
> initialization as a backdoor to globally transport the knowledge of the
> expected type to the option parsing machinery (very early in the program
> lifetime). Without this backdoor, we need to stay stringly typed longer, at
> least until we reach the "localized" place where the single call
into the
> c/t machinery is made; this single call is the only place that has the type
> information needed for the c/t value to become properly typed. But there is
> no way to know how long it will be until we reach that point (or even *if*
> we reach that point; consider passes that are not run on this invocation).
>
> Hence my suggestion of just putting a stringly typed key-value store (or
> whatever) in an easily accessible place (like LLVMContext), and just
> translating any unrecognized command line options (ones that are not for
> #1) into that stringly typed storage.
>
> I agree with Rafael that "constructor arguments to passes" are
not c/t
> points. In the future, there might be some way to integrate the two (from
> the referenced post, you can probably tell that I kind of like the idea of
> doing so), but for now, I think the clear incremental step is to attack #2
> and solve it without llvm::cl. I have suggested a way to do this that I
> think makes sense.
>
> -- Sean Silva
>
>
>
>
>
>
> On Mon, Aug 18, 2014 at 11:49 AM, Chris Bieneman <beanz at apple.com>
wrote:
>
>> Today command line arguments in LLVM are global variables. An example
>> argument from Scalarizer.cpp is:
>>
>> static cl::opt<bool> ScalarizeLoadStore
>>   ("scalarize-load-store", cl::Hidden, cl::init(false),
>>    cl::desc("Allow the scalarizer pass to scalarize loads and
store"));
>>
>>
>> This poses a problem for clients of LLVM that aren’t traditional
>> compilers (i.e. WebKit, and Mesa). My proposal is to take a phased
approach
>> at addressing this issue.
>>
>> The first phase is to move the ownership of command line options to a
>> singleton, OptionRegistry. The OptionRegistry can be made to work with
the
>> existing global command line definitions so that the changes to migrate
>> options can be done in small batches. The primary purpose of this
change is
>> to move the ownership of the command line options out of the global
scope,
>> and to provide a vehicle for threading them through the compiler. At
the
>> completion of this phase, all the command line arguments will be
>> constructed during LLVM initialization and registered under the
>> OptionRegistry. This will replace the 100’s of static initialized
cl::opt
>> objects with a single static initialized OptionRegistry.
>>
>> With this change options can be constructed during initialization. For
>> the example option above the pass initialization would get a line like:
>>
>>
cl::OptionRegistry::CreateOption<bool>("ScalarizeLoadStore",
>>   "scalarize-load-store", cl::Hidden, cl::init(false),
>>   cl::desc("Allow the scalarizer pass to scalarize loads and
store"));
>>
>>
>> Also the pass would add a boolean member to store the value of the
option
>> which would be initialized in the pass’s constructor like this:
>>
>> ScalarizeLoadStore >>
cl::OptionRegistry::GetValue<bool>("ScalarizeLoadStore");
>>
>>
>> These two operations need to occur at separate times due to object
>> lifespans. At the time when command lines are parsed passes have been
>> initialized, but not constructed. That means making options live in
passes
>> doesn’t really work, but since we want the data to be part of the pass
we
>> need to initialize it during construction.
>>
>> A large part of this phase will be finding appropriate places for all
the
>> command line options to be initialized, and identifying all the places
>> where the option data will need to be threaded through the compiler.
One of
>> the goals here is to get rid of all global state in the compiler to
>> (eventually) enable better multi-threading by clients like WebKit.
>>
>> The second phase is to split the OptionRegistry into two pieces. The
>> first piece is the parsing logic, and the second piece is the Option
data
>> store. The main goal of this phase is to make the OptionRegistry
represent
>> everything needed to parse command line options and to define a new
second
>> object, OptionStore, that is populated with values by parsing the
command
>> line. The OptionRegistry will be responsible for initializing “blank”
>> option stores which can then be populated by either the command line
>> parser, or API calls.
>>
>> The OptionRegistry should remain a singleton so that the parsing logic
>> for all options remains universally available. This is required to
continue
>> supporting plugin loadable options.
>>
>> The OptionStore should be created when a command line is parsed, or by
an
>> API call in libraries, and can be passed through the pass manager and
>> targets to populate option data. The OptionStore should have a lifetime
>> independent of contexts, and pass managers because it can be re-used
>> indiscriminately.
>>
>> The core principle in this design is that the objects involved in
parsing
>> options only need to exist once, but you need a full list of all
options in
>> order to parse a command line. You should be able to have multiple
copies
>> of the actual stored option data. Having multiple copies of the data
store
>> is one step toward enabling two instances of LLVM in the same process
to
>> use optimization passes with different options.
>>
>> I haven’t come up with a specific implementation proposal for this yet,
>> but I do have some rough ideas. The basic flow that I’m thinking of is
for
>> command line parsing to create an object that maps option names to
their
>> values without any of the parsing data involved. This would allow for
>> either parsing multiple command lines, or generally just constructing
>> multiple option data stores. **Here is where things get foggy because I
>> haven’t yet looked too deep.** Once you construct a data store it will
get
>> passed into the pass manager (and everywhere else that needs it), and
it
>> will be used to initialize all the option values.
>>
>> There has been discussion about making the option store reside within
the
>> context, but this doesn’t feel right because the biggest consumer of
option
>> data is the passes, and you can use a single pass manager with multiple
>> contexts.
>>
>> Thanks,
>> -Chris
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140819/85fc82c7/attachment.html>

Renato Golin

2014-Aug-20 05:28 UTC

head link

[LLVMdev] [RFC] Removing static initializers for command line options

On 20 August 2014 05:43, Sean Silva <chisophugis at gmail.com>
wrote:> 2. As a way to easily set up a conduit from A to B, where A is the command
> line and B is some place "deep" inside the LLVM library code that
will do
> something in response to the command line.
I never liked this as a solution to #2. Heck, I never liked that we do
have #2 in the first place.

> Hence my suggestion of just putting a stringly typed key-value store (or
> whatever) in an easily accessible place (like LLVMContext), and just
> translating any unrecognized command line options (ones that are not for
#1)
> into that stringly typed storage.
I fully support this idea, and is in line with my strawman  proposal
for FPU/CPU parsing on all tools shared (beyond the boundaries of
LLVM).

String parsing is common and, even being target specific or pass
specific, is still string parsing and should be identical to all users
of the *same* feature.

I also believe a more sane command line option scheme is in order.
Today we have a zillion of options that are completely disconnected,
documented by accident in the initializer, without any bigger context
whatsoever. This is in part to follow what GCC has always done, and
probably we'll still need to support GCC's and our own legacy for
decades, but all that can also live in this commoned up parser.

Specifically to #2, my idea was something like:
--vectorizer-opts="foo,bar,baz=10" --tbaa-opts="...", etc.
That could
use a common parser all the way down to parsing "foo", which would be
left by the vectorizer's back-end to the parser to deal with and setup
the right flags in the right structure, used because "vectorizer" in
"vectorizer-opts" tell the factory to return a vectorizer parser's
back-end.

The FPU/CPU parsing (which has to parse command line options and
assembly directives, which happens to have the same syntax), would
have a similar structure.

Such flags in LLVMContext (or whatever) would have to be structured
like a tree and each pass should receive its own tree root, which most
of the time would have just a list of key/values, but some times have
a more nested structure.

cheers,
--renato

Pete Cooper

2014-Aug-20 06:12 UTC

head link

[LLVMdev] [RFC] Removing static initializers for command line options

> On Aug 19, 2014, at 9:43 PM, Sean Silva <chisophugis at gmail.com>
wrote:
> 
> One interesting issue with moving away from the current system of static
initializers for cl::opt is that we will no longer have the automatic
registration of all the options so that -help will print everything available
and generally we will not be able to issue an error for an "unknown command
line option" (without calling into any other code).Not automatic no, but in the proposal Chris puts the addOption call inside the
pass initializer which is called before ParseCommandLineOptions.  This means
you’ll still get options listed as you currently do, so long as you continue to
calls the pass initializers before parse (something you have to do anyway to get
the pass name visible to the command line)> 
> The auto-registration is fundamentally tied with the globalness and the
static initializers; pondering this has led me down an interesting path that has
made me understand better my suggestion in the other thread. As I see it, there
are two very different sorts of uses of llvm::cl in LLVM:
> 
> 1. For regular command line processing. E.g. if a tool accepts an output
file, then we need something that will parse the argument from the command line.
> 
> 2. As a way to easily set up a conduit from A to B, where A is the command
line and B is some place "deep" inside the LLVM library code that will
do something in response to the command line.
> 
> (and, pending discussion, someday point A might include a proper
programmatic interface (i.e. in a way other than hijacking the command line
processing))That would be nice.  I just suggested in another thread that we expose
ParseCommandLineOptions to the C API to hack around this, but a nice clean
interface would of course be better.> 
> llvm::cl does a decent job for #1 and that is what it was designed for
AFAICT; these uses of llvm::cl live outside of library code and everything is
pretty happy, despite them being global and having static initializers.
> 
> The problem is that llvm::cl is not very well-suited to #2, yet it is used
for #2, and that is the real problem. We need a solution to problem #2 which
does not use llvm::cl. Thus, I don't think that the solution you propose
here is the right direction.
> 
> The first step is to clearly differentiate between #1 and #2. I will say
"command line options" for #1 and "configuration/tweak
points" for #2. (maybe "library options" is better for #2;
neither is perfect terminology)
> 
> The strawman I suggested in
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075503.html was a stab at
#2. There is no way to dodge being stringly typed since command lines are
stringly typed, so really it is just a question of how long a solution stays
stringly typed.
> 
> My thought process for staying stringly typed "the whole time"
(possibly with some caching) comes from these two desires:
> - adding a c/t point should require adding just one call into the c/t
machinery (this is both for convenience and for DRY/SPOT), andRight.  The current point is in the pass initializer.

There is another point in the pass constructor to read the option value.  This
is the only point at which something will change from being a string to its
actual type and value.> - this change should be localized to the code being configured/tweaked
> This is the thought process:
> 
> Note that llvm::cl is stringly typed until it parses the options. llvm::cl
gives the appearance of a typed interface because it uses static initialization
as a backdoor to globally transport the knowledge of the expected type to the
option parsing machinery (very early in the program lifetime). Without this
backdoor, we need to stay stringly typed longer, at least until we reach the
"localized" place where the single call into the c/t machinery is
made; this single call is the only place that has the type information needed
for the c/t value to become properly typed. But there is no way to know how long
it will be until we reach that point (or even *if* we reach that point; consider
passes that are not run on this invocation).The current proposal exposes the type in addOption (as well as later when we get
the option). So the type continues to be known to the command line parser. 
Whether you want to actually type check in the command line is a point i’m open
to discuss.  Personally i want a command line option to be type checked because
it was registered, even if no-one actually gets the value of the option
later.> 
> Hence my suggestion of just putting a stringly typed key-value store (or
whatever) in an easily accessible place (like LLVMContext), and just translating
any unrecognized command line options (ones that are not for #1) into that
stringly typed storage.I’m against it being in the context because you may want to set up and reuse
passes multiple times with the same options, and use that configuration to
compile multiple LLVMContexts.  But I do agree that having a store with some
lifetime is useful.

I think the current proposal is to have the store be a singleton, but there’s
nothing to stop further work to have the storage for options be one per thread
for example.  If you wanted to have one pass manager per thread with its own set
of passes, configured (currently) via their own call to ParseCommandLineOptions
then that would be possible with little work beyond the current
proposal.> 
> I agree with Rafael that "constructor arguments to passes" are
not c/t points. In the future, there might be some way to integrate the two
(from the referenced post, you can probably tell that I kind of like the idea of
doing so), but for now, I think the clear incremental step is to attack #2 and
solve it without llvm::cl. I have suggested a way to do this that I think makes
sense.If you change the current proposal so that it doesn’t read cl::opt, then I think
this reads to me like what is being proposed now.  Really its creating a
string->string map with addOption, and getting the values with getOption. 
The passes don’t care (or know) whether the options are set via the command line
or any other API.  I hope i’ve understood your proposal correctly here.  Please
correct me otherwise.

Thanks,
Pete> 
> -- Sean Silva
> 
> 
> 
> 
> 
> 
> On Mon, Aug 18, 2014 at 11:49 AM, Chris Bieneman <beanz at apple.com>
wrote:
> Today command line arguments in LLVM are global variables. An example
argument from Scalarizer.cpp is:
> 
> static cl::opt<bool> ScalarizeLoadStore
>   ("scalarize-load-store", cl::Hidden, cl::init(false),
>    cl::desc("Allow the scalarizer pass to scalarize loads and
store"));
> 
> This poses a problem for clients of LLVM that aren’t traditional compilers
(i.e. WebKit, and Mesa). My proposal is to take a phased approach at addressing
this issue.
> 
> The first phase is to move the ownership of command line options to a
singleton, OptionRegistry. The OptionRegistry can be made to work with the
existing global command line definitions so that the changes to migrate options
can be done in small batches. The primary purpose of this change is to move the
ownership of the command line options out of the global scope, and to provide a
vehicle for threading them through the compiler. At the completion of this
phase, all the command line arguments will be constructed during LLVM
initialization and registered under the OptionRegistry. This will replace the
100’s of static initialized cl::opt objects with a single static initialized
OptionRegistry.
> 
> With this change options can be constructed during initialization. For the
example option above the pass initialization would get a line like:
> 
>
cl::OptionRegistry::CreateOption<bool>("ScalarizeLoadStore",
>   "scalarize-load-store", cl::Hidden, cl::init(false),
>   cl::desc("Allow the scalarizer pass to scalarize loads and
store"));
> 
> Also the pass would add a boolean member to store the value of the option
which would be initialized in the pass’s constructor like this:
> 
> ScalarizeLoadStore =
cl::OptionRegistry::GetValue<bool>("ScalarizeLoadStore");
> 
> These two operations need to occur at separate times due to object
lifespans. At the time when command lines are parsed passes have been
initialized, but not constructed. That means making options live in passes
doesn’t really work, but since we want the data to be part of the pass we need
to initialize it during construction.
> 
> A large part of this phase will be finding appropriate places for all the
command line options to be initialized, and identifying all the places where the
option data will need to be threaded through the compiler. One of the goals here
is to get rid of all global state in the compiler to (eventually) enable better
multi-threading by clients like WebKit.
> 
> The second phase is to split the OptionRegistry into two pieces. The first
piece is the parsing logic, and the second piece is the Option data store. The
main goal of this phase is to make the OptionRegistry represent everything
needed to parse command line options and to define a new second object,
OptionStore, that is populated with values by parsing the command line. The
OptionRegistry will be responsible for initializing “blank” option stores which
can then be populated by either the command line parser, or API calls.
> 
> The OptionRegistry should remain a singleton so that the parsing logic for
all options remains universally available. This is required to continue
supporting plugin loadable options.
> 
> The OptionStore should be created when a command line is parsed, or by an
API call in libraries, and can be passed through the pass manager and targets to
populate option data. The OptionStore should have a lifetime independent of
contexts, and pass managers because it can be re-used indiscriminately.
> 
> The core principle in this design is that the objects involved in parsing
options only need to exist once, but you need a full list of all options in
order to parse a command line. You should be able to have multiple copies of the
actual stored option data. Having multiple copies of the data store is one step
toward enabling two instances of LLVM in the same process to use optimization
passes with different options.
> 
> I haven’t come up with a specific implementation proposal for this yet, but
I do have some rough ideas. The basic flow that I’m thinking of is for command
line parsing to create an object that maps option names to their values without
any of the parsing data involved. This would allow for either parsing multiple
command lines, or generally just constructing multiple option data stores.
**Here is where things get foggy because I haven’t yet looked too deep.** Once
you construct a data store it will get passed into the pass manager (and
everywhere else that needs it), and it will be used to initialize all the option
values.
> 
> There has been discussion about making the option store reside within the
context, but this doesn’t feel right because the biggest consumer of option data
is the passes, and you can use a single pass manager with multiple contexts.
> 
> Thanks,
> -Chris
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140819/a17813c1/attachment.html>

Apparently Analagous Threads

Search for more reasonably related threads

llvm dev - Aug 2014 - [LLVMdev] [RFC] Removing static initializers for command line options

[LLVMdev] [RFC] Removing static initializers for command line options

[LLVMdev] [RFC] Removing static initializers for command line options

[LLVMdev] [RFC] Removing static initializers for command line options

[LLVMdev] [RFC] Removing static initializers for command line options

[LLVMdev] [RFC] Removing static initializers for command line options

[LLVMdev] [RFC] Removing static initializers for command line options

[LLVMdev] [RFC] Removing static initializers for command line options

[LLVMdev] [RFC] Removing static initializers for command line options

Apparently Analagous Threads