thr3ads.net - llvm dev - [LLVMdev] Fwd: Overhauling the command-line processing/codegen options code [Feb 2013]

If this information is useful, please help other people find it:
Share via:

Justin Holewinski

2013-Feb-13 19:37 UTC

[LLVMdev] Overhauling the command-line processing/codegen options code

Is anyone currently working on overhauling the command-line processing code?


We're currently having some design issues with this component, and I'd
like
to start a larger conversation on it.  For context, I am talking from an
"LLVM as a library" perspective instead of an "LLVM as a set of
tools"
perspective.  In a nut-shell, the problems we are facing are as follows:

1. As per bug 11944, the command-line processing logic accounts for a
non-trivial amount of global constructors and leads to memory that is
allocated when the containing library is loaded, and not freed until the
containing library is unloaded.  Granted, this is not a lot of data...

2. Command-line options are currently used for codegen options in
inherently non-thread-safe ways.


(1) is mostly self-explanatory and has a bug about it, but (2) requires a
bit more explanation.  Take for example the "-unroll-threshold" option
in
LoopUnroll.  If I'm setting up a pass pipeline programmatically, I add the
LoopUnroll pass to the PassManager, but I cannot set the unroll threshold
without calling cl::ParseCommandLineOptions() [or perhaps some ugly hacks
that involve getting at and modifying the global state].  In addition to
being awkward, this is not thread safe!  I cannot run two PassManager
pipelines concurrently with different unroll threshold values.  In this
case, I am singling out the LoopUnroll pass, but this design is very
prevalent within the LLVM analysis/transformation/codegen infrastructure.
 This has no effect on users of opt/llc as tools, but library users can be
greatly affected by this.


Ideally, what I would like to see is a separation between the definition of
analysis/transformation/codegen options and their value storage.  To get
the conversation started, I would like to propose the following:



Requirements

- Make it easy for passes to declare arbitrary options, just like they do
now
- Let different pass pipelines have different sets of option values
- Keep the option registry dynamic, so plugins loaded with "-load" can
register new options
- Let option values be parseable from the command-line (for opt, llc, ...)


Implementation

As a first design draft, I propose that cl::opt and friends be extended to
support argument storage in an LLVMContext.  Instead of storing the value
directly in a cl::opt instance or specifying a global variable, this new
storage would utilize a StringMap stored in an LLVMContext.  As a
consequence, parsing would be delayed until the option is read.


* Command Line Parsing

The cl::ParseCommandLineOptions() call would take an additional parameter,
an LLVMContext reference.  Global options would be written directly to
their cl::opt instances as they are now, but per-context options will be
copied in the LLVMContext instance.  Tools would use something like:

LLVMContext &Ctx = getGlobalContext();
cl::ParseCommandLineOptions(argc, argv, "my tool", Ctx);

In addition, library users can use a new API:

LLVMContext MyCtx;
cl::ParseContextOptions(ArgList, MyCtx);

which will only parse per-context options (and not overwrite any global
state).

* Reading Option Values

For per-context options, an LLVMContext would be required when reading back
the option values.  For example, in the LoopUnroll example, you could write:

unsigned Threshold = UnrollThreshold(Mod.getContext());

This would get the "-unroll-threshold" option value for the context
for the
current module.  Parsing of the value would be delayed until the value is
read, since the value would need to be stored as a string in the context.
Global options can be used as they currently are.  It would be a run-time
error to read a per-context option without a context parameter.


Open Questions

- Can we just make all options per-context, and assign tool options to the
global context?  May require special handling for "-debug",
"-time-passes",
and friends.

- Alternatively, we could try to just eliminate the codegen options
altogether and rely on per-module data, like the new attributes
functionality.  But this seems like it may be less flexible and would
require the Module to know what options to use.  Supporting command-line
options would then require changes to the Module instance.


This is just a first-pass idea for making options more scalable to library
users.  Please let me know if you have any other ideas, or if someone is
already working on this.

-- 

Thanks,

Justin Holewinski
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130213/753a0a03/attachment.html>

Chris Lattner

2013-Feb-13 20:53 UTC

head link

[LLVMdev] Overhauling the command-line processing/codegen options code

On Feb 13, 2013, at 11:37 AM, Justin Holewinski <justin.holewinski at
gmail.com> wrote:> Is anyone currently working on overhauling the command-line processing
code?
Bill's attributes work is quite related, it will hopefully define away a
number of the TargetOptions concepts, replacing them with per-function
attributes.  It should also subsume a number of target-independent concepts like
"omit frame pointers" and other codegen flags.
> We're currently having some design issues with this component, and
I'd like to start a larger conversation on it.  For context, I am talking
from an "LLVM as a library" perspective
My favorite perspective :)
>  In a nut-shell, the problems we are facing are as follows:
> 
> 1. As per bug 11944, the command-line processing logic accounts for a
non-trivial amount of global constructors and leads to memory that is allocated
when the containing library is loaded, and not freed until the containing
library is unloaded.  Granted, this is not a lot of data…
Yep, this sucks.  It has been a long-known problem.  In my ideal world, the only
cl::opt's that would exist would be in leaf tools (e.g. in tools/opt, but
not in optimization passes themselves).  There is some major design work
required to make this happen though, and I'm not aware of anyone tackling
it.
> 2. Command-line options are currently used for codegen options in
inherently non-thread-safe ways.
Hopefully Bill's work will help address the majority of this.
> (1) is mostly self-explanatory and has a bug about it, but (2) requires a
bit more explanation.  Take for example the "-unroll-threshold" option
in LoopUnroll.  If I'm setting up a pass pipeline programmatically, I add
the LoopUnroll pass to the PassManager, but I cannot set the unroll threshold
without calling cl::ParseCommandLineOptions() [or perhaps some ugly hacks that
involve getting at and modifying the global state].  In addition to being
awkward, this is not thread safe!  I cannot run two PassManager pipelines
concurrently with different unroll threshold values.  In this case, I am
singling out the LoopUnroll pass, but this design is very prevalent within the
LLVM analysis/transformation/codegen infrastructure.  This has no effect on
users of opt/llc as tools, but library users can be greatly affected by this.
This is a very interesting one that is orthogonal to Bill's work.  The
preferred approach for this sort of thing is to change the LoopUnroll pass to
take the unroll threshold as a constructor argument or a struct that wraps up
all of the configuration settings.  When the pass is *default* constructed (e.g.
from opt -loop-unroll), it is acceptable to have the default ctor read cl::opt
variables, but the optimization pipeline should not depend on cl::opts to
configure the pass.

There is currently some gray area here for debug settings, and for things that
are being staged in but are not on by default, but the default optimization
pipeline should not be looking to cl::opt's for their settings (in an ideal
world).
> Requirements
> 
> - Make it easy for passes to declare arbitrary options, just like they do
now
> - Let different pass pipelines have different sets of option values
> - Keep the option registry dynamic, so plugins loaded with
"-load" can register new options
> - Let option values be parseable from the command-line (for opt, llc, …)
All good requirements.
> Implementation
> 
> As a first design draft, I propose that cl::opt and friends be extended to
support argument storage in an LLVMContext.  Instead of storing the value
directly in a cl::opt instance or specifying a global variable, this new storage
would utilize a StringMap stored in an LLVMContext.  As a consequence, parsing
would be delayed until the option is read.
I'd really like to avoid going here.  With the direction sketched out above,
is enough of your problem solved?  Taking loop unroll as an example, I'd
much rather you refactor the code to have its constructor take a configuration
struct setting its various settings, than have the various settings pulled out
to llvmcontext.

-Chris

Justin Holewinski

2013-Feb-13 22:32 UTC

head link

[LLVMdev] Fwd: Overhauling the command-line processing/codegen options code

Reply to list as I had originally intended...

---------- Forwarded message ----------
From: Justin Holewinski <justin.holewinski at gmail.com>
Date: Wed, Feb 13, 2013 at 4:23 PM
Subject: Re: [LLVMdev] Overhauling the command-line processing/codegen
options code
To: Chris Lattner <clattner at apple.com>





On Wed, Feb 13, 2013 at 3:53 PM, Chris Lattner <clattner at apple.com>
wrote:
> On Feb 13, 2013, at 11:37 AM, Justin Holewinski <
> justin.holewinski at gmail.com> wrote:
> > Is anyone currently working on overhauling the command-line processing
> code?
>
> Bill's attributes work is quite related, it will hopefully define away
a
> number of the TargetOptions concepts, replacing them with per-function
> attributes.  It should also subsume a number of target-independent concepts
> like "omit frame pointers" and other codegen flags.
>
> > We're currently having some design issues with this component, and
I'd
> like to start a larger conversation on it.  For context, I am talking from
> an "LLVM as a library" perspective
>
> My favorite perspective :)
>
> >  In a nut-shell, the problems we are facing are as follows:
> >
> > 1. As per bug 11944, the command-line processing logic accounts for a
> non-trivial amount of global constructors and leads to memory that is
> allocated when the containing library is loaded, and not freed until the
> containing library is unloaded.  Granted, this is not a lot of data…
>
> Yep, this sucks.  It has been a long-known problem.  In my ideal world,
> the only cl::opt's that would exist would be in leaf tools (e.g. in
> tools/opt, but not in optimization passes themselves).  There is some major
> design work required to make this happen though, and I'm not aware of
> anyone tackling it.
>
Right, the sheer amount of work here gives me night terrors... but
fortunately my current problem isn't really related to static data as much
as it is point (2) below. :)  Though on the topic of static data,
ManagedStatic<> cleanup is another topic that bites me, but that is for
another day.

>
> > 2. Command-line options are currently used for codegen options in
> inherently non-thread-safe ways.
>
> Hopefully Bill's work will help address the majority of this.
>
Bill's work does seem very relevant here, but I'm unsure if module
attributes is really the right approach here.  To me, it makes sense to
keep code generation options separate from the IR itself.  If we use
attributes, that means the tools would actually modify the IR (in the form
of adding attributes) before running it through the pass pipeline.

>
> > (1) is mostly self-explanatory and has a bug about it, but (2)
requires
> a bit more explanation.  Take for example the "-unroll-threshold"
option in
> LoopUnroll.  If I'm setting up a pass pipeline programmatically, I add
the
> LoopUnroll pass to the PassManager, but I cannot set the unroll threshold
> without calling cl::ParseCommandLineOptions() [or perhaps some ugly hacks
> that involve getting at and modifying the global state].  In addition to
> being awkward, this is not thread safe!  I cannot run two PassManager
> pipelines concurrently with different unroll threshold values.  In this
> case, I am singling out the LoopUnroll pass, but this design is very
> prevalent within the LLVM analysis/transformation/codegen infrastructure.
>  This has no effect on users of opt/llc as tools, but library users can be
> greatly affected by this.
>
> This is a very interesting one that is orthogonal to Bill's work.  The
> preferred approach for this sort of thing is to change the LoopUnroll pass
> to take the unroll threshold as a constructor argument or a struct that
> wraps up all of the configuration settings.  When the pass is *default*
> constructed (e.g. from opt -loop-unroll), it is acceptable to have the
> default ctor read cl::opt variables, but the optimization pipeline should
> not depend on cl::opts to configure the pass.
>
> There is currently some gray area here for debug settings, and for things
> that are being staged in but are not on by default, but the default
> optimization pipeline should not be looking to cl::opt's for their
settings
> (in an ideal world).
>
I agree that this is a much cleaner solution for statically-linked passes,
but how do you handle options for passes loaded at run-time?  One of the
advantages of the current command-line option approach is that I can define
options and have opt/llc accept them from modules linked in with
"-load".
 With struct-based options, I would have no way of having my library
dynamically load another containing passes, and set options on those passes
(unless I used some common base class that allowed me to set options).  Or
perhaps a trivial solution would be to just add a new virtual method to all
passes that allows clients to pass arbitrary options, like a "virtual void
setOption(StringRef Option, StringRef Value) {}".  Though this use-case
seems very rare...

>
> > Requirements
> >
> > - Make it easy for passes to declare arbitrary options, just like they
> do now
> > - Let different pass pipelines have different sets of option values
> > - Keep the option registry dynamic, so plugins loaded with
"-load" can
> register new options
> > - Let option values be parseable from the command-line (for opt, llc,
…)
>
> All good requirements.
>
> > Implementation
> >
> > As a first design draft, I propose that cl::opt and friends be
extended
> to support argument storage in an LLVMContext.  Instead of storing the
> value directly in a cl::opt instance or specifying a global variable, this
> new storage would utilize a StringMap stored in an LLVMContext.  As a
> consequence, parsing would be delayed until the option is read.
>
> I'd really like to avoid going here.  With the direction sketched out
> above, is enough of your problem solved?  Taking loop unroll as an example,
> I'd much rather you refactor the code to have its constructor take a
> configuration struct setting its various settings, than have the various
> settings pulled out to llvmcontext.
>
I believe it would solve the issues I have, yes.  That would mean having
two versions of every createXXXPass() function, one for default parameters
and one taking an options struct.  If we ignore the library-loading-library
case, I could see this working.

>
> -Chris
>
>
>

-- 

Thanks,

Justin Holewinski



-- 

Thanks,

Justin Holewinski
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130213/41158d93/attachment.html>

Andrew Trick

2013-Feb-17 21:15 UTC

head link

[LLVMdev] Overhauling the command-line processing/codegen options code

On Feb 13, 2013, at 2:32 PM, Justin Holewinski <justin.holewinski at
gmail.com> wrote:> 
> This is a very interesting one that is orthogonal to Bill's work.  The
preferred approach for this sort of thing is to change the LoopUnroll pass to
take the unroll threshold as a constructor argument or a struct that wraps up
all of the configuration settings.  When the pass is *default* constructed (e.g.
from opt -loop-unroll), it is acceptable to have the default ctor read cl::opt
variables, but the optimization pipeline should not depend on cl::opts to
configure the pass.
> 
> There is currently some gray area here for debug settings, and for things
that are being staged in but are not on by default, but the default optimization
pipeline should not be looking to cl::opt's for their settings (in an ideal
world).
> 
> I agree that this is a much cleaner solution for statically-linked passes,
but how do you handle options for passes loaded at run-time?  One of the
advantages of the current command-line option approach is that I can define
options and have opt/llc accept them from modules linked in with
"-load".  With struct-based options, I would have no way of having my
library dynamically load another containing passes, and set options on those
passes (unless I used some common base class that allowed me to set options). 
Or perhaps a trivial solution would be to just add a new virtual method to all
passes that allows clients to pass arbitrary options, like a "virtual void
setOption(StringRef Option, StringRef Value) {}".  Though this use-case
seems very rare…
It is generally convenient to decouple pass configuration from construction.
CodeGen does this using the TargetPassConfig analysis. It allows  pass
instantiation to be configured using only Pass IDs.

Right now it's only used for subtarget options. But it would be nice to
unify all codegen options for transparency and discoverability. e.g. I need to
know why clang/opt/llc is not generating the same code for some test case.
-print-options should tell me everything that might have affected code
generation.

-Andy
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130217/0aaebd29/attachment.html>

Apparently Analagous Threads

Search for more seemingly similar threads

llvm dev - Feb 2013 - [LLVMdev] Fwd: Overhauling the command-line processing/codegen options code

[LLVMdev] Overhauling the command-line processing/codegen options code

[LLVMdev] Overhauling the command-line processing/codegen options code

[LLVMdev] Fwd: Overhauling the command-line processing/codegen options code

[LLVMdev] Overhauling the command-line processing/codegen options code

Apparently Analagous Threads