thr3ads.net - llvm dev - [LLVMdev] [RFC] Removing static initializers for command line options [Aug 2014]

If this information is useful, please help other people find it:
Share via:

Chandler Carruth

2014-Aug-26 17:41 UTC

[LLVMdev] [RFC] Removing static initializers for command line options

A bunch of IRC discussions etc have taken place, and I wanted to surface
some of the points from them here. Also, I wanted to give a more fully
thought-through set of my own perspectives.


First, I'd like to re-state the goals as I think those have gotten muddied.

1) Remove most of LLVM's static initializers (that is, dynamic
initialization code for static storage duration objects) by moving away
from global objects with non-trivial constructors of 'llvm::cl::opt'
types
in the LLVM *libraries* (they remain fine in the tools of course).

2) Support separate LLVM users in a single address space having different
values for the debugging knobs currently managed through the
'llvm::cl::opt' command line flags mechanism.

3) (related to #2) Support some library API for setting the debugging knobs
currently managed through the 'llvm::cl::opt' command line flags
mechanism.


Now, I want to change my position on one point because after talking to
Chris, Jim, and others I understand the problem significantly better. I am
now convinced that it is impossible to fix even #1 without making a change
for every single option in LLVM. Why? Because we have to *register* all of
the flags, and this registration inherently cannot be done from a global's
constructor without introducing the undesirable static initializers[1].

I also think that in order to address #2 we would need to make some change
to essentially every option in LLVM as we would have to connect it to
non-global storage.

Given that, I think it does make sense to try to address #1 and #2 at the
same time -- trying to limit the change to just one or the other would very
likely result in a significantly different API for creating, registering,
and using these options, and I would rather not make all of the LLVM
developers re-learn how to do this 2 times if we can instead do it 1 time.


However, doing #2 only really makes sense if we also do #3 to let people
*use* this new freedom. =] Owen convinced me that there really is a good
reason to support #3 as part of the LLVM libraries, however I wanted to
raise my primary concern about this on the mailing list because it hasn't
really been discussed.

These "debugging knobs" or library level options which *aren't*
exposed in
the library's APIs are very risky to expose in a side-channel API: they are
often completely unsupported or unsupportable options that were never
intended for "production" usage. Once we expose these options via some
API
to library users, we run a very serious risk of getting locked into
supporting behaviors or debugging misbehavior that were never expected or
planned for...

However, the debugging use case is real and important. So I want to propose
that we guard many of the paths to setting these options with NDEBUG so
that in non-asserts builds, library users can't override these options.
This would mean the C++ API for setting this in LLVMContext (and the C API
if we ever expose it there) would only support setting these in asserts
builds. We could even sink the checking into the registry. The name should
probably have "debug" in it somewhere. It would then have some
InternalUnsupportedScaryBad API which is not actually in the header files
but is used in LLVM's tools to maintain their current behavior without
documenting a public API. It's not bulletproof, but it seemed that it would
be sufficient for the debugging use cases and is a strong statement of
"unsupported" IMO.


The only other observation I would like to make is that if we're going to
do this, we really should take the time to try to sort out a really *good*
API. Hopefully we'll be slightly less wrong this time, and maybe the next
transition can be more incremental.


Ok, on to the more low-level parts of this email, here some specific
thoughts for how to design the API for further discussion:

- I really like the idea of using a 'void*' derived from some static tag
as
a way to identify uniquely each option. I also think we can use some
minimal "reflection" techniques to significantly simplify this.

- The need for a global registry is... distasteful, but I agree inevitable.

- I'd like to avoid a separate "option store". My current
suggestion is to
use the LLVMContext, and to add a context to the layers which don't have an
LLVMContext and where these kinds of options make sense.

- I would actually use a simple 'void*' -> string mapping in the
"store" of
the context. (Maybe we can actually use StringMap?) I would still provide
the type to the registry and use that to validate options when parsing.
When getting the value out of the "store" in the context, we can parse
it
into the data-type and assert fail on parse errors. I don't feel strongly
about this, but without "std::any" it seems simpler than rolling our
own
with a void* or a union.

- I would provide a simple API on the LLVMContext to get an "option"
string
for a given key. This API could be exposed by any context, and we can write
the option value extraction library to accept a generic T which supports
such a query method. That way there is *very* loose coupling between these
and it is easy to support any context object at any layer.

- The registry should essentially provide an interface to parse an argv
string list into such a map from 'void*' -> string. The LLVMContext
(or any
other context) can then be populated by the result of this parse. Each
library can parse whatever argv it wants.

- I don't think we should keep piling on the interface design of the
existing cl::opt stuff. I would make a clean break to a new header /
library, and just build some code to map these into the cl::opt parsing
code from the registry (we already do stuff like this for pass names). This
will also help the two systems co-exist but stay separate during the
(inevitably long) transition period.

- As a specific example of what I would change from the cl::opt stuff: I
think we should make the required arguments just be required and
positional, and the optional arguments use a fluent-style interface (yea, I
know, but I think it works well here) rather than the weird variadic-ish
thing...


Once there is a patch that seems like a reasonable start at this, I'd
actually like to get the new library added and just one pass (the one
you're using seems great) converted. I'd then like to try a few
experiments
to see if we can improve some of the nitty-gritty details of the API before
we roll it out widely.

Seem reasonable? I'm actually happy to help with some of the API
experiments and the rollout here.
-Chandler


[1]: I'd actually like to understand why on some platforms the explicit
global initialization routines are so much better than global
constructors... But that's probably a topic for on off-list or IRC
conversation. I don't want to derail this thread with that.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140826/189c99c2/attachment.html>

Rafael Espíndola

2014-Aug-26 18:52 UTC

head link

[LLVMdev] [RFC] Removing static initializers for command line options

> These "debugging knobs" or library level options which
*aren't* exposed in
> the library's APIs are very risky to expose in a side-channel API: they
are
> often completely unsupported or unsupportable options that were never
> intended for "production" usage. Once we expose these options via
some API
> to library users, we run a very serious risk of getting locked into
> supporting behaviors or debugging misbehavior that were never expected or
> planned for...
Agreed. This is my only real concern in here.
> However, the debugging use case is real and important. So I want to propose
> that we guard many of the paths to setting these options with NDEBUG so
that
> in non-asserts builds, library users can't override these options. This
> would mean the C++ API for setting this in LLVMContext (and the C API if we
> ever expose it there) would only support setting these in asserts builds.
We
> could even sink the checking into the registry. The name should probably
> have "debug" in it somewhere. It would then have some
> InternalUnsupportedScaryBad API which is not actually in the header files
> but is used in LLVM's tools to maintain their current behavior without
> documenting a public API. It's not bulletproof, but it seemed that it
would
> be sufficient for the debugging use cases and is a strong statement of
> "unsupported" IMO.
Just to be clear, the idea is that in a release build the only way of
setting options is via the command line? That they can still be passed
to opt or llc, but they would not be considered part of the API.
> - I really like the idea of using a 'void*' derived from some
static tag as
> a way to identify uniquely each option. I also think we can use some
minimal
> "reflection" techniques to significantly simplify this.
I like it too. The reason I do is that it is one more layer in a
defense in depth against making our internal options part of the API.
The key is not just a well known value (like a string).
> - I'd like to avoid a separate "option store". My current
suggestion is to
> use the LLVMContext, and to add a context to the layers which don't
have an
> LLVMContext and where these kinds of options make sense.
I agree. There are few places in LLVM without a LLVMContext. For
example, there is one cl::opt in lib/MC and Joerg already has a patch
to turn it into a proper API.

Cheers,
Rafael

Chandler Carruth

2014-Aug-26 19:49 UTC

head link

[LLVMdev] [RFC] Removing static initializers for command line options

On Tue, Aug 26, 2014 at 11:52 AM, Rafael Espíndola <
rafael.espindola at gmail.com> wrote:
> Just to be clear, the idea is that in a release build the only way of
> setting options is via the command line? That they can still be passed
> to opt or llc, but they would not be considered part of the API.
>
Yep. The path to being part of the API is, well, to put it in the API --
constructors, methods, whatever.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140826/bc4e22cb/attachment.html>

Robinson, Paul

2014-Aug-27 01:18 UTC

head link

[LLVMdev] [RFC] Removing static initializers for command line options

I count 15 occurrences of "-backend-option" in
clang/lib/Driver/Tools.cpp and most (all? didn't look at them real hard) of
those aren't debugging options.  Presumably those can all be solved by
adding more plumbing, but it would be *really* nice to make that plumbing easier
to do.
--paulr

From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
Behalf Of Chandler Carruth
Sent: Tuesday, August 26, 2014 10:42 AM
To: Chris Bieneman
Cc: LLVM Dev
Subject: Re: [LLVMdev] [RFC] Removing static initializers for command line
options

A bunch of IRC discussions etc have taken place, and I wanted to surface some of
the points from them here. Also, I wanted to give a more fully thought-through
set of my own perspectives.


First, I'd like to re-state the goals as I think those have gotten muddied.

1) Remove most of LLVM's static initializers (that is, dynamic
initialization code for static storage duration objects) by moving away from
global objects with non-trivial constructors of 'llvm::cl::opt' types in
the LLVM *libraries* (they remain fine in the tools of course).

2) Support separate LLVM users in a single address space having different values
for the debugging knobs currently managed through the 'llvm::cl::opt'
command line flags mechanism.

3) (related to #2) Support some library API for setting the debugging knobs
currently managed through the 'llvm::cl::opt' command line flags
mechanism.


Now, I want to change my position on one point because after talking to Chris,
Jim, and others I understand the problem significantly better. I am now
convinced that it is impossible to fix even #1 without making a change for every
single option in LLVM. Why? Because we have to *register* all of the flags, and
this registration inherently cannot be done from a global's constructor
without introducing the undesirable static initializers[1].

I also think that in order to address #2 we would need to make some change to
essentially every option in LLVM as we would have to connect it to non-global
storage.

Given that, I think it does make sense to try to address #1 and #2 at the same
time -- trying to limit the change to just one or the other would very likely
result in a significantly different API for creating, registering, and using
these options, and I would rather not make all of the LLVM developers re-learn
how to do this 2 times if we can instead do it 1 time.


However, doing #2 only really makes sense if we also do #3 to let people *use*
this new freedom. =] Owen convinced me that there really is a good reason to
support #3 as part of the LLVM libraries, however I wanted to raise my primary
concern about this on the mailing list because it hasn't really been
discussed.

These "debugging knobs" or library level options which *aren't*
exposed in the library's APIs are very risky to expose in a side-channel
API: they are often completely unsupported or unsupportable options that were
never intended for "production" usage. Once we expose these options
via some API to library users, we run a very serious risk of getting locked into
supporting behaviors or debugging misbehavior that were never expected or
planned for...

However, the debugging use case is real and important. So I want to propose that
we guard many of the paths to setting these options with NDEBUG so that in
non-asserts builds, library users can't override these options. This would
mean the C++ API for setting this in LLVMContext (and the C API if we ever
expose it there) would only support setting these in asserts builds. We could
even sink the checking into the registry. The name should probably have
"debug" in it somewhere. It would then have some
InternalUnsupportedScaryBad API which is not actually in the header files but is
used in LLVM's tools to maintain their current behavior without documenting
a public API. It's not bulletproof, but it seemed that it would be
sufficient for the debugging use cases and is a strong statement of
"unsupported" IMO.


The only other observation I would like to make is that if we're going to do
this, we really should take the time to try to sort out a really *good* API.
Hopefully we'll be slightly less wrong this time, and maybe the next
transition can be more incremental.


Ok, on to the more low-level parts of this email, here some specific thoughts
for how to design the API for further discussion:

- I really like the idea of using a 'void*' derived from some static tag
as a way to identify uniquely each option. I also think we can use some minimal
"reflection" techniques to significantly simplify this.

- The need for a global registry is... distasteful, but I agree inevitable.

- I'd like to avoid a separate "option store". My current
suggestion is to use the LLVMContext, and to add a context to the layers which
don't have an LLVMContext and where these kinds of options make sense.

- I would actually use a simple 'void*' -> string mapping in the
"store" of the context. (Maybe we can actually use StringMap?) I would
still provide the type to the registry and use that to validate options when
parsing. When getting the value out of the "store" in the context, we
can parse it into the data-type and assert fail on parse errors. I don't
feel strongly about this, but without "std::any" it seems simpler than
rolling our own with a void* or a union.

- I would provide a simple API on the LLVMContext to get an "option"
string for a given key. This API could be exposed by any context, and we can
write the option value extraction library to accept a generic T which supports
such a query method. That way there is *very* loose coupling between these and
it is easy to support any context object at any layer.

- The registry should essentially provide an interface to parse an argv string
list into such a map from 'void*' -> string. The LLVMContext (or any
other context) can then be populated by the result of this parse. Each library
can parse whatever argv it wants.

- I don't think we should keep piling on the interface design of the
existing cl::opt stuff. I would make a clean break to a new header / library,
and just build some code to map these into the cl::opt parsing code from the
registry (we already do stuff like this for pass names). This will also help the
two systems co-exist but stay separate during the (inevitably long) transition
period.

- As a specific example of what I would change from the cl::opt stuff: I think
we should make the required arguments just be required and positional, and the
optional arguments use a fluent-style interface (yea, I know, but I think it
works well here) rather than the weird variadic-ish thing...


Once there is a patch that seems like a reasonable start at this, I'd
actually like to get the new library added and just one pass (the one you're
using seems great) converted. I'd then like to try a few experiments to see
if we can improve some of the nitty-gritty details of the API before we roll it
out widely.

Seem reasonable? I'm actually happy to help with some of the API experiments
and the rollout here.
-Chandler


[1]: I'd actually like to understand why on some platforms the explicit
global initialization routines are so much better than global constructors...
But that's probably a topic for on off-list or IRC conversation. I don't
want to derail this thread with that.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140827/5c9234e4/attachment.html>

llvm dev - Aug 2014 - [LLVMdev] [RFC] Removing static initializers for command line options

[LLVMdev] [RFC] Removing static initializers for command line options

[LLVMdev] [RFC] Removing static initializers for command line options

[LLVMdev] [RFC] Removing static initializers for command line options

[LLVMdev] [RFC] Removing static initializers for command line options