thr3ads.net - llvm dev - [llvm-dev] [RFC] Heterogeneous LLVM-IR Modules [Jul 2020]

If this information is useful, please help other people find it:
Share via:

Johannes Doerfert via llvm-dev

2020-Jul-30 15:56 UTC

[llvm-dev] [RFC] Heterogeneous LLVM-IR Modules

On 7/30/20 10:01 AM, Renato Golin wrote:
 > On Thu, 30 Jul 2020 at 15:09, Johannes Doerfert
 > <johannesdoerfert at gmail.com> wrote:
 >> At this point I ask myself if it wouldn't be better to make the
target
 >> cpu, features, and other "hidden parameters" explicit in the
module
itself.
 >> (I suggested part of that recently anyway[0].) That way we could 
create the
 >> proper target info from the IR, which seems to me like something
 >> valuable even in the current single-target setting.
 >
 > This is still not enough. Other driver flags exist, which have to do
 > with OS and environment issues (incl. user flags) that are not part of
 > the target description and can affect optimisation, codegen and even
 > ABI.
 >
 > Some of those options apply to some targets and not others. If they
 > apply to all targets you have, the user might want to apply to some
 > but not all, and then how will this work at cmdline side?

I can see that we want different command line options per target in the
module. Given that we probably want to allow one pass pipeline per
target, maybe we keep the options but introduce something like a
`--device=N` flag which will apply all following options to the
"N'th"
pipeline. That way you could specify things like:
   ` ... --inline-threshold=1234 --device=2 --inline-threshold=5678`

For TTI and such, the driver would create the appropriate version for
each target and put it in the respective pipeline, as it does now, just
that there are multiple pipelines.

My idea in the last email was to put the relevant driver options
(optionally) into the IR such that you can generate TTI and friends from
the IR alone. As far as I know, this is not possible right now. Note
that this is somewhat unrelated to heterogeneous modules but would
potentially be helpful there. If we would manifest the options though,
you could ask the driver to emit IR with target options embedded, then
use `opt` and friends to work on the result (w/o repeating the flags)
while still being able to create the same TTI the driver would have
created for you in an "end-to-end" run. (I might not express this idea
properly.)

 > I don't know the extent of what you can combine from all of the
 > existing global options into IR annotations, but my wild guess is that
 > it would explode the number of attributes, which is not a good thing.

I mean, you can put the command line string that set the options into
the first place, right? That is as long as it initially was, or maybe I
am missing something.

To recap things that might "differ" from the original proposal:
   - We          want multiple target triples.
   - We probably want multiple data layouts.
   - We probably want multiple pass pipelines, with different (cmd
     line) options and such.
   - We might want to make modules self contained wrt. target options
     such that you can create TTI and friends w/o repeating driver
     options.

~ Johannes

 > --renato

Renato Golin via llvm-dev

2020-Jul-30 16:11 UTC

head link

[llvm-dev] [RFC] Heterogeneous LLVM-IR Modules

On Thu, 30 Jul 2020 at 16:58, Johannes Doerfert
<johannesdoerfert at gmail.com> wrote:> I mean, you can put the command line string that set the options into
> the first place, right? That is as long as it initially was, or maybe I
> am missing something.
Options change with time, and this would make the IR incompatible
across releases without intentionally doing so.
> To recap things that might "differ" from the original proposal:
>    - We          want multiple target triples.
>    - We probably want multiple data layouts.
>    - We probably want multiple pass pipelines, with different (cmd
>      line) options and such.
>    - We might want to make modules self contained wrt. target options
>      such that you can create TTI and friends w/o repeating driver
>      options.
The extent of the separation is what made me suggest that it might be
easier, in the end, to carry multiple modules, from different
front-ends, through multiple pipelines but interacting with each
other.

I guess this is why David made a parallel with LTO, as this ends up as
being a multi-device LTO in a sense. I think that will be easier and
much less intrusive than rewriting the global context, target flags,
IR annotation, data layout assumptions, target triple parsing, target
options bundling, etc.

--renato

Johannes Doerfert via llvm-dev

2020-Jul-30 16:44 UTC

head link

[llvm-dev] [RFC] Heterogeneous LLVM-IR Modules

On 7/30/20 11:11 AM, Renato Golin wrote:
 > On Thu, 30 Jul 2020 at 16:58, Johannes Doerfert
 > <johannesdoerfert at gmail.com> wrote:
 >> I mean, you can put the command line string that set the options into
 >> the first place, right? That is as long as it initially was, or maybe
I
 >> am missing something.
 >
 > Options change with time, and this would make the IR incompatible
 > across releases without intentionally doing so.

You could arguably be forgiving when it comes to the parsing of these so
you might loose some if you mix IR across releases but right now you
cannot express this at all. I mean, IR looks as if it captures the
entire state but not quite. As a use case, the question how to reproduce
`clang -O3` with opt comes up every month or so on the list. Let's table
this for now as it seems unrelated to this proposal.

 >> To recap things that might "differ" from the original
proposal:
 >>    - We          want multiple target triples.
 >>    - We probably want multiple data layouts.
 >>    - We probably want multiple pass pipelines, with different (cmd
 >>      line) options and such.
 >>    - We might want to make modules self contained wrt. target options
 >>      such that you can create TTI and friends w/o repeating driver
 >>      options.
 >
 > The extent of the separation is what made me suggest that it might be
 > easier, in the end, to carry multiple modules, from different
 > front-ends, through multiple pipelines but interacting with each
 > other.
 >
 > I guess this is why David made a parallel with LTO, as this ends up as
 > being a multi-device LTO in a sense. I think that will be easier and
 > much less intrusive than rewriting the global context, target flags,
 > IR annotation, data layout assumptions, target triple parsing, target
 > options bundling, etc.

It is definitively multi-device (link time) optimization. The link
time part is somewhat optional and might be misleading given the
popularity of single source programming models for accelerators. The
"thinLTO" idea would also not be sufficient for everything we hope to
do, the two module approach would be though.

What if we don't rewrite these things but still merge the modules?
Let me explain ;)

(I use `opt` invocations below as a placeholder for the lack of a better
  term but knowing it is not (only) the `opt` tool we talk about.)

The problem is that the `opt` invocation is primed for a single target,
everything (=pipeline, TTI, flags, ...) exists only once, right?
I imagine the two module approach to run two `opt` invocations, one for
each module, which we would synchronize at some point to do cross-module
optimizations. Given that we can run two `opt` invocations and we assume
a pass can work with two modules, that is two sets of everything, why do
we need the separation? From a tooling perspective I think it makes
things easier to have a single module. That said, it should not preclude
us to run two separate `opt` invocations on it. So we don't rewrite
everything but instead "just" need to duplicate all the information in
the IR such that each `opt` invocation can extract it's respective set
of values and run on the respective set of global symbols. This would
reduce the new stuff to more or less what we started with: device triple
& DL, and a way to link global symbol to a device triple & DL. It is the
two module approach but with "co-located" modules ;)

WDYT?

~ Johannes

P.S. This is really helpful but I won't give up so easily on the idea.
      If I do, I have to implement cross module optimizations and I would
      rather not ;)

Maybe Matching Threads

Search for more apparently analagous threads

llvm dev - Jul 2020 - [RFC] Heterogeneous LLVM-IR Modules

[llvm-dev] [RFC] Heterogeneous LLVM-IR Modules

[llvm-dev] [RFC] Heterogeneous LLVM-IR Modules

[llvm-dev] [RFC] Heterogeneous LLVM-IR Modules

Maybe Matching Threads