thr3ads.net - llvm dev - [llvm-dev] [RFC] Abstracting over SSA form IRs to implement generic analyses [Mar 2021]

If this information is useful, please help other people find it:
Share via:

David Blaikie via llvm-dev

2021-Feb-11 00:37 UTC

[llvm-dev] [RFC] Abstracting over SSA form IRs to implement generic analyses

Owen - perhaps you've got some relevant perspective here given the GPU
context/background, and general middle end optimization design
questions. Could you take a look?
(& perhaps Johannes?)

On Mon, Feb 1, 2021 at 11:36 AM David Blaikie <dblaikie at gmail.com>
wrote:>
> Thanks for sending this out!
>
> +Mehdi - if you have a chance to look, I'd appreciate your thoughts on
this, partly as a general LLVM contributor, but also with an eye to abstractions
over multiple IRs given your work on MLIR
> +Lang - you mentioned maybe some colleagues of yours might be able to take
a look at the design here?
>
> I'd like to try writing up maybe alternatives to a couple of the
patches in the series to see how they compare, but haven't done that as yet.
>
> My brief take is that I think the opaque handles make a fair bit of
sense/seem high-value/low-cost (if it turned out the handle for one of these IR
types needed to become a bit heavier, I don't think it'd be super
difficult to revisit this code - make the generic handle large enough to
accommodate whatever the largest concrete handle happens to be, etc). Though
I'm still a bit less certain about the full runtime polymorphic
abstractions.
>
> On Thu, Dec 17, 2020 at 11:17 AM Nicolai Hähnle via llvm-dev <llvm-dev
at lists.llvm.org> wrote:
>>
>> Hi LLVM community,
>>
>> Earlier this year I first proposed a new way of writing analyses that
can be applied to multiple IRs, for example, applying the same analysis to both
LLVM IR and MachineIR.
>>
>> LLVM already has some analyses like that; for example, dominator tree
construction and loop info. However, they're all limited to looking at the
control flow graph: basic blocks and lists of predecessors and successors. We
want to push the envelope with a divergence analysis that is also aware of
instructions and values, and ran into severe limitations in what we could do
with the techniques that are commonly used in LLVM today. Some limitations are
around which concepts are exposed generically at all, though the bulk of
limitations revolves around readability and maintainability of the resulting
generic code.
>>
>> After more evolution of the ideas and many discussions over the last
few months, I want to raise this proposal once more -- this time with an
extensive document:
https://docs.google.com/document/d/1sbeGw5uNGFV0ZPVk6h8Q5_dRhk4qFnKHa-uZ-O3c4UY/edit?usp=sharing
>>
>> Feel free to comment on the document, though high-level discussion is
probably best kept in this email thread.
>>
>> The concrete proposal is to enable 4 tools for use by generic analyses:
>>
>> - type erasure
>> - an SsaContext context class concept with a fairly small surface area
>> - dynamic polymorphism via per-analysis adapters
>> - dynamic polymorphism via an LLVM-wide adapter of SsaContext
>>
>> The document goes to some length to explain what precisely is meant by
each of those bullets, including code examples, as well as describing a few
other options that we _don't_ propose, based on their relative merits.
>>
>> There are concrete patches that go along with the proposal and you can
refer to for additional context. In logical sequence, they are:
>> - https://reviews.llvm.org/D92924: Introduce opaque handles for type
erasure
>> - https://reviews.llvm.org/D83089: Based on the handle infrastructure,
refactor the dominator tree with type-erased base classes that can be used by
generic algorithms
>> - https://reviews.llvm.org/D92925: Introduce an SsaContext context
class concept for static polymorphism
>> - https://reviews.llvm.org/D92926: Introduce an ISsaContext “global”
interface class for dynamic polymorphism built on top of SsaContext and opaque
handles
>> - https://reviews.llvm.org/D83094: A new analysis (cycle info) written
generically as non-template code using opaque handles, ISsaContext, and
analysis-specific dynamic polymorphism via the ICycleInfoSsaContext interface
added in the patch
>>
>> I would like us to get to general agreement on this thread that this is
a direction we want to go in and that we can proceed with the proposed code
changes.
>>
>> Thanks,
>> Nicolai
>> --
>> Lerne, wie die Welt wirklich ist,
>> aber vergiss niemals, wie sie sein sollte.
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Sameer Sahasrabuddhe via llvm-dev

2021-Mar-30 16:20 UTC

head link

[llvm-dev] [RFC] Abstracting over SSA form IRs to implement generic analyses

---- On Thu, 11 Feb 2021 06:07:17 +0530 David Blaikie via llvm-dev <llvm-dev
at lists.llvm.org> wrote ----

 > Owen - perhaps you've got some relevant perspective here given the GPU
 > context/background, and general middle end optimization design
 > questions. Could you take a look?
 > (& perhaps Johannes?)
 > 
 > On Mon, Feb 1, 2021 at 11:36 AM David Blaikie <dblaikie at
gmail.com> wrote:
 > >
 > > Thanks for sending this out!
 > >
 > > +Mehdi - if you have a chance to look, I'd appreciate your
thoughts on this, partly as a general LLVM contributor, but also with an eye to
abstractions over multiple IRs given your work on MLIR
 > > +Lang - you mentioned maybe some colleagues of yours might be able to
take a look at the design here?
 > >
 > > I'd like to try writing up maybe alternatives to a couple of the
patches in the series to see how they compare, but haven't done that as yet.
 > >
 > > My brief take is that I think the opaque handles make a fair bit of
sense/seem high-value/low-cost (if it turned out the handle for one of these IR
types needed to become a bit heavier, I don't think it'd be super
difficult to revisit this code - make the generic handle large enough to
accommodate whatever the largest concrete handle happens to be, etc). Though
I'm still a bit less certain about the full runtime polymorphic
abstractions.

For context, this is about the options laid out in the following document from
Nicolai:

https://docs.google.com/document/d/1sbeGw5uNGFV0ZPVk6h8Q5_dRhk4qFnKHa-uZ-O3c4UY/edit?usp=sharing

The above document was first introduced in the following email, which is the
starting point of the current thread:

https://lists.llvm.org/pipermail/llvm-dev/2020-December/147433.html

The main issue is the need for an abstraction that greatly improves the
experience of writing analyses that work on LLVM IR and MIR (and potentially
also on MLIR). The way I understand it, Nicolai proposed an abstraction that
involves dynamic polymorphism to abstract out the details of the underlying IR.
David was more in favour of first unifying the IRs to the point that the dynamic
polymorphism is limited to only occasional corner cases instead.

I am currently involved in the activity towards decoupling AMDGPU's
dependency on the abstractions being discussed here, and the one thing that
immediately jumps out is the ability to traverse the predecessors and successors
of a basic block. From all the emails so far and the resulting proposal, it
seems that both David and Nicolai think that having a common non-abstract base
class for basic blocks is a good idea. Such a base class will provide a vector
of predecessors and successors that can be used to traverse the CFG independent
of the actual IR.

The current situation is that the basic blocks in LLVM IR and MLIR do not
explicitly track their preds and succs. Preds are determined from the uses of
the block, while succs are determined from the terminator instruction. MIR has
explicit pred and succ vectors, but the succ pointers are duplicated by their
presence as operands to the terminator instructions. The MachineBasicBlock is
not a value, so there is no notion of traversing its uses for predecessors.

So if we create a common non-abstract base class for basic blocks, then MLIR and
LLVM IR will incur the overhead of two vectors, one each for the preds and
succs. Nicolai had reported a 3.1% to 4.1% increase in the size LLVM IR in some
typical programs:

https://docs.google.com/spreadsheets/d/1cwRy2K4XjWCjwfK53MuCqws1TsQ0S6FtRjbwoUdGBfo/edit?usp=sharing

Is this overhead the main issue in deciding whether we should introduce the
common base class for traversing a CFG? Do note that MIR already duplicates
successors. One could argue that duplicating the successor pointers in LLVM IR
and MLIR merely brings them on the same level as MIR. If that argument sticks,
then the "real" overhead is only half of what is reported to account
for the duplicate predecessor pointers.

The duplication of predecessor pointers stems from the fact that basic blocks in
LLVM IR and MLIR are also values that are (primarily) used as operands to the
terminator and PHI instructions. We could redefine these instructions to use
non-value basic blocks as operands. CFG edges are not the typical use-def
relation that other values represent, and it's reasonable to say that
PHINodes and terminators are special in this one way.  I have not managed to see
how far that rabbit hole goes, but I am not really convinced that this is
attractive in any way. Is this even an option?

Sameer.

llvm dev - Mar 2021 - [RFC] Abstracting over SSA form IRs to implement generic analyses

[llvm-dev] [RFC] Abstracting over SSA form IRs to implement generic analyses

[llvm-dev] [RFC] Abstracting over SSA form IRs to implement generic analyses