David Blaikie via llvm-dev
2021-Feb-11 00:37 UTC
[llvm-dev] [RFC] Abstracting over SSA form IRs to implement generic analyses
Owen - perhaps you've got some relevant perspective here given the GPU context/background, and general middle end optimization design questions. Could you take a look? (& perhaps Johannes?) On Mon, Feb 1, 2021 at 11:36 AM David Blaikie <dblaikie at gmail.com> wrote:> > Thanks for sending this out! > > +Mehdi - if you have a chance to look, I'd appreciate your thoughts on this, partly as a general LLVM contributor, but also with an eye to abstractions over multiple IRs given your work on MLIR > +Lang - you mentioned maybe some colleagues of yours might be able to take a look at the design here? > > I'd like to try writing up maybe alternatives to a couple of the patches in the series to see how they compare, but haven't done that as yet. > > My brief take is that I think the opaque handles make a fair bit of sense/seem high-value/low-cost (if it turned out the handle for one of these IR types needed to become a bit heavier, I don't think it'd be super difficult to revisit this code - make the generic handle large enough to accommodate whatever the largest concrete handle happens to be, etc). Though I'm still a bit less certain about the full runtime polymorphic abstractions. > > On Thu, Dec 17, 2020 at 11:17 AM Nicolai Hähnle via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> >> Hi LLVM community, >> >> Earlier this year I first proposed a new way of writing analyses that can be applied to multiple IRs, for example, applying the same analysis to both LLVM IR and MachineIR. >> >> LLVM already has some analyses like that; for example, dominator tree construction and loop info. However, they're all limited to looking at the control flow graph: basic blocks and lists of predecessors and successors. We want to push the envelope with a divergence analysis that is also aware of instructions and values, and ran into severe limitations in what we could do with the techniques that are commonly used in LLVM today. Some limitations are around which concepts are exposed generically at all, though the bulk of limitations revolves around readability and maintainability of the resulting generic code. >> >> After more evolution of the ideas and many discussions over the last few months, I want to raise this proposal once more -- this time with an extensive document: https://docs.google.com/document/d/1sbeGw5uNGFV0ZPVk6h8Q5_dRhk4qFnKHa-uZ-O3c4UY/edit?usp=sharing >> >> Feel free to comment on the document, though high-level discussion is probably best kept in this email thread. >> >> The concrete proposal is to enable 4 tools for use by generic analyses: >> >> - type erasure >> - an SsaContext context class concept with a fairly small surface area >> - dynamic polymorphism via per-analysis adapters >> - dynamic polymorphism via an LLVM-wide adapter of SsaContext >> >> The document goes to some length to explain what precisely is meant by each of those bullets, including code examples, as well as describing a few other options that we _don't_ propose, based on their relative merits. >> >> There are concrete patches that go along with the proposal and you can refer to for additional context. In logical sequence, they are: >> - https://reviews.llvm.org/D92924: Introduce opaque handles for type erasure >> - https://reviews.llvm.org/D83089: Based on the handle infrastructure, refactor the dominator tree with type-erased base classes that can be used by generic algorithms >> - https://reviews.llvm.org/D92925: Introduce an SsaContext context class concept for static polymorphism >> - https://reviews.llvm.org/D92926: Introduce an ISsaContext “global” interface class for dynamic polymorphism built on top of SsaContext and opaque handles >> - https://reviews.llvm.org/D83094: A new analysis (cycle info) written generically as non-template code using opaque handles, ISsaContext, and analysis-specific dynamic polymorphism via the ICycleInfoSsaContext interface added in the patch >> >> I would like us to get to general agreement on this thread that this is a direction we want to go in and that we can proceed with the proposed code changes. >> >> Thanks, >> Nicolai >> -- >> Lerne, wie die Welt wirklich ist, >> aber vergiss niemals, wie sie sein sollte. >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Sameer Sahasrabuddhe via llvm-dev
2021-Mar-30 16:20 UTC
[llvm-dev] [RFC] Abstracting over SSA form IRs to implement generic analyses
---- On Thu, 11 Feb 2021 06:07:17 +0530 David Blaikie via llvm-dev <llvm-dev at lists.llvm.org> wrote ---- > Owen - perhaps you've got some relevant perspective here given the GPU > context/background, and general middle end optimization design > questions. Could you take a look? > (& perhaps Johannes?) > > On Mon, Feb 1, 2021 at 11:36 AM David Blaikie <dblaikie at gmail.com> wrote: > > > > Thanks for sending this out! > > > > +Mehdi - if you have a chance to look, I'd appreciate your thoughts on this, partly as a general LLVM contributor, but also with an eye to abstractions over multiple IRs given your work on MLIR > > +Lang - you mentioned maybe some colleagues of yours might be able to take a look at the design here? > > > > I'd like to try writing up maybe alternatives to a couple of the patches in the series to see how they compare, but haven't done that as yet. > > > > My brief take is that I think the opaque handles make a fair bit of sense/seem high-value/low-cost (if it turned out the handle for one of these IR types needed to become a bit heavier, I don't think it'd be super difficult to revisit this code - make the generic handle large enough to accommodate whatever the largest concrete handle happens to be, etc). Though I'm still a bit less certain about the full runtime polymorphic abstractions. For context, this is about the options laid out in the following document from Nicolai: https://docs.google.com/document/d/1sbeGw5uNGFV0ZPVk6h8Q5_dRhk4qFnKHa-uZ-O3c4UY/edit?usp=sharing The above document was first introduced in the following email, which is the starting point of the current thread: https://lists.llvm.org/pipermail/llvm-dev/2020-December/147433.html The main issue is the need for an abstraction that greatly improves the experience of writing analyses that work on LLVM IR and MIR (and potentially also on MLIR). The way I understand it, Nicolai proposed an abstraction that involves dynamic polymorphism to abstract out the details of the underlying IR. David was more in favour of first unifying the IRs to the point that the dynamic polymorphism is limited to only occasional corner cases instead. I am currently involved in the activity towards decoupling AMDGPU's dependency on the abstractions being discussed here, and the one thing that immediately jumps out is the ability to traverse the predecessors and successors of a basic block. From all the emails so far and the resulting proposal, it seems that both David and Nicolai think that having a common non-abstract base class for basic blocks is a good idea. Such a base class will provide a vector of predecessors and successors that can be used to traverse the CFG independent of the actual IR. The current situation is that the basic blocks in LLVM IR and MLIR do not explicitly track their preds and succs. Preds are determined from the uses of the block, while succs are determined from the terminator instruction. MIR has explicit pred and succ vectors, but the succ pointers are duplicated by their presence as operands to the terminator instructions. The MachineBasicBlock is not a value, so there is no notion of traversing its uses for predecessors. So if we create a common non-abstract base class for basic blocks, then MLIR and LLVM IR will incur the overhead of two vectors, one each for the preds and succs. Nicolai had reported a 3.1% to 4.1% increase in the size LLVM IR in some typical programs: https://docs.google.com/spreadsheets/d/1cwRy2K4XjWCjwfK53MuCqws1TsQ0S6FtRjbwoUdGBfo/edit?usp=sharing Is this overhead the main issue in deciding whether we should introduce the common base class for traversing a CFG? Do note that MIR already duplicates successors. One could argue that duplicating the successor pointers in LLVM IR and MLIR merely brings them on the same level as MIR. If that argument sticks, then the "real" overhead is only half of what is reported to account for the duplicate predecessor pointers. The duplication of predecessor pointers stems from the fact that basic blocks in LLVM IR and MLIR are also values that are (primarily) used as operands to the terminator and PHI instructions. We could redefine these instructions to use non-value basic blocks as operands. CFG edges are not the typical use-def relation that other values represent, and it's reasonable to say that PHINodes and terminators are special in this one way. I have not managed to see how far that rabbit hole goes, but I am not really convinced that this is attractive in any way. Is this even an option? Sameer.