Renato Golin via llvm-dev
2021-Mar-03 11:24 UTC
[llvm-dev] [RFC] Upstreaming a proper SPIR-V backend
On Wed, 3 Mar 2021 at 10:11, Trifunovic, Konrad <konrad.trifunovic at intel.com> wrote:> > With that said, I understand that software development has many reality > concerns (like existing codebase, familiarity with different components, > etc.) and we have many different use cases, which may mean that different > paths make sense. So please don't take this as a negative feedback in > general. It's just that to me it's unclear how we can unify here right now. > Even when the time arrives for unification, I'd believe going through MLIR > is better to have general SPIR-V support. :) > > A very good discussion! I seem to be overly optimistic at the first place > at unifying those two approaches. Now I believe that we actually should > have two paths, for the reasons You have just explained and for the reasons > of supporting 'legacy' paths/compilers that rely on a classical, years old > approach: Front-End -> LLVM-IR (opt) -> backend (llc). For that legacy > path, a plain old 'backend' approach is still (in my view) the way to go. > On the other hand, when MLIR evolves and gets wider adoption, it will be > the way to go. From the semantic point of view, MLIR is much better suited > for representing structured and extensible nature of SPIR-V. But for MLIR > approach to be adopted, new languages/front-ends need to be aware of that > structure, so to take most of the advantage of it. If Clang C/C++ start to > use MLIR as its native generation format - that would be a big case for > MLIR approach, but until that happens, we need to have some intermediate > solution.I think there are two points here: 1. How many SPIRV end-points we have This is mostly about software engineering concerns of duplication, maintenance, etc. But it's also about IR support, with MLIR having an upper hand here because of the existing implementation and its inherent flexibility with dialects. It's perfectly fine to have two back-ends for a while, but since we moved MLIR to the monorepo, we need to treat it as part of the LLVM family, not a side project. LLVM IR has some "flexibility" through intrinsics, which we could use to translate MLIR concepts that can't be represented in LLVM IR for the purpose of lowering only. Optimisations on these intrinsics would bring the usual problems. 2. Where do the optimisations happen in code lowering to SPIRV I think Ronan's points are a good basis for keeping that in MLIR, at least for the current code. Now, if that precludes optimising in LLVM IR, than this could be a conflict with this proposal. If the code passes through MLIR or not will be a decision of the toolchain, that will pick the best path for each workload. This allows us to have concurrent approaches in tree, but also makes it hard to test and creates corner cases that are hard to test. So, while I appreciate this is a large proposal, that will likely take a year or more to get into shape, I think the ultimate goal (after the current proposal) should be that we end up with one back-end. I'm a big fan of MLIR, and I think we should keep developing the SPIRV dialect and possibly this could be the entry point of all SPIRV toolchains. While Clang will take a long time (if ever) to generate MLIR for C/C++, it could very well generate MLIR for non-C++ (OpenCL, OpenMP, SYCL, etc) which is then optimised, compiled into LLVM IR and linked to the main module (or not, for multi-targets) after high-level optimisations. This would answer both questions above and create a pipeline that is consistent, easier to test and with lower overall maintenance costs. cheers, --renato -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210303/30a4f79e/attachment.html>
Trifunovic, Konrad via llvm-dev
2021-Mar-03 17:48 UTC
[llvm-dev] [RFC] Upstreaming a proper SPIR-V backend
Answering to Renato's points:> I think there are two points here: > > 1. How many SPIRV end-points we haveI would rather call this 'two entry points', as to having two entry points for accessing SPIR-V: either through LLVM-IR with augmentation (metadata/intrinsics), or, a proper MLIR 'SPV' dialect.> This is mostly about software engineering concerns of duplication, maintenance, etc. But it's also about IR support, with MLIR having an upper hand here because of the existing implementation and its inherent flexibility with dialects. > > It's perfectly fine to have two back-ends for a while, but since we moved MLIR to the monorepo, we need to treat it as part of the LLVM family, not a side project.Agreed. We are not treating MLIR as a side project 😊> > LLVM IR has some "flexibility" through intrinsics, which we could use to translate MLIR concepts that can't be represented in LLVM IR for the purpose of lowering only. Optimisations on these intrinsics would bring the usual problems. > > 2. Where do the optimisations happen in code lowering to SPIRV > > I think Ronan's points are a good basis for keeping that in MLIR, at least for the current code. Now, if that precludes optimising in LLVM IR, than this could be a conflict with this proposal.I think You are referring to points made by Lei, not by Ronan 😉 I believe that the idea of MLIR path is to completely skip LLVM-IR optimization passes, so having just the MLIR 'entry point' would preclude that possibility. (though, it would be possible to do FE -> LLVM-IR (optimize here) -> LLVM-IR to MLIR -> MLIR to 'spv' MLIR dialect -> SPIR-V, but that seems like an overkill...)> If the code passes through MLIR or not will be a decision of the toolchain, that will pick the best path for each workload. This allows us to have concurrent approaches in tree, but also makes it hard to test and creates corner cases that are hard to test.If we provide two 'entry points' for accessing SPIR-V, it is up to the toolchain to decide the most convenient way. I'm not sure whether this would be a runtime decision though. I believe that all future front-ends would like to target MLIR directly (and skip LLVM-IR altogether).> So, while I appreciate this is a large proposal, that will likely take a year or more to get into shape, I think the ultimate goal (after the current proposal) should be that we end up with one back-end.Agree. Though I would say : one back-end, but two 'entry points'. As I wrote in a reply to Mehdi, it seems that an option of having 'LLVM IR backend' produce 'spv' MLIR dialect would be a good way to ultimately unify the implementation. Though, that seems like a longer distance and needs some research, since, at the moment I'm not sure how to tackle this (to have GlobalISel produce MLIR as an output).> I'm a big fan of MLIR, and I think we should keep developing the SPIRV dialect and possibly this could be the entry point of all SPIRV toolchains.MLIR should be an entry point for all future SPIRV toolchains - that is the future. There will be still toolchains that are legacy and cannot be rewritten to use MLIR.> While Clang will take a long time (if ever) to generate MLIR for C/C++, it could very well generate MLIR for non-C++ (OpenCL, OpenMP, SYCL, etc) which is then optimised, compiled into LLVM IR and linked to the main module (or not, for multi-targets) after high-level optimisations.I'm not sure about Clang OpenCL support. I believe that OpenCL C/C++ cannot produce MLIR directly. For OpenMP, I know that flang (Fortran) does have a MLIR based 'codegen'. Not sure about SYCL either. Someone from Intel should know? Does clang based SYCL compiler have a path to produce MLIR?> This would answer both questions above and create a pipeline that is consistent, easier to test and with lower overall maintenance costs.We should definitely aim at SPIR-V support to be less fragmented if possible (at the moment, we also have SPIR-V <-> LLVM bidirectional translator which is an external project).> cheers, > --renato
Ronan KERYELL via llvm-dev
2021-Mar-03 18:41 UTC
[llvm-dev] [RFC] Upstreaming a proper SPIR-V backend
>>>>> On Wed, 3 Mar 2021 11:24:45 +0000, Renato Golin via llvm-dev <llvm-dev at lists.llvm.org> said:Renato> While Clang will take a long time (if ever) to generate MLIR Renato> for C/C++, it could very well generate MLIR for non-C++ Renato> (OpenCL, OpenMP, SYCL, etc) which is then optimised, Renato> compiled into LLVM IR and linked to the main module (or not, Renato> for multi-targets) after high-level optimisations. Actually SYCL is pure C++, just with a few special C++ classes similar to some other special things like std::thread or setjump()/longjump(). OpenMP, when associated to C++, is also pure C++. In your list OpenCL is a language based on C/C++ to program accelerators while SYCL & OpenMP are single-source frameworks to program full applications using a host and some accelerators, with both part in the same source program in a seamless and type-safe way. So the MLIR approach is quite compelling with its "Multi-Level" representation of both the host and the device code to enable multi-level inter-procedural or inter-module optimizations which cannot be done today when compiling single-source OpenMP/SYCL/CUDA/HIP/OpenACC because most implementations use early outlining of the device code, thus it is super hard to do inter-module optimization later without a multi-level view. As you and most other people said, it looks we are stuck with plain LLVM for a while. But perhaps you were considering in your sentence the case where with OpenMP/SYCL/CUDA/HIP you generate LLVM for the host code part and MLIR just for the hardware accelerator parts? While it would obviously allow to recycle more easily the MLIR SPIR-V generator, it would still require somehow to generate MLIR from the C++ accelerator parts. At least the C++ code allowed in accelerator parts is often restricted, so it is easier to do than with full-fledged host part C++ and actually there are a few hacks trying to do this (for example leveraging Polly, PPCG...). But it seems we are far from a production-quality infrastructure yet. So it looks like, while we do not have a robust C++ to MLIR path, we need an LLVM IR to SPIR-V path somehow. At least, as others like Mehdi said, let's do good software engineering and factorize out as much as we can between the LLVM IR and MLIR paths. -- Ronan KERYELL