thr3ads.net - llvm dev - [llvm-dev] [RFC] Upstreaming a proper SPIR-V backend [Mar 2021]

If this information is useful, please help other people find it:
Share via:

Renato Golin via llvm-dev

2021-Mar-03 11:24 UTC

[llvm-dev] [RFC] Upstreaming a proper SPIR-V backend

On Wed, 3 Mar 2021 at 10:11, Trifunovic, Konrad <konrad.trifunovic at
intel.com>
wrote:
> > With that said, I understand that software development has many
reality
> concerns (like existing codebase, familiarity with different components,
> etc.) and we have many different use cases, which may mean that different
> paths make sense. So please don't take this as a negative feedback in
> general. It's just that to me it's unclear how we can unify here
right now.
> Even when the time arrives for unification, I'd believe going through
MLIR
> is better to have general SPIR-V support. :)
>
> A very good discussion! I seem to be overly optimistic at the first place
> at unifying those two approaches. Now I believe that we actually should
> have two paths, for the reasons You have just explained and for the reasons
> of supporting 'legacy' paths/compilers that rely on a classical,
years old
> approach: Front-End -> LLVM-IR (opt) -> backend (llc). For that
legacy
> path, a plain old 'backend' approach is still (in my view) the way
to go.
> On the other hand, when MLIR evolves and gets wider adoption, it will be
> the way to go. From the semantic point of view, MLIR is much better suited
> for representing structured and extensible nature of SPIR-V. But for MLIR
> approach to be adopted, new languages/front-ends need to be aware of that
> structure, so to take most of the advantage of it. If Clang C/C++ start to
> use MLIR as its native generation format - that would be a big case for
> MLIR approach, but until that happens, we need to have some intermediate
> solution.

I think there are two points here:

1. How many SPIRV end-points we have

This is mostly about software engineering concerns of duplication,
maintenance, etc. But it's also about IR support, with MLIR having an upper
hand here because of the existing implementation and its inherent
flexibility with dialects.

It's perfectly fine to have two back-ends for a while, but since we moved
MLIR to the monorepo, we need to treat it as part of the LLVM family, not a
side project.

LLVM IR has some "flexibility" through intrinsics, which we could use
to
translate MLIR concepts that can't be represented in LLVM IR for the
purpose of lowering only. Optimisations on these intrinsics would bring the
usual problems.

2. Where do the optimisations happen in code lowering to SPIRV

I think Ronan's points are a good basis for keeping that in MLIR, at least
for the current code. Now, if that precludes optimising in LLVM IR, than
this could be a conflict with this proposal.

If the code passes through MLIR or not will be a decision of the toolchain,
that will pick the best path for each workload. This allows us to have
concurrent approaches in tree, but also makes it hard to test and creates
corner cases that are hard to test.

So, while I appreciate this is a large proposal, that will likely take a
year or more to get into shape, I think the ultimate goal (after the
current proposal) should be that we end up with one back-end.

I'm a big fan of MLIR, and I think we should keep developing the SPIRV
dialect and possibly this could be the entry point of all SPIRV toolchains.

While Clang will take a long time (if ever) to generate MLIR for C/C++, it
could very well generate MLIR for non-C++ (OpenCL, OpenMP, SYCL, etc) which
is then optimised, compiled into LLVM IR and linked to the main module (or
not, for multi-targets) after high-level optimisations.

This would answer both questions above and create a pipeline that is
consistent, easier to test and with lower overall maintenance costs.

cheers,
--renato
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210303/30a4f79e/attachment.html>

Trifunovic, Konrad via llvm-dev

2021-Mar-03 17:48 UTC

head link

[llvm-dev] [RFC] Upstreaming a proper SPIR-V backend

Answering to Renato's points:
> I think there are two points here:
> 
> 1. How many SPIRV end-points we have
I would rather call this 'two entry points', as to having two entry
points for accessing SPIR-V: either through LLVM-IR with augmentation
(metadata/intrinsics), or, a proper MLIR 'SPV' dialect.
 > This is mostly about software engineering concerns of duplication,
maintenance, etc. But it's also about IR support, with MLIR having an upper
hand here because of the existing implementation and its inherent flexibility
with dialects.
> 
> It's perfectly fine to have two back-ends for a while, but since we
moved MLIR to the monorepo, we need to treat it as part of the LLVM family, not
a side project.
Agreed. We are not treating MLIR as a side project 😊 
> 
> LLVM IR has some "flexibility" through intrinsics, which we could
use to translate MLIR concepts that can't be represented in LLVM IR for the
purpose of lowering only. Optimisations on these intrinsics would bring the
usual problems.
> 
> 2. Where do the optimisations happen in code lowering to SPIRV
> 
> I think Ronan's points are a good basis for keeping that in MLIR, at
least for the current code. Now, if that precludes optimising in LLVM IR, than
this could be a conflict with this proposal.
I think You are referring to points made by Lei, not by Ronan 😉
I believe that the idea of MLIR path is to completely skip LLVM-IR optimization
passes, so having just the MLIR 'entry point' would preclude that
possibility. (though, it would be possible to do FE -> LLVM-IR (optimize
here) -> LLVM-IR to MLIR -> MLIR to 'spv' MLIR dialect ->
SPIR-V, but that seems like an overkill...)
 > If the code passes through MLIR or not will be a decision of the toolchain,
that will pick the best path for each workload. This allows us to have
concurrent approaches in tree, but also makes it hard to test and creates corner
cases that are hard to test.
If we provide two 'entry points' for accessing SPIR-V, it is up to the
toolchain to decide the most convenient way. I'm not sure whether this would
be a runtime decision though. I believe that all future front-ends would like to
target MLIR directly (and skip LLVM-IR altogether).
 > So, while I appreciate this is a large proposal, that will likely take a
year or more to get into shape, I think the ultimate goal (after the current
proposal) should be that we end up with one back-end.
Agree. Though I would say : one back-end, but two 'entry points'. As I
wrote in a reply to Mehdi, it seems that an option of having 'LLVM IR
backend' produce 'spv' MLIR dialect would be a good way to
ultimately unify the implementation. Though, that seems like a longer distance
and needs some research, since, at the moment I'm not sure how to tackle
this (to have GlobalISel produce MLIR as an output).
 > I'm a big fan of MLIR, and I think we should keep developing the SPIRV
dialect and possibly this could be the entry point of all SPIRV toolchains.
MLIR should be an entry point for all future SPIRV toolchains - that is the
future. There will be still toolchains that are legacy and cannot be rewritten
to use MLIR.

 > While Clang will take a long time (if ever) to generate MLIR for C/C++, it
could very well generate MLIR for non-C++ (OpenCL, OpenMP, SYCL, etc) which is
then optimised, compiled into LLVM IR and linked to the main module (or not, for
multi-targets) after high-level optimisations.
I'm not sure about Clang OpenCL support. I believe that OpenCL C/C++ cannot
produce MLIR directly. For OpenMP, I know that flang (Fortran) does have a MLIR
based 'codegen'. Not sure about SYCL either. Someone from Intel should
know? Does clang based SYCL compiler have a path to produce MLIR?
 > This would answer both questions above and create a pipeline that is
consistent, easier to test and with lower overall maintenance costs.
We should definitely aim at SPIR-V support to be less fragmented if possible (at
the moment, we also have SPIR-V <-> LLVM bidirectional translator which is
an external project).
> cheers,
> --renato

Ronan KERYELL via llvm-dev

2021-Mar-03 18:41 UTC

head link

[llvm-dev] [RFC] Upstreaming a proper SPIR-V backend

>>>>> On Wed, 3 Mar 2021 11:24:45 +0000, Renato Golin via
llvm-dev <llvm-dev at lists.llvm.org> said:
    Renato> While Clang will take a long time (if ever) to generate MLIR
    Renato> for C/C++, it could very well generate MLIR for non-C++
    Renato> (OpenCL, OpenMP, SYCL, etc) which is then optimised,
    Renato> compiled into LLVM IR and linked to the main module (or not,
    Renato> for multi-targets) after high-level optimisations.

Actually SYCL is pure C++, just with a few special C++ classes similar
to some other special things like std::thread or setjump()/longjump().

OpenMP, when associated to C++, is also pure C++.

In your list OpenCL is a language based on C/C++ to program accelerators
while SYCL & OpenMP are single-source frameworks to program full
applications using a host and some accelerators, with both part in the
same source program in a seamless and type-safe way.

So the MLIR approach is quite compelling with its "Multi-Level"
representation of both the host and the device code to enable
multi-level inter-procedural or inter-module optimizations which cannot
be done today when compiling single-source OpenMP/SYCL/CUDA/HIP/OpenACC
because most implementations use early outlining of the device code,
thus it is super hard to do inter-module optimization later without a
multi-level view.

As you and most other people said, it looks we are stuck with plain LLVM
for a while.

But perhaps you were considering in your sentence the case where with
OpenMP/SYCL/CUDA/HIP you generate LLVM for the host code part and MLIR
just for the hardware accelerator parts?

While it would obviously allow to recycle more easily the MLIR SPIR-V
generator, it would still require somehow to generate MLIR from the C++
accelerator parts. At least the C++ code allowed in accelerator parts is
often restricted, so it is easier to do than with full-fledged host part
C++ and actually there are a few hacks trying to do this (for example
leveraging Polly, PPCG...). But it seems we are far from a
production-quality infrastructure yet.

So it looks like, while we do not have a robust C++ to MLIR path, we need
an LLVM IR to SPIR-V path somehow.

At least, as others like Mehdi said, let's do good software
engineering and factorize out as much as we can between the LLVM IR and
MLIR paths.
-- 
  Ronan KERYELL

llvm dev - Mar 2021 - [RFC] Upstreaming a proper SPIR-V backend

[llvm-dev] [RFC] Upstreaming a proper SPIR-V backend

[llvm-dev] [RFC] Upstreaming a proper SPIR-V backend

[llvm-dev] [RFC] Upstreaming a proper SPIR-V backend