thr3ads.net - llvm dev - [llvm-dev] [RFC][HIPSPV] Emitting HIP device code as SPIR-V [Aug 2021]

If this information is useful, please help other people find it:
Share via:

Pekka Jääskeläinen via llvm-dev

2021-Aug-25 06:56 UTC

[llvm-dev] [RFC][HIPSPV] Emitting HIP device code as SPIR-V

Hi Konrad,

OK, thanks for the status update. We discussed this internally, and
our current understanding is that the change from using llvm-spriv for
the SPIR-V emission requires a) to change the target (in triple) from
spir64 to spirv64 b) calling the llc that invokes the LLVM backend
(BE) instead of the llvm-spriv translator tool.

Thus, to make the transition and testing while developing the BE
easier, we could contribute a patch to add a (hidden) command line
option that switches between these modes: 1) The initial default: emit
spir64 LLVM IR and call llvm-spirv. 2) Emit spirv64 LLVM IR instead
and call a user-specified llc binary.

When the LLVM BE lands upstream and is considered at least as
useful/stable as llvm-spriv, we can then switch the default of that
option to (2) instead of (1). This should allow easy testing of the
BE-based toolchain even before it lands to the upstream repo if we
allow redefining the llc binary location to the externally built LLVM
BE. How does that sound?

BR,
Pekka and Henry

On Tue, Aug 24, 2021 at 4:27 PM Trifunovic, Konrad via llvm-dev
<llvm-dev at lists.llvm.org> wrote:>
> Hi,
>
> Regarding the timeline for SPIR-V backend upstream, there is a tentative
plan to land the base patches until end of this year (2021). Nevertheless, at
that moment, SPIR-V backend will be in experimental state. After that, we need
to keep on adding missing functionality, fix bugs etc. so that we can pass a
Khronos CTS test-suite with using a SPIR-V backend (instead of LLVM-SPIRV
translator). At that moment, we might claim it is a production quality and not
anymore experimental. That might take next several months.
>
> I'm wondering - with the approach You have proposed - how much code
that You plan to contribute as the intermediate step will become obsolete once
there is a full switch to SPIR-V backend target?
>
> konrad
>
> > -----Original Message-----
> > From: Henry Linjamäki <henry.linjamaki at parmance.com>
> > Sent: Monday, August 23, 2021 12:59 PM
> > To: Anastasia Stulova <Anastasia.Stulova at arm.com>
> > Cc: cfe-dev at lists.llvm.org; llvm-dev at lists.llvm.org; yaxun.liu
at amd.com;
> > Trifunovic, Konrad <konrad.trifunovic at intel.com>
> > Subject: Re: [llvm-dev] [RFC][HIPSPV] Emitting HIP device code as
SPIR-V
> >
> > On Mon, 23 Aug 2021 at 12:53, Anastasia Stulova
> > <Anastasia.Stulova at arm.com> wrote:
> > >
> > > > Our planned HIP-SPIR-V tool chain calls the command-line
tool
> > > > ‘llvm-spirv’ for translating LLVM IR to SPIR-V, which is
then
> > > > embedded in the HIP binary as a byte array. The call to the
> > > > llvm-spirv tool is used as a temporary solution until the
SPIR-V backend lands
> > in the?
> > > > LLVM code base in the future. When the LLVM SPIR-V backend
is usable
> > > > in the upstream repo, we plan to simply switch to calling
the LLVM’s
> > > > internal ‘llc’ code generator tool for generating the SPIR-V
instead of llvm-
> > spirv.
> > > > Can you immediately spot problems with that approach?
> > >
> > > Since the translator is not part of the LLVM project do you plan
to
> > > add any command-line options to set its location or any other
> > > interface for this and if so would this be advertised as
temporary
> > > functionality? I imagine once you switch to the backend such
> > > functionality would not be needed any longer so I am wondering
how do you
> > see exactly the transition path...
> >
> > We believe it might be enough to locate the tool (llvm-spirv) in PATH
for now: A
> > CLI option would be a nice addition, but perhaps not worth it for a
temporary
> > solution - we expect the SPIR-V backend (and thus
> > llc) become usable sooner than later.
> >
> > >
> > > And another aspect to clarify is the migration path i.e. what do
you
> > > see as criteria for changing to the backend and how soon could
this happen?
> >
> > Calling the llvm-spirv in the PATH is not ideal: The tool’s version
might be too old
> > or new with respect to the calling LLVM’s version which might cause
issues if
> > there are incompatibilities with the LLVM IR. So, we don’t want to
keep using
> > the tool longer than necessary, and prefer to switch to the backend
shipped with
> > the LLVM installation as soon as it lands upstream. I think the
criterion for
> > switching to the BE instead of the llvm-spriv tool is when our
internal HIP test
> > suite (which will be published soon) passes on the upstream SPIR-V BE.
> >
> > >
> > > Thanks,
> > > Anastasia
> > >
> > > ________________________________
> > > From: Henry Linjamäki <henry.linjamaki at parmance.com>
> > > Sent: 23 August 2021 07:32
> > > To: Anastasia Stulova <Anastasia.Stulova at arm.com>
> > > Cc: cfe-dev at lists.llvm.org <cfe-dev at lists.llvm.org>;
> > > llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org>;
yaxun.liu at amd.com
> > > <yaxun.liu at amd.com>; Trifunovic, Konrad
<konrad.trifunovic at intel.com>
> > > Subject: Re: [llvm-dev] [RFC][HIPSPV] Emitting HIP device code as
> > > SPIR-V
> > >
> > > Hi Anastasia,
> > >
> > > On Fri, 20 Aug 2021 at 12:44, Anastasia Stulova
> > > <Anastasia.Stulova at arm.com> wrote:
> > > >
> > > > Hi Henry,
> > > >
> > > > Just to provide some background - we had a discussion about
the
> > > > integration of SPIRV-LLVM Translator some years back  [1]
and here
> > > > is the design of our user interface that has been proposed
at that
> > > > time [2]. I appreciate we might not be able to unify the
interfaces
> > > > completely but it would make sense to provide common
mechanisms for
> > > > different languages to use SPIR-V even if it might not be
achievable straight
> > away we should aim for it as an end goal.
> > > >
> > > > Considering that there seems to be a lot of interest in this
from
> > > > different language communities, can we clarify the full
plan? My
> > > > understanding is that you are proposing to add the
translator
> > > > temporarily and it will be replaced by the backend in the
future.
> > > > How do you envision this transition? Do you plan to provide
> > > > command-line options for the translator to be used that
would be
> > > > deprecated at some point later or would they be added as
temporary from
> > the start?
> > >
> > > Our planned HIP-SPIR-V tool chain calls the command-line tool
> > > ‘llvm-spirv’ for translating LLVM IR to SPIR-V, which is then
embedded
> > > in the HIP binary as a byte array. The call to the llvm-spirv
tool is
> > > used as a temporary solution until the SPIR-V backend lands in
the
> > > LLVM code base in the future. When the LLVM SPIR-V backend is
usable
> > > in the upstream repo, we plan to simply switch to calling the
LLVM’s
> > > internal ‘llc’ code generator tool for generating the SPIR-V
instead
> > > of llvm-spirv. Can you immediately spot problems with that
approach?
> > >
> > > We believe this is the best solution to integrate with the
toolchain
> > > infrastructure. Another would be to call the code generator at
LLVM
> > > API level, but it seems out of place for the toolchain framework:
The
> > > SPIR-V code generation path is not exposed so that clang
frontends
> > > could use it to emit SPIR-V for themselves.
> > >
> > > >
> > > > Another consideration is that we have invested quite a lot
of effort
> > > > in the alternative approach i.e using the SPIR-V backend
because
> > > > this was highlighted as the best viable approach for SPIR-V
support
> > > > in Clang and LLVM when we had our discussion some years
back. The
> > > > situation is likely different now and your proposal isnï¿½t
> > > > identical, also we havenï¿½t made a lot of progress with the
backend
> > > > yet. However, the integration of alternative SPIR-V
translation
> > > > might negatively impact the adoption of the backend. It
might also
> > > > result in either suboptimal design flow or code duplication
in
> > > > Clang. For example, we might need to redesign the OpenCL
builtins
> > representation and mapping to SPIR-V instructions.
> > > >
> > > > Considering that we might not be too far from integrating
the
> > > > backend into the LLVM, would it be reasonable to synchronize
with
> > > > the backend developers and see if the backend could be used
straight
> > > > away? I am looping in Konrad here who has been discussing
the
> > > > backend integration earlier this year [3]. Perhaps he can
provide some
> > insights about the backend work and the timeline for it.
> > > > Maybe you could start working on some parts that are not
related to
> > > > IR consumption first and then add the SPIR-V emission later
on and
> > > > hopefully, the timing can align with the backend work too.
> > > >
> > > > [1]
> > > >
https://lists.llvm.org/pipermail/llvm-dev/2018-September/125948.html
> > > > [2]
> > > >
https://github.com/KhronosGroup/SPIRV-LLVM-Translator/wiki/SPIRV-Too
> > > > lchain-for-Clang [3]
> > > >
https://lists.llvm.org/pipermail/llvm-dev/2021-March/148905.html
> > > >
> > > > Cheers,
> > > > Anastasia
> > > > ________________________________
> > > > From: Henry Linjamï¿½ki <henry.linjamaki at
parmance.com>
> > > > Sent: 17 August 2021 17:16
> > > > To: Anastasia Stulova <Anastasia.Stulova at arm.com>
> > > > Cc: cfe-dev at lists.llvm.org <cfe-dev at
lists.llvm.org>;
> > > > llvm-dev at lists.llvm.org <llvm-dev at
lists.llvm.org>; yaxun.liu at amd.com
> > > > <yaxun.liu at amd.com>
> > > > Subject: Re: [llvm-dev] [RFC][HIPSPV] Emitting HIP device
code as
> > > > SPIR-V
> > > >
> > > >
> > > >
> > > > Hi Anastasia,
> > > >
> > > > On Tue, 17 Aug 2021 at 12:53, Anastasia Stulova
> > > > <Anastasia.Stulova at arm.com> wrote:
> > > > >
> > > > > Hi Henry,
> > > > >
> > > > > > Since the SPIR-V BE might not land in LLVM soon,
we will set up
> > > > > > the compilation flow  to produce the SPIR-V binary
by using the
> > > > > > LLVM-SPIR-V translator [7]  which is used in our
experimental branch.
> > > > >
> > > > > Can you provide more details regarding this? Do you
plan to
> > > > > integrate the translator as an external tool?
> > > > >
> > > > The intention is to use the SPIRV-LLVM translator as a tool
outside
> > > > LLVM: either the tool is found in PATH or an error is
emitted instead.
> > > > Since Iï¿½m assuming that the new SPIR-V BE will eventually
land on
> > > > LLVM and supersede the translator, I donï¿½t see much
motivation for
> > > > cleaner integration of it to the LLVM project.
> > > >
> > > > > Overall, there seem to be a huge overlap with what we
need for
> > > > > OpenCL so it would be good to make sure we are aligned
and the new
> > > > > functionality is reusable for OpenCL too.
> > > > >
> > > > Sure. Iï¿½m not so familiar with the OpenCL infra in the
LLVM
> > > > currently, so can you elaborate on any major overlap that
OpenCL and
> > HIPSPV have?
> > > > We are planning to start submitting patches for review,
perhaps
> > > > itï¿½s easier to point the overlaps on per-patch basis in
the review
> > > > system then?
> > > >
> > > > > Cheers,
> > > > > Anastasia
> > > > >
> > > > >
> > > > > ________________________________
> > > > > From: llvm-dev <llvm-dev-bounces at
lists.llvm.org> on behalf of
> > > > > Henry Linjamï¿½ki via llvm-dev <llvm-dev at
lists.llvm.org>
> > > > > Sent: 09 August 2021 07:57
> > > > > To: cfe-dev at lists.llvm.org <cfe-dev at
lists.llvm.org>
> > > > > Cc: llvm-dev at lists.llvm.org <llvm-dev at
lists.llvm.org>;
> > > > > yaxun.liu at amd.com <yaxun.liu at amd.com>
> > > > > Subject: [llvm-dev] [RFC][HIPSPV] Emitting HIP device
code as
> > > > > SPIR-V
> > > > >
> > > > > Hi all,
> > > > >
> > > > > HIP is a C++ Runtime API and kernel language that
allows
> > > > > developers to create portable applications for AMD and
NVIDIA GPUs
> > > > > from a single source code [0]. There are also projects
for running
> > > > > HIP code on Intel GPU platforms via the Intel Level
Zero API [1]
> > > > > called HIPLZ [3] and HIPCL [2], which runs HIP programs
in OpenCL
> > > > > devices with certain advanced features supported. Both
of these
> > > > > backends consume SPIR-V binaries.
> > > > >
> > > > > We are proposing a patch set to be upstreamed that
enables SPIR-V
> > > > > emission through the HIP code path. The end goal of the
patches to
> > > > > be submitted is to emit SPIR-V binaries from HIP device
code so it
> > > > > can be embedded into executables for OpenCL-like
environments (at
> > > > > least for starters). Our current focus is on the two
> > > > > above-mentioned projects, HIPCL and HIPLZ which are
both
> > > > > work-in-progress HIP implementations. They itself do
not consume
> > > > > SPIR-V, but the device binaries are handed over to the
OpenCL and
> > > > > Intel Level Zero APIs, respectively.
> > > > >
> > > > > Coarsely, the current process of translating the HIP
code to
> > > > > SPIR-V in LLVM/Clang involves:
> > > > >
> > > > > * Retargeting HIP device code generation to the SPIR-V
target.
> > > > > * Mapping address spaces in HIP to corresponding ones
in SPIR-V.
> > > > > * Expanding HIP features, which can not be directly
modeled in SPIR-V
> > > > >   (e.g. dynamic shared memory).
> > > > >
> > > > > The HIPSPIRV experimental branch is available at [4].
Note that it
> > > > > is not yet in a state we intend to propose for
upstreaming, but
> > > > > shaping up the patches is a work in progress. Before
proceeding to
> > > > > shape up and submit the patches, we would like to get
feedback for
> > > > > the plans we have for upstreaming. In the following
sections, we
> > > > > open up the above points further and sketch our plans
for changes
> > > > > to LLVM (mostly to the Clang tool) to achieve the goal.
> > > > >
> > > > > Retargeting device codegen
> > > > > =========================> > > > >
> > > > > For making the HIP toolchain to emit and embed SPIR-V
we are
> > > > > tentatively planning the following changes to the
LLVM/Clang:
> > > > >
> > > > > * Introduce, at minimum, a 'spirv64'
architecture type in Triple. This
> > > > >   is what the SPIR-V backend [5] (SPIR-V BE) effort is
planning to
> > > > >   upstream. We would like to upstream this change in
advance to
> > > > >   specify the HIP SPIR-V device code target,
potentially before the
> > > > >   SPIR-V BE work lands.
> > > > >
> > > > > * Implement a new SPIRVTargetInfo and fill it with
necessary
> > > > >   information. For HIPCL/-LZ we are planning to adjust
the address
> > > > >   space mapping in a way which is discussed later in
the ï¿½address
> > > > >   space mappingï¿½ section.
> > > > >
> > > > > * Introduce a clang option to override the HIP device
code target. We
> > > > >   are interested in the option
ï¿½--offload=<target>ï¿½ discussed in the
> > > > >   'Unified offload option for
CUDA/HIP/OpenMP'-thread [6]. This option
> > > > >   would suit this use case well. As far as we know, the
subject has
> > > > >   not advanced further from the discussion - is anyone
working on it?
> > > > >
> > > > > * Compilation driver:
> > > > >
> > > > >   HIP offload builder is changed to retrieve the
offload device target
> > > > >   from the --offload option. If it is not present, it
can fall back to
> > > > >   AMD's default target for avoiding changing the
current default HIP
> > > > >   compilation behavior.
> > > > >
> > > > >   Temporarily change Driver to force clang to emit LLVM
bitcode for
> > > > >   SPIR-V targets in the backend compilation phase.
Otherwise, the
> > > > >   compilation will fail due to the lack of the real
SPIR-V BE in many
> > > > >   parts of the code. Reworked HIPToolChain takes care
of translating
> > > > >   the bitcode to SPIR-V during the linking phase. When
the SPIR-V BE
> > > > >   lands in LLVM, we can revert this change.
> > > > >
> > > > > * Introduce ï¿½hipspvï¿½ as an OS or environment type
in Triple. The
> > > > >   primary and the current use of the type is to select
device offload
> > > > >   toolchain for HIPCL/-LZ.
> > > > >
> > > > > * Implement a new toolchain class
'HIPSPVToolChain' in clang which is
> > > > >   selected when the HIP device target is specified to
be
> > > > >   ï¿½spirv64-unknown-hipspvï¿½ with the --offload
option. Since the SPIR-V
> > > > >   BE might not land in LLVM soon, we will set up the
compilation flow
> > > > >   to produce the SPIR-V binary by using the LLVM-SPIR-V
translator [7]
> > > > >   which is used in our experimental branch.
> > > > >
> > > > >   One important thing the toolchain does is to run one
or several LLVM
> > > > >   IR passes, which are needed by the HIPCL/LZ runtime,
on the final
> > > > >   fully linked device bitcode. The passes are required
to be run
> > > > >   during link time - all user specified device code and
HIPCL/LZ
> > > > >   device library routines have to be visible when the
passes are
> > > > >   run. The reason for the requirement is explained in
the 'HIP code
> > > > >   expansion' section. HIPSPVToolChain will use the
opt tool for
> > > > >   running the passes at link time.
> > > > >
> > > > > * Currently, HIPToolChain is derived from ROCmToolchain
and its long
> > > > >   chain of super classes (AMDGPUToolChain, Generic_ELF
and
> > > > >   Generic_GCC). The new upstreamed target would not
logically belong
> > > > >   under the AMDGPU/ROCm family so it does not make
sense to derive the
> > > > >   HIPCL toolchain from the HIP toolchain. Therefore, we
propose to:
> > > > >
> > > > >   - Create a new base HIP tool chain,
'BaseHIPToolChain' or just
> > > > >     'HIPToolChain', derived directly from
ToolChain and put any
> > > > >     HIP-related code that is common or that can be
reused in the
> > > > >     derived toolchains there.
> > > > >
> > > > >   - Derive a new HIPSPVToolChain from HIPToolChain.
> > > > >
> > > > >   - Rebase the HIPToolChain under the HIPToolChain and
rename it to
> > > > >     HIPAMDToolChain. Since the current HIPToolChain
depends on methods
> > > > >     in the super classes (e.g. AMDGPUToolChainï¿½s
getParsedTargetID)
> > > > >     the rebased class is planned to be a proxy class to
avoid code
> > > > >     duplication and to reduce the amount of changes.
Another option to
> > > > >     refactor the current HIPToolChain would be to use
multiple
> > > > >     heritance but that leads to dreaded diamond class
structure which
> > > > >     probably is not a great choice.
> > > > >
> > > > >   With the current plan, HIPToolChain is not going to
have much code
> > > > >   to be shared with the derived classes - so far only a
bit of the
> > > > >   ï¿½fat binaryï¿½ construction code is in sight for
sharing, so the
> > > > >   immediate gains for the effort seems small. However,
The TCï¿½s layout
> > > > >   is more logical and it may spark more HIP
implementations, as well
> > > > >   as help refactoring when going forward.
> > > > >
> > > > >
> > > > > Address space mapping
> > > > > ====================> > > > >
> > > > > Translating HIP device code to valid SPIR-V binary
requires tweaks
> > > > > on
> > > > > pointers:
> > > > >
> > > > > Pointers without address space (AS) qualification in
HIP programs
> > > > > are considered ï¿½flatï¿½ pointers - they can point to
function
> > > > > local, __device__, __shared__ and __constant__ memory
space
> > > > > dynamically, which matches the idea of ï¿½genericï¿½
pointers
> > > > > introduced in OpenCL 2.0. Therefore, the logical choice
for the
> > > > > flat pointers is to map them to generic pointers of
SPIR-Vï¿½s
> > > > > OpenCL environment. HIPCLï¿½s and HIPLZï¿½s SPIR-V
environment
> > > > > mandates that the kernel pointer parameters must point
to
> > > > > __global, __local or __constant memory (these are named
> > > > > differently in SPIR-V; using OpenCL names as they are
more
> > > > > familiar). So HIP pointer parameters in the HIP kernel
> > > > > (__global__) functions would be mapped to global
pointers. Otherwise, HIP
> > pointers with AS qualifiers are mapped to SPIR-V equivalent, if
suitable.
> > > > >
> > > > > Now, there are significant differences between HIPï¿½s
> > > > > __constant__ and SPIR-V/OpenCLï¿½s constant address
space:
> > > > >
> > > > > * In HIP, __constant__ globals can be altered on the
host side with
> > > > >   the hipMemcpyToSymbol() API function. In the
OpenCLï¿½s host API you
> > > > >   cannot do this.
> > > > >
> > > > >   (Side-note: OpenCL host API does not have an
equivalent method for
> > > > >   hipMemcpyToSymbol but HIPCL currently supports
hipMemcpyToSymbol
> > for
> > > > >   the global __global variables via Intelï¿½s
> > > > >   clGetDeviceGlobalVariablePointerINTEL API extension,
but we are
> > > > >   planning to inject shadow kernel commands that access
the global
> > > > >   variables instead for portability.)
> > > > >
> > > > > * In HIP flat pointers can point to __constant__
memory. In OpenCL
> > > > >   this is not the case with __generic pointers, which
means __constant
> > > > >   pointers cannot be casted to __generic pointers and
vice versa.
> > > > >
> > > > > There are a couple ways to deal with constants:
> > > > >
> > > > > * Map __constant__ to __global space in SPIR-V. That
way we can
> > > > >   generate code that works and is simple to implement.
Of course, we
> > > > >   lose the optimization/placing benefits of constant
memory.
> > > > >
> > > > > * Transform the code after clang codegen (by an LLVM
pass) by
> > > > >   converting the __constant objects to kernel
arguments. This covers
> > > > >   the hipMemcpyToSymbol() case. There is still the
constant-to-generic
> > > > >   cast issue, so we would have to use the previous
point as the
> > > > >   fallback.
> > > > >
> > > > > We plan to start by upstreaming the first option, and
time
> > > > > permitting, improve by implementing the second option.
> > > > >
> > > > > The planned changes to Clang to achieve the
aforementioned AS
> > > > > mapping are as follows:
> > > > >
> > > > > * Define address space mapping in the new,
aforementioned
> > > > >   SPIRVTargetInfo to map CUDA address spaces (which the
HIP reuses) to
> > > > >   do the mapping mentioned earlier. Default AS (0) used
for the flat
> > > > >   pointers are mapped to the SPIR-Vï¿½s ï¿½genericï¿½.
We intend this
> > > > >   mapping being enabled when the language mode is HIP.
> > > > >
> > > > > * Change SPIRABIInfo to coerce kernel AS-unqualified
pointer arguments
> > > > >   to __global ones. Pointer arguments in regular device
functions
> > > > >   receive the __generic AS qualifier via the address
space mapping
> > > > >   defined in SPIRVTargetInfo in the above point.
> > > > >
> > > > >
> > > > > HIP code expansion
> > > > > =================> > > > >
> > > > > There are features in HIP language which do not have
direct
> > > > > counterparts in SPIR-Vï¿½s OpenCL environment and those
features
> > > > > need to be rewritten before translation to SPIR-V (in
the future,
> > > > > lowering to SPIR-V machine code through the new BE).
The
> > > > > non-exhaustive list of features that need to be
expanded includes:
> > > > >
> > > > > * Dynamic shared memory allocation (DSM): It is an
array which is
> > > > >   declared globally in LLVM IR and its actual size
determined at
> > > > >   kernel launch. OpTypeRuntimeArray in SPIR-V is the
closest thing to
> > > > >   model this object, alas, it requires shader
capability.
> > > > >
> > > > > * abort() builtin: No counterpart in SPIR-V/OpenCL.
> > > > >   (Note: the behavior is not well specified in the HIP
spec
> > > > >   either. Assuming it terminates the whole grid if any
work item
> > > > >   reaches it. AMDï¿½s abort definition calls
__builtin_trap).
> > > > >
> > > > > * printf(): OpenCLï¿½s printf takes the format string
as ï¿½__constant__
> > > > >   char*ï¿½ while in HIP the format string does not have
to reside in
> > > > >   constant memory.
> > > > >
> > > > > * Texture objects. These roughly correspond to image
and sampler
> > > > >   objects of OpenCL combined. Also, texture objects
carry more
> > > > >   information for the texture functions than
image+sampler objects do.
> > > > >
> > > > > * Texture references. Same as above but these are
program global
> > > > >   objects. In OpenCL, image objects cannot reside in
the program
> > > > >   global space.
> > > > >
> > > > > HIPCL/-LZï¿½s solution to the DSM allocation case is
that the
> > > > > runtime allocates a shared buffer and passes it to the
kernel as
> > > > > an additional argument (which is hidden from the user).
The device
> > > > > code is modified so that the DSM object is replaced
with the new
> > > > > kernel argument. Various other cases listed will be
handled similarly:
> > > > >
> > > > > * For the printf case we tentatively replace the printf
calls with a
> > > > >   function that packs their arguments to an additional
buffer passed
> > > > >   as additional kernel argument and do the printing on
the host side.
> > > > >
> > > > > * Texture objects will be tentatively split to image
and sampler
> > > > >   objects and possibly auxiliary struct to carry
texture
> > > > >   settings. This means at least that the kernel
parameter listing
> > > > >   needs to be rewritten for the Texture objects.
> > > > >
> > > > > * For the texture reference we tentatively planned
replacing the
> > > > >   global texture objects also with a number of
additional kernel
> > > > >   arguments.
> > > > >
> > > > > For this and other HIP features we need to apply LLVM
IR passes to
> > > > > perform modifications on the device code. In many cases
the passes
> > > > > should be run when the device code (as LLVM bitcode) is
fully
> > > > > linked. This is simply achieved as the HIP offload
mechanism
> > > > > already emits device code as LLVM bitcode in RDC mode
(-fgpu-rdc),
> > > > > so during linking we do receive the device code as LLVM
bitcode
> > > > > where to apply these expansions with full view of the
device code.
> > > > >
> > > > > The current plan for implementing this is to make the
> > > > > HIPSPVToolChain to build a linker that uses llvm-link
for linking
> > > > > device code, opt for running the IR passes needed and
the external
> > > > > llvm-spirv tool (llc in the future when the SPIR-V BE
lands) for
> > > > > emitting the SPIR-V binary. We load the passes from a
path the
> > > > > user provides via --hip-link-pass-path (name pending)
or
> > > > > automatically from HIP runtimeï¿½s installation
location by using
> > > > > the search logic provided by ROCmInstallationDetector.
> > > > >
> > > > > There is interest in upstreaming the HIPCL/-LZ passes
from the
> > > > > HIPCL/-LZ repositories in the future for reduced
maintenance
> > > > > burden. However, we are not attempting to upstream them
initially,
> > > > > as they are not yet completed and are subject to rapid
changes.
> > > > > Question
> > > > > is: Where should the passes eventually be put in within
the LLVM
> > > > > project tree? Could it be OK to add a new directory
under Clang
> > > > > for tool chain passes?
> > > > >
> > > > >
> > > > > Testing
> > > > > ======> > > > >
> > > > > We will provide llvm-lit tests for our toolchain in the
upstream.
> > > > > We also want to add tests to make sure clang who will
run the
> > > > > HIPCL/-LZ runtime passes get run at device code link
time. For
> > > > > this we need a dummy pass plugin that the clang loads
during the test.
> > > > >
> > > > > When the new LLVM SPIR-V BE work lands on LLVM, we will
add SPIR-V
> > > > > assembly checks that are relevant for HIPSPV.
> > > > >
> > > > >
> > > > > References
> > > > > =========> > > > >
> > > > > [0]:
> > > > >
> > https://rocmdocs.amd.com/en/latest/Programming_Guides/Programming-
> > > > > Guides.html
> > > > > [1]:
https://spec.oneapi.com/level-zero/latest/index.html
> > > > > [2]: https://github.com/cpc/hipcl
> > > > > [3]: https://github.com/jz10/anl-gt-gpu
> > > > > [4]:
https://github.com/parmance/llvm-project/tree/hip2spirv-v5
> > > > > [5]: https://github.com/KhronosGroup/LLVM-SPIRV-Backend
> > > > > [6]:
> > > > >
https://lists.llvm.org/pipermail/cfe-dev/2020-December/067362.html
> > > > > [7]:
https://github.com/KhronosGroup/SPIRV-LLVM-Translator
> > > > > _______________________________________________
> > > > > LLVM Developers mailing list
> > > > > llvm-dev at lists.llvm.org
> > > > >
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> > > >
> > > >
> > > >
> > > > --
> > > > BR,
> > > > Henry Linjamï¿½ki
> > > >
> > >
> > > BR,
> > > Henry and Pekka
> >
> > BR,
> > Henry and Pekka
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Anastasia Stulova via llvm-dev

2021-Aug-25 10:13 UTC

head link

[llvm-dev] [RFC][HIPSPV] Emitting HIP device code as SPIR-V

Hi Pekka,

Can you also clarify how do you plan to test the SPIR-V generation?

Will you be adding IR only tests or also SPIR-V assembly/binary tests?
If the latter one, does it mean that some LLVM test bots will need an
installation and invocation of the translator?

In general, it feels like at least we will likely be ending up with duplicate
testing up until the transition phase is finalized.

Cheers,
Anastasia

________________________________
From: Pekka Jääskeläinen <pekka at parmance.com>
Sent: 25 August 2021 07:56
To: Trifunovic, Konrad <konrad.trifunovic at intel.com>
Cc: Henry Linjamäki <henry.linjamaki at parmance.com>; Anastasia Stulova
<Anastasia.Stulova at arm.com>; llvm-dev at lists.llvm.org <llvm-dev at
lists.llvm.org>; yaxun.liu at amd.com <yaxun.liu at amd.com>; cfe-dev
at lists.llvm.org <cfe-dev at lists.llvm.org>; Videau, Brice <bvideau
at anl.gov>; Harms, Kevin <harms at alcf.anl.gov>
Subject: Re: [llvm-dev] [RFC][HIPSPV] Emitting HIP device code as SPIR-V

Hi Konrad,

OK, thanks for the status update. We discussed this internally, and
our current understanding is that the change from using llvm-spriv for
the SPIR-V emission requires a) to change the target (in triple) from
spir64 to spirv64 b) calling the llc that invokes the LLVM backend
(BE) instead of the llvm-spriv translator tool.

Thus, to make the transition and testing while developing the BE
easier, we could contribute a patch to add a (hidden) command line
option that switches between these modes: 1) The initial default: emit
spir64 LLVM IR and call llvm-spirv. 2) Emit spirv64 LLVM IR instead
and call a user-specified llc binary.

When the LLVM BE lands upstream and is considered at least as
useful/stable as llvm-spriv, we can then switch the default of that
option to (2) instead of (1). This should allow easy testing of the
BE-based toolchain even before it lands to the upstream repo if we
allow redefining the llc binary location to the externally built LLVM
BE. How does that sound?

BR,
Pekka and Henry

On Tue, Aug 24, 2021 at 4:27 PM Trifunovic, Konrad via llvm-dev
<llvm-dev at lists.llvm.org> wrote:>
> Hi,
>
> Regarding the timeline for SPIR-V backend upstream, there is a tentative
plan to land the base patches until end of this year (2021). Nevertheless, at
that moment, SPIR-V backend will be in experimental state. After that, we need
to keep on adding missing functionality, fix bugs etc. so that we can pass a
Khronos CTS test-suite with using a SPIR-V backend (instead of LLVM-SPIRV
translator). At that moment, we might claim it is a production quality and not
anymore experimental. That might take next several months.
>
> I'm wondering - with the approach You have proposed - how much code
that You plan to contribute as the intermediate step will become obsolete once
there is a full switch to SPIR-V backend target?
>
> konrad
>
> > -----Original Message-----
> > From: Henry Linjamäki <henry.linjamaki at parmance.com>
> > Sent: Monday, August 23, 2021 12:59 PM
> > To: Anastasia Stulova <Anastasia.Stulova at arm.com>
> > Cc: cfe-dev at lists.llvm.org; llvm-dev at lists.llvm.org; yaxun.liu
at amd.com;
> > Trifunovic, Konrad <konrad.trifunovic at intel.com>
> > Subject: Re: [llvm-dev] [RFC][HIPSPV] Emitting HIP device code as
SPIR-V
> >
> > On Mon, 23 Aug 2021 at 12:53, Anastasia Stulova
> > <Anastasia.Stulova at arm.com> wrote:
> > >
> > > > Our planned HIP-SPIR-V tool chain calls the command-line
tool
> > > > ‘llvm-spirv’ for translating LLVM IR to SPIR-V, which is
then
> > > > embedded in the HIP binary as a byte array. The call to the
> > > > llvm-spirv tool is used as a temporary solution until the
SPIR-V backend lands
> > in the?
> > > > LLVM code base in the future. When the LLVM SPIR-V backend
is usable
> > > > in the upstream repo, we plan to simply switch to calling
the LLVM’s
> > > > internal ‘llc’ code generator tool for generating the SPIR-V
instead of llvm-
> > spirv.
> > > > Can you immediately spot problems with that approach?
> > >
> > > Since the translator is not part of the LLVM project do you plan
to
> > > add any command-line options to set its location or any other
> > > interface for this and if so would this be advertised as
temporary
> > > functionality? I imagine once you switch to the backend such
> > > functionality would not be needed any longer so I am wondering
how do you
> > see exactly the transition path...
> >
> > We believe it might be enough to locate the tool (llvm-spirv) in PATH
for now: A
> > CLI option would be a nice addition, but perhaps not worth it for a
temporary
> > solution - we expect the SPIR-V backend (and thus
> > llc) become usable sooner than later.
> >
> > >
> > > And another aspect to clarify is the migration path i.e. what do
you
> > > see as criteria for changing to the backend and how soon could
this happen?
> >
> > Calling the llvm-spirv in the PATH is not ideal: The tool’s version
might be too old
> > or new with respect to the calling LLVM’s version which might cause
issues if
> > there are incompatibilities with the LLVM IR. So, we don’t want to
keep using
> > the tool longer than necessary, and prefer to switch to the backend
shipped with
> > the LLVM installation as soon as it lands upstream. I think the
criterion for
> > switching to the BE instead of the llvm-spriv tool is when our
internal HIP test
> > suite (which will be published soon) passes on the upstream SPIR-V BE.
> >
> > >
> > > Thanks,
> > > Anastasia
> > >
> > > ________________________________
> > > From: Henry Linjamäki <henry.linjamaki at parmance.com>
> > > Sent: 23 August 2021 07:32
> > > To: Anastasia Stulova <Anastasia.Stulova at arm.com>
> > > Cc: cfe-dev at lists.llvm.org <cfe-dev at lists.llvm.org>;
> > > llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org>;
yaxun.liu at amd.com
> > > <yaxun.liu at amd.com>; Trifunovic, Konrad
<konrad.trifunovic at intel.com>
> > > Subject: Re: [llvm-dev] [RFC][HIPSPV] Emitting HIP device code as
> > > SPIR-V
> > >
> > > Hi Anastasia,
> > >
> > > On Fri, 20 Aug 2021 at 12:44, Anastasia Stulova
> > > <Anastasia.Stulova at arm.com> wrote:
> > > >
> > > > Hi Henry,
> > > >
> > > > Just to provide some background - we had a discussion about
the
> > > > integration of SPIRV-LLVM Translator some years back  [1]
and here
> > > > is the design of our user interface that has been proposed
at that
> > > > time [2]. I appreciate we might not be able to unify the
interfaces
> > > > completely but it would make sense to provide common
mechanisms for
> > > > different languages to use SPIR-V even if it might not be
achievable straight
> > away we should aim for it as an end goal.
> > > >
> > > > Considering that there seems to be a lot of interest in this
from
> > > > different language communities, can we clarify the full
plan? My
> > > > understanding is that you are proposing to add the
translator
> > > > temporarily and it will be replaced by the backend in the
future.
> > > > How do you envision this transition? Do you plan to provide
> > > > command-line options for the translator to be used that
would be
> > > > deprecated at some point later or would they be added as
temporary from
> > the start?
> > >
> > > Our planned HIP-SPIR-V tool chain calls the command-line tool
> > > ‘llvm-spirv’ for translating LLVM IR to SPIR-V, which is then
embedded
> > > in the HIP binary as a byte array. The call to the llvm-spirv
tool is
> > > used as a temporary solution until the SPIR-V backend lands in
the
> > > LLVM code base in the future. When the LLVM SPIR-V backend is
usable
> > > in the upstream repo, we plan to simply switch to calling the
LLVM’s
> > > internal ‘llc’ code generator tool for generating the SPIR-V
instead
> > > of llvm-spirv. Can you immediately spot problems with that
approach?
> > >
> > > We believe this is the best solution to integrate with the
toolchain
> > > infrastructure. Another would be to call the code generator at
LLVM
> > > API level, but it seems out of place for the toolchain framework:
The
> > > SPIR-V code generation path is not exposed so that clang
frontends
> > > could use it to emit SPIR-V for themselves.
> > >
> > > >
> > > > Another consideration is that we have invested quite a lot
of effort
> > > > in the alternative approach i.e using the SPIR-V backend
because
> > > > this was highlighted as the best viable approach for SPIR-V
support
> > > > in Clang and LLVM when we had our discussion some years
back. The
> > > > situation is likely different now and your proposal isnï¿½t
> > > > identical, also we havenï¿½t made a lot of progress with the
backend
> > > > yet. However, the integration of alternative SPIR-V
translation
> > > > might negatively impact the adoption of the backend. It
might also
> > > > result in either suboptimal design flow or code duplication
in
> > > > Clang. For example, we might need to redesign the OpenCL
builtins
> > representation and mapping to SPIR-V instructions.
> > > >
> > > > Considering that we might not be too far from integrating
the
> > > > backend into the LLVM, would it be reasonable to synchronize
with
> > > > the backend developers and see if the backend could be used
straight
> > > > away? I am looping in Konrad here who has been discussing
the
> > > > backend integration earlier this year [3]. Perhaps he can
provide some
> > insights about the backend work and the timeline for it.
> > > > Maybe you could start working on some parts that are not
related to
> > > > IR consumption first and then add the SPIR-V emission later
on and
> > > > hopefully, the timing can align with the backend work too.
> > > >
> > > > [1]
> > > >
https://lists.llvm.org/pipermail/llvm-dev/2018-September/125948.html
> > > > [2]
> > > >
https://github.com/KhronosGroup/SPIRV-LLVM-Translator/wiki/SPIRV-Too
> > > > lchain-for-Clang [3]
> > > >
https://lists.llvm.org/pipermail/llvm-dev/2021-March/148905.html
> > > >
> > > > Cheers,
> > > > Anastasia
> > > > ________________________________
> > > > From: Henry Linjamï¿½ki <henry.linjamaki at
parmance.com>
> > > > Sent: 17 August 2021 17:16
> > > > To: Anastasia Stulova <Anastasia.Stulova at arm.com>
> > > > Cc: cfe-dev at lists.llvm.org <cfe-dev at
lists.llvm.org>;
> > > > llvm-dev at lists.llvm.org <llvm-dev at
lists.llvm.org>; yaxun.liu at amd.com
> > > > <yaxun.liu at amd.com>
> > > > Subject: Re: [llvm-dev] [RFC][HIPSPV] Emitting HIP device
code as
> > > > SPIR-V
> > > >
> > > >
> > > >
> > > > Hi Anastasia,
> > > >
> > > > On Tue, 17 Aug 2021 at 12:53, Anastasia Stulova
> > > > <Anastasia.Stulova at arm.com> wrote:
> > > > >
> > > > > Hi Henry,
> > > > >
> > > > > > Since the SPIR-V BE might not land in LLVM soon,
we will set up
> > > > > > the compilation flow  to produce the SPIR-V binary
by using the
> > > > > > LLVM-SPIR-V translator [7]  which is used in our
experimental branch.
> > > > >
> > > > > Can you provide more details regarding this? Do you
plan to
> > > > > integrate the translator as an external tool?
> > > > >
> > > > The intention is to use the SPIRV-LLVM translator as a tool
outside
> > > > LLVM: either the tool is found in PATH or an error is
emitted instead.
> > > > Since Iï¿½m assuming that the new SPIR-V BE will eventually
land on
> > > > LLVM and supersede the translator, I donï¿½t see much
motivation for
> > > > cleaner integration of it to the LLVM project.
> > > >
> > > > > Overall, there seem to be a huge overlap with what we
need for
> > > > > OpenCL so it would be good to make sure we are aligned
and the new
> > > > > functionality is reusable for OpenCL too.
> > > > >
> > > > Sure. Iï¿½m not so familiar with the OpenCL infra in the
LLVM
> > > > currently, so can you elaborate on any major overlap that
OpenCL and
> > HIPSPV have?
> > > > We are planning to start submitting patches for review,
perhaps
> > > > itï¿½s easier to point the overlaps on per-patch basis in
the review
> > > > system then?
> > > >
> > > > > Cheers,
> > > > > Anastasia
> > > > >
> > > > >
> > > > > ________________________________
> > > > > From: llvm-dev <llvm-dev-bounces at
lists.llvm.org> on behalf of
> > > > > Henry Linjamï¿½ki via llvm-dev <llvm-dev at
lists.llvm.org>
> > > > > Sent: 09 August 2021 07:57
> > > > > To: cfe-dev at lists.llvm.org <cfe-dev at
lists.llvm.org>
> > > > > Cc: llvm-dev at lists.llvm.org <llvm-dev at
lists.llvm.org>;
> > > > > yaxun.liu at amd.com <yaxun.liu at amd.com>
> > > > > Subject: [llvm-dev] [RFC][HIPSPV] Emitting HIP device
code as
> > > > > SPIR-V
> > > > >
> > > > > Hi all,
> > > > >
> > > > > HIP is a C++ Runtime API and kernel language that
allows
> > > > > developers to create portable applications for AMD and
NVIDIA GPUs
> > > > > from a single source code [0]. There are also projects
for running
> > > > > HIP code on Intel GPU platforms via the Intel Level
Zero API [1]
> > > > > called HIPLZ [3] and HIPCL [2], which runs HIP programs
in OpenCL
> > > > > devices with certain advanced features supported. Both
of these
> > > > > backends consume SPIR-V binaries.
> > > > >
> > > > > We are proposing a patch set to be upstreamed that
enables SPIR-V
> > > > > emission through the HIP code path. The end goal of the
patches to
> > > > > be submitted is to emit SPIR-V binaries from HIP device
code so it
> > > > > can be embedded into executables for OpenCL-like
environments (at
> > > > > least for starters). Our current focus is on the two
> > > > > above-mentioned projects, HIPCL and HIPLZ which are
both
> > > > > work-in-progress HIP implementations. They itself do
not consume
> > > > > SPIR-V, but the device binaries are handed over to the
OpenCL and
> > > > > Intel Level Zero APIs, respectively.
> > > > >
> > > > > Coarsely, the current process of translating the HIP
code to
> > > > > SPIR-V in LLVM/Clang involves:
> > > > >
> > > > > * Retargeting HIP device code generation to the SPIR-V
target.
> > > > > * Mapping address spaces in HIP to corresponding ones
in SPIR-V.
> > > > > * Expanding HIP features, which can not be directly
modeled in SPIR-V
> > > > >   (e.g. dynamic shared memory).
> > > > >
> > > > > The HIPSPIRV experimental branch is available at [4].
Note that it
> > > > > is not yet in a state we intend to propose for
upstreaming, but
> > > > > shaping up the patches is a work in progress. Before
proceeding to
> > > > > shape up and submit the patches, we would like to get
feedback for
> > > > > the plans we have for upstreaming. In the following
sections, we
> > > > > open up the above points further and sketch our plans
for changes
> > > > > to LLVM (mostly to the Clang tool) to achieve the goal.
> > > > >
> > > > > Retargeting device codegen
> > > > > =========================> > > > >
> > > > > For making the HIP toolchain to emit and embed SPIR-V
we are
> > > > > tentatively planning the following changes to the
LLVM/Clang:
> > > > >
> > > > > * Introduce, at minimum, a 'spirv64'
architecture type in Triple. This
> > > > >   is what the SPIR-V backend [5] (SPIR-V BE) effort is
planning to
> > > > >   upstream. We would like to upstream this change in
advance to
> > > > >   specify the HIP SPIR-V device code target,
potentially before the
> > > > >   SPIR-V BE work lands.
> > > > >
> > > > > * Implement a new SPIRVTargetInfo and fill it with
necessary
> > > > >   information. For HIPCL/-LZ we are planning to adjust
the address
> > > > >   space mapping in a way which is discussed later in
the ï¿½address
> > > > >   space mappingï¿½ section.
> > > > >
> > > > > * Introduce a clang option to override the HIP device
code target. We
> > > > >   are interested in the option
ï¿½--offload=<target>ï¿½ discussed in the
> > > > >   'Unified offload option for
CUDA/HIP/OpenMP'-thread [6]. This option
> > > > >   would suit this use case well. As far as we know, the
subject has
> > > > >   not advanced further from the discussion - is anyone
working on it?
> > > > >
> > > > > * Compilation driver:
> > > > >
> > > > >   HIP offload builder is changed to retrieve the
offload device target
> > > > >   from the --offload option. If it is not present, it
can fall back to
> > > > >   AMD's default target for avoiding changing the
current default HIP
> > > > >   compilation behavior.
> > > > >
> > > > >   Temporarily change Driver to force clang to emit LLVM
bitcode for
> > > > >   SPIR-V targets in the backend compilation phase.
Otherwise, the
> > > > >   compilation will fail due to the lack of the real
SPIR-V BE in many
> > > > >   parts of the code. Reworked HIPToolChain takes care
of translating
> > > > >   the bitcode to SPIR-V during the linking phase. When
the SPIR-V BE
> > > > >   lands in LLVM, we can revert this change.
> > > > >
> > > > > * Introduce ï¿½hipspvï¿½ as an OS or environment type
in Triple. The
> > > > >   primary and the current use of the type is to select
device offload
> > > > >   toolchain for HIPCL/-LZ.
> > > > >
> > > > > * Implement a new toolchain class
'HIPSPVToolChain' in clang which is
> > > > >   selected when the HIP device target is specified to
be
> > > > >   ï¿½spirv64-unknown-hipspvï¿½ with the --offload
option. Since the SPIR-V
> > > > >   BE might not land in LLVM soon, we will set up the
compilation flow
> > > > >   to produce the SPIR-V binary by using the LLVM-SPIR-V
translator [7]
> > > > >   which is used in our experimental branch.
> > > > >
> > > > >   One important thing the toolchain does is to run one
or several LLVM
> > > > >   IR passes, which are needed by the HIPCL/LZ runtime,
on the final
> > > > >   fully linked device bitcode. The passes are required
to be run
> > > > >   during link time - all user specified device code and
HIPCL/LZ
> > > > >   device library routines have to be visible when the
passes are
> > > > >   run. The reason for the requirement is explained in
the 'HIP code
> > > > >   expansion' section. HIPSPVToolChain will use the
opt tool for
> > > > >   running the passes at link time.
> > > > >
> > > > > * Currently, HIPToolChain is derived from ROCmToolchain
and its long
> > > > >   chain of super classes (AMDGPUToolChain, Generic_ELF
and
> > > > >   Generic_GCC). The new upstreamed target would not
logically belong
> > > > >   under the AMDGPU/ROCm family so it does not make
sense to derive the
> > > > >   HIPCL toolchain from the HIP toolchain. Therefore, we
propose to:
> > > > >
> > > > >   - Create a new base HIP tool chain,
'BaseHIPToolChain' or just
> > > > >     'HIPToolChain', derived directly from
ToolChain and put any
> > > > >     HIP-related code that is common or that can be
reused in the
> > > > >     derived toolchains there.
> > > > >
> > > > >   - Derive a new HIPSPVToolChain from HIPToolChain.
> > > > >
> > > > >   - Rebase the HIPToolChain under the HIPToolChain and
rename it to
> > > > >     HIPAMDToolChain. Since the current HIPToolChain
depends on methods
> > > > >     in the super classes (e.g. AMDGPUToolChainï¿½s
getParsedTargetID)
> > > > >     the rebased class is planned to be a proxy class to
avoid code
> > > > >     duplication and to reduce the amount of changes.
Another option to
> > > > >     refactor the current HIPToolChain would be to use
multiple
> > > > >     heritance but that leads to dreaded diamond class
structure which
> > > > >     probably is not a great choice.
> > > > >
> > > > >   With the current plan, HIPToolChain is not going to
have much code
> > > > >   to be shared with the derived classes - so far only a
bit of the
> > > > >   ï¿½fat binaryï¿½ construction code is in sight for
sharing, so the
> > > > >   immediate gains for the effort seems small. However,
The TCï¿½s layout
> > > > >   is more logical and it may spark more HIP
implementations, as well
> > > > >   as help refactoring when going forward.
> > > > >
> > > > >
> > > > > Address space mapping
> > > > > ====================> > > > >
> > > > > Translating HIP device code to valid SPIR-V binary
requires tweaks
> > > > > on
> > > > > pointers:
> > > > >
> > > > > Pointers without address space (AS) qualification in
HIP programs
> > > > > are considered ï¿½flatï¿½ pointers - they can point to
function
> > > > > local, __device__, __shared__ and __constant__ memory
space
> > > > > dynamically, which matches the idea of ï¿½genericï¿½
pointers
> > > > > introduced in OpenCL 2.0. Therefore, the logical choice
for the
> > > > > flat pointers is to map them to generic pointers of
SPIR-Vï¿½s
> > > > > OpenCL environment. HIPCLï¿½s and HIPLZï¿½s SPIR-V
environment
> > > > > mandates that the kernel pointer parameters must point
to
> > > > > __global, __local or __constant memory (these are named
> > > > > differently in SPIR-V; using OpenCL names as they are
more
> > > > > familiar). So HIP pointer parameters in the HIP kernel
> > > > > (__global__) functions would be mapped to global
pointers. Otherwise, HIP
> > pointers with AS qualifiers are mapped to SPIR-V equivalent, if
suitable.
> > > > >
> > > > > Now, there are significant differences between HIPï¿½s
> > > > > __constant__ and SPIR-V/OpenCLï¿½s constant address
space:
> > > > >
> > > > > * In HIP, __constant__ globals can be altered on the
host side with
> > > > >   the hipMemcpyToSymbol() API function. In the
OpenCLï¿½s host API you
> > > > >   cannot do this.
> > > > >
> > > > >   (Side-note: OpenCL host API does not have an
equivalent method for
> > > > >   hipMemcpyToSymbol but HIPCL currently supports
hipMemcpyToSymbol
> > for
> > > > >   the global __global variables via Intelï¿½s
> > > > >   clGetDeviceGlobalVariablePointerINTEL API extension,
but we are
> > > > >   planning to inject shadow kernel commands that access
the global
> > > > >   variables instead for portability.)
> > > > >
> > > > > * In HIP flat pointers can point to __constant__
memory. In OpenCL
> > > > >   this is not the case with __generic pointers, which
means __constant
> > > > >   pointers cannot be casted to __generic pointers and
vice versa.
> > > > >
> > > > > There are a couple ways to deal with constants:
> > > > >
> > > > > * Map __constant__ to __global space in SPIR-V. That
way we can
> > > > >   generate code that works and is simple to implement.
Of course, we
> > > > >   lose the optimization/placing benefits of constant
memory.
> > > > >
> > > > > * Transform the code after clang codegen (by an LLVM
pass) by
> > > > >   converting the __constant objects to kernel
arguments. This covers
> > > > >   the hipMemcpyToSymbol() case. There is still the
constant-to-generic
> > > > >   cast issue, so we would have to use the previous
point as the
> > > > >   fallback.
> > > > >
> > > > > We plan to start by upstreaming the first option, and
time
> > > > > permitting, improve by implementing the second option.
> > > > >
> > > > > The planned changes to Clang to achieve the
aforementioned AS
> > > > > mapping are as follows:
> > > > >
> > > > > * Define address space mapping in the new,
aforementioned
> > > > >   SPIRVTargetInfo to map CUDA address spaces (which the
HIP reuses) to
> > > > >   do the mapping mentioned earlier. Default AS (0) used
for the flat
> > > > >   pointers are mapped to the SPIR-Vï¿½s ï¿½genericï¿½.
We intend this
> > > > >   mapping being enabled when the language mode is HIP.
> > > > >
> > > > > * Change SPIRABIInfo to coerce kernel AS-unqualified
pointer arguments
> > > > >   to __global ones. Pointer arguments in regular device
functions
> > > > >   receive the __generic AS qualifier via the address
space mapping
> > > > >   defined in SPIRVTargetInfo in the above point.
> > > > >
> > > > >
> > > > > HIP code expansion
> > > > > =================> > > > >
> > > > > There are features in HIP language which do not have
direct
> > > > > counterparts in SPIR-Vï¿½s OpenCL environment and those
features
> > > > > need to be rewritten before translation to SPIR-V (in
the future,
> > > > > lowering to SPIR-V machine code through the new BE).
The
> > > > > non-exhaustive list of features that need to be
expanded includes:
> > > > >
> > > > > * Dynamic shared memory allocation (DSM): It is an
array which is
> > > > >   declared globally in LLVM IR and its actual size
determined at
> > > > >   kernel launch. OpTypeRuntimeArray in SPIR-V is the
closest thing to
> > > > >   model this object, alas, it requires shader
capability.
> > > > >
> > > > > * abort() builtin: No counterpart in SPIR-V/OpenCL.
> > > > >   (Note: the behavior is not well specified in the HIP
spec
> > > > >   either. Assuming it terminates the whole grid if any
work item
> > > > >   reaches it. AMDï¿½s abort definition calls
__builtin_trap).
> > > > >
> > > > > * printf(): OpenCLï¿½s printf takes the format string
as ï¿½__constant__
> > > > >   char*ï¿½ while in HIP the format string does not have
to reside in
> > > > >   constant memory.
> > > > >
> > > > > * Texture objects. These roughly correspond to image
and sampler
> > > > >   objects of OpenCL combined. Also, texture objects
carry more
> > > > >   information for the texture functions than
image+sampler objects do.
> > > > >
> > > > > * Texture references. Same as above but these are
program global
> > > > >   objects. In OpenCL, image objects cannot reside in
the program
> > > > >   global space.
> > > > >
> > > > > HIPCL/-LZï¿½s solution to the DSM allocation case is
that the
> > > > > runtime allocates a shared buffer and passes it to the
kernel as
> > > > > an additional argument (which is hidden from the user).
The device
> > > > > code is modified so that the DSM object is replaced
with the new
> > > > > kernel argument. Various other cases listed will be
handled similarly:
> > > > >
> > > > > * For the printf case we tentatively replace the printf
calls with a
> > > > >   function that packs their arguments to an additional
buffer passed
> > > > >   as additional kernel argument and do the printing on
the host side.
> > > > >
> > > > > * Texture objects will be tentatively split to image
and sampler
> > > > >   objects and possibly auxiliary struct to carry
texture
> > > > >   settings. This means at least that the kernel
parameter listing
> > > > >   needs to be rewritten for the Texture objects.
> > > > >
> > > > > * For the texture reference we tentatively planned
replacing the
> > > > >   global texture objects also with a number of
additional kernel
> > > > >   arguments.
> > > > >
> > > > > For this and other HIP features we need to apply LLVM
IR passes to
> > > > > perform modifications on the device code. In many cases
the passes
> > > > > should be run when the device code (as LLVM bitcode) is
fully
> > > > > linked. This is simply achieved as the HIP offload
mechanism
> > > > > already emits device code as LLVM bitcode in RDC mode
(-fgpu-rdc),
> > > > > so during linking we do receive the device code as LLVM
bitcode
> > > > > where to apply these expansions with full view of the
device code.
> > > > >
> > > > > The current plan for implementing this is to make the
> > > > > HIPSPVToolChain to build a linker that uses llvm-link
for linking
> > > > > device code, opt for running the IR passes needed and
the external
> > > > > llvm-spirv tool (llc in the future when the SPIR-V BE
lands) for
> > > > > emitting the SPIR-V binary. We load the passes from a
path the
> > > > > user provides via --hip-link-pass-path (name pending)
or
> > > > > automatically from HIP runtimeï¿½s installation
location by using
> > > > > the search logic provided by ROCmInstallationDetector.
> > > > >
> > > > > There is interest in upstreaming the HIPCL/-LZ passes
from the
> > > > > HIPCL/-LZ repositories in the future for reduced
maintenance
> > > > > burden. However, we are not attempting to upstream them
initially,
> > > > > as they are not yet completed and are subject to rapid
changes.
> > > > > Question
> > > > > is: Where should the passes eventually be put in within
the LLVM
> > > > > project tree? Could it be OK to add a new directory
under Clang
> > > > > for tool chain passes?
> > > > >
> > > > >
> > > > > Testing
> > > > > ======> > > > >
> > > > > We will provide llvm-lit tests for our toolchain in the
upstream.
> > > > > We also want to add tests to make sure clang who will
run the
> > > > > HIPCL/-LZ runtime passes get run at device code link
time. For
> > > > > this we need a dummy pass plugin that the clang loads
during the test.
> > > > >
> > > > > When the new LLVM SPIR-V BE work lands on LLVM, we will
add SPIR-V
> > > > > assembly checks that are relevant for HIPSPV.
> > > > >
> > > > >
> > > > > References
> > > > > =========> > > > >
> > > > > [0]:
> > > > >
> > https://rocmdocs.amd.com/en/latest/Programming_Guides/Programming-
> > > > > Guides.html
> > > > > [1]:
https://spec.oneapi.com/level-zero/latest/index.html
> > > > > [2]: https://github.com/cpc/hipcl
> > > > > [3]: https://github.com/jz10/anl-gt-gpu
> > > > > [4]:
https://github.com/parmance/llvm-project/tree/hip2spirv-v5
> > > > > [5]: https://github.com/KhronosGroup/LLVM-SPIRV-Backend
> > > > > [6]:
> > > > >
https://lists.llvm.org/pipermail/cfe-dev/2020-December/067362.html
> > > > > [7]:
https://github.com/KhronosGroup/SPIRV-LLVM-Translator
> > > > > _______________________________________________
> > > > > LLVM Developers mailing list
> > > > > llvm-dev at lists.llvm.org
> > > > >
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> > > >
> > > >
> > > >
> > > > --
> > > > BR,
> > > > Henry Linjamï¿½ki
> > > >
> > >
> > > BR,
> > > Henry and Pekka
> >
> > BR,
> > Henry and Pekka
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210825/4d49ce38/attachment-0001.html>

llvm dev - Aug 2021 - [RFC][HIPSPV] Emitting HIP device code as SPIR-V

[llvm-dev] [RFC][HIPSPV] Emitting HIP device code as SPIR-V

[llvm-dev] [RFC][HIPSPV] Emitting HIP device code as SPIR-V