thr3ads.net - llvm dev - [llvm-dev] AMDGPU workgroup size as metadata [Apr 2020]

If this information is useful, please help other people find it:
Share via:

Frank Winter via llvm-dev

2020-Apr-30 19:09 UTC

[llvm-dev] AMDGPU workgroup size as metadata

From LLVM IR, how can you get the 'workgroup size' value?
It seems to be set by the AMDGPU backend as metadata since in 
AMDGPUMetadata.h there are things defined like

constexpr char ReqdWorkGroupSize[] = "ReqdWorkGroupSize";

and

struct Metadata final {
   /// 'reqd_work_group_size' attribute. Optional.
   std::vector<uint32_t> mReqdWorkGroupSize =
std::vector<uint32_t>();
   ...
}

Is this metadata set to the kernel function or to the module?

What IR instructions would give access to the value of, say, the 
workgroup size in dimension x?


Frank

Matt Arsenault via llvm-dev

2020-May-01 00:32 UTC

head link

[llvm-dev] AMDGPU workgroup size as metadata

> On Apr 30, 2020, at 15:09, Frank Winter via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> From LLVM IR, how can you get the 'workgroup size' value?
> It seems to be set by the AMDGPU backend as metadata since in
AMDGPUMetadata.h there are things defined like
> 
> constexpr char ReqdWorkGroupSize[] = "ReqdWorkGroupSize";
> 
> and
> 
> struct Metadata final {
>   /// 'reqd_work_group_size' attribute. Optional.
>   std::vector<uint32_t> mReqdWorkGroupSize =
std::vector<uint32_t>();
>   ...
> }
> 
> Is this metadata set to the kernel function or to the module?
> 
> What IR instructions would give access to the value of, say, the workgroup
size in dimension x?
> 
> 
> Frank
> 

The code object metadata is only for statically known workgroup size information
The metadata you found here corresponds to !reqd_work_group_size, corresponding
to the OpenCL attribute of the same name. We have a variety of other static
attributes useful related to workgroup sizes, as documented here:
https://llvm.org/docs/AMDGPUUsage.html#llvm-ir-attributes
<https://llvm.org/docs/AMDGPUUsage.html#llvm-ir-attributes>. The
"uniform-work-group-size” (corresponding to the OpenCL flag
-cl-uniform-work-group-size) may also be of interest.

Dynamically, there isn’t a single instruction to get the group size and it
depends on the runtime/driver how to implement it. You need to get a pointer to
somewhere, and load from it. For HSA/ROCm, these are loaded from an ABI struct
pointed to by a special kernel input SGPR. Recently the core implementation was
moved into clang builtin so we can annotate the load with !range metadata:
https://github.com/llvm/llvm-project/blob/a1bd5cd539f9e2fd34e522b848e751342985e882/clang/lib/CodeGen/CGBuiltin.cpp#L13985
<https://github.com/llvm/llvm-project/blob/a1bd5cd539f9e2fd34e522b848e751342985e882/clang/lib/CodeGen/CGBuiltin.cpp#L13985>.
You can see how these are used here:
https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/amd-stg-open/ockl/src/workitem.cl
<https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/amd-stg-open/ockl/src/workitem.cl>

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200430/f435d2a7/attachment.html>

llvm dev - Apr 2020 - AMDGPU workgroup size as metadata

[llvm-dev] AMDGPU workgroup size as metadata

[llvm-dev] AMDGPU workgroup size as metadata