Nicholas Wilson via llvm-dev
2021-Oct-11 00:31 UTC
[llvm-dev] NVPTX i8 surface intrinsics/instructions are actually i16?
Hi all
I’ve been looking into adding support for NVPTX’s texture and surface intrinsics
for our frontend. Running our builtins generator revealed that the intrinsics
corresponding to 8-bit integer surface instructions, e.g.
"llvm.nvvm.suld.3d.i8.zero”, return a 16-bit integer whereas the rest of
the intrinsics in the overload set i.e. the
“llvm.nvvm.suld.3d.{i16,i32,i64}.zero” all return the corresponding type.
Looking at llvm/include/IR/IntrinsicsNNVM.td confirms this, as does
llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
My question: is this intentional?
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#surface-instructions-suld
seems to suggest that the corresponding assembly does support 8-bit operations
and that they should return data of the "size of the data transfer matches
the size of destination operand d” which seems to me like it should be i8 for an
i8 instruction