Level Zero's SPIRV barrier function is not working on my Intel(R) UHD Graphics P630 [0x9bf6] The kernel I'm looking at implements a reduction (sum) within a workgroup (local). It sums over the elements of an array of floats and stores the intermediate results in local memory. The required synchronization within the workgroup is implemented via call void @_Z7barrierj(i32 3) declare spir_func void @_Z7barrierj(i32) #0 attributes #0 = { readnone } However, the kernel doesn't work correctly. It's result is off a little bit from the correct value. This is a typical symptom when the barrier function is not working correctly. The kernel gets translated to SPIRV and then loaded and launched with Intel Level Zero/compute-runtime. And that's when the barrier is not working. I don't know at what stage it gets messed up. I have tried passing '1' and '2' to the barrier function and not issuing the call at all: Exactly the same outcome. This hints to the fact that the barrier is not called at all. This is what Clang generates for the following statement: barrier(CLK_LOCAL_MEM_FENCE | CLK_GLOBAL_MEM_FENCE); tail call spir_func void @_Z7barrierj(i32 3) #4 declare spir_func void @_Z7barrierj(i32) local_unnamed_addr #2 attributes #2 = { convergent "frame-pointer"="none" "no-trapping-math"="true" "stack-protector-buffer-size"="8" } attributes #4 = { convergent nounwind } I don't think the additional attributes matter here. Any ideas? Thanks, Frank -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20220118/e45c8226/attachment.html>