Hilloulin Damien
2014-Aug-22 00:13 UTC
[LLVMdev] [AMDGPU][PATCH 2/3] Stubs implementation of the new intrinsics on Evergreen
This patch adds some stubs to provide a first implementation of the intrinsics for barriers and memory fences on EG. The barrier.nofence() intrinsic is the only intrinsic correctly implemented (for sure) with this patch. Maybe the barrier.local() intrinsic can be considered ok like this as the LDS memory is atomic. The other intrinsics need to use WAIT_ACK in some way and that we modify the surrounding memory operations with ACK. Signed-off-by: Damien Hilloulin <damien.hilloulin at supelec.fr> --- lib/Target/R600/EvergreenInstructions.td | 69 +++++++++++++++++++++++++++++++- 1 file changed, 68 insertions(+), 1 deletion(-) diff --git a/lib/Target/R600/EvergreenInstructions.td b/lib/Target/R600/EvergreenInstructions.td index a83567a..c8f90ce 100644 --- a/lib/Target/R600/EvergreenInstructions.td +++ b/lib/Target/R600/EvergreenInstructions.td @@ -358,8 +358,12 @@ def FLT_TO_UINT_eg : FLT_TO_UINT_Common<0x9A> { def UINT_TO_FLT_eg : UINT_TO_FLT_Common<0x9C>; +//===----------------------------------------------------------------------===// +// SYnchronization instructions +//===----------------------------------------------------------------------===// + def GROUP_BARRIER : InstR600 < - (outs), (ins), " GROUP_BARRIER", [(int_AMDGPU_barrier_local), (int_AMDGPU_barrier_global)], AnyALU>, + (outs), (ins), " GROUP_BARRIER", [], AnyALU>, R600ALU_Word0, R600ALU_Word1_OP2 <0x54> { @@ -389,10 +393,73 @@ def GROUP_BARRIER : InstR600 < } def : Pat < + (int_AMDGPU_barrier_nofence), + (GROUP_BARRIER) +>; + +// XXX: the following patterns in the section are stubs. +// We should take care of inserting WAIT_ACK and modifying the +// read/writes instructions before the barrier and in the loop. +def : Pat < + (int_AMDGPU_barrier_local), + (GROUP_BARRIER) +>; + +def : Pat < (int_AMDGPU_barrier_global), (GROUP_BARRIER) >; +def : Pat < + (int_AMDGPU_barrier_localglobal), + (GROUP_BARRIER) +>; + + +def : Pat < + (int_AMDGPU_mem_fence_local), + (GROUP_BARRIER) +>; + +def : Pat < + (int_AMDGPU_mem_fence_global), + (GROUP_BARRIER) +>; + +def : Pat < + (int_AMDGPU_mem_fence_localglobal), + (GROUP_BARRIER) +>; + +def : Pat < + (int_AMDGPU_read_mem_fence_local), + (GROUP_BARRIER) +>; + +def : Pat < + (int_AMDGPU_read_mem_fence_global), + (GROUP_BARRIER) +>; + +def : Pat < + (int_AMDGPU_read_mem_fence_localglobal), + (GROUP_BARRIER) +>; + +def : Pat < + (int_AMDGPU_write_mem_fence_local), + (GROUP_BARRIER) +>; + +def : Pat < + (int_AMDGPU_write_mem_fence_global), + (GROUP_BARRIER) +>; + +def : Pat < + (int_AMDGPU_write_mem_fence_localglobal), + (GROUP_BARRIER) +>; //===----------------------------------------------------------------------===// // LDS Instructions //===----------------------------------------------------------------------===// -- 1.9.1