thr3ads.net - search: "row

Displaying 7 results from an estimated 7 matches for "row_bcast".

Implementing cross-thread reduction in the AMDGPU backend

2017 Jun 14

Implementing cross-thread reduction in the AMDGPU backend

...a data hazard >>>>>> v_nop >>>>>> v_foo_f32 v1, v1, v1 row_shr:8 bank_mask:0xc // Instruction 5 >>>>>> v_nop // Add two independent instructions to avoid a data hazard >>>>>> v_nop >>>>>> v_foo_f32 v1, v1, v1 row_bcast:15 row_mask:0xa // Instruction 6 >>>>>> v_nop // Add two independent instructions to avoid a data hazard >>>>>> v_nop >>>>>> v_foo_f32 v1, v1, v1 row_bcast:31 row_mask:0xc // Instruction 7 >>>>>> >>>>>> The pr...

Implementing cross-thread reduction in the AMDGPU backend

2017 Jun 13

Implementing cross-thread reduction in the AMDGPU backend

...dd two independent instructions to avoid a data hazard >>>> v_nop >>>> v_foo_f32 v1, v1, v1 row_shr:8 bank_mask:0xc // Instruction 5 >>>> v_nop // Add two independent instructions to avoid a data hazard >>>> v_nop >>>> v_foo_f32 v1, v1, v1 row_bcast:15 row_mask:0xa // Instruction 6 >>>> v_nop // Add two independent instructions to avoid a data hazard >>>> v_nop >>>> v_foo_f32 v1, v1, v1 row_bcast:31 row_mask:0xc // Instruction 7 >>>> >>>> The problem is that the way these instructions...

Implementing cross-thread reduction in the AMDGPU backend

2017 Jun 14

Implementing cross-thread reduction in the AMDGPU backend

...data hazard >>>>>> v_nop >>>>>> v_foo_f32 v1, v1, v1 row_shr:8 bank_mask:0xc // Instruction 5 >>>>>> v_nop // Add two independent instructions to avoid a data hazard >>>>>> v_nop >>>>>> v_foo_f32 v1, v1, v1 row_bcast:15 row_mask:0xa // Instruction 6 >>>>>> v_nop // Add two independent instructions to avoid a data hazard >>>>>> v_nop >>>>>> v_foo_f32 v1, v1, v1 row_bcast:31 row_mask:0xc // Instruction 7 >>>>>> >>>>>> The...

Implementing cross-thread reduction in the AMDGPU backend

2017 Jun 12

Implementing cross-thread reduction in the AMDGPU backend

...hazard v_nop v_foo_f32 v1, v1, v1 row_shr:4 bank_mask:0xe // Instruction 4 v_nop // Add two independent instructions to avoid a data hazard v_nop v_foo_f32 v1, v1, v1 row_shr:8 bank_mask:0xc // Instruction 5 v_nop // Add two independent instructions to avoid a data hazard v_nop v_foo_f32 v1, v1, v1 row_bcast:15 row_mask:0xa // Instruction 6 v_nop // Add two independent instructions to avoid a data hazard v_nop v_foo_f32 v1, v1, v1 row_bcast:31 row_mask:0xc // Instruction 7 The problem is that the way these instructions use the DPP word isn't currently expressible in LLVM. We have the llvm.amdgcn.m...

Implementing cross-thread reduction in the AMDGPU backend

2017 Jun 12

Implementing cross-thread reduction in the AMDGPU backend

...0xe // Instruction 4 >> v_nop // Add two independent instructions to avoid a data hazard >> v_nop >> v_foo_f32 v1, v1, v1 row_shr:8 bank_mask:0xc // Instruction 5 >> v_nop // Add two independent instructions to avoid a data hazard >> v_nop >> v_foo_f32 v1, v1, v1 row_bcast:15 row_mask:0xa // Instruction 6 >> v_nop // Add two independent instructions to avoid a data hazard >> v_nop >> v_foo_f32 v1, v1, v1 row_bcast:31 row_mask:0xc // Instruction 7 >> >> The problem is that the way these instructions use the DPP word isn't >> cur...

Implementing cross-thread reduction in the AMDGPU backend

2017 Jun 15

Implementing cross-thread reduction in the AMDGPU backend

...gt;> v_nop >>>>>>>> v_foo_f32 v1, v1, v1 row_shr:8 bank_mask:0xc // Instruction 5 >>>>>>>> v_nop // Add two independent instructions to avoid a data hazard >>>>>>>> v_nop >>>>>>>> v_foo_f32 v1, v1, v1 row_bcast:15 row_mask:0xa // Instruction 6 >>>>>>>> v_nop // Add two independent instructions to avoid a data hazard >>>>>>>> v_nop >>>>>>>> v_foo_f32 v1, v1, v1 row_bcast:31 row_mask:0xc // Instruction 7 >>>>>>>>...

Implementing cross-thread reduction in the AMDGPU backend

2017 Jun 15

Implementing cross-thread reduction in the AMDGPU backend

...>>>>>>>> v_foo_f32 v1, v1, v1 row_shr:8 bank_mask:0xc // Instruction 5 >>>>>>>>> v_nop // Add two independent instructions to avoid a data >>>>>>>>> hazard v_nop >>>>>>>>> v_foo_f32 v1, v1, v1 row_bcast:15 row_mask:0xa // Instruction >>>>>>>>> 6 v_nop // Add two independent instructions to avoid a data >>>>>>>>> hazard v_nop >>>>>>>>> v_foo_f32 v1, v1, v1 row_bcast:31 row_mask:0xc // Instruction >>>>&gt...

search for: row_bcast