search for: image_sample_lz_v1_v2

Displaying 2 results from an estimated 2 matches for "image_sample_lz_v1_v2".

2019 Sep 09
2
Fwd: MachineScheduler not scheduling for latency
...ne code in cases like the one in the attached debug dump. This is on AMDGPU, an in-order target, and the problem is that the IMAGE_SAMPLE instructions have very high (80 cycle) latency, but in the resulting schedule they are often placed right next to their uses like this: 1784B %140:vgpr_32 = IMAGE_SAMPLE_LZ_V1_V2 %533:vreg_64, %30:sreg_256, %26:sreg_128, 8, 0, 0, 0, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable load 4 from custom TargetCustom8) 1792B %142:vgpr_32 = V_MUL_F32_e32 %44:sreg_32, %140:vgpr_32, implicit $exec ... 1784B %140:vgpr_32 = IMAGE_SAMPLE_LZ_V1_V2 %533:vreg_64, %30:sreg_256, %...
2019 Sep 10
2
MachineScheduler not scheduling for latency
...gt; > This is on AMDGPU, an in-order target, and the problem is that the > > IMAGE_SAMPLE instructions have very high (80 cycle) latency, but in > > the resulting schedule they are often placed right next to their uses > > like this: > > > > 1784B %140:vgpr_32 = IMAGE_SAMPLE_LZ_V1_V2 %533:vreg_64, > > %30:sreg_256, %26:sreg_128, 8, 0, 0, 0, 0, 0, 0, 0, 0, implicit $exec > > :: (dereferenceable load 4 from custom TargetCustom8) > > 1792B %142:vgpr_32 = V_MUL_F32_e32 %44:sreg_32, %140:vgpr_32, implicit $exec > > ... > > 1784B %140:vgpr_32 = I...