Displaying 8 results from an estimated 8 matches for "vgpr_32".
2019 Sep 09
2
LiveInterval error with 2 dead defs
Hi,
I’m hitting a machine verifier error in a trivial testcase which I don’t understand. There are 2 dead defs of the same register:
---
name: multiple_connected_compnents_dead
tracksRegLiveness: true
body: |
bb.0:
dead %0:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
dead %0:vgpr_32 = V_MOV_B32_e32 1, implicit $exec
...
The live intervals look OK to me with 1 valno per instruction, for the life of the instruction like I would expect. The verifier does not like it however:
$ llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx906 -ve...
2019 Sep 09
2
Fwd: MachineScheduler not scheduling for latency
...traight line code in cases like the one in the attached debug dump.
This is on AMDGPU, an in-order target, and the problem is that the
IMAGE_SAMPLE instructions have very high (80 cycle) latency, but in
the resulting schedule they are often placed right next to their uses
like this:
1784B %140:vgpr_32 = IMAGE_SAMPLE_LZ_V1_V2 %533:vreg_64,
%30:sreg_256, %26:sreg_128, 8, 0, 0, 0, 0, 0, 0, 0, 0, implicit $exec
:: (dereferenceable load 4 from custom TargetCustom8)
1792B %142:vgpr_32 = V_MUL_F32_e32 %44:sreg_32, %140:vgpr_32, implicit $exec
...
1784B %140:vgpr_32 = IMAGE_SAMPLE_LZ_V1_V2 %533:...
2019 Sep 10
2
MachineScheduler not scheduling for latency
...ug dump.
> > This is on AMDGPU, an in-order target, and the problem is that the
> > IMAGE_SAMPLE instructions have very high (80 cycle) latency, but in
> > the resulting schedule they are often placed right next to their uses
> > like this:
> >
> > 1784B %140:vgpr_32 = IMAGE_SAMPLE_LZ_V1_V2 %533:vreg_64,
> > %30:sreg_256, %26:sreg_128, 8, 0, 0, 0, 0, 0, 0, 0, 0, implicit $exec
> > :: (dereferenceable load 4 from custom TargetCustom8)
> > 1792B %142:vgpr_32 = V_MUL_F32_e32 %44:sreg_32, %140:vgpr_32, implicit $exec
> > ...
> > 17...
2019 Oct 07
2
LiveInterval error with 2 dead defs
...o:Matthew.Arsenault at amd.com>> wrote:
Hi,
I’m hitting a machine verifier error in a trivial testcase which I don’t understand. There are 2 dead defs of the same register:
---
name: multiple_connected_compnents_dead
tracksRegLiveness: true
body: |
bb.0:
dead %0:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
dead %0:vgpr_32 = V_MOV_B32_e32 1, implicit $exec
...
The live intervals look OK to me with 1 valno per instruction, for the life of the instruction like I would expect. The verifier does not like it however:
$ llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx906 -ve...
2016 Mar 28
0
RFC: atomic operations on SI+
...s.td b/lib/Target/AMDGPU/CIInstructions.td
> index 593300f..d99b013 100644
> --- a/lib/Target/AMDGPU/CIInstructions.td
> +++ b/lib/Target/AMDGPU/CIInstructions.td
> @@ -156,7 +156,7 @@ defm FLAT_ATOMIC_SWAP : FLAT_ATOMIC <
> flat<0x30, 0x40>, "flat_atomic_swap", VGPR_32
> >;
> defm FLAT_ATOMIC_CMPSWAP : FLAT_ATOMIC <
> - flat<0x31, 0x41>, "flat_atomic_cmpswap", VGPR_32, VReg_64
> + flat<0x31, 0x41>, "flat_atomic_cmpswap", VReg_64
> >;
> defm FLAT_ATOMIC_ADD : FLAT_ATOMIC <
> flat<0x32, 0x...
2016 Mar 25
2
RFC: atomic operations on SI+
Hi Tom, Matt,
I'm working on a project that needs few coherent atomic operations (HSA
mode: load, store, compare-and-swap) for std::atomic_uint in HCC.
the attached patch implements atomic compare and swap for SI+
(untested). I tried to stay within what was available, but there are
few issues that I was unsure how to address:
1.) it currently uses v2i32 for both input and output. This
2017 May 16
2
Bug in TableGen RegisterBankEmitter
...s would allow us to prevent it from following the subreg indices into the wrong classes but it would also make it harder to define the register banks.
>
I'm a little confused about what the issue is. AMDGPU has 2 64-bit register
classes each with sub0 and sub1 sub-registers:
VReg_64:sub0=VGPR_32
VReg_64:sub1=VGPR_32
SReg_64:sub0=SGPR_32
SReg_64:sub1=SGPR_32
Are you saying that tablegen considers VReg_64:sub0 and SReg_64:sub0 to be
the same sub-register class because they are both called sub0 ?
-Tom
>> On 10 May 2017, at 21:58, Daniel Sanders via llvm-dev <llvm-dev at lists.ll...
2017 May 10
2
Bug in TableGen RegisterBankEmitter
Hi Tom,
The output:
Added VReg_64(explicit)
Added VS_32(explicit (VS_32) VReg_64 class-with-subregs: VReg_64)
is saying that VS_32 was added because VReg_64 was explicitly specified and that while inspecting VS_32, it noticed that every register in VS_32 was a subregister of a register from VReg_64 using a single common subregister index.
I've added some more tracing to my local copy and