thr3ads.net - search: "v_add

Displaying 5 results from an estimated 5 matches for "v_add_i32".

[LLVMdev] How should I update LiveIntervals after removing a use of a register?

2014 Apr 04

[LLVMdev] How should I update LiveIntervals after removing a use of a register?

Hi, I am working on a simple copy propagation pass for the R600 backend that propagates immediates rather than registers. For example, I want to transform: ... %vreg1 = V_MOV_B32 1 %vreg2 = V_ADD_I32 %vreg1, %vreg0 ... into: %vreg1 = V_MOV_B32 1 ; <- Only delete this if it is dead %vreg2 = V_ADD_I32 1, %vreg0 For best results, I am trying to run this pass after the TwoAddressInstruction pass, which means I need to preserve the LiveIntervals analysis. My question is: How do I update the L...

[LLVMdev] Weird problems with cos (was Re: [PATCH v3 2/3] R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO)

2014 Oct 03

[LLVMdev] Weird problems with cos (was Re: [PATCH v3 2/3] R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO)

...rify-machineinstrs | FileCheck --check-prefix=SI --check-prefix=FUNC %s > > ;FUNC-LABEL: @test1: > -;EG-CHECK: ADD_INT {{[* ]*}}T{{[0-9]+\.[XYZW], T[0-9]+\.[XYZW], T[0-9]+\.[XYZW]}} > +;EG: ADD_INT {{[* ]*}}T{{[0-9]+\.[XYZW], T[0-9]+\.[XYZW], T[0-9]+\.[XYZW]}} > > -;SI-CHECK: V_ADD_I32_e32 [[REG:v[0-9]+]], {{v[0-9]+, v[0-9]+}} > -;SI-CHECK-NOT: [[REG]] > -;SI-CHECK: BUFFER_STORE_DWORD [[REG]], > +;SI: V_ADD_I32_e32 [[REG:v[0-9]+]], {{v[0-9]+, v[0-9]+}} > +;SI-NOT: [[REG]] > +;SI: BUFFER_STORE_DWORD [[REG]], > define void @test1(i32 addrspace(1)* %out, i32 addrs...

Implementing cross-thread reduction in the AMDGPU backend

2017 Jun 15

Implementing cross-thread reduction in the AMDGPU backend

...fold the > v_mov_b32 into the operation itself. That is, you'd do: > > %swizzled = i32 llvm.amdgcn.update.dpp i32 0, %update, (dpp control) > %new = i32 add %swizzled, %old > > and after coalescing, register allocation, etc. the backend would turn > that into: > > v_add_i32 v1, v0, v1 (dpp control) bound_ctrl:1 > > which is functionally equivalent to the version without the bound_ctrl. > > Otherwise, for operations `op' where a `op' a == a, you can do something like: > > %swizzled = f32 llvm.amdgcn.update.dpp %old, %update, (dpp control) &...

Implementing cross-thread reduction in the AMDGPU backend

2017 Jun 15

Implementing cross-thread reduction in the AMDGPU backend

...he operation itself. That is, you'd do: >> >> %swizzled = i32 llvm.amdgcn.update.dpp i32 0, %update, (dpp control) >> %new = i32 add %swizzled, %old >> >> and after coalescing, register allocation, etc. the backend would >> turn that into: >> >> v_add_i32 v1, v0, v1 (dpp control) bound_ctrl:1 >> >> which is functionally equivalent to the version without the bound_ctrl. >> >> Otherwise, for operations `op' where a `op' a == a, you can do something like: >> >> %swizzled = f32 llvm.amdgcn.update.dpp %old, %up...

Implementing cross-thread reduction in the AMDGPU backend

2017 Jun 14

Implementing cross-thread reduction in the AMDGPU backend

On 06/13/2017 07:33 PM, Matt Arsenault wrote: > >> On Jun 12, 2017, at 17:23, Tom Stellard <tstellar at redhat.com <mailto:tstellar at redhat.com>> wrote: >> >> On 06/12/2017 08:03 PM, Connor Abbott wrote: >>> On Mon, Jun 12, 2017 at 4:56 PM, Tom Stellard <tstellar at redhat.com <mailto:tstellar at redhat.com>> wrote: >>>> On

search for: v_add_i32