Displaying 3 results from an estimated 3 matches for "add_dpp".
2017 Jun 15
2
Implementing cross-thread reduction in the AMDGPU backend
...nough, it can also fold it into
> one instruction since the invalid lanes will get their old values
> back. I think that covers all the cases we care about. The question is
> whether it's better to take that route, or whether it's better to just
> add intrinsics like llvm.amdgcn.add_dpp, llvm.amdgcn.fmin_dpp, etc. so
> that:
>
I would only go the route of adding a dpp intrinsic for every operation
if it gives you functionality that you can't get with the llvm.amdgcn.update.dpp.
The main reason for this is that you will lose the generic combines that
LLVM has for add, m...
2017 Jun 15
1
Implementing cross-thread reduction in the AMDGPU backend
...old it into
>> one instruction since the invalid lanes will get their old values
>> back. I think that covers all the cases we care about. The question
>> is whether it's better to take that route, or whether it's better to
>> just add intrinsics like llvm.amdgcn.add_dpp, llvm.amdgcn.fmin_dpp,
>> etc. so
>> that:
>>
>
> I would only go the route of adding a dpp intrinsic for every
> operation if it gives you functionality that you can't get with the llvm.amdgcn.update.dpp.
> The main reason for this is that you will lose the gene...
2017 Jun 14
5
Implementing cross-thread reduction in the AMDGPU backend
On 06/13/2017 07:33 PM, Matt Arsenault wrote:
>
>> On Jun 12, 2017, at 17:23, Tom Stellard <tstellar at redhat.com <mailto:tstellar at redhat.com>> wrote:
>>
>> On 06/12/2017 08:03 PM, Connor Abbott wrote:
>>> On Mon, Jun 12, 2017 at 4:56 PM, Tom Stellard <tstellar at redhat.com <mailto:tstellar at redhat.com>> wrote:
>>>> On