search for: mininvocationsinclusivescanamd

Displaying 3 results from an estimated 3 matches for "mininvocationsinclusivescanamd".

2017 Jun 12
4
Implementing cross-thread reduction in the AMDGPU backend
...ic that returns the first argument with inactive lanes set to the second argument. We'd also need something like WQM to make all the lanes active during the sequence. But that raises some hairy requirements for register allocation. For example, in something like: foo = ... if (...) { bar = minInvocationsInclusiveScanAMD(...) } else { ... = foo; } we have to make sure that foo isn't allocated to the same register as one of the temporaries used inside minInvocationsInclusiveScanAMD(), though they don't interfere. That's because the implementation of minInvocationsInclusiveScanAMD() will do funny thi...
2017 Jun 12
2
Implementing cross-thread reduction in the AMDGPU backend
...>> the second argument. We'd also need something like WQM to make all the >> lanes active during the sequence. But that raises some hairy >> requirements for register allocation. For example, in something like: >> >> foo = ... >> if (...) { >> bar = minInvocationsInclusiveScanAMD(...) >> } else { >> ... = foo; >> } >> >> we have to make sure that foo isn't allocated to the same register as >> one of the temporaries used inside minInvocationsInclusiveScanAMD(), >> though they don't interfere. That's because the implem...
2017 Jun 13
2
Implementing cross-thread reduction in the AMDGPU backend
...something like WQM to make all the >>>> lanes active during the sequence. But that raises some hairy >>>> requirements for register allocation. For example, in something like: >>>> >>>> foo = ... >>>> if (...) { >>>> bar = minInvocationsInclusiveScanAMD(...) >>>> } else { >>>> ... = foo; >>>> } >>>> >>>> we have to make sure that foo isn't allocated to the same register as >>>> one of the temporaries used inside minInvocationsInclusiveScanAMD(), >>>> though...