dag at cray.com
2012-Oct-22 17:15 UTC
[LLVMdev] Predication on SIMD architectures and LLVM
Dan Gohman <dan433584 at gmail.com> writes:> And, in part because a popular trend seems to be to have SIMD units > which don't trap or raise exception flags on arithmetic and which don't > go faster when predicated, such that there's no reason to predicate > anything except stores and occasionally loads. On these architectures, > simply having intrinsics for stores, and perhaps loads, is basically > sufficient, and less invasive.This is going to change. Intel recently released the ISA for Knights Corner, a machine with general predication for SIMD. http://software.intel.com/en-us/forums/topic/278102> And, in part because predication is another wrinkle for SIMD > performance portability. As people start caring more about SIMD > performance, there will be more pressure to tune SIMD code in > target-specific ways, and it erodes the benefit of a > target-independent representation. This is a complex topic though, and > there are multiple considerations, and not everyone agrees with me > here.It's true that a target-independent predicated IR isn't going to translate well to a target that doesn't have predication. However, for targets that do it's a godsend.> One thing that's initially counter-intuitive is that SIMD predication > cannot be done in the same way as scalar or VLIW predication, where > the majority of the compiler works as if it's on a "normal" scalar > machine and predication happens during codegen, where the optimizer > doesn't have to think about it. SIMD predication must be applied by > whatever code is producing SIMD instructions, and in LLVM, that's > typically in the optimizer or earlier.Yep. This is why I think IR support is essential. -David
On Oct 22, 2012, at 10:15 AM, dag at cray.com wrote:> > It's true that a target-independent predicated IR isn't going to > translate well to a target that doesn't have predication. However, for > targets that do it's a godsend.Even for MIC (Xeon Phi), the predicated IR is not necessary. The instructions that really benefit from predication are loads and stores. MIC masks are write masks, but even if they were to help the performance of predicated instructions, there are other ways to do this. One way would be to implement masked load and mask store intrinsics, and to place 'select' instructions in strategic locations: before instructions that may fault, before phi-nodes, etc. A pre-register allocation pass can propagate the masks to all of the instructions that need them. But this is theoretical since only load/store really benefit from predication.> > Yep. This is why I think IR support is essential.I don't think that we need to change the IR, even for a predicated architecture such as MIC . -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121023/a78179cf/attachment.html>
dag at cray.com
2012-Oct-24 17:12 UTC
[LLVMdev] Predication on SIMD architectures and LLVM
Nadav Rotem <nrotem at apple.com> writes:> One way would be to implement masked load and mask store > intrinsics, and to place 'select' instructions in strategic locations: > before instructions that may fault, before phi-nodes, etc. A > pre-register allocation pass can propagate the masks to all of the > instructions that need them.How does this work if the load is not conditional but trapping operations that use the loaded values are conditional? Yes, such propagation can probably be done but it's painful and every predicated target would have to implement it. It's much easier to just select the right operation in isel, I think, and that seems to require IR support. -David
Possibly Parallel Threads
- [LLVMdev] Predication on SIMD architectures and LLVM
- [LLVMdev] Predication on SIMD architectures and LLVM
- [LLVMdev] Predication on SIMD architectures and LLVM
- [LLVMdev] Predication on SIMD architectures and LLVM
- [LLVMdev] Predication on SIMD architectures and LLVM