Hi Bjarke,
Without being an expert on the details of architectures targeted by CUDA, here
are some high level observations regarding pass pipelines for GPUs:
- The set of CFG optimizations you want is likely to be quite different.
JumpThreading in particularly is typically not desirable.
- Most modern GPUs are going to want specialized passes (scalarization,
speculative execution) inserted at various points in the pipeline.
- A lot of GPUs are very sensitive to loop unrolling to eliminate dynamic
accesses.
From there, your mileage will vary based on whether you’re doing online or
offline compilation. I’m guessing the latter for CUDA.
—Owen
> On May 14, 2015, at 8:05 PM, Bjarke Roune <broune at google.com>
wrote:
>
> Hi Owen,
>
> You mentioned at http://reviews.llvm.org/D9360
<http://reviews.llvm.org/D9360> that the optimization pipeline set up in
PassManagerBuilder has not worked well for GPUs in your experience. So I'd
like to try out an alternative to PassManagerBuilder for CUDA. Do you have a
suggestion for what I might try instead of PassManagerBuilder? If you happen to
have a replacement for it that I could try, that would be great.
>
> Bjarke
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150515/e8c4d85a/attachment.html>