On Sat, Jun 6, 2015 at 6:24 PM, Christos Margiolas <chrmargiolas at gmail.com> wrote:> Hello, > > Thank you a lot for the feedback. I believe that the heterogeneous engine > should be strongly connected with parallelization and vectorization efforts. > Most of the accelerators are parallel architectures where having efficient > parallelization and vectorization can be critical for performance. > > I am interested in these efforts and I hope that my code can help you > managing the offloading operations. Your LLVM instruction set extensions may > require some changes in the analysis code but I think is going to be > straightforward. > > I am planning to push my code on phabricator in the next days.If you're doing the extracting at the loop and llvm ir level - why would you need to modify the IR? Wouldn't the target level lowering happen later? How are you actually determining to offload? Is this tied to directives or using heuristics+some set of restrictions? Lastly, are you handling 2 targets in the same module or end up emitting 2 modules and dealing with recombining things later..
Eric Christopher
2015-Jun-06 19:22 UTC
[LLVMdev] Supporting heterogeneous computing in llvm.
On Sat, Jun 6, 2015 at 5:02 AM C Bergström <cbergstrom at pathscale.com> wrote:> On Sat, Jun 6, 2015 at 6:24 PM, Christos Margiolas > <chrmargiolas at gmail.com> wrote: > > Hello, > > > > Thank you a lot for the feedback. I believe that the heterogeneous engine > > should be strongly connected with parallelization and vectorization > efforts. > > Most of the accelerators are parallel architectures where having > efficient > > parallelization and vectorization can be critical for performance. > > > > I am interested in these efforts and I hope that my code can help you > > managing the offloading operations. Your LLVM instruction set extensions > may > > require some changes in the analysis code but I think is going to be > > straightforward. > > > > I am planning to push my code on phabricator in the next days. > > If you're doing the extracting at the loop and llvm ir level - why > would you need to modify the IR? Wouldn't the target level lowering > happen later? > > How are you actually determining to offload? Is this tied to > directives or using heuristics+some set of restrictions? > > Lastly, are you handling 2 targets in the same module or end up > emitting 2 modules and dealing with recombining things later.. > >It's not currently possible to do this using the current structure without some significant and, honestly, icky patches. -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150606/10aeb921/attachment.html>
On Sun, Jun 7, 2015 at 2:22 AM, Eric Christopher <echristo at gmail.com> wrote:> > > On Sat, Jun 6, 2015 at 5:02 AM C Bergström <cbergstrom at pathscale.com> wrote: >> >> On Sat, Jun 6, 2015 at 6:24 PM, Christos Margiolas >> <chrmargiolas at gmail.com> wrote: >> > Hello, >> > >> > Thank you a lot for the feedback. I believe that the heterogeneous >> > engine >> > should be strongly connected with parallelization and vectorization >> > efforts. >> > Most of the accelerators are parallel architectures where having >> > efficient >> > parallelization and vectorization can be critical for performance. >> > >> > I am interested in these efforts and I hope that my code can help you >> > managing the offloading operations. Your LLVM instruction set extensions >> > may >> > require some changes in the analysis code but I think is going to be >> > straightforward. >> > >> > I am planning to push my code on phabricator in the next days. >> >> If you're doing the extracting at the loop and llvm ir level - why >> would you need to modify the IR? Wouldn't the target level lowering >> happen later? >> >> How are you actually determining to offload? Is this tied to >> directives or using heuristics+some set of restrictions? >> >> Lastly, are you handling 2 targets in the same module or end up >> emitting 2 modules and dealing with recombining things later.. >> > > It's not currently possible to do this using the current structure without > some significant and, honestly, icky patches.What's not possible? I agree some of our local patches and design may not make it upstream as-is, but we are offloading to 2+ targets using llvm ir *today*. IMHO - you must (re)solve the problem about handling multiple targets concurrently. That means 2 targets in a single Module or 2 Modules basically glued one after the other.