thr3ads.net - llvm dev - [LLVMdev] Supporting heterogeneous computing in llvm. [Jun 2015]

If this information is useful, please help other people find it:
Share via:

C Bergström

2015-Jun-06 19:30 UTC

[LLVMdev] Supporting heterogeneous computing in llvm.

On Sun, Jun 7, 2015 at 2:22 AM, Eric Christopher <echristo at gmail.com>
wrote:>
>
> On Sat, Jun 6, 2015 at 5:02 AM C Bergström <cbergstrom at
pathscale.com> wrote:
>>
>> On Sat, Jun 6, 2015 at 6:24 PM, Christos Margiolas
>> <chrmargiolas at gmail.com> wrote:
>> > Hello,
>> >
>> > Thank you a lot for the feedback. I believe that the heterogeneous
>> > engine
>> > should be strongly connected with parallelization and
vectorization
>> > efforts.
>> > Most of the accelerators are parallel architectures where having
>> > efficient
>> > parallelization and vectorization can be critical for performance.
>> >
>> > I am interested in these efforts and I hope that my code can help
you
>> > managing the offloading operations. Your LLVM instruction set
extensions
>> > may
>> > require some changes in the analysis code but I think is going to
be
>> > straightforward.
>> >
>> > I am planning to push my code on phabricator in the next days.
>>
>> If you're doing the extracting at the loop and llvm ir level - why
>> would you need to modify the IR? Wouldn't the target level lowering
>> happen later?
>>
>> How are you actually determining to offload? Is this tied to
>> directives or using heuristics+some set of restrictions?
>>
>> Lastly, are you handling 2 targets in the same module or end up
>> emitting 2 modules and dealing with recombining things later..
>>
>
> It's not currently possible to do this using the current structure
without
> some significant and, honestly, icky patches.
What's not possible? I agree some of our local patches and design may
not make it upstream as-is, but we are offloading to 2+ targets using
llvm ir *today*.

IMHO - you must (re)solve the problem about handling multiple targets
concurrently. That means 2 targets in a single Module or 2 Modules
basically glued one after the other.

Eric Christopher

2015-Jun-06 19:34 UTC

head link

[LLVMdev] Supporting heterogeneous computing in llvm.

On Sat, Jun 6, 2015 at 12:31 PM C Bergström <cbergstrom at pathscale.com>
wrote:
> On Sun, Jun 7, 2015 at 2:22 AM, Eric Christopher <echristo at
gmail.com>
> wrote:
> >
> >
> > On Sat, Jun 6, 2015 at 5:02 AM C Bergström <cbergstrom at
pathscale.com>
> wrote:
> >>
> >> On Sat, Jun 6, 2015 at 6:24 PM, Christos Margiolas
> >> <chrmargiolas at gmail.com> wrote:
> >> > Hello,
> >> >
> >> > Thank you a lot for the feedback. I believe that the
heterogeneous
> >> > engine
> >> > should be strongly connected with parallelization and
vectorization
> >> > efforts.
> >> > Most of the accelerators are parallel architectures where
having
> >> > efficient
> >> > parallelization and vectorization can be critical for
performance.
> >> >
> >> > I am interested in these efforts and I hope that my code can
help you
> >> > managing the offloading operations. Your LLVM instruction set
> extensions
> >> > may
> >> > require some changes in the analysis code but I think is
going to be
> >> > straightforward.
> >> >
> >> > I am planning to push my code on phabricator in the next
days.
> >>
> >> If you're doing the extracting at the loop and llvm ir level -
why
> >> would you need to modify the IR? Wouldn't the target level
lowering
> >> happen later?
> >>
> >> How are you actually determining to offload? Is this tied to
> >> directives or using heuristics+some set of restrictions?
> >>
> >> Lastly, are you handling 2 targets in the same module or end up
> >> emitting 2 modules and dealing with recombining things later..
> >>
> >
> > It's not currently possible to do this using the current structure
> without
> > some significant and, honestly, icky patches.
>
> What's not possible? I agree some of our local patches and design may
> not make it upstream as-is, but we are offloading to 2+ targets using
> llvm ir *today*.
>
>I'm not sure how much more clear I can be. It's not possible, in the
same
module, to handle multiple targets at the same time.

> IMHO - you must (re)solve the problem about handling multiple targets
> concurrently. That means 2 targets in a single Module or 2 Modules
> basically glued one after the other.
>
Patches welcome.

-eric
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150606/72526feb/attachment.html>

C Bergström

2015-Jun-06 19:43 UTC

head link

[LLVMdev] Supporting heterogeneous computing in llvm.

On Sun, Jun 7, 2015 at 2:34 AM, Eric Christopher <echristo at gmail.com>
wrote:>
>
> On Sat, Jun 6, 2015 at 12:31 PM C Bergström <cbergstrom at
pathscale.com>
> wrote:
>>
>> On Sun, Jun 7, 2015 at 2:22 AM, Eric Christopher <echristo at
gmail.com>
>> wrote:
>> >
>> >
>> > On Sat, Jun 6, 2015 at 5:02 AM C Bergström <cbergstrom at
pathscale.com>
>> > wrote:
>> >>
>> >> On Sat, Jun 6, 2015 at 6:24 PM, Christos Margiolas
>> >> <chrmargiolas at gmail.com> wrote:
>> >> > Hello,
>> >> >
>> >> > Thank you a lot for the feedback. I believe that the
heterogeneous
>> >> > engine
>> >> > should be strongly connected with parallelization and
vectorization
>> >> > efforts.
>> >> > Most of the accelerators are parallel architectures where
having
>> >> > efficient
>> >> > parallelization and vectorization can be critical for
performance.
>> >> >
>> >> > I am interested in these efforts and I hope that my code
can help you
>> >> > managing the offloading operations. Your LLVM instruction
set
>> >> > extensions
>> >> > may
>> >> > require some changes in the analysis code but I think is
going to be
>> >> > straightforward.
>> >> >
>> >> > I am planning to push my code on phabricator in the next
days.
>> >>
>> >> If you're doing the extracting at the loop and llvm ir
level - why
>> >> would you need to modify the IR? Wouldn't the target level
lowering
>> >> happen later?
>> >>
>> >> How are you actually determining to offload? Is this tied to
>> >> directives or using heuristics+some set of restrictions?
>> >>
>> >> Lastly, are you handling 2 targets in the same module or end
up
>> >> emitting 2 modules and dealing with recombining things later..
>> >>
>> >
>> > It's not currently possible to do this using the current
structure
>> > without
>> > some significant and, honestly, icky patches.
>>
>> What's not possible? I agree some of our local patches and design
may
>> not make it upstream as-is, but we are offloading to 2+ targets using
>> llvm ir *today*.
>>
>
> I'm not sure how much more clear I can be. It's not possible, in
the same
> module, to handle multiple targets at the same time.
>
>>
>> IMHO - you must (re)solve the problem about handling multiple targets
>> concurrently. That means 2 targets in a single Module or 2 Modules
>> basically glued one after the other.
>
>
> Patches welcome.
While I appreciate your taste in music - Canned (troll) replies are
typically a waste of time..

Christos Margiolas

2015-Jun-09 07:17 UTC

head link

[LLVMdev] Supporting heterogeneous computing in llvm.

In fact, I have two modules:
a) the Host one
b) the Accelerator one

Each one gets compiled independently. The runtime takes care of the
offloading operations and loads the accelerator code. Imagine that you want
to compile for amd64 and nvidia ptx. You cannot do it in a single module
and even if you support it, it is gonna become scary. How are you gonna
handle architecture differences that affect the IR in a nice way? e.g.
pointer size, stack alignment and much more...

--chris

On Sat, Jun 6, 2015 at 12:30 PM, C Bergström <cbergstrom at pathscale.com>
wrote:
> On Sun, Jun 7, 2015 at 2:22 AM, Eric Christopher <echristo at
gmail.com>
> wrote:
> >
> >
> > On Sat, Jun 6, 2015 at 5:02 AM C Bergström <cbergstrom at
pathscale.com>
> wrote:
> >>
> >> On Sat, Jun 6, 2015 at 6:24 PM, Christos Margiolas
> >> <chrmargiolas at gmail.com> wrote:
> >> > Hello,
> >> >
> >> > Thank you a lot for the feedback. I believe that the
heterogeneous
> >> > engine
> >> > should be strongly connected with parallelization and
vectorization
> >> > efforts.
> >> > Most of the accelerators are parallel architectures where
having
> >> > efficient
> >> > parallelization and vectorization can be critical for
performance.
> >> >
> >> > I am interested in these efforts and I hope that my code can
help you
> >> > managing the offloading operations. Your LLVM instruction set
> extensions
> >> > may
> >> > require some changes in the analysis code but I think is
going to be
> >> > straightforward.
> >> >
> >> > I am planning to push my code on phabricator in the next
days.
> >>
> >> If you're doing the extracting at the loop and llvm ir level -
why
> >> would you need to modify the IR? Wouldn't the target level
lowering
> >> happen later?
> >>
> >> How are you actually determining to offload? Is this tied to
> >> directives or using heuristics+some set of restrictions?
> >>
> >> Lastly, are you handling 2 targets in the same module or end up
> >> emitting 2 modules and dealing with recombining things later..
> >>
> >
> > It's not currently possible to do this using the current structure
> without
> > some significant and, honestly, icky patches.
>
> What's not possible? I agree some of our local patches and design may
> not make it upstream as-is, but we are offloading to 2+ targets using
> llvm ir *today*.
>
> IMHO - you must (re)solve the problem about handling multiple targets
> concurrently. That means 2 targets in a single Module or 2 Modules
> basically glued one after the other.
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150609/020ba964/attachment.html>

llvm dev - Jun 2015 - [LLVMdev] Supporting heterogeneous computing in llvm.

[LLVMdev] Supporting heterogeneous computing in llvm.

[LLVMdev] Supporting heterogeneous computing in llvm.

[LLVMdev] Supporting heterogeneous computing in llvm.

[LLVMdev] Supporting heterogeneous computing in llvm.