Mehdi Amini via llvm-dev
2017-Jan-13 18:11 UTC
[llvm-dev] [RFC] IR-level Region Annotations
> On Jan 13, 2017, at 9:41 AM, Hal Finkel <hfinkel at anl.gov> wrote: > > > On 01/13/2017 12:29 AM, Mehdi Amini wrote: >> >>> On Jan 12, 2017, at 5:02 PM, Hal Finkel <hfinkel at anl.gov <mailto:hfinkel at anl.gov>> wrote: >>> On 01/12/2017 06:20 PM, Reid Kleckner via llvm-dev wrote: >>> >>>> On Wed, Jan 11, 2017 at 8:13 PM, Mehdi Amini <mehdi.amini at apple.com <mailto:mehdi.amini at apple.com>> wrote: >>>> Can you elaborate why? I’m curious. >>>> >>>> The con of proposal c was that many passes would need to learn about many region intrinsics. With tokens, you only need to teach all passes about tokens, which they should already know about because WinEH and other things use them. >>>> >>>> With tokens, we can add as many region-introducing intrinsics as makes sense without any additional cost to the middle end. We don't need to make one omnibus region intrinsic set that describes every parallel loop annotation scheme supported by LLVM. Instead we would factor things according to other software design considerations. >>> >>> I think that, unless we allow frontends to add their own intrinsics without recompiling LLVM, this severely restricts the usefulness of this feature. >> >> I’m not convinced that “building a frontend without recompiling LLVM while injecting custom passes” is a strong compelling use-case, i.e. can you explain why requiring such use-case/frontends to rebuild LLVM is so limiting? > > I don't understand your viewpoint. Many frontends either compose their own pass pipelines or use the existing extension-point mechanism. Some frontends, Chapel for example, can insert code using custom address spaces and then insert passes later to turn accesses using pointers to those address spaces into runtime calls. This is the kind of design we'd like to support, without forcing frontends to use custom versions of LLVM, but with annotated regions instead of just with address spaces.I think we’re talking about two different things here: you mentioned originally “without recompiling LLVM”, which I don’t see as major blocker, while now you’re now clarifying I think that you’re more concerned about putting a requirement on a *custom* LLVM, as in “it wouldn’t work with the source from a vanilla upstream LLVM”, which I agree is a different story. That said, it extends the point from the other email (in parallel) about the semantics of the intrinsics: while your solution allows these frontend to reuse the intrinsics, it means that upstream optimization have to consider such intrinsics as optimization barrier because their semantic is unknown. — Mehdi -------------- next part -------------- An HTML attachment was scrubbed... URL: <lists.llvm.org/pipermail/llvm-dev/attachments/20170113/b67756d2/attachment.html>
Hal Finkel via llvm-dev
2017-Jan-20 14:59 UTC
[llvm-dev] [RFC] IR-level Region Annotations
On 01/13/2017 12:11 PM, Mehdi Amini wrote:> >> On Jan 13, 2017, at 9:41 AM, Hal Finkel <hfinkel at anl.gov >> <mailto:hfinkel at anl.gov>> wrote: >> >> >> On 01/13/2017 12:29 AM, Mehdi Amini wrote: >>> >>>> On Jan 12, 2017, at 5:02 PM, Hal Finkel <hfinkel at anl.gov >>>> <mailto:hfinkel at anl.gov>> wrote: >>>> >>>> On 01/12/2017 06:20 PM, Reid Kleckner via llvm-dev wrote: >>>> >>>>> On Wed, Jan 11, 2017 at 8:13 PM, Mehdi Amini >>>>> <mehdi.amini at apple.com <mailto:mehdi.amini at apple.com>> wrote: >>>>> >>>>> Can you elaborate why? I’m curious. >>>>> >>>>> >>>>> The con of proposal c was that many passes would need to learn >>>>> about many region intrinsics. With tokens, you only need to teach >>>>> all passes about tokens, which they should already know about >>>>> because WinEH and other things use them. >>>>> >>>>> With tokens, we can add as many region-introducing intrinsics as >>>>> makes sense without any additional cost to the middle end. We >>>>> don't need to make one omnibus region intrinsic set that describes >>>>> every parallel loop annotation scheme supported by LLVM. Instead >>>>> we would factor things according to other software design >>>>> considerations. >>>> >>>> I think that, unless we allow frontends to add their own intrinsics >>>> without recompiling LLVM, this severely restricts the usefulness of >>>> this feature. >>> >>> I’m not convinced that “building a frontend without recompiling LLVM >>> while injecting custom passes” is a strong compelling use-case, i.e. >>> can you explain why requiring such use-case/frontends to rebuild >>> LLVM is so limiting? >> >> I don't understand your viewpoint. Many frontends either compose >> their own pass pipelines or use the existing extension-point >> mechanism. Some frontends, Chapel for example, can insert code using >> custom address spaces and then insert passes later to turn accesses >> using pointers to those address spaces into runtime calls. This is >> the kind of design we'd like to support, without forcing frontends to >> use custom versions of LLVM, but with annotated regions instead of >> just with address spaces. > > I think we’re talking about two different things here: you mentioned > originally “without recompiling LLVM”, which I don’t see as major > blocker, while now you’re now clarifying I think that you’re more > concerned about putting a requirement on a *custom* LLVM, as in “it > wouldn’t work with the source from a vanilla upstream LLVM”, which I > agree is a different story. > > That said, it extends the point from the other email (in parallel) > about the semantics of the intrinsics: while your solution allows > these frontend to reuse the intrinsics, it means that upstream > optimization have to consider such intrinsics as optimization barrier > because their semantic is unknown.I see no reason why this needs to be true (at least so long as you're willing to accept a certain amount of "as if" parallelism). Moreover, if it is true, then we'll lose the benefits of, for example, being able to hoist scalar loads out of parallel loops. We might need to include dependencies on "inaccessible memory", so cover natural runtime dependencies by default (this can be refined with custom AA logic), but that is not a complete code-motion barrier. Memory being explicitly managed will end up as arguments to the region intrinsics, so we'll automatically get more-fine-grained information. -Hal> > > — > Mehdi-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <lists.llvm.org/pipermail/llvm-dev/attachments/20170120/0fda9c32/attachment.html>
Mehdi Amini via llvm-dev
2017-Jan-20 17:52 UTC
[llvm-dev] [RFC] IR-level Region Annotations
> On Jan 20, 2017, at 6:59 AM, Hal Finkel <hfinkel at anl.gov> wrote: > > On 01/13/2017 12:11 PM, Mehdi Amini wrote: >> >>> On Jan 13, 2017, at 9:41 AM, Hal Finkel <hfinkel at anl.gov <mailto:hfinkel at anl.gov>> wrote: >>> >>> >>> On 01/13/2017 12:29 AM, Mehdi Amini wrote: >>>> >>>>> On Jan 12, 2017, at 5:02 PM, Hal Finkel <hfinkel at anl.gov <mailto:hfinkel at anl.gov>> wrote: >>>>> On 01/12/2017 06:20 PM, Reid Kleckner via llvm-dev wrote: >>>>> >>>>>> On Wed, Jan 11, 2017 at 8:13 PM, Mehdi Amini <mehdi.amini at apple.com <mailto:mehdi.amini at apple.com>> wrote: >>>>>> Can you elaborate why? I’m curious. >>>>>> >>>>>> The con of proposal c was that many passes would need to learn about many region intrinsics. With tokens, you only need to teach all passes about tokens, which they should already know about because WinEH and other things use them. >>>>>> >>>>>> With tokens, we can add as many region-introducing intrinsics as makes sense without any additional cost to the middle end. We don't need to make one omnibus region intrinsic set that describes every parallel loop annotation scheme supported by LLVM. Instead we would factor things according to other software design considerations. >>>>> >>>>> I think that, unless we allow frontends to add their own intrinsics without recompiling LLVM, this severely restricts the usefulness of this feature. >>>> >>>> I’m not convinced that “building a frontend without recompiling LLVM while injecting custom passes” is a strong compelling use-case, i.e. can you explain why requiring such use-case/frontends to rebuild LLVM is so limiting? >>> >>> I don't understand your viewpoint. Many frontends either compose their own pass pipelines or use the existing extension-point mechanism. Some frontends, Chapel for example, can insert code using custom address spaces and then insert passes later to turn accesses using pointers to those address spaces into runtime calls. This is the kind of design we'd like to support, without forcing frontends to use custom versions of LLVM, but with annotated regions instead of just with address spaces. >> >> I think we’re talking about two different things here: you mentioned originally “without recompiling LLVM”, which I don’t see as major blocker, while now you’re now clarifying I think that you’re more concerned about putting a requirement on a *custom* LLVM, as in “it wouldn’t work with the source from a vanilla upstream LLVM”, which I agree is a different story. >> >> That said, it extends the point from the other email (in parallel) about the semantics of the intrinsics: while your solution allows these frontend to reuse the intrinsics, it means that upstream optimization have to consider such intrinsics as optimization barrier because their semantic is unknown. > > I see no reason why this needs to be true (at least so long as you're willing to accept a certain amount of "as if" parallelism).Sorry, I didn’t quite get that?> Moreover, if it is true, then we'll lose the benefits of, for example, being able to hoist scalar loads out of parallel loops. We might need to include dependencies on "inaccessible memory", so cover natural runtime dependencies by default (this can be refined with custom AA logic), but that is not a complete code-motion barrier. Memory being explicitly managed will end up as arguments to the region intrinsics, so we'll automatically get more-fine-grained information.Sanjoy gave an example of the kind of optimization that can break the semantic: lists.llvm.org/pipermail/llvm-dev/2017-January/109302.html <lists.llvm.org/pipermail/llvm-dev/2017-January/109302.html> ; I haven’t yet seen an explanation about how this is addressed? I’m not sure how you imagine going around the optimization barrier that goes with “this intrinsic has an unknown semantic that can impact the control flow of the program implicitly”, unless it acts as a “hint” only (but I don’t believe it is the direction?). — Mehdi -------------- next part -------------- An HTML attachment was scrubbed... URL: <lists.llvm.org/pipermail/llvm-dev/attachments/20170120/dd851b86/attachment-0001.html>