Owen Anderson via llvm-dev
2015-Sep-04 20:25 UTC
[llvm-dev] [RFC] Refinement of convergent semantics
Hi all, In light of recent discussions regarding updating passes to respect convergent semantics, and whether or not it is sufficient for barriers, I would like to propose a change in convergent semantics that should resolve a lot of the identified problems regarding loop unrolling, loop unswitching, etc. Credit to John McCall for talking this over with me and seeding the core ideas. Today, convergent operations may only be moved into control-equivalent locations, or, in layman’s terms, a convergent operation may neither be sunk into nor hoisted out of, a condition. This causes problems for full loop unrolling, as the control dependence on the loop counter is eliminated, but our intuition indicates that this dependence was somehow trivial. More concretely, all know uses of convergent are OK with full unrolling, making this semantic undesirable. Related problems arise in loop unswitching as well. The proposed change is to split the semantics of convergent into two annotations: convergent - this operation may not be made control dependent on any additional values (aka may not be sunk into a condition) nospeculate - this operation may not be added to any program trace on which it was not previously executed (same as notrap?) Most of today’s convergent operations (barriers, arithmetic gradients) would continue to be marked only as convergent. The new semantics would allow full loop unrolling, and provide clarity on which loop unswitching operations are allowed, examples below. The one case where nospeculate would also be needed is in the case of texture fetches that compute implicit gradients. Because the computed gradient forms part of the addressing mode, gibberish gradients here can cause invalid memory dereferences. —Owen —————————————————— Loop Unswitching Examples ALLOWED: for (…) { if (c) { convergent(); } } DISALLOWED: for (…) { if (c) { … } convergent(); }
Matt Arsenault via llvm-dev
2015-Sep-04 20:50 UTC
[llvm-dev] [RFC] Refinement of convergent semantics
> On Sep 4, 2015, at 1:25 PM, Owen Anderson via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hi all, > > In light of recent discussions regarding updating passes to respect convergent semantics, and whether or not it is sufficient for barriers, I would like to propose a change in convergent semantics that should resolve a lot of the identified problems regarding loop unrolling, loop unswitching, etc. Credit to John McCall for talking this over with me and seeding the core ideas. > > Today, convergent operations may only be moved into control-equivalent locations, or, in layman’s terms, a convergent operation may neither be sunk into nor hoisted out of, a condition. This causes problems for full loop unrolling, as the control dependence on the loop counter is eliminated, but our intuition indicates that this dependence was somehow trivial. More concretely, all know uses of convergent are OK with full unrolling, making this semantic undesirable. Related problems arise in loop unswitching as well. > > The proposed change is to split the semantics of convergent into two annotations: > convergent - this operation may not be made control dependent on any additional values (aka may not be sunk into a condition) > nospeculate - this operation may not be added to any program trace on which it was not previously executed (same as notrap?) > > Most of today’s convergent operations (barriers, arithmetic gradients) would continue to be marked only as convergent. The new semantics would allow full loop unrolling, and provide clarity on which loop unswitching operations are allowed, examples below. > > The one case where nospeculate would also be needed is in the case of texture fetches that compute implicit gradients. Because the computed gradient forms part of the addressing mode, gibberish gradients here can cause invalid memory dereferences. > > —Owen > > —————————————————— > > Loop Unswitching Examples > > ALLOWED: > for (…) { > if (c) { convergent(); } > } > > DISALLOWED: > for (…) { > if (c) { … } > convergent(); > } > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev+1
Sanjoy Das via llvm-dev
2015-Sep-05 08:02 UTC
[llvm-dev] [RFC] Refinement of convergent semantics
> The proposed change is to split the semantics of convergent into two > annotations: > > convergent - this operation may not be made control dependent on any > additional values (aka may not be sunk into a condition)Does every unknown function need to be conservatively considered containing a convergent operation? IOW, I'd want LLVM to unswitch this: for (…) { if (c) { … } call (*func_ptr)(); } but (*func_ptr)() may contain a convergent operation in it (and it may later get devirtualized and inlined). Also, how about transforms like these: if (a & b) convergent(); ==> if (a) if (b) convergent(); ?> nospeculate - this operation may not be added to any program trace > on which it was not previously executed (same as notrap?)How is this different from, say, a store to the heap? Or a volatile load? If not, maybe nospeculate operations can just be modeled as writing to the heap? -- Sanjoy
Owen Anderson via llvm-dev
2015-Sep-08 16:36 UTC
[llvm-dev] [RFC] Refinement of convergent semantics
> On Sep 5, 2015, at 1:02 AM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: > >> The proposed change is to split the semantics of convergent into two >> annotations: >> >> convergent - this operation may not be made control dependent on any >> additional values (aka may not be sunk into a condition) > > Does every unknown function need to be conservatively considered > containing a convergent operation? IOW, I'd want LLVM to unswitch > this: > > for (…) { > if (c) { … } > call (*func_ptr)(); > } > > but (*func_ptr)() may contain a convergent operation in it (and it may > later get devirtualized and inlined).I expect SPMD implementations built on top of LLVM will need to conservatively mark external or indirect calls as convergent.> Also, how about transforms like these: > > if (a & b) > convergent(); > > ==> > > if (a) > if (b) > convergent(); > > ?This should be allowed, as the call was already control dependent on both a and b.>> nospeculate - this operation may not be added to any program trace >> on which it was not previously executed (same as notrap?) > > How is this different from, say, a store to the heap? Or a volatile > load? If not, maybe nospeculate operations can just be modeled as > writing to the heap?They’re more akin to integer divisions. They may in fact be purely arithmetic operations (say, computing a cross-thread gradient), and treating them like heap stores will significantly over-pessimize the program. —Owen
Owen Anderson via llvm-dev
2015-Sep-08 16:52 UTC
[llvm-dev] [RFC] Refinement of convergent semantics
+llvm-commits as well. —Owen> On Sep 4, 2015, at 1:25 PM, Owen Anderson via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hi all, > > In light of recent discussions regarding updating passes to respect convergent semantics, and whether or not it is sufficient for barriers, I would like to propose a change in convergent semantics that should resolve a lot of the identified problems regarding loop unrolling, loop unswitching, etc. Credit to John McCall for talking this over with me and seeding the core ideas. > > Today, convergent operations may only be moved into control-equivalent locations, or, in layman’s terms, a convergent operation may neither be sunk into nor hoisted out of, a condition. This causes problems for full loop unrolling, as the control dependence on the loop counter is eliminated, but our intuition indicates that this dependence was somehow trivial. More concretely, all know uses of convergent are OK with full unrolling, making this semantic undesirable. Related problems arise in loop unswitching as well. > > The proposed change is to split the semantics of convergent into two annotations: > convergent - this operation may not be made control dependent on any additional values (aka may not be sunk into a condition) > nospeculate - this operation may not be added to any program trace on which it was not previously executed (same as notrap?) > > Most of today’s convergent operations (barriers, arithmetic gradients) would continue to be marked only as convergent. The new semantics would allow full loop unrolling, and provide clarity on which loop unswitching operations are allowed, examples below. > > The one case where nospeculate would also be needed is in the case of texture fetches that compute implicit gradients. Because the computed gradient forms part of the addressing mode, gibberish gradients here can cause invalid memory dereferences. > > —Owen > > —————————————————— > > Loop Unswitching Examples > > ALLOWED: > for (…) { > if (c) { convergent(); } > } > > DISALLOWED: > for (…) { > if (c) { … } > convergent(); > } > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Owen Anderson via llvm-dev
2015-Sep-12 05:27 UTC
[llvm-dev] [RFC] Refinement of convergent semantics
Ping? —Owen> On Sep 8, 2015, at 9:52 AM, Owen Anderson <resistor at mac.com> wrote: > > +llvm-commits as well. > > —Owen > >> On Sep 4, 2015, at 1:25 PM, Owen Anderson via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> >> Hi all, >> >> In light of recent discussions regarding updating passes to respect convergent semantics, and whether or not it is sufficient for barriers, I would like to propose a change in convergent semantics that should resolve a lot of the identified problems regarding loop unrolling, loop unswitching, etc. Credit to John McCall for talking this over with me and seeding the core ideas. >> >> Today, convergent operations may only be moved into control-equivalent locations, or, in layman’s terms, a convergent operation may neither be sunk into nor hoisted out of, a condition. This causes problems for full loop unrolling, as the control dependence on the loop counter is eliminated, but our intuition indicates that this dependence was somehow trivial. More concretely, all know uses of convergent are OK with full unrolling, making this semantic undesirable. Related problems arise in loop unswitching as well. >> >> The proposed change is to split the semantics of convergent into two annotations: >> convergent - this operation may not be made control dependent on any additional values (aka may not be sunk into a condition) >> nospeculate - this operation may not be added to any program trace on which it was not previously executed (same as notrap?) >> >> Most of today’s convergent operations (barriers, arithmetic gradients) would continue to be marked only as convergent. The new semantics would allow full loop unrolling, and provide clarity on which loop unswitching operations are allowed, examples below. >> >> The one case where nospeculate would also be needed is in the case of texture fetches that compute implicit gradients. Because the computed gradient forms part of the addressing mode, gibberish gradients here can cause invalid memory dereferences. >> >> —Owen >> >> —————————————————— >> >> Loop Unswitching Examples >> >> ALLOWED: >> for (…) { >> if (c) { convergent(); } >> } >> >> DISALLOWED: >> for (…) { >> if (c) { … } >> convergent(); >> } >> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >
Philip Reames via llvm-dev
2015-Sep-14 19:15 UTC
[llvm-dev] [RFC] Refinement of convergent semantics
On 09/04/2015 01:25 PM, Owen Anderson via llvm-dev wrote:> Hi all, > > In light of recent discussions regarding updating passes to respect convergent semantics, and whether or not it is sufficient for barriers, I would like to propose a change in convergent semantics that should resolve a lot of the identified problems regarding loop unrolling, loop unswitching, etc. Credit to John McCall for talking this over with me and seeding the core ideas. > > Today, convergent operations may only be moved into control-equivalent locations, or, in layman’s terms, a convergent operation may neither be sunk into nor hoisted out of, a condition. This causes problems for full loop unrolling, as the control dependence on the loop counter is eliminated, but our intuition indicates that this dependence was somehow trivial. More concretely, all know uses of convergent are OK with full unrolling, making this semantic undesirable. Related problems arise in loop unswitching as well.I don't understand this point. Loop unrolling specifically won't change which indices actually run. It might result in code duplication with a subset of indices taken one of two paths. Does today's convergent also imply no-duplicate? Is that what you're trying to relax?> > The proposed change is to split the semantics of convergent into two annotations: > convergent - this operation may not be made control dependent on any additional values (aka may not be sunk into a condition)To be clear, this is a restriction of current semantics only right?> nospeculate - this operation may not be added to any program trace on which it was not previously executed (same as notrap?)Isn't this already true of all instructions? Unless we can *prove* that speculating an instruction can't introduce faults, we can't speculate it ever. An unknown call or intrinsic should already have this property right? This part of the proposal doesn't feel mature to me. In particular, I think we need to talk about what other cases we want to handle w.r.t. speculation attributes and what our model is. One case I want to support is for small functions which only read argument memory (i.e. argmemonly readonly nounwind) which are guaranteed (by the frontend) to fault only if a) the pointer passed in is null, or b) the memory state on entry is different that the one the context should have ensured. (The second part is standard. The first allows speculation in more cases.) I'd suggest promoting this to it's own thread. Once we settle on a workable model for safe speculation attributes, we can revisit how we want to change the convergent attribute.> > Most of today’s convergent operations (barriers, arithmetic gradients) would continue to be marked only as convergent. The new semantics would allow full loop unrolling, and provide clarity on which loop unswitching operations are allowed, examples below. > > The one case where nospeculate would also be needed is in the case of texture fetches that compute implicit gradients. Because the computed gradient forms part of the addressing mode, gibberish gradients here can cause invalid memory dereferences. > > —Owen > > —————————————————— > > Loop Unswitching Examples > > ALLOWED: > for (…) { > if (c) { convergent(); } > } > > DISALLOWED: > for (…) { > if (c) { … } > convergent(); > } > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Owen Anderson via llvm-dev
2015-Sep-14 19:30 UTC
[llvm-dev] [RFC] Refinement of convergent semantics
> On Sep 14, 2015, at 12:15 PM, Philip Reames <listmail at philipreames.com> wrote: > > On 09/04/2015 01:25 PM, Owen Anderson via llvm-dev wrote: >> Hi all, >> >> In light of recent discussions regarding updating passes to respect convergent semantics, and whether or not it is sufficient for barriers, I would like to propose a change in convergent semantics that should resolve a lot of the identified problems regarding loop unrolling, loop unswitching, etc. Credit to John McCall for talking this over with me and seeding the core ideas. >> >> Today, convergent operations may only be moved into control-equivalent locations, or, in layman’s terms, a convergent operation may neither be sunk into nor hoisted out of, a condition. This causes problems for full loop unrolling, as the control dependence on the loop counter is eliminated, but our intuition indicates that this dependence was somehow trivial. More concretely, all know uses of convergent are OK with full unrolling, making this semantic undesirable. Related problems arise in loop unswitching as well. > I don't understand this point. Loop unrolling specifically won't change which indices actually run. It might result in code duplication with a subset of indices taken one of two paths. Does today's convergent also imply no-duplicate? Is that what you're trying to relax?The definition today says that we cannot remove a control dependence. Since the loop counter is eliminated entirely, one can argue that we have eliminated the control dependence on it. I agree that there’s an intuitive argument that the dependence on the loop counter was trivial, but I have no idea how to formalize that. While resolving the question re: loop unrolling is nice, I actually thinking providing answers on which loop unswitching transforms are legal is actually the more novel part of this change.>> >> The proposed change is to split the semantics of convergent into two annotations: >> convergent - this operation may not be made control dependent on any additional values (aka may not be sunk into a condition) > To be clear, this is a restriction of current semantics only right?That depends on what you mean by restriction. It allow strictly more code motion than is allowed by the current semantics.>> nospeculate - this operation may not be added to any program trace on which it was not previously executed (same as notrap?) > Isn't this already true of all instructions? Unless we can *prove* that speculating an instruction can't introduce faults, we can't speculate it ever. An unknown call or intrinsic should already have this property right?Possibly? We probably need a safetospeculate attribute, then.> > This part of the proposal doesn't feel mature to me. In particular, I think we need to talk about what other cases we want to handle w.r.t. speculation attributes and what our model is. > > One case I want to support is for small functions which only read argument memory (i.e. argmemonly readonly nounwind) which are guaranteed (by the frontend) to fault only if a) the pointer passed in is null, or b) the memory state on entry is different that the one the context should have ensured. (The second part is standard. The first allows speculation in more cases.) > > I'd suggest promoting this to it's own thread. Once we settle on a workable model for safe speculation attributes, we can revisit how we want to change the convergent attribute.We can certainly start a separate conversation about safe speculation attribute. If what you suggest is true that we already treat intrinsics conservatively WRT speculation, then I don’t think we need to block progress on convergent on that. —Owen
Jingyue Wu via llvm-dev
2015-Sep-22 17:33 UTC
[llvm-dev] [RFC] Refinement of convergent semantics
Hi Owen, This is very interesting. How different is "convergent" from "uniform"? An instruction is uniform if threads in the same SIMT unit (e.g. warp) do not diverge when executing this instruction. I ask this because Bjarke recently came up with a mathematical definition of uniformity. I wonder if that is a foundation "convergent" needs as well. AFAICT, Bjarke's definition of "uniformity" is less restrictive than "convergent". For example, it allows loop unswitching the following code if "c" is uniform, which seems a case you ideally want to allow. DISALLOWED: for (…) { if (c) { … } convergent(); } Jingyue On Fri, Sep 4, 2015 at 1:25 PM, Owen Anderson via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi all, > > In light of recent discussions regarding updating passes to respect > convergent semantics, and whether or not it is sufficient for barriers, I > would like to propose a change in convergent semantics that should resolve > a lot of the identified problems regarding loop unrolling, loop > unswitching, etc. Credit to John McCall for talking this over with me and > seeding the core ideas. > > Today, convergent operations may only be moved into control-equivalent > locations, or, in layman’s terms, a convergent operation may neither be > sunk into nor hoisted out of, a condition. This causes problems for full > loop unrolling, as the control dependence on the loop counter is > eliminated, but our intuition indicates that this dependence was somehow > trivial. More concretely, all know uses of convergent are OK with full > unrolling, making this semantic undesirable. Related problems arise in > loop unswitching as well. > > The proposed change is to split the semantics of convergent into two > annotations: > convergent - this operation may not be made control dependent on > any additional values (aka may not be sunk into a condition) > nospeculate - this operation may not be added to any program trace > on which it was not previously executed (same as notrap?) > > Most of today’s convergent operations (barriers, arithmetic gradients) > would continue to be marked only as convergent. The new semantics would > allow full loop unrolling, and provide clarity on which loop unswitching > operations are allowed, examples below. > > The one case where nospeculate would also be needed is in the case of > texture fetches that compute implicit gradients. Because the computed > gradient forms part of the addressing mode, gibberish gradients here can > cause invalid memory dereferences. > > —Owen > > —————————————————— > > Loop Unswitching Examples > > ALLOWED: > for (…) { > if (c) { convergent(); } > } > > DISALLOWED: > for (…) { > if (c) { … } > convergent(); > } > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150922/cc9cb45b/attachment.html>
Owen Anderson via llvm-dev
2015-Sep-22 17:39 UTC
[llvm-dev] [RFC] Refinement of convergent semantics
Hi Jingyue, I consider it a very important element of the design of convergent that it does not require baseline LLVM to contain a definition of uniformity, which would itself pull in a definition of SIMT/SPMD, warps, threads, etc. The intention is that it should be a conservative (but hopefully not too conservative) approximation, and that implementations of specific GPU programming models (CUDA, OpenCL, individual GPU vendors, etc) may layer more permissive semantics on top of it in code that is specific to that programming model. —Owen> On Sep 22, 2015, at 10:33 AM, Jingyue Wu <jingyue at google.com> wrote: > > Hi Owen, > > This is very interesting. > > How different is "convergent" from "uniform"? An instruction is uniform if threads in the same SIMT unit (e.g. warp) do not diverge when executing this instruction. > > I ask this because Bjarke recently came up with a mathematical definition of uniformity. I wonder if that is a foundation "convergent" needs as well. AFAICT, Bjarke's definition of "uniformity" is less restrictive than "convergent". For example, it allows loop unswitching the following code if "c" is uniform, which seems a case you ideally want to allow. > > DISALLOWED: > for (…) { > if (c) { … } > convergent(); > } > > Jingyue > > On Fri, Sep 4, 2015 at 1:25 PM, Owen Anderson via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > Hi all, > > In light of recent discussions regarding updating passes to respect convergent semantics, and whether or not it is sufficient for barriers, I would like to propose a change in convergent semantics that should resolve a lot of the identified problems regarding loop unrolling, loop unswitching, etc. Credit to John McCall for talking this over with me and seeding the core ideas. > > Today, convergent operations may only be moved into control-equivalent locations, or, in layman’s terms, a convergent operation may neither be sunk into nor hoisted out of, a condition. This causes problems for full loop unrolling, as the control dependence on the loop counter is eliminated, but our intuition indicates that this dependence was somehow trivial. More concretely, all know uses of convergent are OK with full unrolling, making this semantic undesirable. Related problems arise in loop unswitching as well. > > The proposed change is to split the semantics of convergent into two annotations: > convergent - this operation may not be made control dependent on any additional values (aka may not be sunk into a condition) > nospeculate - this operation may not be added to any program trace on which it was not previously executed (same as notrap?) > > Most of today’s convergent operations (barriers, arithmetic gradients) would continue to be marked only as convergent. The new semantics would allow full loop unrolling, and provide clarity on which loop unswitching operations are allowed, examples below. > > The one case where nospeculate would also be needed is in the case of texture fetches that compute implicit gradients. Because the computed gradient forms part of the addressing mode, gibberish gradients here can cause invalid memory dereferences. > > —Owen > > —————————————————— > > Loop Unswitching Examples > > ALLOWED: > for (…) { > if (c) { convergent(); } > } > > DISALLOWED: > for (…) { > if (c) { … } > convergent(); > } > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150922/023be6e3/attachment.html>