Owen Anderson via llvm-dev
2015-Sep-14 19:30 UTC
[llvm-dev] [RFC] Refinement of convergent semantics
> On Sep 14, 2015, at 12:15 PM, Philip Reames <listmail at philipreames.com> wrote: > > On 09/04/2015 01:25 PM, Owen Anderson via llvm-dev wrote: >> Hi all, >> >> In light of recent discussions regarding updating passes to respect convergent semantics, and whether or not it is sufficient for barriers, I would like to propose a change in convergent semantics that should resolve a lot of the identified problems regarding loop unrolling, loop unswitching, etc. Credit to John McCall for talking this over with me and seeding the core ideas. >> >> Today, convergent operations may only be moved into control-equivalent locations, or, in layman’s terms, a convergent operation may neither be sunk into nor hoisted out of, a condition. This causes problems for full loop unrolling, as the control dependence on the loop counter is eliminated, but our intuition indicates that this dependence was somehow trivial. More concretely, all know uses of convergent are OK with full unrolling, making this semantic undesirable. Related problems arise in loop unswitching as well. > I don't understand this point. Loop unrolling specifically won't change which indices actually run. It might result in code duplication with a subset of indices taken one of two paths. Does today's convergent also imply no-duplicate? Is that what you're trying to relax?The definition today says that we cannot remove a control dependence. Since the loop counter is eliminated entirely, one can argue that we have eliminated the control dependence on it. I agree that there’s an intuitive argument that the dependence on the loop counter was trivial, but I have no idea how to formalize that. While resolving the question re: loop unrolling is nice, I actually thinking providing answers on which loop unswitching transforms are legal is actually the more novel part of this change.>> >> The proposed change is to split the semantics of convergent into two annotations: >> convergent - this operation may not be made control dependent on any additional values (aka may not be sunk into a condition) > To be clear, this is a restriction of current semantics only right?That depends on what you mean by restriction. It allow strictly more code motion than is allowed by the current semantics.>> nospeculate - this operation may not be added to any program trace on which it was not previously executed (same as notrap?) > Isn't this already true of all instructions? Unless we can *prove* that speculating an instruction can't introduce faults, we can't speculate it ever. An unknown call or intrinsic should already have this property right?Possibly? We probably need a safetospeculate attribute, then.> > This part of the proposal doesn't feel mature to me. In particular, I think we need to talk about what other cases we want to handle w.r.t. speculation attributes and what our model is. > > One case I want to support is for small functions which only read argument memory (i.e. argmemonly readonly nounwind) which are guaranteed (by the frontend) to fault only if a) the pointer passed in is null, or b) the memory state on entry is different that the one the context should have ensured. (The second part is standard. The first allows speculation in more cases.) > > I'd suggest promoting this to it's own thread. Once we settle on a workable model for safe speculation attributes, we can revisit how we want to change the convergent attribute.We can certainly start a separate conversation about safe speculation attribute. If what you suggest is true that we already treat intrinsics conservatively WRT speculation, then I don’t think we need to block progress on convergent on that. —Owen
Philip Reames via llvm-dev
2015-Sep-14 22:41 UTC
[llvm-dev] [RFC] Refinement of convergent semantics
On 09/14/2015 12:30 PM, Owen Anderson wrote:>> On Sep 14, 2015, at 12:15 PM, Philip Reames <listmail at philipreames.com> wrote: >> >> On 09/04/2015 01:25 PM, Owen Anderson via llvm-dev wrote: >>> Hi all, >>> >>> In light of recent discussions regarding updating passes to respect convergent semantics, and whether or not it is sufficient for barriers, I would like to propose a change in convergent semantics that should resolve a lot of the identified problems regarding loop unrolling, loop unswitching, etc. Credit to John McCall for talking this over with me and seeding the core ideas. >>> >>> Today, convergent operations may only be moved into control-equivalent locations, or, in layman’s terms, a convergent operation may neither be sunk into nor hoisted out of, a condition. This causes problems for full loop unrolling, as the control dependence on the loop counter is eliminated, but our intuition indicates that this dependence was somehow trivial. More concretely, all know uses of convergent are OK with full unrolling, making this semantic undesirable. Related problems arise in loop unswitching as well. >> I don't understand this point. Loop unrolling specifically won't change which indices actually run. It might result in code duplication with a subset of indices taken one of two paths. Does today's convergent also imply no-duplicate? Is that what you're trying to relax? > The definition today says that we cannot remove a control dependence. Since the loop counter is eliminated entirely, one can argue that we have eliminated the control dependence on it. I agree that there’s an intuitive argument that the dependence on the loop counter was trivial, but I have no idea how to formalize that.I'm still not getting the challenge. I can write a loop unroll as a two step transform. 1) Introduce another predicated copy of the loop, reduce the iteration count by 1/2. 2) Try to prove the predicated version always execute. (1) involves code duplication, but doesn't change the dynamic branches executed along any path or the state of any index when hitting the convergent operation. (2) is merely constant folding and/or CSE. While it reduces the dynamic branches, it does so in a very structured way. Is the problem purely that our current definition is a purely static one? Does our current definition allow (1) above?> > While resolving the question re: loop unrolling is nice, I actually thinking providing answers on which loop unswitching transforms are legal is actually the more novel part of this change.I think you can probably break down the argument the same as I did for loop unswitch. Step 1 - Introduce a new control dependency which doesn't change the state of any index along any path. Step 2 - Prove away a trivial branch. Having said that, I don't care enough about the "convergent" attribute to block you here. Do what you want. :)> >>> The proposed change is to split the semantics of convergent into two annotations: >>> convergent - this operation may not be made control dependent on any additional values (aka may not be sunk into a condition) >> To be clear, this is a restriction of current semantics only right? > That depends on what you mean by restriction. It allow strictly more code motion than is allowed by the current semantics.Ok. That's what I was trying to say, but said badly.> >>> nospeculate - this operation may not be added to any program trace on which it was not previously executed (same as notrap?) >> Isn't this already true of all instructions? Unless we can *prove* that speculating an instruction can't introduce faults, we can't speculate it ever. An unknown call or intrinsic should already have this property right? > Possibly? We probably need a safetospeculate attribute, then.The tricky part comes in defining what set of preconditions we want to encode in "safe". An unconditionally safe to speculate function isn't very common. So far, most of the cases I've seen come down to "safe if X" where X is "single argument which is non-null". But that's pretty specific to my frontend. I suspect we want to gather other inputs here.> >> This part of the proposal doesn't feel mature to me. In particular, I think we need to talk about what other cases we want to handle w.r.t. speculation attributes and what our model is. >> >> One case I want to support is for small functions which only read argument memory (i.e. argmemonly readonly nounwind) which are guaranteed (by the frontend) to fault only if a) the pointer passed in is null, or b) the memory state on entry is different that the one the context should have ensured. (The second part is standard. The first allows speculation in more cases.) >> >> I'd suggest promoting this to it's own thread. Once we settle on a workable model for safe speculation attributes, we can revisit how we want to change the convergent attribute. > We can certainly start a separate conversation about safe speculation attribute. If what you suggest is true that we already treat intrinsics conservatively WRT speculation, then I don’t think we need to block progress on convergent on that.Great! Looking at isSafeToSpeculate, we definitely return false for unknown targets. If you find a place we're not conservative here, please let me know. It'd be a potentially serious bug.> > —Owen
Owen Anderson via llvm-dev
2015-Sep-15 03:23 UTC
[llvm-dev] [RFC] Refinement of convergent semantics
> On Sep 14, 2015, at 3:41 PM, Philip Reames <listmail at philipreames.com> wrote: > > > > On 09/14/2015 12:30 PM, Owen Anderson wrote: >>> On Sep 14, 2015, at 12:15 PM, Philip Reames <listmail at philipreames.com> wrote: >>> >>> On 09/04/2015 01:25 PM, Owen Anderson via llvm-dev wrote: >>>> Hi all, >>>> >>>> In light of recent discussions regarding updating passes to respect convergent semantics, and whether or not it is sufficient for barriers, I would like to propose a change in convergent semantics that should resolve a lot of the identified problems regarding loop unrolling, loop unswitching, etc. Credit to John McCall for talking this over with me and seeding the core ideas. >>>> >>>> Today, convergent operations may only be moved into control-equivalent locations, or, in layman’s terms, a convergent operation may neither be sunk into nor hoisted out of, a condition. This causes problems for full loop unrolling, as the control dependence on the loop counter is eliminated, but our intuition indicates that this dependence was somehow trivial. More concretely, all know uses of convergent are OK with full unrolling, making this semantic undesirable. Related problems arise in loop unswitching as well. >>> I don't understand this point. Loop unrolling specifically won't change which indices actually run. It might result in code duplication with a subset of indices taken one of two paths. Does today's convergent also imply no-duplicate? Is that what you're trying to relax? >> The definition today says that we cannot remove a control dependence. Since the loop counter is eliminated entirely, one can argue that we have eliminated the control dependence on it. I agree that there’s an intuitive argument that the dependence on the loop counter was trivial, but I have no idea how to formalize that. > I'm still not getting the challenge. I can write a loop unroll as a two step transform. 1) Introduce another predicated copy of the loop, reduce the iteration count by 1/2. 2) Try to prove the predicated version always execute. (1) involves code duplication, but doesn't change the dynamic branches executed along any path or the state of any index when hitting the convergent operation. (2) is merely constant folding and/or CSE. While it reduces the dynamic branches, it does so in a very structured way. > > Is the problem purely that our current definition is a purely static one? Does our current definition allow (1) above? >> >> While resolving the question re: loop unrolling is nice, I actually thinking providing answers on which loop unswitching transforms are legal is actually the more novel part of this change. > I think you can probably break down the argument the same as I did for loop unswitch. Step 1 - Introduce a new control dependency which doesn't change the state of any index along any path. Step 2 - Prove away a trivial branch. > > Having said that, I don't care enough about the "convergent" attribute to block you here. Do what you want. :)The problem in both proposed transformations sequences is that you have to add a new control dependence, which is not allowed by either the old or new definitions of convergent. Coming up with a definition that allows it will be very difficult without introducing more details of SPMD programming models, since the “real” constraint is that only uniform control dependences can be added. That happens to be true in both cases outlined here (since the loop count is thread-invariant), but would not necessarily be true in other transformations. However, we can achieve the same end result without needing to introduce more details of SPMD programming into LLVM IR by removing the rule preventing the removal of control dependences. That’s basically what I’m proposing here.>> >>>> nospeculate - this operation may not be added to any program trace on which it was not previously executed (same as notrap?) >>> Isn't this already true of all instructions? Unless we can *prove* that speculating an instruction can't introduce faults, we can't speculate it ever. An unknown call or intrinsic should already have this property right? >> Possibly? We probably need a safetospeculate attribute, then. > The tricky part comes in defining what set of preconditions we want to encode in "safe". An unconditionally safe to speculate function isn't very common. So far, most of the cases I've seen come down to "safe if X" where X is "single argument which is non-null". But that's pretty specific to my frontend. I suspect we want to gather other inputs here.I actually have plenty of intrinsics that are universally safe to speculate, because they are essentially convergent math operations. —Owen