Sanjoy Das via llvm-dev
2016-Feb-21 20:24 UTC
[llvm-dev] RFC: Add guard intrinsics to LLVM
Hi Andy, Thanks for replying, responses inline below: On Fri, Feb 19, 2016 at 11:12 AM, Andrew Trick <atrick at apple.com> wrote:> This clearly doesn't need operand bundles, but using an intrinsic > would permit special code motion semantics. It could be hoisted and > merged with other traps, but the condition could never be widened > beyond the union of the original conditions. Unlike deoptimizing > guards, it would need to be sequenced with memory barriers, but couldBy memory barrier, do you mean things like fences?> otherwise be hoisted as readnone.Can you clarify this a little bit? Are you talking about things like: *ptr = 42 @trap_if(%foo) => @trap_if(%foo) *ptr = 42 ? I.e. if a failing guard postdominates a point "P" in the program, then the guard can be failed early at P. (We'd want a stronger condition than postdomination, since we won't want to hoist while (!*cond) { } @trap_if(*cond) => @trap_if(*cond) while (!*cond) { } ) [snip]> Of course it would be nice to consider these intrinsics > readonly. (One thing we never fixed with patchpoints is the memory > effects and that was probably hurting performance.) But has this bug > been fixed yet: http://llvm.org/pr18912 Optimizer should consider > "maythrow" calls (those without "nounwind) as having side effects? I > vaguely remember some work being done to improve the situation, but > last I checked LICM was still violating it, and who knows what else?The specific case in the bug was fixed by David in http://reviews.llvm.org/rL256728. But I agree with your concern, that this notion of "readnone calls are always okay to remove" may have leaked into other parts of LLVM.> BTW, if you do want readonly semantics, why would you want readonly > to be implicit instead of explicit?I don't understand the question. :) What's explicit vs. implicit here?> I think you would need to mark @guard_on as may-unwind AND fix any > lingering assumptions in LLVM that readonly calls are nounwind. (Loads > can be CSE's across unwind calls but can't be hoisted across them).Yes, but a failed guard doesn't exactly "unwind". I.e. if A invokes B and B fails a guard, then the guard may not always unwind to the A->B callsite's landingpad, but can also return to its normal no-exception successor; unless you inline B into A, in which case "B"s guard (now physically present in the A+B function) will return from A if it fails. In other words, we'll have to re-purpose the term "unwind" to mean either "throwing an exception" or "failed a guard". I'm fine with this, but it is a choice we need to make explicitly.> Another good point mentioned later in the thread is that a readonly > callee should *not* imply a readonly @guard_on or other "deopt" call > site. The simple solution for this is to disallow "deopt" of readonly > calls.Assuming we're talking about the same thing, the concern was more along the lines of: you cannot do IPA on a callsite that calls something that contains a potential deoptimization point, be it either for a @guard_on call, or a "normal" deoptimization safepoint. This is because if you have: ``` void @something() void callee() { call @something() [ "deopt"(state0) ] *ptr = 100; } void caller() { call @callee() // whether this is a "deopt" call site or a // a normal call site doesn't matter int k = *ptr; } ``` you cannot, say, fold `k` to `100` in @caller() because the deopt state, state0, if resumed to may actually write something else to *ptr. Making the call to @something() read/write/may unwind does not solve the problem -- even if the call to @something() wrote to *ptr (or threw an exception), we could still fold k to 100. The novel control flow here is that `@caller`, if deoptimized with `state0`, will execute some arbitrary code, and **return** to @caller at the callsite to @callee. (Btw: things are a little different inside our JVM because of the way we register dependencies, but I'm trying to make a LLVM-centric argument here.) At this point, I think it is most straightforward to do a slight variant of what Philip suggested to me offline: if a function can deoptimize, it needs to be marked as "mayBeOverridden", preferably by introducing a new function attribute or by setting the linkage type to one of those checked by GlobalValue::mayBeOverridden. This is the responsibility of the frontend, and resuming to the "deopt" continuation in a physical frame where the enclosing function was not marked mayBeOverridden is UB. -- Sanjoy
Andrew Trick via llvm-dev
2016-Feb-22 18:08 UTC
[llvm-dev] RFC: Add guard intrinsics to LLVM
> On Feb 21, 2016, at 12:24 PM, Sanjoy Das via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hi Andy, > > Thanks for replying, responses inline below: > > On Fri, Feb 19, 2016 at 11:12 AM, Andrew Trick <atrick at apple.com> wrote: >> This clearly doesn't need operand bundles, but using an intrinsic >> would permit special code motion semantics. It could be hoisted and >> merged with other traps, but the condition could never be widened >> beyond the union of the original conditions. Unlike deoptimizing >> guards, it would need to be sequenced with memory barriers, but could > > By memory barrier, do you mean things like fences?Yes, the ‘fence’ instruction (not to be confused with GC barriers)>> otherwise be hoisted as readnone. > > Can you clarify this a little bit? Are you talking about things like: > > *ptr = 42 > @trap_if(%foo) > => > @trap_if(%foo) > *ptr = 42 > > ? I.e. if a failing guard postdominates a point "P" in the program, > then the guard can be failed early at P.That’s right. Conservatively, I would not hoist at the LLVM level past opaque non-readonly calls or fences.> (We'd want a stronger condition than postdomination, since we won't > want to hoist > > while (!*cond) { } > @trap_if(*cond) > => > @trap_if(*cond) > while (!*cond) { } > )Sorry, if you need an infinite loop you need to hide it in an opaque call or perform a volatile read.>> Of course it would be nice to consider these intrinsics >> readonly. (One thing we never fixed with patchpoints is the memory >> effects and that was probably hurting performance.) But has this bug >> been fixed yet: http://llvm.org/pr18912 Optimizer should consider >> "maythrow" calls (those without "nounwind) as having side effects? I >> vaguely remember some work being done to improve the situation, but >> last I checked LICM was still violating it, and who knows what else? > > The specific case in the bug was fixed by David in > http://reviews.llvm.org/rL256728. But I agree with your concern, that > this notion of "readnone calls are always okay to remove" may have > leaked into other parts of LLVM.I've begun to think that may-unwind (or may fail guard) and readonly should be mutually exclusive. readonly refers to all system memory, not just LLVM-visible memory. To achieve the effect of "aliases-with-nothing-in-llvm" it's cleaner to use alias analysis.>> BTW, if you do want readonly semantics, why would you want readonly >> to be implicit instead of explicit? > > I don't understand the question. :) What's explicit vs. implicit here?I meant readonly can always be a property of the call site as opposed an intrinsic property for more frontend control>> I think you would need to mark @guard_on as may-unwind AND fix any >> lingering assumptions in LLVM that readonly calls are nounwind. (Loads >> can be CSE's across unwind calls but can't be hoisted across them). > > Yes, but a failed guard doesn't exactly "unwind". I.e. if A invokes B > and B fails a guard, then the guard may not always unwind to the A->B > callsite's landingpad, but can also return to its normal no-exception > successor; unless you inline B into A, in which case "B"s guard (now > physically present in the A+B function) will return from A if it > fails. In other words, we'll have to re-purpose the term "unwind" to > mean either "throwing an exception" or "failed a guard". I'm fine > with this, but it is a choice we need to make explicitly.I was thinking that that "unwind" would be repurposed as you say. But after responding to your points below, I think that could be misleading. It's probably cleaner to adhere to the rule that unwinding can only resume in a landing pad.>> Another good point mentioned later in the thread is that a readonly >> callee should *not* imply a readonly @guard_on or other "deopt" call >> site. The simple solution for this is to disallow "deopt" of readonly >> calls. > > Assuming we're talking about the same thing, the concern was more > along the lines of: you cannot do IPA on a callsite that calls > something that contains a potential deoptimization point, be it either > for a @guard_on call, or a "normal" deoptimization safepoint. This is > because if you have: > > ``` > void @something() > > void callee() { > call @something() [ "deopt"(state0) ] > *ptr = 100; > } > > void caller() { > call @callee() // whether this is a "deopt" call site or a > // a normal call site doesn't matter > int k = *ptr; > } > ``` > > you cannot, say, fold `k` to `100` in @caller() because the deopt > state, state0, if resumed to may actually write something else to > *ptr. > > Making the call to @something() read/write/may unwind does not solve > the problem -- even if the call to @something() wrote to *ptr (or > threw an exception), we could still fold k to 100. The novel control > flow here is that `@caller`, if deoptimized with `state0`, will > execute some arbitrary code, and **return** to @caller at the callsite > to @callee. > > (Btw: things are a little different inside our JVM because of the way > we register dependencies, but I'm trying to make a LLVM-centric > argument here.) > > At this point, I think it is most straightforward to do a slight > variant of what Philip suggested to me offline: if a function can > deoptimize, it needs to be marked as "mayBeOverridden", preferably by > introducing a new function attribute or by setting the linkage type to > one of those checked by GlobalValue::mayBeOverridden. This is the > responsibility of the frontend, and resuming to the "deopt" > continuation in a physical frame where the enclosing function was not > marked mayBeOverridden is UB.I understand your problem. I was ignoring an aspect of analyzing the unwind path and the fact the LLVM could make assumptions about where control will resume. mayBeOverriden makes sense. I would think that the symbol at the deopt call site needs to be marked mayBeOverriden. -Andy
Sanjoy Das via llvm-dev
2016-Feb-22 19:02 UTC
[llvm-dev] RFC: Add guard intrinsics to LLVM
Hi Andy, Responses inline: On Mon, Feb 22, 2016 at 10:08 AM, Andrew Trick <atrick at apple.com> wrote:>> By memory barrier, do you mean things like fences? > That’s right. Conservatively, I would not hoist at the LLVM level past > opaque non-readonly calls or fences.Just for curiosity: why do you care about memory fences specifically?> I've begun to think that may-unwind (or may fail guard) and readonly > should be mutually exclusive. readonly refers to all system memory, > not just LLVM-visible memory. To achieve the effect of > "aliases-with-nothing-in-llvm" it's cleaner to use alias analysis.(I've put this on the bug as well:) https://llvm.org/bugs/show_bug.cgi?id=18912#c3 ``` (In reply to comment #2)> The test case in this bug report is fixed by > http://reviews.llvm.org/rL256728. > > I'm closing this because I don't have a test case and I no longer think it > makes much sense to mark may-unwind calls "readonly". An unwind path always > touches some memory. That fact that it doesn't alias with LLVM-visibleI'm not sure about this -- why does an unwind path *have* to write some memory? Can't you implement unwind as "read the appropriate exception handler RPC from a table, and return to that RPC"? ```> memory access can be handled by alias analysis.Btw, I think in this interpretation it is incorrect for -functionattrs to infer readnone for @foo in (which it does today): define void @foo({ i8*, i32 } %val) personality i8 8 { resume { i8*, i32 } %val }> I meant readonly can always be a property of the call site as opposed an intrinsic > property for more frontend controlYes, I agree that if we decide that readonly semantics are fine, we can always explicitly mark these as readonly, and not have to worry about specifying the memory effects as part of the intrinsics definition. Part of the discussion here has devolved to is "readonly" is correct.> I was thinking that that "unwind" would be repurposed as you say. But > after responding to your points below, I think that could be > misleading. It's probably cleaner to adhere to the rule that unwinding > can only resume in a landing pad.Ok.> I understand your problem. I was ignoring an aspect of analyzing the > unwind path and the fact the LLVM could make assumptions about where > control will resume. > > mayBeOverriden makes sense. I would think that the symbol at the deopt > call site needs to be marked mayBeOverriden.Ok. I'll probably split off an "introduce a `interposable` function attribute" into an independent review (I think re-using a linkage type for this will be somewhat of a hack). Thanks! -- Sanjoy