Dan Gohman <gohman at apple.com> writes:> For example, suppose we want to convert the && to &, and the ?: to a > select, in this code: > > if (a && (b ? (c + d) : e)) { > > because we have a CPU architecture with poor branch prediction, or > because we want more ILP, or because of some other reason. Here's what the > LLVM IR for that might look like: > > %t0 = add nsw i32 %c, %d > %t1 = select i1 %b, i32 %t0, i32 %e > %t2 = icmp ne i32 %t1, 0 > %t3 = and i1 %a, %t2 > br i1 %t3, ... > > The extra branching is gone, yay. But now we've put an add nsw out there > to be executed unconditionally. If we make the select an observation > point, we'd have introduced undefined behavior on a path that didn't > previously have it.Unless the undefined behavior only triggered if the select actually produced a poisoned result. Then it should have the same behavior as the branch, no?> A foodtaster instruction doesn't really solve this problem, because > we'd have to put it between the add and the select, and it would be > just as problematic.Or you put it immediately after the select.> One could argue that aggressive speculation is a low-level optimization > better suited to CodeGen than the optimizer, as LLVM divides them, and > that perhaps the cost for providing this level of flexibility in the > optimizer is too high, but that's a different argument.No, I think we want the flexibility. But I believe there are sane ways to do this. -Dave
On Tue, Dec 6, 2011 at 9:06 AM, David A. Greene <greened at obbligato.org>wrote:> Dan Gohman <gohman at apple.com> writes: > > > For example, suppose we want to convert the && to &, and the ?: to a > > select, in this code: > > > > if (a && (b ? (c + d) : e)) { > > > > because we have a CPU architecture with poor branch prediction, or > > because we want more ILP, or because of some other reason. Here's what > the > > LLVM IR for that might look like: > > > > %t0 = add nsw i32 %c, %d > > %t1 = select i1 %b, i32 %t0, i32 %e > > %t2 = icmp ne i32 %t1, 0 > > %t3 = and i1 %a, %t2 > > br i1 %t3, ... > > > > The extra branching is gone, yay. But now we've put an add nsw out there > > to be executed unconditionally. If we make the select an observation > > point, we'd have introduced undefined behavior on a path that didn't > > previously have it. > > Unless the undefined behavior only triggered if the select actually > produced a poisoned result. Then it should have the same behavior as > the branch, no? > > > A foodtaster instruction doesn't really solve this problem, because > > we'd have to put it between the add and the select, and it would be > > just as problematic. > > Or you put it immediately after the select. >That was my thinking. The select is an observation point for its first operand, but then merely propagates poison from the second or third, just like any computational instruction would. The icmp is an observation point for both inputs. Pogo -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111206/acacce9c/attachment.html>
On Dec 6, 2011, at 10:07 AM, Paul Robinson wrote:> On Tue, Dec 6, 2011 at 9:06 AM, David A. Greene <greened at obbligato.org> wrote: > Dan Gohman <gohman at apple.com> writes: > > > For example, suppose we want to convert the && to &, and the ?: to a > > select, in this code: > > > > if (a && (b ? (c + d) : e)) { > > > > because we have a CPU architecture with poor branch prediction, or > > because we want more ILP, or because of some other reason. Here's what the > > LLVM IR for that might look like: > > > > %t0 = add nsw i32 %c, %d > > %t1 = select i1 %b, i32 %t0, i32 %e > > %t2 = icmp ne i32 %t1, 0 > > %t3 = and i1 %a, %t2 > > br i1 %t3, ... > > > > The extra branching is gone, yay. But now we've put an add nsw out there > > to be executed unconditionally. If we make the select an observation > > point, we'd have introduced undefined behavior on a path that didn't > > previously have it. > > Unless the undefined behavior only triggered if the select actually > produced a poisoned result. Then it should have the same behavior as > the branch, no?> > A foodtaster instruction doesn't really solve this problem, because > > we'd have to put it between the add and the select, and it would be > > just as problematic. > > Or you put it immediately after the select. > > That was my thinking. The select is an observation point for its first operand, > but then merely propagates poison from the second or third, just like any > computational instruction would. > The icmp is an observation point for both inputs.Put the observation point on the add, the select, the icmp, or even the and, or between any of them, and it'll still be before the branch. That means that the code will have unconditional undefined behavior when the add overflows, which the original code did not have. Dan