Friedman, Eli via llvm-dev
2017-May-16 18:18 UTC
[llvm-dev] [RFC] Canonicalization of unsigned subtraction with saturation
On 5/16/2017 6:30 AM, Sanjay Patel wrote:> Thanks for posting this question, Julia. > > I had a similar question about a signed min/max variant here: > http://lists.llvm.org/pipermail/llvm-dev/2016-November/106868.html > > The 2nd version in each case contains a canonical max/min > representation in IR, and this could enable more IR analysis. > A secondary advantage is that the backend recognizes the max/min in > the second IR form when creating DAG nodes, > and this directly affects isel for many targets.This seems important. And pattern-matching max(x,y)-y to a saturating subtract seems easy in the backend.> A possibly important difference between the earlier example and the > current unsigned case: > is a select with a zero constant operand easier to reason about in IR > than the canonical min/max?It might be in some cases... maybe? I mean, it might be easier to analyze in ComputeMaskedBits or something, but we don't really do much to optimize selects involving zero. -Eli> On Tue, May 16, 2017 at 5:30 AM, Koval, Julia <julia.koval at intel.com > <mailto:julia.koval at intel.com>> wrote: > > (1.16) > %cmp = icmp ugt i16 %x, %y > %sub2 = sub i16 %y, %x > %res = select i1 %cmp, i16 0, i16 %sub2 > > or > > (2.16) > %cmp = icmp ugt i16 %x, %y > %sel = select i1 %cmp, i16 %x, i16 %y > %sub = sub i16 %sel, %x > > Which of these versions is canonical? I think first version is > better, because it can be converted to unsigned saturation > instruction(i.e. PSUBUS), using existing backend code. > >-- Employee of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170516/2346d49b/attachment.html>
Sanjay Patel via llvm-dev
2017-May-16 19:49 UTC
[llvm-dev] [RFC] Canonicalization of unsigned subtraction with saturation
On Tue, May 16, 2017 at 12:18 PM, Friedman, Eli <efriedma at codeaurora.org> wrote:> On 5/16/2017 6:30 AM, Sanjay Patel wrote: > > Thanks for posting this question, Julia. > > I had a similar question about a signed min/max variant here: > http://lists.llvm.org/pipermail/llvm-dev/2016-November/106868.html > > The 2nd version in each case contains a canonical max/min representation > in IR, and this could enable more IR analysis. > A secondary advantage is that the backend recognizes the max/min in the > second IR form when creating DAG nodes, > and this directly affects isel for many targets. > > > This seems important. And pattern-matching max(x,y)-y to a saturating > subtract seems easy in the backend. > > A possibly important difference between the earlier example and the > current unsigned case: > is a select with a zero constant operand easier to reason about in IR than > the canonical min/max? > > > It might be in some cases... maybe? I mean, it might be easier to analyze > in ComputeMaskedBits or something, but we don't really do much to optimize > selects involving zero. >Because of my CPU upbringing, I always see this: select A, B, 0 as: and (sext A), B Any chance of control-flow is bad! ...but now I know that's wrong for IR. :) So forming the min/max sounds like the right answer to me. Note that we don't actually canonicalize the signed min/max cases from the earlier thread yet. We detect those in value tracking, and that was good enough to produce the ideal backend results, but I haven't gotten back to doing the transform in IR.> -Eli > > On Tue, May 16, 2017 at 5:30 AM, Koval, Julia <julia.koval at intel.com> > wrote: > >> (1.16) >> %cmp = icmp ugt i16 %x, %y >> %sub2 = sub i16 %y, %x >> %res = select i1 %cmp, i16 0, i16 %sub2 >> >> or >> >> (2.16) >> %cmp = icmp ugt i16 %x, %y >> %sel = select i1 %cmp, i16 %x, i16 %y >> %sub = sub i16 %sel, %x >> >> Which of these versions is canonical? I think first version is better, >> because it can be converted to unsigned saturation instruction(i.e. >> PSUBUS), using existing backend code. >> > > > -- > Employee of Qualcomm Innovation Center, Inc. > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170516/0ebac87d/attachment.html>
Koval, Julia via llvm-dev
2017-May-26 09:46 UTC
[llvm-dev] [RFC] Canonicalization of unsigned subtraction with saturation
So, can we declare the second version as a cannonical? Second version also has another advantage. If operands of select are different types, it still works as max. In the first case it is separated by trunk and not converts to max instruction in the end. Here is an example for 32 and 16 bit: (1) void foo(unsigned short *p, int max, int n) { int i; unsigned m; for (i = 0; i < n; i++) { m = *--p; *p = (unsigned short)(m >= max ? m-max : 0); } } (2) void goo(unsigned short *p, int max, int n) { int i; unsigned m; for (i = 0; i < n; i++) { m = *--p; unsigned umax = m > max ? m : max; *p = (unsigned short)(umax - max); } } (1) %cmp = icmp ugt i32 %x, %y %sub2 = sub i32 %y, %x %tr = trunc i32 %sub2 to i16 %res = select i1 %cmp, i16 0, i16 %tr or (2) %cmp = icmp ugt i32 %x, %y %sel = select i1 %cmp, i32 %x, i32 %y %sub = sub i32 %sel, %x %res = trunc i32 %sub to i16 -Julia> -----Original Message----- > From: Sanjay Patel [mailto:spatel at rotateright.com] > Sent: Tuesday, May 16, 2017 9:49 PM > To: Friedman, Eli <efriedma at codeaurora.org> > Cc: Koval, Julia <julia.koval at intel.com>; llvm-dev at lists.llvm.org; Bozhenov, > Nikolai <nikolai.bozhenov at intel.com>; Elovikov, Andrei > <andrei.elovikov at intel.com>; Hal Finkel <hfinkel at anl.gov>; David Majnemer > <david.majnemer at gmail.com> > Subject: Re: [RFC] Canonicalization of unsigned subtraction with saturation > > > > On Tue, May 16, 2017 at 12:18 PM, Friedman, Eli <efriedma at codeaurora.org > <mailto:efriedma at codeaurora.org> > wrote: > > > > On 5/16/2017 6:30 AM, Sanjay Patel wrote: > > > Thanks for posting this question, Julia. > > I had a similar question about a signed min/max variant here: > http://lists.llvm.org/pipermail/llvm-dev/2016- > November/106868.html <http://lists.llvm.org/pipermail/llvm-dev/2016- > November/106868.html> > > The 2nd version in each case contains a canonical max/min > representation in IR, and this could enable more IR analysis. > A secondary advantage is that the backend recognizes the > max/min in the second IR form when creating DAG nodes, > and this directly affects isel for many targets. > > > > This seems important. And pattern-matching max(x,y)-y to a saturating > subtract seems easy in the backend. > > > > A possibly important difference between the earlier example > and the current unsigned case: > is a select with a zero constant operand easier to reason about > in IR than the canonical min/max? > > > > It might be in some cases... maybe? I mean, it might be easier to > analyze in ComputeMaskedBits or something, but we don't really do much to > optimize selects involving zero. > > > > Because of my CPU upbringing, I always see this: > > select A, B, 0 > > > as: > > and (sext A), B > > > Any chance of control-flow is bad! > ...but now I know that's wrong for IR. :) > > > So forming the min/max sounds like the right answer to me. > > > Note that we don't actually canonicalize the signed min/max cases from the > earlier thread yet. We detect those in value tracking, and that was good enough > to produce the ideal backend results, but I haven't gotten back to doing the > transform in IR. > > > > > > > > -Eli > > > > On Tue, May 16, 2017 at 5:30 AM, Koval, Julia > <julia.koval at intel.com <mailto:julia.koval at intel.com> > wrote: > > > (1.16) > %cmp = icmp ugt i16 %x, %y > %sub2 = sub i16 %y, %x > %res = select i1 %cmp, i16 0, i16 %sub2 > > or > > (2.16) > %cmp = icmp ugt i16 %x, %y > %sel = select i1 %cmp, i16 %x, i16 %y > %sub = sub i16 %sel, %x > > Which of these versions is canonical? I think first > version is better, because it can be converted to unsigned saturation > instruction(i.e. PSUBUS), using existing backend code. > > > > > > > > > > -- > Employee of Qualcomm Innovation Center, Inc. > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, > a Linux Foundation Collaborative Project >