Sanjay Patel via llvm-dev
2017-Jan-12 18:19 UTC
[llvm-dev] IR canonicalization: shufflevector or vector trunc?
On Thu, Jan 12, 2017 at 11:06 AM, Friedman, Eli <efriedma at codeaurora.org> wrote:> On 1/12/2017 9:04 AM, Sanjay Patel via llvm-dev wrote: > > It's time for another round of "What is the canonical IR?" > > Credit for this episode to Zvi and PR31551. :) > https://llvm.org/bugs/show_bug.cgi?id=31551 > > define <4 x i16> @shuffle(<16 x i16> %x) { > %shuf = shufflevector <16 x i16> %x, <16 x i16> undef, <4 x i32> <i32 0, i32 4, i32 8, i32 12> > ret <4 x i16> %shuf > } > > define <4 x i16> @trunc(<16 x i16> %x) { > %bc = bitcast <16 x i16> %x to <4 x i64> > %tr = trunc <4 x i64> %bc to <4 x i16> > ret <4 x i16> %tr > } > > > Potential reasons to prefer one or the other: > 1. Shuffle is the most compact. > 2. Trunc is easier to read. > 3. One of these is easier for value tracking. > 4. Compatibility with existing IR transforms (eg, InterleavedAccess > recognizes the shuffle form). > 5. We don't create arbitrary shuffle masks in IR because that's bad for a > lot of targets (but maybe this mask pattern should always be recognized as > special?). > > > Hmm... not sure what the right answer is, but a couple more observations: > 1. If we're going to canonicalize, we should probably canonicalize the > same way independent of the original argument type (so we would introduce > bitcasts either way). >Ah, right - kill #1 in my list.> 2. Those two functions are only equivalent on little-endian platforms. >I was wondering about that. So yes, if we do want to canonicalize (until the recent compile-time complaints, I always thought this was the objective of InstCombine...maybe it still is), then the masks we're matching or generating will differ based on endianness. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170112/f0f7bdc6/attachment-0001.html>
Rackover, Zvi via llvm-dev
2017-Jan-12 20:14 UTC
[llvm-dev] IR canonicalization: shufflevector or vector trunc?
Just to add, there is also the ‘zext’ – ‘shuffle with zero’ duality which can broaden the discussion. --Zvi From: Sanjay Patel [mailto:spatel at rotateright.com] Sent: Thursday, January 12, 2017 20:19 To: Friedman, Eli <efriedma at codeaurora.org> Cc: llvm-dev <llvm-dev at lists.llvm.org>; Rackover, Zvi <zvi.rackover at intel.com> Subject: Re: [llvm-dev] IR canonicalization: shufflevector or vector trunc? On Thu, Jan 12, 2017 at 11:06 AM, Friedman, Eli <efriedma at codeaurora.org<mailto:efriedma at codeaurora.org>> wrote: On 1/12/2017 9:04 AM, Sanjay Patel via llvm-dev wrote: It's time for another round of "What is the canonical IR?" Credit for this episode to Zvi and PR31551. :) https://llvm.org/bugs/show_bug.cgi?id=31551 define <4 x i16> @shuffle(<16 x i16> %x) { %shuf = shufflevector <16 x i16> %x, <16 x i16> undef, <4 x i32> <i32 0, i32 4, i32 8, i32 12> ret <4 x i16> %shuf } define <4 x i16> @trunc(<16 x i16> %x) { %bc = bitcast <16 x i16> %x to <4 x i64> %tr = trunc <4 x i64> %bc to <4 x i16> ret <4 x i16> %tr } Potential reasons to prefer one or the other: 1. Shuffle is the most compact. 2. Trunc is easier to read. 3. One of these is easier for value tracking. 4. Compatibility with existing IR transforms (eg, InterleavedAccess recognizes the shuffle form). 5. We don't create arbitrary shuffle masks in IR because that's bad for a lot of targets (but maybe this mask pattern should always be recognized as special?). Hmm... not sure what the right answer is, but a couple more observations: 1. If we're going to canonicalize, we should probably canonicalize the same way independent of the original argument type (so we would introduce bitcasts either way). Ah, right - kill #1 in my list. 2. Those two functions are only equivalent on little-endian platforms. I was wondering about that. So yes, if we do want to canonicalize (until the recent compile-time complaints, I always thought this was the objective of InstCombine...maybe it still is), then the masks we're matching or generating will differ based on endianness. --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170112/85f436a2/attachment.html>
Sanjay Patel via llvm-dev
2017-Jan-13 16:19 UTC
[llvm-dev] IR canonicalization: shufflevector or vector trunc?
Right - I think that case looks like this for little endian: define <2 x i32> @zextshuffle(<2 x i16> %x) { %zext_shuffle = shufflevector <2 x i16> %x, <2 x i16> zeroinitializer, <4 x i32> <i32 0, i32 2, i32 1, i32 2> %bc = bitcast <4 x i16> %zext_shuffle to <2 x i32> ret <2 x i32> %bc } define <2 x i32> @zextvec(<2 x i16> %x) { %zext = zext <2 x i16> %x to <2 x i32> ret <2 x i32> %zext } IMO, the fact that we have to take endianness into account with the shuffles makes the trunc/zext forms the better choice. That way, we limit the endian dependency to one place in InstCombine, and other transforms don't have to worry about it. We also have lots of existing folds for trunc/zext and hardly any for shuffles. On Thu, Jan 12, 2017 at 1:14 PM, Rackover, Zvi <zvi.rackover at intel.com> wrote:> Just to add, there is also the ‘zext’ – ‘shuffle with zero’ duality which > can broaden the discussion. > > > > --Zvi > > > > *From:* Sanjay Patel [mailto:spatel at rotateright.com] > *Sent:* Thursday, January 12, 2017 20:19 > *To:* Friedman, Eli <efriedma at codeaurora.org> > *Cc:* llvm-dev <llvm-dev at lists.llvm.org>; Rackover, Zvi < > zvi.rackover at intel.com> > *Subject:* Re: [llvm-dev] IR canonicalization: shufflevector or vector > trunc? > > > > > > > > On Thu, Jan 12, 2017 at 11:06 AM, Friedman, Eli <efriedma at codeaurora.org> > wrote: > > On 1/12/2017 9:04 AM, Sanjay Patel via llvm-dev wrote: > > It's time for another round of "What is the canonical IR?" > > Credit for this episode to Zvi and PR31551. :) > https://llvm.org/bugs/show_bug.cgi?id=31551 > > define <4 x i16> @shuffle(<16 x i16> %x) { > > %shuf = shufflevector <16 x i16> %x, <16 x i16> undef, <4 x i32> <i32 0, i32 4, i32 8, i32 12> > > ret <4 x i16> %shuf > > } > > > > define <4 x i16> @trunc(<16 x i16> %x) { > > %bc = bitcast <16 x i16> %x to <4 x i64> > > %tr = trunc <4 x i64> %bc to <4 x i16> > > ret <4 x i16> %tr > > } > > > > Potential reasons to prefer one or the other: > 1. Shuffle is the most compact. > 2. Trunc is easier to read. > 3. One of these is easier for value tracking. > 4. Compatibility with existing IR transforms (eg, InterleavedAccess > recognizes the shuffle form). > > 5. We don't create arbitrary shuffle masks in IR because that's bad for a > lot of targets (but maybe this mask pattern should always be recognized as > special?). > > > > Hmm... not sure what the right answer is, but a couple more observations: > 1. If we're going to canonicalize, we should probably canonicalize the > same way independent of the original argument type (so we would introduce > bitcasts either way). > > > > Ah, right - kill #1 in my list. > > > > 2. Those two functions are only equivalent on little-endian platforms. > > > > I was wondering about that. So yes, if we do want to canonicalize (until > the recent compile-time complaints, I always thought this was the objective > of InstCombine...maybe it still is), then the masks we're matching or > generating will differ based on endianness. > > > > --------------------------------------------------------------------- > Intel Israel (74) Limited > > This e-mail and any attachments may contain confidential material for > the sole use of the intended recipient(s). Any review or distribution > by others is strictly prohibited. If you are not the intended > recipient, please contact the sender and delete all copies. >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170113/7a6a6de6/attachment.html>