Demikhovsky, Elena via llvm-dev
2016-Jan-23 20:06 UTC
[llvm-dev] how to force llvm generate gather intrinsic
Ø Can we legalize the same set of masked load/store operations for AVX1 as AVX2? Yes, of course. - Elena From: Sanjay Patel [mailto:spatel at rotateright.com] Sent: Saturday, January 23, 2016 18:42 To: Nema, Ashutosh <Ashutosh.Nema at amd.com> Cc: Demikhovsky, Elena <elena.demikhovsky at intel.com>; zhi chen <zchenhn at gmail.com>; llvm-dev <llvm-dev at lists.llvm.org> Subject: Re: [llvm-dev] how to force llvm generate gather intrinsic On Sat, Jan 23, 2016 at 6:45 AM, Nema, Ashutosh <Ashutosh.Nema at amd.com<mailto:Ashutosh.Nema at amd.com>> wrote: Thanks Sanjay for highlighting this, few days back I also faced similar problem while generating masked store in avx1 mode, found its only supported under avx2 else we scalarize it.> 1) I did not switch-on masked_load/store to AVX1, I can do this.Yes Elena, This should be supported for FP type in avx1 mode (for INT type, I doubt X86 has masked_load/store instruction in avx1 mode). Thanks everyone for the answers. My immediate motivation is to improve the masked load/store ops for an AVX target. If we can fix scatter/gather similarly, that would be great. Can we legalize the same set of masked load/store operations for AVX1 as AVX2? If I'm understanding them correctly, the AVX1 FP instructions (vmaskmovps/pd) can be used in place of the AVX2 int instructions (vpmaskmovd/q), just with domain crossing penalties thrown in. I think we do this for other missing integer ops for an AVX1 target either in x86 lowering or in the tablegen patterns. Elena - I'm not too familiar with the vectorizers or scatter/gather, but I'll certainly take a look at D15690. Thanks for pointing out the patch! --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160123/0e242426/attachment.html>
zhi chen via llvm-dev
2016-Feb-24 23:20 UTC
[llvm-dev] how to force llvm generate gather intrinsic
Hi Elena, Are the masked_load and gather working now? Best, Zhi On Sat, Jan 23, 2016 at 12:06 PM, Demikhovsky, Elena < elena.demikhovsky at intel.com> wrote:> Ø Can we legalize the same set of masked load/store operations for AVX1 > as AVX2? > > Yes, of course. > > > > - * Elena* > > > > *From:* Sanjay Patel [mailto:spatel at rotateright.com] > *Sent:* Saturday, January 23, 2016 18:42 > *To:* Nema, Ashutosh <Ashutosh.Nema at amd.com> > *Cc:* Demikhovsky, Elena <elena.demikhovsky at intel.com>; zhi chen < > zchenhn at gmail.com>; llvm-dev <llvm-dev at lists.llvm.org> > *Subject:* Re: [llvm-dev] how to force llvm generate gather intrinsic > > > > > > On Sat, Jan 23, 2016 at 6:45 AM, Nema, Ashutosh <Ashutosh.Nema at amd.com> > wrote: > > Thanks Sanjay for highlighting this, few days back I also faced similar > problem > > while generating masked store in avx1 mode, found its only supported under > > avx2 else we scalarize it. > > > > > 1) I did not switch-on masked_load/store to AVX1, I can do this. > > > > Yes Elena, This should be supported for FP type in avx1 mode (for INT > type, I doubt X86 has masked_load/store instruction in avx1 mode). > > > > Thanks everyone for the answers. My immediate motivation is to improve the > masked load/store ops for an AVX target. If we can fix scatter/gather > similarly, that would be great. > > Can we legalize the same set of masked load/store operations for AVX1 as > AVX2? If I'm understanding them correctly, the AVX1 FP instructions > (vmaskmovps/pd) can be used in place of the AVX2 int instructions > (vpmaskmovd/q), just with domain crossing penalties thrown in. I think we > do this for other missing integer ops for an AVX1 target either in x86 > lowering or in the tablegen patterns. > > Elena - I'm not too familiar with the vectorizers or scatter/gather, but > I'll certainly take a look at D15690. Thanks for pointing out the patch! > > > > --------------------------------------------------------------------- > Intel Israel (74) Limited > > This e-mail and any attachments may contain confidential material for > the sole use of the intended recipient(s). Any review or distribution > by others is strictly prohibited. If you are not the intended > recipient, please contact the sender and delete all copies. >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160224/c3e856ec/attachment.html>
Demikhovsky, Elena via llvm-dev
2016-Feb-25 06:39 UTC
[llvm-dev] how to force llvm generate gather intrinsic
Yes, masked load/store/gather/scatter are completed. - Elena From: zhi chen [mailto:zchenhn at gmail.com] Sent: Thursday, February 25, 2016 01:20 To: Demikhovsky, Elena <elena.demikhovsky at intel.com> Cc: Sanjay Patel <spatel at rotateright.com>; Nema, Ashutosh <Ashutosh.Nema at amd.com>; llvm-dev <llvm-dev at lists.llvm.org> Subject: Re: [llvm-dev] how to force llvm generate gather intrinsic Hi Elena, Are the masked_load and gather working now? Best, Zhi On Sat, Jan 23, 2016 at 12:06 PM, Demikhovsky, Elena <elena.demikhovsky at intel.com<mailto:elena.demikhovsky at intel.com>> wrote:> Can we legalize the same set of masked load/store operations for AVX1 as AVX2?Yes, of course. - Elena From: Sanjay Patel [mailto:spatel at rotateright.com<mailto:spatel at rotateright.com>] Sent: Saturday, January 23, 2016 18:42 To: Nema, Ashutosh <Ashutosh.Nema at amd.com<mailto:Ashutosh.Nema at amd.com>> Cc: Demikhovsky, Elena <elena.demikhovsky at intel.com<mailto:elena.demikhovsky at intel.com>>; zhi chen <zchenhn at gmail.com<mailto:zchenhn at gmail.com>>; llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> Subject: Re: [llvm-dev] how to force llvm generate gather intrinsic On Sat, Jan 23, 2016 at 6:45 AM, Nema, Ashutosh <Ashutosh.Nema at amd.com<mailto:Ashutosh.Nema at amd.com>> wrote: Thanks Sanjay for highlighting this, few days back I also faced similar problem while generating masked store in avx1 mode, found its only supported under avx2 else we scalarize it.> 1) I did not switch-on masked_load/store to AVX1, I can do this.Yes Elena, This should be supported for FP type in avx1 mode (for INT type, I doubt X86 has masked_load/store instruction in avx1 mode). Thanks everyone for the answers. My immediate motivation is to improve the masked load/store ops for an AVX target. If we can fix scatter/gather similarly, that would be great. Can we legalize the same set of masked load/store operations for AVX1 as AVX2? If I'm understanding them correctly, the AVX1 FP instructions (vmaskmovps/pd) can be used in place of the AVX2 int instructions (vpmaskmovd/q), just with domain crossing penalties thrown in. I think we do this for other missing integer ops for an AVX1 target either in x86 lowering or in the tablegen patterns. Elena - I'm not too familiar with the vectorizers or scatter/gather, but I'll certainly take a look at D15690. Thanks for pointing out the patch! --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160225/3918ec61/attachment.html>