Demikhovsky, Elena
2015-Apr-16 19:17 UTC
[LLVMdev] Code review for gather and scatter intrinsics
Hi Renato, I fully agree with you, but indexed load and store is the next step. I'm asking to review gather and scatter code. Thanks. - Elena -----Original Message----- From: Renato Golin [mailto:renato.golin at linaro.org] Sent: Thursday, April 16, 2015 17:17 To: Demikhovsky, Elena Cc: llvmdev at cs.uiuc.edu; Chandler Carruth; James Molloy Subject: Re: [LLVMdev] Code review for gather and scatter intrinsics On 16 April 2015 at 14:44, Demikhovsky, Elena <elena.demikhovsky at intel.com> wrote:> http://reviews.llvm.org/D7665. > I presented this work on LLVM Euro and people are interested in this > feature. > Can anybody review this code, please?Hi Elena, Sorry for the delay, I'm still catching up with my emails after a long holiday. :) The only concern to this feature I remember was Chandler's comment that we should try to encode everything into loads and shuffles. Correct me if I'm wrong, but on the strided vectorizer thread we have reached a consensus that indexed intrinsics would be the least problematic (compared to strided access intrinsics or plain load+shuffle) because they're restricted to load/store of patterns that could later be lowered to shuffles, if the hardware doesn't support it, but they'd also keep the pattern intact even after other optimization passes have gone through the same code. Chandler, As I said at EuroLLVM, the fear is that we'd get the pattern destroyed and lose the ability of using masked / strided access at all. However, I'm haven't looked at great depth at this problem to know what are the cases and why they could break. Maybe Elena or James could help with that. cheers, --renato --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
Renato Golin
2015-Apr-16 19:48 UTC
[LLVMdev] Code review for gather and scatter intrinsics
I see. I probably got it wrong, then. I was under the impression that we wouldn't do stride.load/store or gather/scatter directly, but implement both of them via index.load/store and differentiate gather/scatter with the argument as a vector of pointers, not a pointer of vectors. I also thought that we'd use the mask to complement the indexed load, which would in turn help the vectorizer to create prologues/epilogues and the back end to generate the best instruction. cheers, --renato On 16 April 2015 at 20:17, Demikhovsky, Elena <elena.demikhovsky at intel.com> wrote:> Hi Renato, > > I fully agree with you, but indexed load and store is the next step. > I'm asking to review gather and scatter code. > > Thanks. > > - Elena > > -----Original Message----- > From: Renato Golin [mailto:renato.golin at linaro.org] > Sent: Thursday, April 16, 2015 17:17 > To: Demikhovsky, Elena > Cc: llvmdev at cs.uiuc.edu; Chandler Carruth; James Molloy > Subject: Re: [LLVMdev] Code review for gather and scatter intrinsics > > On 16 April 2015 at 14:44, Demikhovsky, Elena <elena.demikhovsky at intel.com> wrote: >> http://reviews.llvm.org/D7665. >> I presented this work on LLVM Euro and people are interested in this >> feature. >> Can anybody review this code, please? > > Hi Elena, > > Sorry for the delay, I'm still catching up with my emails after a long holiday. :) > > The only concern to this feature I remember was Chandler's comment that we should try to encode everything into loads and shuffles. > > Correct me if I'm wrong, but on the strided vectorizer thread we have reached a consensus that indexed intrinsics would be the least problematic (compared to strided access intrinsics or plain > load+shuffle) because they're restricted to load/store of patterns > that could later be lowered to shuffles, if the hardware doesn't support it, but they'd also keep the pattern intact even after other optimization passes have gone through the same code. > > Chandler, > > As I said at EuroLLVM, the fear is that we'd get the pattern destroyed and lose the ability of using masked / strided access at all. However, I'm haven't looked at great depth at this problem to know what are the cases and why they could break. Maybe Elena or James could help with that. > > cheers, > --renato > --------------------------------------------------------------------- > Intel Israel (74) Limited > > This e-mail and any attachments may contain confidential material for > the sole use of the intended recipient(s). Any review or distribution > by others is strictly prohibited. If you are not the intended > recipient, please contact the sender and delete all copies.
Demikhovsky, Elena
2015-Apr-16 20:24 UTC
[LLVMdev] Code review for gather and scatter intrinsics
We had a long discussion in the dev list and I suggested to implement masked.index.load(%base, %<vector of indices>, %scale, %mask), but people said that it is Intel-wise form. So we decided to go for gather/scatter -> masked.gather(<vector of pointers>, %mask) - common case of random memory access. For A[B[i]]. The first patch for Gather/Scatter is already submitted. As the next step, that good for Intel and for ARM we see this form: masked.index.load(%ptr, <vector of const-indices>, %mask), but unlike the first proposal, the index is not a random variable, but a vector of compile time constants. ARM does not need masks, but it can be "all-ones". I suppose, that Hao, from ARM will go forward with index.load, otherwise I'll take it when gather/scatter will be completed. - Elena -----Original Message----- From: Renato Golin [mailto:renato.golin at linaro.org] Sent: Thursday, April 16, 2015 22:49 To: Demikhovsky, Elena Cc: llvmdev at cs.uiuc.edu; Chandler Carruth; James Molloy Subject: Re: [LLVMdev] Code review for gather and scatter intrinsics I see. I probably got it wrong, then. I was under the impression that we wouldn't do stride.load/store or gather/scatter directly, but implement both of them via index.load/store and differentiate gather/scatter with the argument as a vector of pointers, not a pointer of vectors. I also thought that we'd use the mask to complement the indexed load, which would in turn help the vectorizer to create prologues/epilogues and the back end to generate the best instruction. cheers, --renato On 16 April 2015 at 20:17, Demikhovsky, Elena <elena.demikhovsky at intel.com> wrote:> Hi Renato, > > I fully agree with you, but indexed load and store is the next step. > I'm asking to review gather and scatter code. > > Thanks. > > - Elena > > -----Original Message----- > From: Renato Golin [mailto:renato.golin at linaro.org] > Sent: Thursday, April 16, 2015 17:17 > To: Demikhovsky, Elena > Cc: llvmdev at cs.uiuc.edu; Chandler Carruth; James Molloy > Subject: Re: [LLVMdev] Code review for gather and scatter intrinsics > > On 16 April 2015 at 14:44, Demikhovsky, Elena <elena.demikhovsky at intel.com> wrote: >> http://reviews.llvm.org/D7665. >> I presented this work on LLVM Euro and people are interested in this >> feature. >> Can anybody review this code, please? > > Hi Elena, > > Sorry for the delay, I'm still catching up with my emails after a long > holiday. :) > > The only concern to this feature I remember was Chandler's comment that we should try to encode everything into loads and shuffles. > > Correct me if I'm wrong, but on the strided vectorizer thread we have > reached a consensus that indexed intrinsics would be the least > problematic (compared to strided access intrinsics or plain > load+shuffle) because they're restricted to load/store of patterns > that could later be lowered to shuffles, if the hardware doesn't support it, but they'd also keep the pattern intact even after other optimization passes have gone through the same code. > > Chandler, > > As I said at EuroLLVM, the fear is that we'd get the pattern destroyed and lose the ability of using masked / strided access at all. However, I'm haven't looked at great depth at this problem to know what are the cases and why they could break. Maybe Elena or James could help with that. > > cheers, > --renato > --------------------------------------------------------------------- > Intel Israel (74) Limited > > This e-mail and any attachments may contain confidential material for > the sole use of the intended recipient(s). Any review or distribution > by others is strictly prohibited. If you are not the intended > recipient, please contact the sender and delete all copies.--------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.