thr3ads.net - llvm dev - [LLVMdev] Code review for gather and scatter intrinsics [Apr 2015]

If this information is useful, please help other people find it:
Share via:

Demikhovsky, Elena

2015-Apr-16 19:17 UTC

[LLVMdev] Code review for gather and scatter intrinsics

Hi Renato,

I fully agree with you, but indexed load and store is the next step. 
I'm asking to review gather and scatter code.

Thanks.

-  Elena

-----Original Message-----
From: Renato Golin [mailto:renato.golin at linaro.org] 
Sent: Thursday, April 16, 2015 17:17
To: Demikhovsky, Elena
Cc: llvmdev at cs.uiuc.edu; Chandler Carruth; James Molloy
Subject: Re: [LLVMdev] Code review for gather and scatter intrinsics

On 16 April 2015 at 14:44, Demikhovsky, Elena <elena.demikhovsky at
intel.com> wrote:> http://reviews.llvm.org/D7665.
> I presented this work on LLVM Euro and people are interested in this 
> feature.
> Can anybody review this code, please?
Hi Elena,

Sorry for the delay, I'm still catching up with my emails after a long
holiday. :)

The only concern to this feature I remember was Chandler's comment that we
should try to encode everything into loads and shuffles.

Correct me if I'm wrong, but on the strided vectorizer thread we have
reached a consensus that indexed intrinsics would be the least problematic
(compared to strided access intrinsics or plain
load+shuffle) because they're restricted to load/store of patterns
that could later be lowered to shuffles, if the hardware doesn't support it,
but they'd also keep the pattern intact even after other optimization passes
have gone through the same code.

Chandler,

As I said at EuroLLVM, the fear is that we'd get the pattern destroyed and
lose the ability of using masked / strided access at all. However, I'm
haven't looked at great depth at this problem to know what are the cases and
why they could break. Maybe Elena or James could help with that.

cheers,
--renato
---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

Renato Golin

2015-Apr-16 19:48 UTC

head link

[LLVMdev] Code review for gather and scatter intrinsics

I see. I probably got it wrong, then.

I was under the impression that we wouldn't do stride.load/store or
gather/scatter directly, but implement both of them via
index.load/store and differentiate gather/scatter with the argument as
a vector of pointers, not a pointer of vectors.

I also thought that we'd use the mask to complement the indexed load,
which would in turn help the vectorizer to create prologues/epilogues
and the back end to generate the best instruction.

cheers,
--renato

On 16 April 2015 at 20:17, Demikhovsky, Elena
<elena.demikhovsky at intel.com> wrote:> Hi Renato,
>
> I fully agree with you, but indexed load and store is the next step.
> I'm asking to review gather and scatter code.
>
> Thanks.
>
> -  Elena
>
> -----Original Message-----
> From: Renato Golin [mailto:renato.golin at linaro.org]
> Sent: Thursday, April 16, 2015 17:17
> To: Demikhovsky, Elena
> Cc: llvmdev at cs.uiuc.edu; Chandler Carruth; James Molloy
> Subject: Re: [LLVMdev] Code review for gather and scatter intrinsics
>
> On 16 April 2015 at 14:44, Demikhovsky, Elena <elena.demikhovsky at
intel.com> wrote:
>> http://reviews.llvm.org/D7665.
>> I presented this work on LLVM Euro and people are interested in this
>> feature.
>> Can anybody review this code, please?
>
> Hi Elena,
>
> Sorry for the delay, I'm still catching up with my emails after a long
holiday. :)
>
> The only concern to this feature I remember was Chandler's comment that
we should try to encode everything into loads and shuffles.
>
> Correct me if I'm wrong, but on the strided vectorizer thread we have
reached a consensus that indexed intrinsics would be the least problematic
(compared to strided access intrinsics or plain
> load+shuffle) because they're restricted to load/store of patterns
> that could later be lowered to shuffles, if the hardware doesn't
support it, but they'd also keep the pattern intact even after other
optimization passes have gone through the same code.
>
> Chandler,
>
> As I said at EuroLLVM, the fear is that we'd get the pattern destroyed
and lose the ability of using masked / strided access at all. However, I'm
haven't looked at great depth at this problem to know what are the cases and
why they could break. Maybe Elena or James could help with that.
>
> cheers,
> --renato
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.

Demikhovsky, Elena

2015-Apr-16 20:24 UTC

head link

[LLVMdev] Code review for gather and scatter intrinsics

We had a long discussion in the dev list and I suggested to implement 
masked.index.load(%base, %<vector of indices>, %scale, %mask), but people
said that it is Intel-wise form.

So we decided to go for gather/scatter -> masked.gather(<vector of
pointers>, %mask) - common case of random memory access. For A[B[i]].
The first patch for Gather/Scatter is already submitted. 

As the next step, that good for Intel and for ARM we see this form:

masked.index.load(%ptr, <vector of const-indices>, %mask), but unlike the
first proposal, the index is not a random variable, but a vector of compile time
constants.
ARM does not need masks, but it can be "all-ones".

I suppose, that Hao, from ARM will go forward with index.load, otherwise
I'll take it when gather/scatter will be completed.

-  Elena

-----Original Message-----
From: Renato Golin [mailto:renato.golin at linaro.org] 
Sent: Thursday, April 16, 2015 22:49
To: Demikhovsky, Elena
Cc: llvmdev at cs.uiuc.edu; Chandler Carruth; James Molloy
Subject: Re: [LLVMdev] Code review for gather and scatter intrinsics

I see. I probably got it wrong, then.

I was under the impression that we wouldn't do stride.load/store or
gather/scatter directly, but implement both of them via index.load/store and
differentiate gather/scatter with the argument as a vector of pointers, not a
pointer of vectors.

I also thought that we'd use the mask to complement the indexed load, which
would in turn help the vectorizer to create prologues/epilogues and the back end
to generate the best instruction.

cheers,
--renato

On 16 April 2015 at 20:17, Demikhovsky, Elena <elena.demikhovsky at
intel.com> wrote:> Hi Renato,
>
> I fully agree with you, but indexed load and store is the next step.
> I'm asking to review gather and scatter code.
>
> Thanks.
>
> -  Elena
>
> -----Original Message-----
> From: Renato Golin [mailto:renato.golin at linaro.org]
> Sent: Thursday, April 16, 2015 17:17
> To: Demikhovsky, Elena
> Cc: llvmdev at cs.uiuc.edu; Chandler Carruth; James Molloy
> Subject: Re: [LLVMdev] Code review for gather and scatter intrinsics
>
> On 16 April 2015 at 14:44, Demikhovsky, Elena <elena.demikhovsky at
intel.com> wrote:
>> http://reviews.llvm.org/D7665.
>> I presented this work on LLVM Euro and people are interested in this 
>> feature.
>> Can anybody review this code, please?
>
> Hi Elena,
>
> Sorry for the delay, I'm still catching up with my emails after a long 
> holiday. :)
>
> The only concern to this feature I remember was Chandler's comment that
we should try to encode everything into loads and shuffles.
>
> Correct me if I'm wrong, but on the strided vectorizer thread we have 
> reached a consensus that indexed intrinsics would be the least 
> problematic (compared to strided access intrinsics or plain
> load+shuffle) because they're restricted to load/store of patterns
> that could later be lowered to shuffles, if the hardware doesn't
support it, but they'd also keep the pattern intact even after other
optimization passes have gone through the same code.
>
> Chandler,
>
> As I said at EuroLLVM, the fear is that we'd get the pattern destroyed
and lose the ability of using masked / strided access at all. However, I'm
haven't looked at great depth at this problem to know what are the cases and
why they could break. Maybe Elena or James could help with that.
>
> cheers,
> --renato
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for 
> the sole use of the intended recipient(s). Any review or distribution 
> by others is strictly prohibited. If you are not the intended 
> recipient, please contact the sender and delete all copies.---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

llvm dev - Apr 2015 - [LLVMdev] Code review for gather and scatter intrinsics

[LLVMdev] Code review for gather and scatter intrinsics

[LLVMdev] Code review for gather and scatter intrinsics

[LLVMdev] Code review for gather and scatter intrinsics