thr3ads.net - llvm dev - [LLVMdev] Extending Vector GEP

If this information is useful, please help other people find it:
Share via:

Demikhovsky, Elena

2015-Mar-03 07:30 UTC

[LLVMdev] Extending Vector GEP - proposal

> This problem can be solved by sinking the broadcast instruction at
codegen-prepare time.I considered this option. We currently don’t have target specific optimizations
in codegen-prepare time. (Or I’m wrong?)
And it will be very X86-directed optimization. Even gather-scatter intrinsics
are considered as common for all targets.

And the second reason, why I’d prefer to generate a splat-GEP, is compile-time
saving.
 I should generate 2 (or more, for each splat element) redundant instructions
(broadcast is insert+shuffle), hoist them outside the loop on some stage. Then
look for them  on CodeGenPreare pass, sink them back and rebuild the CFG.

-           Elena

From: Nadav Rotem [mailto:nrotem at apple.com]
Sent: Monday, March 02, 2015 19:01
To: Demikhovsky, Elena
Cc: llvmdev at cs.uiuc.edu; Duncan P. N. Exon Smith; dag at cray.com; Philip
Reames (listmail at philipreames.com); Hal Finkel (hfinkel at anl.gov); Chandler
Carruth (chandlerc at gmail.com)
Subject: Re: Extending Vector GEP - proposal

I don’t have a strong opinion on this. The current GEP syntax is more
restrictive and the single base pointer case can be emulated using a broadcast +
vector-gep, that can easily be patten matched at codegen time. The problem with
the current syntax is that the ‘broadcast’ instruction can be hoisted outside of
loops and this can be a problem with our "one block at a time" codegen
implementation. This problem can be solved by sinking the broadcast instruction
at codegen-prepare time.

Is there a strong motivation to prefer one representation over the other?


On Mar 1, 2015, at 2:10 AM, Demikhovsky, Elena <elena.demikhovsky at
intel.com<mailto:elena.demikhovsky at intel.com>> wrote:

Hi,

According to the current GEP syntax, vector GEP requires that each index must be
a vector with the same number of elements.

%A = getelementptr <4 x i8*> %ptrs, <4 x i64> %offsets

I propose to lessen this requirement. Let each index be or vector or scalar. All
vector indices must have the same number of elements. The scalar value will mean
the splat vector value.

%A = getelementptr i8* %ptr, <4 x i64> %offsets
or
%A = getelementptr <4 x i8*> %ptrs, i64 %offset

In this case we don’t have to add a “broadcast” before GEP. It is actually will
be developer’s decision what form to choose.
I plan to use vector GEP in gather/scatter and the “broadcasting” of the scalar
value impedes to narrow this operation to the “common base, multiple indices”
form in the future.

What do you think?
Thanks.

·         Elena



---------------------------------------------------------------------
Intel Israel (74) Limited
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150303/f5997e0e/attachment.html>

Nadav Rotem

2015-Mar-03 17:38 UTC

head link

[LLVMdev] Extending Vector GEP - proposal

> On Mar 2, 2015, at 11:30 PM, Demikhovsky, Elena <elena.demikhovsky at
intel.com> wrote:
> 
> > This problem can be solved by sinking the broadcast instruction at
codegen-prepare time.
> I considered this option. We currently don’t have target specific
optimizations in codegen-prepare time. (Or I’m wrong?)
> And it will be very X86-directed optimization. Even gather-scatter
intrinsics are considered as common for all targets.
>  
> And the second reason, why I’d prefer to generate a splat-GEP, is
compile-time saving.
>  I should generate 2 (or more, for each splat element) redundant
instructions (broadcast is insert+shuffle), hoist them outside the loop on some
stage. Then look for them  on CodeGenPreare pass, sink them back and rebuild the
CFG.
Okay. I think that it’s reasonable to add support for GEP with a single base
pointer and a vector of indices.
>  
> -           Elena
>  
> From: Nadav Rotem [mailto:nrotem at apple.com <mailto:nrotem at
apple.com>]
> Sent: Monday, March 02, 2015 19:01
> To: Demikhovsky, Elena
> Cc: llvmdev at cs.uiuc.edu <mailto:llvmdev at cs.uiuc.edu>; Duncan P.
N. Exon Smith; dag at cray.com <mailto:dag at cray.com>; Philip Reames
(listmail at philipreames.com <mailto:listmail at philipreames.com>); Hal
Finkel (hfinkel at anl.gov <mailto:hfinkel at anl.gov>); Chandler Carruth
(chandlerc at gmail.com <mailto:chandlerc at gmail.com>)
> Subject: Re: Extending Vector GEP - proposal
>  
> I don’t have a strong opinion on this. The current GEP syntax is more
restrictive and the single base pointer case can be emulated using a broadcast +
vector-gep, that can easily be patten matched at codegen time. The problem with
the current syntax is that the ‘broadcast’ instruction can be hoisted outside of
loops and this can be a problem with our "one block at a time" codegen
implementation. This problem can be solved by sinking the broadcast instruction
at codegen-prepare time.
>  
> Is there a strong motivation to prefer one representation over the other?
>  
>  
> On Mar 1, 2015, at 2:10 AM, Demikhovsky, Elena <elena.demikhovsky at
intel.com <mailto:elena.demikhovsky at intel.com>> wrote:
>  
> Hi,
>  
> According to the current GEP syntax, vector GEP requires that each index
must be a vector with the same number of elements.
>  
> %A = getelementptr <4 x i8*> %ptrs, <4 x i64> %offsets
>  
> I propose to lessen this requirement. Let each index be or vector or
scalar. All vector indices must have the same number of elements. The scalar
value will mean the splat vector value.
>  
> %A = getelementptr i8* %ptr, <4 x i64> %offsets
> or
> %A = getelementptr <4 x i8*> %ptrs, i64 %offset
>  
> In this case we don’t have to add a “broadcast” before GEP. It is actually
will be developer’s decision what form to choose.
> I plan to use vector GEP in gather/scatter and the “broadcasting” of the
scalar value impedes to narrow this operation to the “common base, multiple
indices” form in the future.
>  
> What do you think?
> Thanks.
>  
> ·         Elena
>  
>  
>  
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>  
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
> 
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150303/c5143fe2/attachment.html>

Hal Finkel

2015-Mar-03 17:47 UTC

head link

[LLVMdev] Extending Vector GEP - proposal

----- Original Message -----
> From: "Nadav Rotem" <nrotem at apple.com>
> To: "Elena Demikhovsky" <elena.demikhovsky at intel.com>
> Cc: llvmdev at cs.uiuc.edu, "Duncan P. N. Exon Smith"
> <dexonsmith at apple.com>, dag at cray.com, "Philip Reames
> (listmail at philipreames.com)" <listmail at philipreames.com>,
"Hal
> Finkel (hfinkel at anl.gov)" <hfinkel at anl.gov>,
"Chandler Carruth
> (chandlerc at gmail.com)" <chandlerc at gmail.com>
> Sent: Tuesday, March 3, 2015 11:38:47 AM
> Subject: Re: Extending Vector GEP - proposal
> > On Mar 2, 2015, at 11:30 PM, Demikhovsky, Elena <
> > elena.demikhovsky at intel.com > wrote:
> 
> > > This problem can be solved by sinking the broadcast instruction
> > > at
> > > codegen-prepare time.
> 
> > I considered this option. We currently don’t have target specific
> > optimizations in codegen-prepare time. (Or I’m wrong?)
> 
> > And it will be very X86-directed optimization. Even gather-scatter
> > intrinsics are considered as common for all targets.
> 
> > And the second reason, why I’d prefer to generate a splat-GEP, is
> > compile-time saving.
> 
> > I should generate 2 (or more, for each splat element) redundant
> > instructions (broadcast is insert+shuffle), hoist them outside the
> > loop on some stage. Then look for them on CodeGenPreare pass, sink
> > them back and rebuild the CFG.
> 
> Okay. I think that it’s reasonable to add support for GEP with a
> single base pointer and a vector of indices.
I agree; the splat case, especially when you're indexing into a structure,
seems as though it will be very common.

-Hal 
> > - Elena
> 
> > From: Nadav Rotem [ mailto:nrotem at apple.com ]
> 
> > Sent: Monday, March 02, 2015 19:01
> 
> > To: Demikhovsky, Elena
> 
> > Cc: llvmdev at cs.uiuc.edu ; Duncan P. N. Exon Smith; dag at cray.com
;
> > Philip Reames ( listmail at philipreames.com ); Hal Finkel (
> > hfinkel at anl.gov ); Chandler Carruth ( chandlerc at gmail.com )
> 
> > Subject: Re: Extending Vector GEP - proposal
> 
> > I don’t have a strong opinion on this. The current GEP syntax is
> > more
> > restrictive and the single base pointer case can be emulated using
> > a
> > broadcast + vector-gep, that can easily be patten matched at
> > codegen
> > time. The problem with the current syntax is that the ‘broadcast’
> > instruction can be hoisted outside of loops and this can be a
> > problem with our "one block at a time" codegen
implementation. This
> > problem can be solved by sinking the broadcast instruction at
> > codegen-prepare time.
> 
> > Is there a strong motivation to prefer one representation over the
> > other?
> 
> > > On Mar 1, 2015, at 2:10 AM, Demikhovsky, Elena <
> > > elena.demikhovsky at intel.com > wrote:
> > 
> 
> > > Hi,
> > 
> 
> > > According to the current GEP syntax, vector GEP requires that
> > > each
> > > index must be a vector with the same number of elements.
> > 
> 
> > > %A = getelementptr <4 x i8*> %ptrs, <4 x i64>
%offsets
> > 
> 
> > > I propose to lessen this requirement. Let each index be or vector
> > > or
> > > scalar. All vector indices must have the same number of elements.
> > > The scalar value will mean the splat vector value.
> > 
> 
> > > %A = getelementptr i8* %ptr, <4 x i64> %offsets
> > 
> 
> > > or
> > 
> 
> > > %A = getelementptr <4 x i8*> %ptrs, i64 %offset
> > 
> 
> > > In this case we don’t have to add a “broadcast” before GEP. It is
> > > actually will be developer’s decision what form to choose.
> > 
> 
> > > I plan to use vector GEP in gather/scatter and the “broadcasting”
> > > of
> > > the scalar value impedes to narrow this operation to the “common
> > > base, multiple indices” form in the future.
> > 
> 
> > > What do you think?
> > 
> 
> > > Thanks.
> > 
> 
> > > · Elena
> > 
> 
> > >
---------------------------------------------------------------------
> > 
> 
> > > Intel Israel (74) Limited
> > 
> 
> > > This e-mail and any attachments may contain confidential material
> > > for
> > 
> 
> > > the sole use of the intended recipient(s). Any review or
> > > distribution
> > 
> 
> > > by others is strictly prohibited. If you are not the intended
> > 
> 
> > > recipient, please contact the sender and delete all copies.
> > 
> 
> > ---------------------------------------------------------------------
> 
> > Intel Israel (74) Limited
> 
> > This e-mail and any attachments may contain confidential material
> > for
> 
> > the sole use of the intended recipient(s). Any review or
> > distribution
> 
> > by others is strictly prohibited. If you are not the intended
> 
> > recipient, please contact the sender and delete all copies.
> -- 

Hal Finkel 
Assistant Computational Scientist 
Leadership Computing Facility 
Argonne National Laboratory 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150303/6c5dc0cf/attachment.html>

dag at cray.com

2015-Mar-03 20:36 UTC

head link

[LLVMdev] Extending Vector GEP - proposal

"Demikhovsky, Elena" <elena.demikhovsky at intel.com> writes:
> I should generate 2 (or more, for each splat element) redundant
> instructions (broadcast is insert+shuffle), hoist them outside the
> loop on some stage. Then look for them on CodeGenPreare pass, sink
> them back and rebuild the CFG.
I agree with Elena.  These are common operations and ought to be
directly representable in the IR.  Hoisting and sinking have been
constant pain points for us for exactly the reason described.  Getting
the sinking right isn't trivial.  It's not especially hard but it's
extra work that supporting the operations actually desired in the IR
would eliminate.

                             -David

dag at cray.com

2015-Mar-03 21:00 UTC

head link

[LLVMdev] Extending Vector GEP - proposal

Nadav Rotem <nrotem at apple.com> writes:
> Okay. I think that it’s reasonable to add support for GEP with a
> single base pointer and a vector of indices. 
We should also support a vector of pointers and a scalar index, I think.

                                       -David

llvm dev - Mar 2015 - [LLVMdev] Extending Vector GEP - proposal

[LLVMdev] Extending Vector GEP - proposal

[LLVMdev] Extending Vector GEP - proposal

[LLVMdev] Extending Vector GEP - proposal

[LLVMdev] Extending Vector GEP - proposal

[LLVMdev] Extending Vector GEP - proposal