thr3ads.net - llvm dev - [LLVMdev] Adding masked vector load and store intrinsics [Oct 2014]

If this information is useful, please help other people find it:
Share via:

Demikhovsky, Elena

2014-Oct-27 07:02 UTC

[LLVMdev] Adding masked vector load and store intrinsics

we just follow  a common recommendation to start with intrinsics:
http://llvm.org/docs/ExtendingLLVM.html

-           Elena

From: Owen Anderson [mailto:resistor at mac.com]
Sent: Sunday, October 26, 2014 23:57
To: Demikhovsky, Elena
Cc: llvmdev at cs.uiuc.edu; dag at cray.com
Subject: Re: [LLVMdev] Adding masked vector load and store intrinsics

What is the motivation for using intrinsics versus adding new instructions?

—Owen

On Oct 24, 2014, at 4:24 AM, Demikhovsky, Elena <elena.demikhovsky at
intel.com<mailto:elena.demikhovsky at intel.com>> wrote:

Hi,

We would like to add support for masked vector loads and stores by introducing
new target-independent intrinsics. The loop vectorizer will then be enhanced to
optimize loops containing conditional memory accesses by generating these
intrinsics for existing targets such as AVX2 and AVX-512. The vectorizer will
first ask the target about availability of masked vector loads and stores. The
SLP vectorizer can potentially be enhanced to use these intrinsics as well.

The intrinsics would be legal for all targets; targets that do not support
masked vector loads or stores will scalarize them.
The addressed memory will not be touched for masked-off lanes. In particular, if
all lanes are masked off no address will be accessed.

  call void @llvm.masked.store (i32* %addr, <16 x i32> %data, i32 4,
<16 x i1> %mask)

  %data = call <8 x i32> @llvm.masked.load (i32* %addr, <8 x i32>
%passthru, i32 4, <8 x i1> %mask)

where %passthru is used to fill the elements of %data that are masked-off (if
any; can be zeroinitializer or undef).

Comments so far, before we dive into more details?

Thank you.

- Elena and Ayal

---------------------------------------------------------------------
Intel Israel (74) Limited
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu<mailto:LLVMdev at cs.uiuc.edu>        
http://llvm.cs.uiuc.edu<http://llvm.cs.uiuc.edu/>
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141027/647272c8/attachment.html>

Chandler Carruth

2014-Oct-27 07:08 UTC

head link

[LLVMdev] Adding masked vector load and store intrinsics

It's not clear that these would need to be totally new instructions as
opposed to a mask operand to the existing store instruction? I'm curious
what Owen is actually imagining here...

On Mon, Oct 27, 2014 at 12:02 AM, Demikhovsky, Elena <
elena.demikhovsky at intel.com> wrote:
>  we just follow  a common recommendation to start with intrinsics:
>
> http://llvm.org/docs/ExtendingLLVM.html
>
>
>
>
>
> -          * Elena*
>
>
>
> *From:* Owen Anderson [mailto:resistor at mac.com]
> *Sent:* Sunday, October 26, 2014 23:57
> *To:* Demikhovsky, Elena
> *Cc:* llvmdev at cs.uiuc.edu; dag at cray.com
> *Subject:* Re: [LLVMdev] Adding masked vector load and store intrinsics
>
>
>
> What is the motivation for using intrinsics versus adding new instructions?
>
>
>
> —Owen
>
>
>
>  On Oct 24, 2014, at 4:24 AM, Demikhovsky, Elena <
> elena.demikhovsky at intel.com> wrote:
>
>
>
> Hi,
>
>
>
> We would like to add support for masked vector loads and stores by
> introducing new target-independent intrinsics. The loop vectorizer will
> then be enhanced to optimize loops containing conditional memory accesses
> by generating these intrinsics for existing targets such as AVX2 and
> AVX-512. The vectorizer will first ask the target about availability of
> masked vector loads and stores. The SLP vectorizer can potentially be
> enhanced to use these intrinsics as well.
>
>
>
> The intrinsics would be legal for all targets; targets that do not support
> masked vector loads or stores will scalarize them.
>
> The addressed memory will not be touched for masked-off lanes. In
> particular, if all lanes are masked off no address will be accessed.
>
>
>
>   call void @llvm.masked.store (i32* %addr, <16 x i32> %data, i32 4,
<16 x
> i1> %mask)
>
>
>
>   %data = call <8 x i32> @llvm.masked.load (i32* %addr, <8 x
i32>
> %passthru, i32 4, <8 x i1> %mask)
>
>
>
> where %passthru is used to fill the elements of %data that are masked-off
> (if any; can be zeroinitializer or undef).
>
>
>
> Comments so far, before we dive into more details?
>
>
>
> Thank you.
>
>
>
> - Elena and Ayal
>
>
>
>
>
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141027/cf317934/attachment.html>

Owen Anderson

2014-Oct-27 16:57 UTC

head link

[LLVMdev] Adding masked vector load and store intrinsics

Adding a mask operand to the existing store instructions seems risky, as lots of
existing code would not necessarily preserve/respect them.

—Owen
> On Oct 27, 2014, at 12:08 AM, Chandler Carruth <chandlerc at
google.com> wrote:
> 
> It's not clear that these would need to be totally new instructions as
opposed to a mask operand to the existing store instruction? I'm curious
what Owen is actually imagining here...
> 
> On Mon, Oct 27, 2014 at 12:02 AM, Demikhovsky, Elena <elena.demikhovsky
at intel.com <mailto:elena.demikhovsky at intel.com>> wrote:
> we just follow  a common recommendation to start with intrinsics:
> 
> http://llvm.org/docs/ExtendingLLVM.html
<http://llvm.org/docs/ExtendingLLVM.html>
>  
> 
>  
> 
> -           Elena
> 
>  
> 
> From: Owen Anderson [mailto:resistor at mac.com <mailto:resistor at
mac.com>]
> Sent: Sunday, October 26, 2014 23:57
> To: Demikhovsky, Elena
> Cc: llvmdev at cs.uiuc.edu <mailto:llvmdev at cs.uiuc.edu>; dag at
cray.com <mailto:dag at cray.com>
> Subject: Re: [LLVMdev] Adding masked vector load and store intrinsics
> 
>  
> 
> What is the motivation for using intrinsics versus adding new instructions?
> 
>  
> 
> —Owen
> 
>  
> 
> On Oct 24, 2014, at 4:24 AM, Demikhovsky, Elena <elena.demikhovsky at
intel.com <mailto:elena.demikhovsky at intel.com>> wrote:
> 
>  
> 
> Hi,
> 
>  
> 
> We would like to add support for masked vector loads and stores by
introducing new target-independent intrinsics. The loop vectorizer will then be
enhanced to optimize loops containing conditional memory accesses by generating
these intrinsics for existing targets such as AVX2 and AVX-512. The vectorizer
will first ask the target about availability of masked vector loads and stores.
The SLP vectorizer can potentially be enhanced to use these intrinsics as well.
> 
>  
> 
> The intrinsics would be legal for all targets; targets that do not support
masked vector loads or stores will scalarize them.
> 
> The addressed memory will not be touched for masked-off lanes. In
particular, if all lanes are masked off no address will be accessed.
> 
>  
> 
>   call void @llvm.masked.store (i32* %addr, <16 x i32> %data, i32 4,
<16 x i1> %mask)
> 
>  
> 
>   %data = call <8 x i32> @llvm.masked.load (i32* %addr, <8 x
i32> %passthru, i32 4, <8 x i1> %mask)
> 
>  
> 
> where %passthru is used to fill the elements of %data that are masked-off
(if any; can be zeroinitializer or undef).
> 
>  
> 
> Comments so far, before we dive into more details?
> 
>  
> 
> Thank you.
> 
>  
> 
> - Elena and Ayal
> 
>  
> 
>  
> 
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
> 
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>        
http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/>
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>  
> 
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
> 
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>        
http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/>
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
> 
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141027/2d12100d/attachment.html>

Owen Anderson

2014-Oct-27 16:58 UTC

head link

[LLVMdev] Adding masked vector load and store intrinsics

Since this is something that you expect to be supported on all targets, and
which requires extensive type overloading, it seems like a perfect candidate for
being an Instruction rather than an intrinsic.

—Owen
> On Oct 27, 2014, at 12:02 AM, Demikhovsky, Elena <elena.demikhovsky at
intel.com> wrote:
> 
> we just follow  a common recommendation to start with intrinsics:
> http://llvm.org/docs/ExtendingLLVM.html
<http://llvm.org/docs/ExtendingLLVM.html>
>  
>  
> -           Elena
>  
> From: Owen Anderson [mailto:resistor at mac.com <mailto:resistor at
mac.com>]
> Sent: Sunday, October 26, 2014 23:57
> To: Demikhovsky, Elena
> Cc: llvmdev at cs.uiuc.edu <mailto:llvmdev at cs.uiuc.edu>; dag at
cray.com <mailto:dag at cray.com>
> Subject: Re: [LLVMdev] Adding masked vector load and store intrinsics
>  
> What is the motivation for using intrinsics versus adding new instructions?
>  
> —Owen
>  
> On Oct 24, 2014, at 4:24 AM, Demikhovsky, Elena <elena.demikhovsky at
intel.com <mailto:elena.demikhovsky at intel.com>> wrote:
>  
> Hi,
>  
> We would like to add support for masked vector loads and stores by
introducing new target-independent intrinsics. The loop vectorizer will then be
enhanced to optimize loops containing conditional memory accesses by generating
these intrinsics for existing targets such as AVX2 and AVX-512. The vectorizer
will first ask the target about availability of masked vector loads and stores.
The SLP vectorizer can potentially be enhanced to use these intrinsics as well.
>  
> The intrinsics would be legal for all targets; targets that do not support
masked vector loads or stores will scalarize them.
> The addressed memory will not be touched for masked-off lanes. In
particular, if all lanes are masked off no address will be accessed.
>  
>   call void @llvm.masked.store (i32* %addr, <16 x i32> %data, i32 4,
<16 x i1> %mask)
>  
>   %data = call <8 x i32> @llvm.masked.load (i32* %addr, <8 x
i32> %passthru, i32 4, <8 x i1> %mask)
>  
> where %passthru is used to fill the elements of %data that are masked-off
(if any; can be zeroinitializer or undef).
>  
> Comments so far, before we dive into more details?
>  
> Thank you.
>  
> - Elena and Ayal
>  
>  
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>        
http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/>
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>  
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
> 
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141027/25eec333/attachment.html>

Demikhovsky, Elena

2014-Oct-28 12:26 UTC

head link

[LLVMdev] Adding masked vector load and store intrinsics

Many oveloaded intrinsics may be replaced with instructions - fabs or fma or
sqrt.
Chandler will probably explain the criteria. What the diff between fma and fadd?
Or fptrunc and fabs?

A new instruction like
%a = loadm <4 x i32>* %addr, <4 x i32> %passthru,  i32 4, <4 x
i1>%mask
is possible, but may be not very useful for most of targets.
So we start from intrinsics.

-           Elena

From: Owen Anderson [mailto:resistor at mac.com]
Sent: Monday, October 27, 2014 18:59
To: Demikhovsky, Elena
Cc: llvmdev at cs.uiuc.edu; dag at cray.com
Subject: Re: [LLVMdev] Adding masked vector load and store intrinsics

Since this is something that you expect to be supported on all targets, and
which requires extensive type overloading, it seems like a perfect candidate for
being an Instruction rather than an intrinsic.

—Owen

On Oct 27, 2014, at 12:02 AM, Demikhovsky, Elena <elena.demikhovsky at
intel.com<mailto:elena.demikhovsky at intel.com>> wrote:

we just follow  a common recommendation to start with intrinsics:
http://llvm.org/docs/ExtendingLLVM.html


-           Elena

From: Owen Anderson [mailto:resistor at mac.com]
Sent: Sunday, October 26, 2014 23:57
To: Demikhovsky, Elena
Cc: llvmdev at cs.uiuc.edu<mailto:llvmdev at cs.uiuc.edu>; dag at
cray.com<mailto:dag at cray.com>
Subject: Re: [LLVMdev] Adding masked vector load and store intrinsics

What is the motivation for using intrinsics versus adding new instructions?

—Owen

On Oct 24, 2014, at 4:24 AM, Demikhovsky, Elena <elena.demikhovsky at
intel.com<mailto:elena.demikhovsky at intel.com>> wrote:

Hi,

We would like to add support for masked vector loads and stores by introducing
new target-independent intrinsics. The loop vectorizer will then be enhanced to
optimize loops containing conditional memory accesses by generating these
intrinsics for existing targets such as AVX2 and AVX-512. The vectorizer will
first ask the target about availability of masked vector loads and stores. The
SLP vectorizer can potentially be enhanced to use these intrinsics as well.

The intrinsics would be legal for all targets; targets that do not support
masked vector loads or stores will scalarize them.
The addressed memory will not be touched for masked-off lanes. In particular, if
all lanes are masked off no address will be accessed.

  call void @llvm.masked.store (i32* %addr, <16 x i32> %data, i32 4,
<16 x i1> %mask)

  %data = call <8 x i32> @llvm.masked.load (i32* %addr, <8 x i32>
%passthru, i32 4, <8 x i1> %mask)

where %passthru is used to fill the elements of %data that are masked-off (if
any; can be zeroinitializer or undef).

Comments so far, before we dive into more details?

Thank you.

- Elena and Ayal


---------------------------------------------------------------------
Intel Israel (74) Limited
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu<mailto:LLVMdev at cs.uiuc.edu>        
http://llvm.cs.uiuc.edu<http://llvm.cs.uiuc.edu/>
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

---------------------------------------------------------------------
Intel Israel (74) Limited
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141028/8020a160/attachment.html>

llvm dev - Oct 2014 - [LLVMdev] Adding masked vector load and store intrinsics

[LLVMdev] Adding masked vector load and store intrinsics

[LLVMdev] Adding masked vector load and store intrinsics

[LLVMdev] Adding masked vector load and store intrinsics

[LLVMdev] Adding masked vector load and store intrinsics

[LLVMdev] Adding masked vector load and store intrinsics