dag at cray.com
2014-Oct-24 16:48 UTC
[LLVMdev] Adding masked vector load and store intrinsics
Hal Finkel <hfinkel at anl.gov> writes:> For the loads, I'm must less sure. Why can't we represent the loads as > select(mask, load(addr), passthru)?Because that does not specify the correct semantics. This formulation expects the load to happen before the mask is applied. The load could trap. The operation needs to be presented as an atomic unit. The same problem exists with any potentially trapping instruction (e.g. all floating point computations). The need for intrinsics goes way beyond loads and stores. -David
Tian, Xinmin
2014-Oct-24 17:57 UTC
[LLVMdev] Adding masked vector load and store intrinsics
> select(mask, load(addr), passthru)?David is right, "select(mask, load(addr), passthru)" is like vector load + blending ... which involves memory access speculation, and not safe in some cases, so it does not have same semantics of masking-lane-off, Xinmin -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of dag at cray.com Sent: Friday, October 24, 2014 9:49 AM To: Hal Finkel Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Adding masked vector load and store intrinsics Hal Finkel <hfinkel at anl.gov> writes:> For the loads, I'm must less sure. Why can't we represent the loads as > select(mask, load(addr), passthru)?Because that does not specify the correct semantics. This formulation expects the load to happen before the mask is applied. The load could trap. The operation needs to be presented as an atomic unit. The same problem exists with any potentially trapping instruction (e.g. all floating point computations). The need for intrinsics goes way beyond loads and stores. -David _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Demikhovsky, Elena
2014-Oct-25 11:30 UTC
[LLVMdev] Adding masked vector load and store intrinsics
> The same problem exists with any potentially trapping instruction (e.g. all floating point computations). The need for intrinsics goes way beyond loads and stores.We definitely looking at them, but decided to start from load and store. All FP + gather/scatter are in our long term plan. It will be about 20 intrinsics. But step-by-step. - Elena -----Original Message----- From: dag at cray.com [mailto:dag at cray.com] Sent: Friday, October 24, 2014 19:49 To: Hal Finkel Cc: Demikhovsky, Elena; llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Adding masked vector load and store intrinsics Hal Finkel <hfinkel at anl.gov> writes:> For the loads, I'm must less sure. Why can't we represent the loads as > select(mask, load(addr), passthru)?Because that does not specify the correct semantics. This formulation expects the load to happen before the mask is applied. The load could trap. The operation needs to be presented as an atomic unit. The same problem exists with any potentially trapping instruction (e.g. all floating point computations). The need for intrinsics goes way beyond loads and stores. -David --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
> On Oct 25, 2014, at 4:30 AM, Demikhovsky, Elena <elena.demikhovsky at intel.com> wrote: > >> The same problem exists with any potentially trapping instruction (e.g. all floating point computations). The need for intrinsics goes way beyond loads and stores. > > We definitely looking at them, but decided to start from load and store. All FP + gather/scatter are in our long term plan. It will be about 20 intrinsics. > But step-by-step.Hi Elena, Can you please elaborate on the list. I don’t see how 20 intrinsics would cover “All FP”. But do you really have to do all FP or only instructions that can trap with LLVM (e.g. division by zero)? I do agree that we want to go step by step but we also need to see the the end goal to make sure the design will scale. Thanks, Adam> - Elena > > > -----Original Message----- > From: dag at cray.com [mailto:dag at cray.com] > Sent: Friday, October 24, 2014 19:49 > To: Hal Finkel > Cc: Demikhovsky, Elena; llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] Adding masked vector load and store intrinsics > > Hal Finkel <hfinkel at anl.gov> writes: > >> For the loads, I'm must less sure. Why can't we represent the loads as >> select(mask, load(addr), passthru)? > > Because that does not specify the correct semantics. This formulation expects the load to happen before the mask is applied. The load could trap. The operation needs to be presented as an atomic unit. > > The same problem exists with any potentially trapping instruction (e.g. all floating point computations). The need for intrinsics goes way beyond loads and stores. > > -David > --------------------------------------------------------------------- > Intel Israel (74) Limited > > This e-mail and any attachments may contain confidential material for > the sole use of the intended recipient(s). Any review or distribution > by others is strictly prohibited. If you are not the intended > recipient, please contact the sender and delete all copies. > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
dag at cray.com
2014-Oct-27 17:19 UTC
[LLVMdev] Adding masked vector load and store intrinsics
"Demikhovsky, Elena" <elena.demikhovsky at intel.com> writes:>> The same problem exists with any potentially trapping instruction >> (e.g. all floating point computations). The need for intrinsics >> goes way beyond loads and stores. > > We definitely looking at them, but decided to start from load and > store. All FP + gather/scatter are in our long term plan. It will be > about 20 intrinsics. > But step-by-step.Makes total sense. Glad to hear the rest is coming! -David