----- Original Message -----> From: dag at cray.com > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: "Elena Demikhovsky" <elena.demikhovsky at intel.com>, llvmdev at cs.uiuc.edu > Sent: Friday, October 24, 2014 11:56:14 AM > Subject: Re: [LLVMdev] Adding masked vector load and store intrinsics > > Hal Finkel <hfinkel at anl.gov> writes: > > >> If this were really a question of safety, I'd agree. And if we > >> were > >> talking about gather loads, I'd agree. For a regular vector loads, > >> I > >> don't see this as a safety issue. We should outline what the > >> downside of emitting a regular load would actually be should some > >> optimization be done to the select. Can you please elaborate on > >> this? > > > > Nevermind ;) -- I changed my mind, the safety issue is with > > non-aligned loads that might cross page boundaries. Is that right? > > That's just one safety issue. There are others.Can you be more specific? You mentioned overindexing in your other e-mail, exactly what do you mean by that? Thanks again, Hal> > > If so, I think this proposal is good (although obviously the docs > > need > > to make clear what the faulting behavior of these intrinsics is). > > The behavior should be not to ever fault on an element whose mask bit > is > false, and behave as a regular load (wrt trapping) for any element > whose > mask bit is true. > > -David >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
Stephen Canon
2014-Oct-24 17:17 UTC
[LLVMdev] Adding masked vector load and store intrinsics
One can at least imagine using a masked load to access device memory which might have access granularity smaller than the vector size (this seems like a *terrible* idea to me, but at least I can conceive of cases where the semantics would matter beyond just page-crossing loads). That said, page-crossing loads are a good-enough reason to support this on their own. – Steve> On Oct 24, 2014, at 12:58 PM, Hal Finkel <hfinkel at anl.gov> wrote: > > ----- Original Message ----- >> From: dag at cray.com >> To: "Hal Finkel" <hfinkel at anl.gov> >> Cc: "Elena Demikhovsky" <elena.demikhovsky at intel.com>, llvmdev at cs.uiuc.edu >> Sent: Friday, October 24, 2014 11:56:14 AM >> Subject: Re: [LLVMdev] Adding masked vector load and store intrinsics >> >> Hal Finkel <hfinkel at anl.gov> writes: >> >>>> If this were really a question of safety, I'd agree. And if we >>>> were >>>> talking about gather loads, I'd agree. For a regular vector loads, >>>> I >>>> don't see this as a safety issue. We should outline what the >>>> downside of emitting a regular load would actually be should some >>>> optimization be done to the select. Can you please elaborate on >>>> this? >>> >>> Nevermind ;) -- I changed my mind, the safety issue is with >>> non-aligned loads that might cross page boundaries. Is that right? >> >> That's just one safety issue. There are others. > > Can you be more specific? You mentioned overindexing in your other e-mail, exactly what do you mean by that? > > Thanks again, > Hal > >> >>> If so, I think this proposal is good (although obviously the docs >>> need >>> to make clear what the faulting behavior of these intrinsics is). >> >> The behavior should be not to ever fault on an element whose mask bit >> is >> false, and behave as a regular load (wrt trapping) for any element >> whose >> mask bit is true. >> >> -David >> > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu> http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141024/fe762665/attachment.html>
dag at cray.com
2014-Oct-24 17:39 UTC
[LLVMdev] Adding masked vector load and store intrinsics
Hal Finkel <hfinkel at anl.gov> writes:>> > Nevermind ;) -- I changed my mind, the safety issue is with >> > non-aligned loads that might cross page boundaries. Is that right? >> >> That's just one safety issue. There are others. > > Can you be more specific? You mentioned overindexing in your other > e-mail, exactly what do you mean by that?Accessing past the end of an array. Some vector optimizations do that and assume the masking will prevent traps. Aggressive vectorizers can do all kinds of "unsafe" transformations that are safe in the presence of masks. Any time there is control flow in the loop protecting a dereference of a NULL pointer, a mask is needed and it needs to be applied at the time of the load, not at the time of the write to the loaded-to register. That's why select doesn't work. This same issues extends to any trap situation like a divide-by-zero or use of a NaN. It's not only the write to the register that needs protection, it's the operation itself. -David
Smith, Kevin B
2014-Oct-24 18:13 UTC
[LLVMdev] Adding masked vector load and store intrinsics
I strongly agree with all these reasons, and it is for all those reasons that the proposal is written this way. Kevin B. Smith -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of dag at cray.com Sent: Friday, October 24, 2014 10:39 AM To: Hal Finkel Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Adding masked vector load and store intrinsics Hal Finkel <hfinkel at anl.gov> writes:>> > Nevermind ;) -- I changed my mind, the safety issue is with >> > non-aligned loads that might cross page boundaries. Is that right? >> >> That's just one safety issue. There are others. > > Can you be more specific? You mentioned overindexing in your other > e-mail, exactly what do you mean by that?Accessing past the end of an array. Some vector optimizations do that and assume the masking will prevent traps. Aggressive vectorizers can do all kinds of "unsafe" transformations that are safe in the presence of masks. Any time there is control flow in the loop protecting a dereference of a NULL pointer, a mask is needed and it needs to be applied at the time of the load, not at the time of the write to the loaded-to register. That's why select doesn't work. This same issues extends to any trap situation like a divide-by-zero or use of a NaN. It's not only the write to the register that needs protection, it's the operation itself. -David _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev