On 12/04/2013 04:29 PM, Andrew Trick wrote:> On Dec 4, 2013, at 3:33 PM, Matt Arsenault <Matthew.Arsenault at amd.com> wrote: > >> On 11/11/2013 03:13 PM, Andrew Trick wrote: >>> On Nov 9, 2013, at 1:39 PM, Matt Arsenault <arsenm2 at gmail.com> wrote: >>> >>>> On Nov 9, 2013, at 3:14 AM, Chandler Carruth <chandlerc at google.com> wrote: >>>> >>>>> Perhaps you're instead trying to say that with certain address spaces "noalias" (and by inference, "restrict" at the language level) has a different semantic model than other address spaces? While it's less worrisome than the first interpretation, I still don't really like it. >>>>> >>>> This sounds right. With the constant address space, anything you do is OK since it’s constant. Private address space is supposed to be totally inaccessible from other workitems, so parallel modifications aren’t a concern. The others require explicit synchronization which noalias would need to be aware of. >>> FWIW, it seems generally useful to me to have a nomemfence function attribute and intrinsic property. We should avoid memory optimization (and possibly other optimization) across these regardless of alias analysis. >>> >> I'm think I'll try implementing this. Ideally it would be parameterized over the address space, so it makes more sense for it to be a memfence attribute rather than a nomemfence. You would then have an arbitrary number of memfence(N) attributes for each required address space. > So for correctness, would we need to tag all functions with memfence(0..M) until we can prove otherwise? That seem heinous.I was thinking the absence of it would mean no memfence in any address space, which is the current behavior. This adds the option of fencing.> Better to have an optional attribute that can be added to expose optimization. Is it important in practice to optimize the case of memfence(I) + nomemfence(J)?I think it would be important for the GPU case. You never need a memfence for private address space / addrspace 0, but you frequently want them for local or global. The local or global writes can't be reordered, but it could be very beneficial to move the private accesses across fences which might help reduce register usage.> If so, is there a problem with nomemfence(N)?nomemfence is the current assumption made on an arbitrary call, and it's the common case. Specifying the absence of a fence seems backwards of how this is used and more cumbersome to deal with. To match the current behavior, it would require littering nomemfence for any possible address space everywhere. In OpenCL you specify your fences, so it would be more straightforward to map that. If I have a memfence intrinsic, I just need to mark it with the fence attribute, and then propogate it to its callers. There would generally only be a few of them in any program compared to fenceless calls. To implement this with nomemfence, I would have to mark every function with at least 4 nomemfences, and remove them when encountering the memfence intrinsic.
On Dec 4, 2013, at 5:19 PM, Matt Arsenault <Matthew.Arsenault at amd.com> wrote:> On 12/04/2013 04:29 PM, Andrew Trick wrote: >> On Dec 4, 2013, at 3:33 PM, Matt Arsenault <Matthew.Arsenault at amd.com> wrote: >> >>> On 11/11/2013 03:13 PM, Andrew Trick wrote: >>>> On Nov 9, 2013, at 1:39 PM, Matt Arsenault <arsenm2 at gmail.com> wrote: >>>> >>>>> On Nov 9, 2013, at 3:14 AM, Chandler Carruth <chandlerc at google.com> wrote: >>>>> >>>>>> Perhaps you're instead trying to say that with certain address spaces "noalias" (and by inference, "restrict" at the language level) has a different semantic model than other address spaces? While it's less worrisome than the first interpretation, I still don't really like it. >>>>>> >>>>> This sounds right. With the constant address space, anything you do is OK since it’s constant. Private address space is supposed to be totally inaccessible from other workitems, so parallel modifications aren’t a concern. The others require explicit synchronization which noalias would need to be aware of. >>>> FWIW, it seems generally useful to me to have a nomemfence function attribute and intrinsic property. We should avoid memory optimization (and possibly other optimization) across these regardless of alias analysis. >>>> >>> I'm think I'll try implementing this. Ideally it would be parameterized over the address space, so it makes more sense for it to be a memfence attribute rather than a nomemfence. You would then have an arbitrary number of memfence(N) attributes for each required address space. >> So for correctness, would we need to tag all functions with memfence(0..M) until we can prove otherwise? That seem heinous. > I was thinking the absence of it would mean no memfence in any address space, which is the current behavior. This adds the option of fencing. >> Better to have an optional attribute that can be added to expose optimization. Is it important in practice to optimize the case of memfence(I) + nomemfence(J)? > I think it would be important for the GPU case. You never need a memfence for private address space / addrspace 0, but you frequently want them for local or global. The local or global writes can't be reordered, but it could be very beneficial to move the private accesses across fences which might help reduce register usage. > >> If so, is there a problem with nomemfence(N)? > nomemfence is the current assumption made on an arbitrary call, and it's the common case. Specifying the absence of a fence seems backwards of how this is used and more cumbersome to deal with. To match the current behavior, it would require littering nomemfence for any possible address space everywhere. In OpenCL you specify your fences, so it would be more straightforward to map that. If I have a memfence intrinsic, I just need to mark it with the fence attribute, and then propogate it to its callers. There would generally only be a few of them in any program compared to fenceless calls. To implement this with nomemfence, I would have to mark every function with at least 4 nomemfences, and remove them when encountering the memfence intrinsic.Sure, but the program still needs to be correct if you skip attribute propagation. -Andy -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131204/4aa2ed2d/attachment.html>
On 12/04/2013 05:25 PM, Andrew Trick wrote:> > On Dec 4, 2013, at 5:19 PM, Matt Arsenault <Matthew.Arsenault at amd.com > <mailto:Matthew.Arsenault at amd.com>> wrote: > >> On 12/04/2013 04:29 PM, Andrew Trick wrote: >>> On Dec 4, 2013, at 3:33 PM, Matt Arsenault >>> <Matthew.Arsenault at amd.com <mailto:Matthew.Arsenault at amd.com>> wrote: >>> >>>> On 11/11/2013 03:13 PM, Andrew Trick wrote: >>>>> On Nov 9, 2013, at 1:39 PM, Matt Arsenault <arsenm2 at gmail.com >>>>> <mailto:arsenm2 at gmail.com>> wrote: >>>>> >>>>>> On Nov 9, 2013, at 3:14 AM, Chandler Carruth >>>>>> <chandlerc at google.com <mailto:chandlerc at google.com>> wrote: >>>>>> >>>>>>> Perhaps you're instead trying to say that with certain address >>>>>>> spaces "noalias" (and by inference, "restrict" at the language >>>>>>> level) has a different semantic model than other address spaces? >>>>>>> While it's less worrisome than the first interpretation, I still >>>>>>> don't really like it. >>>>>>> >>>>>> This sounds right. With the constant address space, anything you >>>>>> do is OK since it’s constant. Private address space is supposed >>>>>> to be totally inaccessible from other workitems, so parallel >>>>>> modifications aren’t a concern. The others require explicit >>>>>> synchronization which noalias would need to be aware of. >>>>> FWIW, it seems generally useful to me to have a nomemfence >>>>> function attribute and intrinsic property. We should avoid memory >>>>> optimization (and possibly other optimization) across these >>>>> regardless of alias analysis. >>>>> >>>> I'm think I'll try implementing this. Ideally it would be >>>> parameterized over the address space, so it makes more sense for it >>>> to be a memfence attribute rather than a nomemfence. You would then >>>> have an arbitrary number of memfence(N) attributes for each >>>> required address space. >>> So for correctness, would we need to tag all functions with >>> memfence(0..M) until we can prove otherwise? That seem heinous. >> I was thinking the absence of it would mean no memfence in any >> address space, which is the current behavior. This adds the option of >> fencing. >>> Better to have an optional attribute that can be added to expose >>> optimization. Is it important in practice to optimize the case of >>> memfence(I) + nomemfence(J)? >> I think it would be important for the GPU case. You never need a >> memfence for private address space / addrspace 0, but you frequently >> want them for local or global. The local or global writes can't be >> reordered, but it could be very beneficial to move the private >> accesses across fences which might help reduce register usage. >> >>> If so, is there a problem with nomemfence(N)? >> nomemfence is the current assumption made on an arbitrary call, and >> it's the common case. Specifying the absence of a fence seems >> backwards of how this is used and more cumbersome to deal with. To >> match the current behavior, it would require littering nomemfence for >> any possible address space everywhere. In OpenCL you specify your >> fences, so it would be more straightforward to map that. If I have a >> memfence intrinsic, I just need to mark it with the fence attribute, >> and then propogate it to its callers. There would generally only be a >> few of them in any program compared to fenceless calls. To implement >> this with nomemfence, I would have to mark every function with at >> least 4 nomemfences, and remove them when encountering the memfence >> intrinsic. > > Sure, but the program still needs to be correct if you skip attribute > propagation. > -AndyIs that actually a real concern? My main problem with nomemfence is how do you mark a function as not fencing any other address space you might care about around call sites? I suppose nomemfence without an address space could indicate nomemfence for any address space, but then that just restricts the problem to when you do have a few fenced address spaces. How do you know what other address spaces are relevant to be marked? Add a nomemfence for any address spaces encountered in functions with call sites? What if those in another module? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131206/09e5a8f2/attachment.html>
On Dec 4, 2013, at 8:25 PM, Andrew Trick <atrick at apple.com> wrote:> > On Dec 4, 2013, at 5:19 PM, Matt Arsenault <Matthew.Arsenault at amd.com> wrote: > >> On 12/04/2013 04:29 PM, Andrew Trick wrote: >>> On Dec 4, 2013, at 3:33 PM, Matt Arsenault <Matthew.Arsenault at amd.com> wrote: >>> >>>> On 11/11/2013 03:13 PM, Andrew Trick wrote: >>>>> On Nov 9, 2013, at 1:39 PM, Matt Arsenault <arsenm2 at gmail.com> wrote: >>>>> >>>>>> On Nov 9, 2013, at 3:14 AM, Chandler Carruth <chandlerc at google.com> wrote: >>>>>> >>>>>>> Perhaps you're instead trying to say that with certain address spaces "noalias" (and by inference, "restrict" at the language level) has a different semantic model than other address spaces? While it's less worrisome than the first interpretation, I still don't really like it. >>>>>>> >>>>>> This sounds right. With the constant address space, anything you do is OK since it’s constant. Private address space is supposed to be totally inaccessible from other workitems, so parallel modifications aren’t a concern. The others require explicit synchronization which noalias would need to be aware of. >>>>> FWIW, it seems generally useful to me to have a nomemfence function attribute and intrinsic property. We should avoid memory optimization (and possibly other optimization) across these regardless of alias analysis. >>>>> >>>> I'm think I'll try implementing this. Ideally it would be parameterized over the address space, so it makes more sense for it to be a memfence attribute rather than a nomemfence. You would then have an arbitrary number of memfence(N) attributes for each required address space. >>> So for correctness, would we need to tag all functions with memfence(0..M) until we can prove otherwise? That seem heinous. >> I was thinking the absence of it would mean no memfence in any address space, which is the current behavior. This adds the option of fencing. >>> Better to have an optional attribute that can be added to expose optimization. Is it important in practice to optimize the case of memfence(I) + nomemfence(J)? >> I think it would be important for the GPU case. You never need a memfence for private address space / addrspace 0, but you frequently want them for local or global. The local or global writes can't be reordered, but it could be very beneficial to move the private accesses across fences which might help reduce register usage. >> >>> If so, is there a problem with nomemfence(N)? >> nomemfence is the current assumption made on an arbitrary call, and it's the common case. Specifying the absence of a fence seems backwards of how this is used and more cumbersome to deal with. To match the current behavior, it would require littering nomemfence for any possible address space everywhere. In OpenCL you specify your fences, so it would be more straightforward to map that. If I have a memfence intrinsic, I just need to mark it with the fence attribute, and then propogate it to its callers. There would generally only be a few of them in any program compared to fenceless calls. To implement this with nomemfence, I would have to mark every function with at least 4 nomemfences, and remove them when encountering the memfence intrinsic. > > > Sure, but the program still needs to be correct if you skip attribute propagation. > -AndyIs this a requirement for an attribute? This would be a problem for the already existing noduplicate. If a function has a call to a noduplicate function, the calling function could still be duplicated if the attribute isn’t propagated which isn’t allowed. - Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131220/16fda8d4/attachment.html>