Hal Finkel via llvm-dev
2016-Jul-16 03:06 UTC
[llvm-dev] RFC: Strong GC References in LLVM
----- Original Message -----> From: "Andrew Trick" <atrick at apple.com> > To: "Sanjoy Das" <sanjoy at playingwithpointers.com> > Cc: "Daniel Berlin" <dberlin at dberlin.org>, "llvm-dev" <llvm-dev at lists.llvm.org>, "Joseph Tremoulet" > <jotrem at microsoft.com>, "Oscar Blumberg" <oscar.blumberg at normalesup.org>, "Chandler Carruth" <chandlerc at gmail.com>, > "Nick Lewycky" <nlewycky at google.com>, "Hal Finkel" <hfinkel at anl.gov>, "Philip Reames" <listmail at philipreames.com>, > "Manuel Jacob" <me at manueljacob.de>, "Eli Friedman" <eli.friedman at gmail.com>, "David Majnemer" > <david.majnemer at gmail.com> > Sent: Friday, July 15, 2016 7:40:51 PM > Subject: Re: RFC: Strong GC References in LLVM > > > > On Jul 15, 2016, at 5:37 PM, Sanjoy Das > > <sanjoy at playingwithpointers.com> wrote: > > > > Hi Andy, > > > > Andrew Trick wrote: > > > At some point I stopped thinking about this as a bug and realized > > > that > > > you just need to think of LLVM as modeling speculative code > > > barriers as > > > memory dependence. In LLVM, it makes no sense to have a readonly > > > may-throw call. > > > > The problem is that that model breaks down with aggressive aliasing > > like: > > > > void foo(int* restrict ptr) { > > *ptr = 40; > > may_throw(); // read/write call > > *ptr = 50; > > } > > > > Now it is tempting to CSE the store of 40 to *ptr. If we can't do > > that then what does restrict/noalias even mean? > > I thought it meant ‘ptr’ doesn’t alias with other ‘restrict’ pointer > args. Not that it’s an exclusive way to access the memory. I could > be wrong though...It means that, within the scope of ptr, any object accessed via a pointer based on ptr is not accessed via a pointer not based on ptr. -Hal> > In the same way you can’t have readonly-maythrow, you wouldn’t have > TBAA+maythrow. Yeah, it’s not great. > > -Andy-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
Sanjoy Das via llvm-dev
2016-Jul-18 18:54 UTC
[llvm-dev] RFC: Strong GC References in LLVM
Hi all, I think it is time to start getting more concrete here. As a starting point, I want to send out for review (roughly) the following changes: - Add a "gc" address space to the datalayout string - Start implementing the non-controversial rules (i.e. everything except the bits that initiated the "nospeculate" attribute discussion): - No pointer <-> integer casts for GC address spaces to begin with - Add an intrinsic (with control dependence) to convert GCrefs -> integers (we need this for GC load/store barriers) - Disable some of the problematic "cast by round tripping through memory" type optimizations for loads and stores that are of GC ref type The things above are things we know we need, and even if all we do is implement those, we will be in a better position overall. One thing I want a design opinion on (already discussed on IRC): I'm planning to phrase RewriteStatepointsForGC (a ModulePass) that "implements" GC references "in terms of" normal pointers. One way to do this is to rewrite each def and user of GC refs to use a normal pointer, but that's unnecessary data structure churn, so I was wondering if instead we can flip the meaning of what a GC ref is by modifying the datalayout instead? RewriteStatepointsForGC can then be seen as changing IR that can be lowered to run on only a "machine" that directly supports GC pointers to IR that can be lowered to run on machines that don't. That is RewriteStatepointsForGC will change IR from "No explicit relocations, addrspace(k) is marked as 'gc' in the datalayout" to "All relocations explicit, addrspace(k) is not marked specially in the datalayout" However, Chandler had some (strong?) reservations on IRC about modifying datalayout in an optimization, in the face of which I have a couple of alternatives: - Have RewriteStatepointsForGC rewrite defs and users of GC references to use a "normal" pointer type. I'm a little hesitant to to do this since it seems wasteful (no evidence yet that it will matter), and may complicate keeping side data structures correct in the face of mass invalidations. - Represent the gc address space in something other than the datalayout that we all can agree is fair game to be modified by a ModulePass. Not a great option since datalayout seems the most natural place to put the "gc-ref-addrspace" information. - Don't do anything, i.e. RewriteStatepointsForGC does what it does today: it rewrites pointers of addrspace(1) (or addrspace(k) for some k) to be explicit but does not change the meaning of "addrspace(k)". I'm hesitant to do this because then I can't concisely answer "what does RewriteStatepointsForGC do?". I want to see what others think about this, but in the absence of any specific opinion here I'll go with the first option (and consider using mutateType if things turn out to be too slow). In parallel with all this, I'll try to come up with a concrete notion of how the nospeculate attributes on loads and function calls will look like, how it would interact with optimizations like mem2reg etc. I'll consider potential interactions with https://reviews.llvm.org/D20116 "Add speculatable function attribute" and generally just kick it around to see if the idea holds up and gives us all of the constraints we need. Sounds good? -- Sanjoy
Daniel Berlin via llvm-dev
2016-Jul-18 18:57 UTC
[llvm-dev] RFC: Strong GC References in LLVM
+1 Sounds like a good plan to me On Mon, Jul 18, 2016 at 11:54 AM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote:> Hi all, > > I think it is time to start getting more concrete here. As a starting > point, I want to send out for review (roughly) the following changes: > > - Add a "gc" address space to the datalayout string > - Start implementing the non-controversial rules (i.e. everything > except the bits that initiated the "nospeculate" attribute > discussion): > - No pointer <-> integer casts for GC address spaces to begin with > - Add an intrinsic (with control dependence) to > convert GCrefs -> integers (we need this for GC load/store > barriers) > - Disable some of the problematic "cast by round tripping through > memory" type optimizations for loads and stores that are of GC > ref type > > The things above are things we know we need, and even if all we do is > implement those, we will be in a better position overall. > > > One thing I want a design opinion on (already discussed on IRC): I'm > planning to phrase RewriteStatepointsForGC (a ModulePass) that > "implements" GC references "in terms of" normal pointers. One way to > do this is to rewrite each def and user of GC refs to use a normal > pointer, but that's unnecessary data structure churn, so I was > wondering if instead we can flip the meaning of what a GC ref is by > modifying the datalayout instead? RewriteStatepointsForGC can then be > seen as changing IR that can be lowered to run on only a "machine" > that directly supports GC pointers to IR that can be lowered to run on > machines that don't. That is RewriteStatepointsForGC will change IR > from > > "No explicit relocations, addrspace(k) is marked as 'gc' in the > datalayout" to "All relocations explicit, addrspace(k) is not marked > specially in the datalayout" > > However, Chandler had some (strong?) reservations on IRC about > modifying datalayout in an optimization, in the face of which I have a > couple of alternatives: > > - Have RewriteStatepointsForGC rewrite defs and users of GC > references to use a "normal" pointer type. I'm a little hesitant > to to do this since it seems wasteful (no evidence yet that it will > matter), and may complicate keeping side data structures correct in > the face of mass invalidations. > > - Represent the gc address space in something other than the > datalayout that we all can agree is fair game to be modified by a > ModulePass. Not a great option since datalayout seems the most > natural place to put the "gc-ref-addrspace" information. > > - Don't do anything, i.e. RewriteStatepointsForGC does what it does > today: it rewrites pointers of addrspace(1) (or addrspace(k) for > some k) to be explicit but does not change the meaning of > "addrspace(k)". I'm hesitant to do this because then I can't > concisely answer "what does RewriteStatepointsForGC do?". > > I want to see what others think about this, but in the absence of any > specific opinion here I'll go with the first option (and consider > using mutateType if things turn out to be too slow). > > > > > In parallel with all this, I'll try to come up with a concrete notion > of how the nospeculate attributes on loads and function calls will > look like, how it would interact with optimizations like mem2reg etc. > I'll consider potential interactions with > https://reviews.llvm.org/D20116 "Add speculatable function attribute" > and generally just kick it around to see if the idea holds up and > gives us all of the constraints we need. > > Sounds good? > -- Sanjoy >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160718/574f33d1/attachment.html>
Chandler Carruth via llvm-dev
2016-Jul-18 23:46 UTC
[llvm-dev] RFC: Strong GC References in LLVM
Sorry, I missed this at first but I have one issue here: On Mon, Jul 18, 2016 at 11:55 AM Sanjoy Das via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi all, > > I think it is time to start getting more concrete here. As a starting > point, I want to send out for review (roughly) the following changes: > > - Add a "gc" address space to the datalayout string >I don't really understand the need for this yet, because the following point:> - Start implementing the non-controversial rules (i.e. everything > except the bits that initiated the "nospeculate" attribute > discussion): >I think everything here should apply to *all* non-zero address spaces. I think the thing we would want is for a tagged address space to opt *out* of this conservative behavior, not the other way around. So I don't think you need a tagged address space to implement everything here, and I'd like to avoid tagging the address space until the last possible second to make sure that this is implemented as generically as possible. I'm actually hopeful that the tagging isn't necessary at all. -Chandler> - No pointer <-> integer casts for GC address spaces to begin with > - Add an intrinsic (with control dependence) to > convert GCrefs -> integers (we need this for GC load/store > barriers) > - Disable some of the problematic "cast by round tripping through > memory" type optimizations for loads and stores that are of GC > ref type > > The things above are things we know we need, and even if all we do is > implement those, we will be in a better position overall. > > > One thing I want a design opinion on (already discussed on IRC): I'm > planning to phrase RewriteStatepointsForGC (a ModulePass) that > "implements" GC references "in terms of" normal pointers. One way to > do this is to rewrite each def and user of GC refs to use a normal > pointer, but that's unnecessary data structure churn, so I was > wondering if instead we can flip the meaning of what a GC ref is by > modifying the datalayout instead? RewriteStatepointsForGC can then be > seen as changing IR that can be lowered to run on only a "machine" > that directly supports GC pointers to IR that can be lowered to run on > machines that don't. That is RewriteStatepointsForGC will change IR > from > > "No explicit relocations, addrspace(k) is marked as 'gc' in the > datalayout" to "All relocations explicit, addrspace(k) is not marked > specially in the datalayout" > > However, Chandler had some (strong?) reservations on IRC about > modifying datalayout in an optimization, in the face of which I have a > couple of alternatives: > > - Have RewriteStatepointsForGC rewrite defs and users of GC > references to use a "normal" pointer type. I'm a little hesitant > to to do this since it seems wasteful (no evidence yet that it will > matter), and may complicate keeping side data structures correct in > the face of mass invalidations. > > - Represent the gc address space in something other than the > datalayout that we all can agree is fair game to be modified by a > ModulePass. Not a great option since datalayout seems the most > natural place to put the "gc-ref-addrspace" information. > > - Don't do anything, i.e. RewriteStatepointsForGC does what it does > today: it rewrites pointers of addrspace(1) (or addrspace(k) for > some k) to be explicit but does not change the meaning of > "addrspace(k)". I'm hesitant to do this because then I can't > concisely answer "what does RewriteStatepointsForGC do?". > > I want to see what others think about this, but in the absence of any > specific opinion here I'll go with the first option (and consider > using mutateType if things turn out to be too slow). > > > > > In parallel with all this, I'll try to come up with a concrete notion > of how the nospeculate attributes on loads and function calls will > look like, how it would interact with optimizations like mem2reg etc. > I'll consider potential interactions with > https://reviews.llvm.org/D20116 "Add speculatable function attribute" > and generally just kick it around to see if the idea holds up and > gives us all of the constraints we need. > > Sounds good? > -- Sanjoy > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160718/11cba3a0/attachment.html>