Sanjoy Das wrote:> In your > example, foo will have to treat its argument differently depending on > whether it is a GC pointer or not.In practice, this is not true of many functions that don't call other functions. Take the example of a simple "print" function that takes a void * to cast and print, type_int to determine what to cast to: why should it care about whether the pointer is GC'able or not? In the callsite, I have this information, and I accordingly emit statepoint/relocate information. But "print" doesn't call other functions, and doesn't need to emit statepoint/relocate. Let's say I made the void * argument addrspace(0). Then, in callsites where I have an addrspace(1) to pass, I have to emit: addrspacecast 1 -> 0 call print addrspacecast 0 -> 1 Is the ideal workflow, or should we have some sort of addrspaceany?
On 18 Jan 2015, at 18:40, Ramkumar Ramachandra <artagnon at gmail.com> wrote:> In practice, this is not true of many functions that don't call other > functions. Take the example of a simple "print" function that takes a > void * to cast and print, type_int to determine what to cast to: why > should it care about whether the pointer is GC'able or not? In the > callsite, I have this information, and I accordingly emit > statepoint/relocate information. But "print" doesn't call other > functions, and doesn't need to emit statepoint/relocate. > > Let's say I made the void * argument addrspace(0). Then, in callsites > where I have an addrspace(1) to pass, I have to emit: > > addrspacecast 1 -> 0 > call print > addrspacecast 0 -> 1 > > Is the ideal workflow, or should we have some sort of addrspaceany?The requirements ought to be captured by the nocapture attribute (though that still places some limitations on the GC - it isn't allowed to relocate an object while a pointer to it is passed to GC-oblivious code, which may not be an invariant that's easy to enforce in some designs). I'm wary of an addrspaceany attribute though - we have different address spaces with different sizes and different register assignments for calling conventions, so this is a bit broad. I'm not totally convinced by the use of address spaces to indicate GC vs non-GC pointers in this way, because we don't have a good way of describing interactions between address spaces in IR currently. DataLayout can tell us the size and alignment for pointers to each AS, but can't currently tell us: - Whether one address space is contained within another. - Whether casts from one address space are lossy (if you do addrspacecast from n->m then back, are you guaranteed the same pointer?). - Whether address space casts between a pair of address spaces are valid always, never, or sometimes. Your addrspaceany is really a union of the two pointer types (which your high-level language's type system may or may not like), with the assumption that they have the same representation. David
On 01/18/2015 10:40 AM, Ramkumar Ramachandra wrote:> Sanjoy Das wrote: >> In your >> example, foo will have to treat its argument differently depending on >> whether it is a GC pointer or not. > In practice, this is not true of many functions that don't call other > functions. Take the example of a simple "print" function that takes a > void * to cast and print, type_int to determine what to cast to: why > should it care about whether the pointer is GC'able or not? In the > callsite, I have this information, and I accordingly emit > statepoint/relocate information. But "print" doesn't call other > functions, and doesn't need to emit statepoint/relocate.You are right that there are some functions which can not trigger garbage collection and thus are not sensitive to the 'type' of the pointer they've been given. I've been calling such functions "gc leaf functions" for lack of a better name. However, there's a good chance that your "simple print function" is not, in fact, such a function. If your print routine contains any non gc-leaf call, or a loop whose bounds are not known at compile time, it may in fact need to do relocation. Depending on your collector, the routine may also need a load or store barrier for one or the other uses. It's highly unlikely that the code between the GC address space and the non-GC address space is actually the same. There's lots of room to experiment with a gc-leaf function attribute, and - in particular - the inference of such. Having said all that, I'm really curious why this matters to you. In practice, we haven't found there to be many functions at all which are needed on both gc and non-gc pointers (where the function is *also* a gc-leaf.) Unless you're seeing a bunch of cases like this, I'd just duplicate the shared routines.> > Let's say I made the void * argument addrspace(0). Then, in callsites > where I have an addrspace(1) to pass, I have to emit: > > addrspacecast 1 -> 0 > call print > addrspacecast 0 -> 1 > > Is the ideal workflow, or should we have some sort of addrspaceany?I strongly advise against introducing such casts. Doing so makes it much harder to reason about correctness. I would be open to a proposal of an "generic address space" mechanism, but that's a large project. I don't really see the motivation for it currently. You'd need to send a proposal to llvmdev and get feedback on the idea. Philip
On 01/18/2015 10:56 AM, David Chisnall wrote:> On 18 Jan 2015, at 18:40, Ramkumar Ramachandra <artagnon at gmail.com> wrote: >> In practice, this is not true of many functions that don't call other >> functions. Take the example of a simple "print" function that takes a >> void * to cast and print, type_int to determine what to cast to: why >> should it care about whether the pointer is GC'able or not? In the >> callsite, I have this information, and I accordingly emit >> statepoint/relocate information. But "print" doesn't call other >> functions, and doesn't need to emit statepoint/relocate. >> >> Let's say I made the void * argument addrspace(0). Then, in callsites >> where I have an addrspace(1) to pass, I have to emit: >> >> addrspacecast 1 -> 0 >> call print >> addrspacecast 0 -> 1 >> >> Is the ideal workflow, or should we have some sort of addrspaceany? > The requirements ought to be captured by the nocapture attribute (though that still places some limitations on the GC - it isn't allowed to relocate an object while a pointer to it is passed to GC-oblivious code, which may not be an invariant that's easy to enforce in some designs).FYI, nocapture is *not* enough. A store to a GC pointer may require a store barrier; a store to a non-gc pointer may not. Just because a pointer isn't *captured* doesn't mean that the 'GCness' doesn't effect the code generated.> I'm wary of an addrspaceany attribute though - we have different address spaces with different sizes and different register assignments for calling conventions, so this is a bit broad. I'm not totally convinced by the use of address spaces to indicate GC vs non-GC pointers in this way, because we don't have a good way of describing interactions between address spaces in IR currently. DataLayout can tell us the size and alignment for pointers to each AS, but can't currently tell us: > > - Whether one address space is contained within another. > > - Whether casts from one address space are lossy (if you do addrspacecast from n->m then back, are you guaranteed the same pointer?). > > - Whether address space casts between a pair of address spaces are valid always, never, or sometimes. > > Your addrspaceany is really a union of the two pointer types (which your high-level language's type system may or may not like), with the assumption that they have the same representation.David raises a fair point: addrspaceany is clearly unworkable. Something like an addrspacevariant(X, Y) might be workable, but I'm still not really seeing much need for this. I'm not particularly in support of the addrspaceany mechanism. I believe it might make sense to discuss, but I haven't seen a compelling need for it to date and it adds a major source of complexity and bugs. Philip
Philip Reames wrote:> There's lots of room to experiment with a gc-leaf function attribute, and - > in particular - the inference of such.I'm not sure how this would help: do you have optimizations that we can apply specifically to gc-leaf functions?> Having said all that, I'm really curious why this matters to you.While refactoring my code to use statepoints, I noticed that I was alloca'ing things like global string pointers, while malloc'ing things like arrays; but since my language is untyped, I was boxing them up in a malloc'ed structure. The problem doesn't affect me in practice because I'm always passing around boxed objects, but I was wondering if others who are doing typed languages would need an addrspaceany. You don't see the need for it, so I suppose the discussion is closed until someone does.