Justin Lebar via llvm-dev
2016-Aug-29 21:04 UTC
[llvm-dev] invariant.load metadata semantics
> Yes, if you define "same pointer" as "pointers with the same bitwiserepresentation". Yes, this is the definition I was interested in, under the assumption that it's the definition LLVM uses. But it sounds like you've augmented LLVM so that bitwise-identical pointer values are not considered equal if they come from different allocations. In that case, I don't see any problem, although if this isn't already explicit in the langref (possibly not specifically here, since I figure this applies to lots of other stuff as well) clarifying it somewhere might help! -Justin> Are you talking about cases like these: > > void f(array_t* a) { > // Not sure if this is well-defined C++, but you get the picture > int l = a->length; // invariant > delete a; > new(a) array_t(other_length); > int l2 = a->length; // also invariant > }Yes, On Mon, Aug 29, 2016 at 1:21 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote:> Hi Justin, > > Justin Lebar wrote: >> Sanjoy, can you clarify this bit about JVM array lengths? >> >> Presumably the same pointer can point to two arrays of different lengths >> during a program's execution. > > Yes, if you define "same pointer" as "pointers with the same bitwise > representation". However, we do not have placement new or free, and > we do not have to worry about pointers that are visible to the program > being re-used. That is, a pointer pointing to an array in the heap > will not be re-used for any other allocation while the program can > still reach it. > >> Does this mean that you're relying on >> invariant.load having function scope? That is, correctness depends on >> the pointer not being reused for an array of a different length between >> the first invariant load of that array length in a function and the last >> (possibly not invariant) load in the function? > > Are you talking about cases like these: > > void f(array_t* a) { > // Not sure if this is well-defined C++, but you get the picture > int l = a->length; // invariant > delete a; > new(a) array_t(other_length); > int l2 = a->length; // also invariant > } > > A case like above cannot happen for us since we don't have placement > new. An allocation always returns a logically new reference, which > goes away once the application can provably no longer look at that > reference (not just load from it, but also compare it to other > references etc.). An allocation can return a "logically new" location > that happens to be bitwise equal to something it returned before (in > fact, it has to, otherwise we'll need infinite memory :) ), but the GC > (this includes the stuff we've added to LLVM) makes sure that we > re-use only locations that are no longer visible to the application. > > Another way to look at this is, the GC provides an illusion of an > infinitely large heap, where every "new" returns a new distinct > reference. For instance, in cases like: > > void do_something(array_t* a); > > void f(array_t* a) { > do_something(a); > array_t* b = gc_new_array(...); > boolean c = (a == b); > } > > `c` is *always* false. Since we have a use of `a` when computing `c`, > it _cannot_ have gone away at the point of the comparison[0]. > >> This seems like a very weak form of invariance. For example, when you >> inline, you would have to transform invariant loads to the scoped >> invariant thing. > > I don't think we have to, but if you have some specific examples, I'll > be more than happy to comment on those. And thanks for bringing this > up! > > -- Sanjoy > > [0]: Actually it can, since we can first fold `c` to `false`, after > which the GC can re-use the space for `a` when allocating `b`. But > that falls into "as-if" type optimization.-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160829/7b70268c/attachment.html>
Sanjoy Das via llvm-dev
2016-Aug-29 23:27 UTC
[llvm-dev] invariant.load metadata semantics
Hi Justin, Justin Lebar wrote: > > Yes, if you define "same pointer" as "pointers with the same bitwise > representation". > > Yes, this is the definition I was interested in, under the assumption > that it's the definition LLVM uses. But it sounds like you've augmented > LLVM so that bitwise-identical pointer values are not considered equal > if they come from different allocations. I may be misunderstanding you, but what you said above sounds misleading. It isn't that we do some extra magic at the equality level, the magic we do is at the _allocation_ level. The right mental picture is that we have a malloc that (magically) never runs out of memory, and our programs cannot explicitly call free or realloc on memory (so every allocation _is_, in fact, distinct). This is not specific to GC'ed environments -- you could do the same thing in C++ too, by adding (say) a programmer-level invariant (don't check in code that calls delete on class X, all instances of X live forever etc.). As an "optimization"[0] on top of that (this is the magic bit) we figure out which allocations are unnecessary from the program's point of view, and throw them away. Did that make sense? -- Sanjoy [0]: I'm being somewhat flippant here, this "optimization" is basically the whole of the GC. :)
Justin Lebar via llvm-dev
2016-Aug-30 12:29 UTC
[llvm-dev] invariant.load metadata semantics
My concern was just that, given what was happening with your Java stuff, invariant.load might not be as strong as I thought. But I am satisfied on that count. Thanks, Sanjoy! On Mon, Aug 29, 2016 at 4:27 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote:> Hi Justin, > > Justin Lebar wrote: >> > Yes, if you define "same pointer" as "pointers with the same bitwise >> representation". >> >> Yes, this is the definition I was interested in, under the assumption >> that it's the definition LLVM uses. But it sounds like you've augmented >> LLVM so that bitwise-identical pointer values are not considered equal >> if they come from different allocations. > > I may be misunderstanding you, but what you said above sounds > misleading. It isn't that we do some extra magic at the equality > level, the magic we do is at the _allocation_ level. The right mental > picture is that we have a malloc that (magically) never runs out of > memory, and our programs cannot explicitly call free or realloc on > memory (so every allocation _is_, in fact, distinct). This is not > specific to GC'ed environments -- you could do the same thing in C++ > too, by adding (say) a programmer-level invariant (don't check in code > that calls delete on class X, all instances of X live forever etc.). > > As an "optimization"[0] on top of that (this is the magic bit) we > figure out which allocations are unnecessary from the program's point > of view, and throw them away. > > Did that make sense? > > -- Sanjoy > > [0]: I'm being somewhat flippant here, this "optimization" is > basically the whole of the GC. :)