Justin Lebar via llvm-dev
2016-Aug-29  18:42 UTC
[llvm-dev] invariant.load metadata semantics
Sanjoy, can you clarify this bit about JVM array lengths? Presumably the same pointer can point to two arrays of different lengths during a program's execution. Does this mean that you're relying on invariant.load having function scope? That is, correctness depends on the pointer not being reused for an array of a different length between the first invariant load of that array length in a function and the last (possibly not invariant) load in the function? This seems like a very weak form of invariance. For example, when you inline, you would have to transform invariant loads to the scoped invariant thing. In NVPTX I am on a quest to use invariant loads for loads that must never change during the execution of a kernel. If the metadata signaled this more restricted scope, I guess I'd have to figure something else out. -Justin On Aug 25, 2016 7:13 PM, "Sanjoy Das via llvm-dev" <llvm-dev at lists.llvm.org> wrote:> Hi Charles, > > Caldarale, Charles R via llvm-dev wrote: > >> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of > Hal Finkel via llvm-dev > >> Subject: Re: [llvm-dev] invariant.load metadata semantics > > > >> Alternatively, we might phrase this as: The optimizer may assume that > all values loaded > >> from a location, where any of the loads are tagged with > !invariant.load, are identical. > > > > > This would seem to limit the usefulness of the invariant attribute. I > would expect invariant to indicate that loads reachable from the one marked > invariant are guaranteed to read the same value, but that prior ones are > not. This would allow updates to be made to the location of interest up to > the point of declared invariance, but not after. > > That is a very defensible point, but depending on your use case you > consider these "initial" writes part of making the memory > dereferenceable. For instance, for us (JVM) loading array lengths > satisfy the requirements of invariant loads, and we bake in the length > of the array as part of the array's allocation routine. LLVM never > "sees" a store the the length field of an array. > > However, I can imagine more complex use cases for which the above > approach won't work. > > -- Sanjoy > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160829/07563ccf/attachment.html>
Sanjoy Das via llvm-dev
2016-Aug-29  20:21 UTC
[llvm-dev] invariant.load metadata semantics
Hi Justin,
Justin Lebar wrote:
 > Sanjoy, can you clarify this bit about JVM array lengths?
 >
 > Presumably the same pointer can point to two arrays of different lengths
 > during a program's execution.
Yes, if you define "same pointer" as "pointers with the same
bitwise
representation".  However, we do not have placement new or free, and
we do not have to worry about pointers that are visible to the program
being re-used.  That is, a pointer pointing to an array in the heap
will not be re-used for any other allocation while the program can
still reach it.
 > Does this mean that you're relying on
 > invariant.load having function scope?  That is, correctness depends on
 > the pointer not being reused for an array of a different length between
 > the first invariant load of that array length in a function and the last
 > (possibly not invariant) load in the function?
Are you talking about cases like these:
void f(array_t* a) {
   // Not sure if this is well-defined C++, but you get the picture
   int l = a->length;  // invariant
   delete a;
   new(a) array_t(other_length);
   int l2 = a->length; // also invariant
}
A case like above cannot happen for us since we don't have placement
new.  An allocation always returns a logically new reference, which
goes away once the application can provably no longer look at that
reference (not just load from it, but also compare it to other
references etc.).  An allocation can return a "logically new" location
that happens to be bitwise equal to something it returned before (in
fact, it has to, otherwise we'll need infinite memory :) ), but the GC
(this includes the stuff we've added to LLVM) makes sure that we
re-use only locations that are no longer visible to the application.
Another way to look at this is, the GC provides an illusion of an
infinitely large heap, where every "new" returns a new distinct
reference.  For instance, in cases like:
void do_something(array_t* a);
void f(array_t* a) {
   do_something(a);
   array_t* b = gc_new_array(...);
   boolean c = (a == b);
}
`c` is *always* false.  Since we have a use of `a` when computing `c`,
it _cannot_ have gone away at the point of the comparison[0].
 > This seems like a very weak form of invariance.  For example, when you
 > inline, you would have to transform invariant loads to the scoped
 > invariant thing.
I don't think we have to, but if you have some specific examples, I'll
be more than happy to comment on those.  And thanks for bringing this
up!
-- Sanjoy
[0]: Actually it can, since we can first fold `c` to `false`, after
   which the GC can re-use the space for `a` when allocating `b`.  But
   that falls into "as-if" type optimization.
Justin Lebar via llvm-dev
2016-Aug-29  21:04 UTC
[llvm-dev] invariant.load metadata semantics
> Yes, if you define "same pointer" as "pointers with the same bitwiserepresentation". Yes, this is the definition I was interested in, under the assumption that it's the definition LLVM uses. But it sounds like you've augmented LLVM so that bitwise-identical pointer values are not considered equal if they come from different allocations. In that case, I don't see any problem, although if this isn't already explicit in the langref (possibly not specifically here, since I figure this applies to lots of other stuff as well) clarifying it somewhere might help! -Justin> Are you talking about cases like these: > > void f(array_t* a) { > // Not sure if this is well-defined C++, but you get the picture > int l = a->length; // invariant > delete a; > new(a) array_t(other_length); > int l2 = a->length; // also invariant > }Yes, On Mon, Aug 29, 2016 at 1:21 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote:> Hi Justin, > > Justin Lebar wrote: >> Sanjoy, can you clarify this bit about JVM array lengths? >> >> Presumably the same pointer can point to two arrays of different lengths >> during a program's execution. > > Yes, if you define "same pointer" as "pointers with the same bitwise > representation". However, we do not have placement new or free, and > we do not have to worry about pointers that are visible to the program > being re-used. That is, a pointer pointing to an array in the heap > will not be re-used for any other allocation while the program can > still reach it. > >> Does this mean that you're relying on >> invariant.load having function scope? That is, correctness depends on >> the pointer not being reused for an array of a different length between >> the first invariant load of that array length in a function and the last >> (possibly not invariant) load in the function? > > Are you talking about cases like these: > > void f(array_t* a) { > // Not sure if this is well-defined C++, but you get the picture > int l = a->length; // invariant > delete a; > new(a) array_t(other_length); > int l2 = a->length; // also invariant > } > > A case like above cannot happen for us since we don't have placement > new. An allocation always returns a logically new reference, which > goes away once the application can provably no longer look at that > reference (not just load from it, but also compare it to other > references etc.). An allocation can return a "logically new" location > that happens to be bitwise equal to something it returned before (in > fact, it has to, otherwise we'll need infinite memory :) ), but the GC > (this includes the stuff we've added to LLVM) makes sure that we > re-use only locations that are no longer visible to the application. > > Another way to look at this is, the GC provides an illusion of an > infinitely large heap, where every "new" returns a new distinct > reference. For instance, in cases like: > > void do_something(array_t* a); > > void f(array_t* a) { > do_something(a); > array_t* b = gc_new_array(...); > boolean c = (a == b); > } > > `c` is *always* false. Since we have a use of `a` when computing `c`, > it _cannot_ have gone away at the point of the comparison[0]. > >> This seems like a very weak form of invariance. For example, when you >> inline, you would have to transform invariant loads to the scoped >> invariant thing. > > I don't think we have to, but if you have some specific examples, I'll > be more than happy to comment on those. And thanks for bringing this > up! > > -- Sanjoy > > [0]: Actually it can, since we can first fold `c` to `false`, after > which the GC can re-use the space for `a` when allocating `b`. But > that falls into "as-if" type optimization.-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160829/7b70268c/attachment.html>