thr3ads.net - llvm dev - [LLVMdev] Proposal for safe-to-execute meta-data for heap accesses [Nov 2013]

If this information is useful, please help other people find it:
Share via:

Filip Pizlo

2013-Nov-08 16:44 UTC

[LLVMdev] Proposal for safe-to-execute meta-data for heap accesses

> On Nov 8, 2013, at 1:13 AM, Chandler Carruth <chandlerc at
google.com> wrote:
> 
> 
>> On Thu, Nov 7, 2013 at 9:39 PM, Filip Pizlo <fpizlo at apple.com>
wrote:
>> NEW PROPOSAL
>> 
>> The solution is to introduce meta-data that is explicit about how the
safe-to-execute condition ought to be evaluated.  Instead of an SSA use, we can
have meta-data that says:
>> 
>>         %v = load %p !notrap !{ @f, <args> }
>> 
>> where @f is a function in the current module and this function returns
i1, and <args> is zero or more arguments to pass to @f.  As with any
meta-data, this doesn’t imply anything different if you wanted to just execute
the code: executing the load doesn’t imply calling @f; indeed if you dropped the
meta-data the execution of this would still be the same.
> 
> So, first a clarifying question:
> 
> Is the expectation that to utilize this metadata an optimization pass would
have to inspect the body of @f and reason about its behavior given <args>?
Yes. 
> 
> If so, then I think this is pretty bad. If we ever want to parallelize
function passes, then they can't inspect the innards of other functions.
I must be missing something. Can't you do some simple locking?  Lock a
function if it's being transformed, or if you want to inspect it...
> So this would significantly constrain the utility here.
I think we can engineer around this problem. For example, the function @f is
meant to contain basically hand-written IR; it ought not be necessary to
optimize it in order to make use of it for safe-to-execute. It's also
reasonable to expect these to be small.

Hence you can imagine freezing a copy of those functions that are used in this
meta-data.
> 
> Also, this would create uses of the arguments that were
"ephemeral" uses.
I think they're ephemeral in a very different sense than the previous
!notrap; for example here the used continue to be meaningful even after
replaceAllUsesWith.
> It's not clear how that is better than any of the other proposals to
represent constraint systems in the IR via "ephemeral" uses.-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131108/9d2038f0/attachment.html>

Hal Finkel

2013-Nov-08 19:20 UTC

head link

[LLVMdev] Proposal for safe-to-execute meta-data for heap accesses

----- Original Message -----> 
> 
> 
> 
> 
> On Nov 8, 2013, at 1:13 AM, Chandler Carruth < chandlerc at google.com
>
> wrote:
> 
> 
> 
> 
> 
> 
> 
> 
> On Thu, Nov 7, 2013 at 9:39 PM, Filip Pizlo < fpizlo at apple.com >
> wrote:
> 
> 
> 
> NEW PROPOSAL
> 
> The solution is to introduce meta-data that is explicit about how the
> safe-to-execute condition ought to be evaluated. Instead of an SSA
> use, we can have meta-data that says:
> 
> %v = load %p !notrap !{ @f, <args> }
> 
> where @f is a function in the current module and this function
> returns i1, and <args> is zero or more arguments to pass to @f. As
> with any meta-data, this doesn’t imply anything different if you
> wanted to just execute the code: executing the load doesn’t imply
> calling @f; indeed if you dropped the meta-data the execution of
> this would still be the same.
> 
> So, first a clarifying question:
> 
> 
> Is the expectation that to utilize this metadata an optimization pass
> would have to inspect the body of @f and reason about its behavior
> given <args>?
> 
> 
> Yes.
> 
> 
> 
> 
> 
> 
> 
> If so, then I think this is pretty bad. If we ever want to
> parallelize function passes, then they can't inspect the innards of
> other functions.
> 
> 
> I must be missing something. Can't you do some simple locking? Lock a
> function if it's being transformed, or if you want to inspect it...
> 
I think we'd exclude these functions from being acted upon by the regular
optimization passes. These functions would need to be special anyway: they are
functions with a special internal linkage that should not be deleted as dead,
even if 'unused' (likely we'd want them to survive in memory through
CodeGen), but CodeGen itself should ignore the functions (no code for them
should ever be generated).
> 
> 
> 
> 
> So this would significantly constrain the utility here.
> 
> 
> I think we can engineer around this problem. For example, the
> function @f is meant to contain basically hand-written IR; it ought
> not be necessary to optimize it in order to make use of it for
> safe-to-execute. It's also reasonable to expect these to be small.
> 
> 
> Hence you can imagine freezing a copy of those functions that are
> used in this meta-data.
> 
> 
> 
> 
> 
> 
> 
> Also, this would create uses of the arguments that were
"ephemeral"
> uses.
> 
> 
> I think they're ephemeral in a very different sense than the previous
> !notrap; for example here the used continue to be meaningful even
> after replaceAllUsesWith.
I think that, to Chandler's point, it would be the responsibility of the
function creator to insure that the special 'functions' would not need
any non-constant values without other reasonable uses. It seems like this can be
arranged for things like pointer alignment checks, pointer not-null assertions,
pointer dereferencability (indicated by a load in the function I suppose),
simple value constraints. I think that covers most of the intended use cases.

 -Hal
> 
> 
> 
> 
> 
> It's not clear how that is better than any of the other proposals to
> represent constraint systems in the IR via "ephemeral" uses.
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Filip Pizlo

2013-Nov-08 21:18 UTC

head link

[LLVMdev] Proposal for safe-to-execute meta-data for heap accesses

On Nov 8, 2013, at 11:20 AM, Hal Finkel <hfinkel at anl.gov> wrote:
> ----- Original Message -----
>> 
>> 
>> 
>> 
>> 
>> On Nov 8, 2013, at 1:13 AM, Chandler Carruth < chandlerc at
google.com >
>> wrote:
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> On Thu, Nov 7, 2013 at 9:39 PM, Filip Pizlo < fpizlo at apple.com
>
>> wrote:
>> 
>> 
>> 
>> NEW PROPOSAL
>> 
>> The solution is to introduce meta-data that is explicit about how the
>> safe-to-execute condition ought to be evaluated. Instead of an SSA
>> use, we can have meta-data that says:
>> 
>> %v = load %p !notrap !{ @f, <args> }
>> 
>> where @f is a function in the current module and this function
>> returns i1, and <args> is zero or more arguments to pass to @f.
As
>> with any meta-data, this doesn’t imply anything different if you
>> wanted to just execute the code: executing the load doesn’t imply
>> calling @f; indeed if you dropped the meta-data the execution of
>> this would still be the same.
>> 
>> So, first a clarifying question:
>> 
>> 
>> Is the expectation that to utilize this metadata an optimization pass
>> would have to inspect the body of @f and reason about its behavior
>> given <args>?
>> 
>> 
>> Yes.
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> If so, then I think this is pretty bad. If we ever want to
>> parallelize function passes, then they can't inspect the innards of
>> other functions.
>> 
>> 
>> I must be missing something. Can't you do some simple locking? Lock
a
>> function if it's being transformed, or if you want to inspect it...
>> 
> 
> I think we'd exclude these functions from being acted upon by the
regular optimization passes. These functions would need to be special anyway:
they are functions with a special internal linkage that should not be deleted as
dead, even if 'unused' (likely we'd want them to survive in memory
through CodeGen), but CodeGen itself should ignore the functions (no code for
them should ever be generated).
> 
>> 
>> 
>> 
>> 
>> So this would significantly constrain the utility here.
>> 
>> 
>> I think we can engineer around this problem. For example, the
>> function @f is meant to contain basically hand-written IR; it ought
>> not be necessary to optimize it in order to make use of it for
>> safe-to-execute. It's also reasonable to expect these to be small.
>> 
>> 
>> Hence you can imagine freezing a copy of those functions that are
>> used in this meta-data.
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> Also, this would create uses of the arguments that were
"ephemeral"
>> uses.
>> 
>> 
>> I think they're ephemeral in a very different sense than the
previous
>> !notrap; for example here the used continue to be meaningful even
>> after replaceAllUsesWith.
> 
> I think that, to Chandler's point, it would be the responsibility of
the function creator to insure that the special 'functions' would not
need any non-constant values without other reasonable uses. It seems like this
can be arranged for things like pointer alignment checks, pointer not-null
assertions, pointer dereferencability (indicated by a load in the function I
suppose), simple value constraints. I think that covers most of the intended use
cases.
Yeah, I think it is reasonable to say that given anything like:

	%v = load %p !notrap !{ @f, %a1, %a2, …, %an }

It must be the case that %a1…%an are reachable from %p in the sense that ADCE
could only kill %a1…%an if it killed %p.

Here are some concrete examples:

- Load that requires a pointer to be null-checked:
	
	%p = getelementptr %object, <things>
	%v = load %p !notrap !{ @isNotNull, %object }

- Load that requires a null check and an array bounds check:

	%p = getelementptr %object, %index  (possibly other things also depending on
your array object model)
	%v = load %p !notrap !{ @isNotNullAndInBounds,%object, %index }

- Load that requires tagged value encoding and that an object to have a
particular shape and involves a doubly-indirected object model (for example when
implementing dynamic language heap accesses):

	%object = inttoptr %value
	%p1 = getelementptr %object, <things>
	%p2 = load %p1 !notrap !{ @isObject, %value }
	%p3 = getelementptr %p3, <more things>
	%v = load %p3 !notrap !{ @hasShape, %object, 0xstuff } // 0xstuff will be a
compile-time constant

I can’t think of any examples where you’d want to use safe-to-execute and where
Hal’s rule won’t hold.

-Filip

> 
> -Hal
> 
>> 
>> 
>> 
>> 
>> 
>> It's not clear how that is better than any of the other proposals
to
>> represent constraint systems in the IR via "ephemeral" uses.
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>> 
> 
> -- 
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131108/7ed49a63/attachment.html>

Chandler Carruth

2013-Nov-09 05:36 UTC

head link

[LLVMdev] Proposal for safe-to-execute meta-data for heap accesses

On Fri, Nov 8, 2013 at 8:44 AM, Filip Pizlo <fpizlo at apple.com> wrote:
> Is the expectation that to utilize this metadata an optimization pass
> would have to inspect the body of @f and reason about its behavior given
> <args>?
>
>
> Yes.
>
>
> If so, then I think this is pretty bad. If we ever want to parallelize
> function passes, then they can't inspect the innards of other
functions.
>
>
> I must be missing something. Can't you do some simple locking?  Lock a
> function if it's being transformed, or if you want to inspect it...
>
I really, *really* don't like this.

I do *not* want parallelizing LLVM to require careful locking protocols to
be followed. Instead, I want the design to naturally arrange for different
threads to operate on different constructs and for the interconnecting
interfaces to be thread safe. The best system we have yet devised for this
is based around function passes not digging into tho bodies of other
functions. Instead we rely on attributes to propagate information about the
body of another function to a caller.

>
> So this would significantly constrain the utility here.
>
>
> I think we can engineer around this problem. For example, the function @f
> is meant to contain basically hand-written IR; it ought not be necessary to
> optimize it in order to make use of it for safe-to-execute. It's also
> reasonable to expect these to be small.
>
> Hence you can imagine freezing a copy of those functions that are used in
> this meta-data.
>
At this point, you are essentially proposing that these functions are a
similar but not quite the same IR... They will have the same concepts but
subtly different constraints or "expectations".

I'm not yet sure how I feel about this. It could work really well, or it
could end up looking a lot like ConstantExpr and being a pain for us going
forward. I'm going to keep thinking about this though and see if I can
contribute a more positive comment. =]
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131108/e0d6089e/attachment.html>

Filip Pizlo

2013-Nov-09 05:50 UTC

head link

[LLVMdev] Proposal for safe-to-execute meta-data for heap accesses

On Nov 8, 2013, at 9:36 PM, Chandler Carruth <chandlerc at google.com>
wrote:
> 
> On Fri, Nov 8, 2013 at 8:44 AM, Filip Pizlo <fpizlo at apple.com>
wrote:
>> Is the expectation that to utilize this metadata an optimization pass
would have to inspect the body of @f and reason about its behavior given
<args>?
> 
> Yes. 
> 
>> 
>> If so, then I think this is pretty bad. If we ever want to parallelize
function passes, then they can't inspect the innards of other functions.
> 
> I must be missing something. Can't you do some simple locking?  Lock a
function if it's being transformed, or if you want to inspect it...
> 
> I really, *really* don't like this.
> 
> I do *not* want parallelizing LLVM
So, I'm relatively new to LLVM, but I'm not new to parallelizing a
compiler - I've done it before.  And when I did it, it (a) did use locking
in a bunch of places, (b) wasn't a big deal, and (c) reliably scaled to 8
cores (the max number of cores I had at the time - I was a grad student and it
was, like, the last decade).

Is there any documented proposal that lays out the philosophy?  I'd like to
understand why locks are such a party pooper.
> to require careful locking protocols to be followed. Instead, I want the
design to naturally arrange for different threads to operate on different
constructs and for the interconnecting interfaces to be thread safe. The best
system we have yet devised for this is based around function passes not digging
into tho bodies of other functions. Instead we rely on attributes to propagate
information about the body of another function to a caller.
I kind of get what you're aiming at, but I'm curious what other
constraints are in play.  When I last wrote a parallel compiler, I had the
notion of predesignating functions that were candidates for cross-thread IR
introspection, and freeze-drying their IR at a certain phase that preceded a
global barrier.  It just so happened that in my case, I did this for inlining
candidates - which is different than what we're proposing here but the same
tricks apply.
>  
> 
>> So this would significantly constrain the utility here.
> 
> I think we can engineer around this problem. For example, the function @f
is meant to contain basically hand-written IR; it ought not be necessary to
optimize it in order to make use of it for safe-to-execute. It's also
reasonable to expect these to be small.
> 
> Hence you can imagine freezing a copy of those functions that are used in
this meta-data.
> 
> At this point, you are essentially proposing that these functions are a
similar but not quite the same IR... They will have the same concepts but subtly
different constraints or "expectations".
Sort of.  I'm only proposing that they get treated differently from the
standpoint of the compilation pipeline.  But, to clarify, the IR inside them
still has the same semantics as LLVM IR.

It's interesting that this is the second time that the thought of
"special" functions has arisen in my LLVM JIT adventures.  The other
time was when I wanted to create a module that contained one function that I
wanted to compile (i.e. it was a function that carried the IR that I actually
wanted to JIT) but I wanted to pre-load that module with runtime function that
were inline candidates.  I did not want the JIT to compile those functions
except if they were inlined.

I bring this up not because I have any timetable for implementing this other
concept, but because I find it interesting that LLVM's "every function
in a module is a thing that will get compiled and be part of the resulting
object file" rule is a tad constraining for a bunch of things I want to do
that don't involve a C-like language.
> 
> I'm not yet sure how I feel about this. It could work really well, or
it could end up looking a lot like ConstantExpr and being a pain for us going
forward. I'm going to keep thinking about this though and see if I can
contribute a more positive comment. =]
Fair enough!  I look forward to hearing more feedback.

-Filip


-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131108/549981b8/attachment.html>

Dean Sutherland

2013-Nov-11 15:24 UTC

head link

[LLVMdev] Proposal for safe-to-execute meta-data for heap accesses

FYI, we (DeLesley Hutchins at Google and my group at CERT) have a new static
analysis under development that will be able to express and enforce this kind of
property (once the analysis is up and working). When we get to the point where
it can handle the LLVM & Clang sources, implementer will be able to count on
having code that actually follows the rules, too.

Dean Sutherland
dsutherland at cert.org

On Nov 9, 2013, at 12:36 AM, Chandler Carruth <chandlerc at google.com>
wrote:
[SNIP]
> I do *not* want parallelizing LLVM to require careful locking protocols to
be followed. Instead, I want the design to naturally arrange for different
threads to operate on different constructs and for the interconnecting
interfaces to be thread safe. The best system we have yet devised for this is
based around function passes not digging into tho bodies of other functions.
Instead we rely on attributes to propagate information about the body of another
function to a caller.

Possibly Parallel Threads

Search for more reasonably related threads

llvm dev - Nov 2013 - [LLVMdev] Proposal for safe-to-execute meta-data for heap accesses

[LLVMdev] Proposal for safe-to-execute meta-data for heap accesses

[LLVMdev] Proposal for safe-to-execute meta-data for heap accesses

[LLVMdev] Proposal for safe-to-execute meta-data for heap accesses

[LLVMdev] Proposal for safe-to-execute meta-data for heap accesses

[LLVMdev] Proposal for safe-to-execute meta-data for heap accesses

[LLVMdev] Proposal for safe-to-execute meta-data for heap accesses

Possibly Parallel Threads