thr3ads.net - llvm dev - [llvm-dev] RFC: Add guard intrinsics to LLVM [Feb 2016]

If this information is useful, please help other people find it:
Share via:

Sanjoy Das via llvm-dev

2016-Feb-23 17:32 UTC

[llvm-dev] RFC: Add guard intrinsics to LLVM

On Mon, Feb 22, 2016 at 11:18 PM, Chandler Carruth <chandlerc at
gmail.com> wrote:>> # step A: Introduce an `interposable` function attribute
>>
>> We can bike shed on the name and the exact specification, but the
>> general idea is that you cannot do IPA / IPO over callsites calling
>> `interposable` functions without inlining them.  This attribute will
>> (usually) have to be used on function bodies that can deoptimize (e.g.
has
>> a
>> side exit / guard it in); but also has more general use cases.
>
>
> Note that we already have this *exact* concept in the IR via linkage for
> better or worse. I think it is really confusing as you are currently
I was going to have a more detailed discussion on this in the (yet to
be started) review thread for `interposable`: we'd like to be able to
inline `interposable` functions.  The "interposition" can only happen
in physical function boundaries, so opt is allowed to do as much
IPA/IPO it wants once it makes the physical function boundary go away
via inlining.  None of linkage types seem to have this property.

Part of the challenge here is to specify the attribute in a way that
allows inlining, but not IPA without inlining.  In fact, maybe it is
best to not call it "interposable" at all?

Actually, I think one of the problems we're trying to solve with
`interposable` is applicable to the available_externally linkage as
well.  Say we have

```
void foo() available_externally {
  %t0 = load atomic %ptr
  %t1 = load atomic %ptr
  if (%t0 != %t1) print("X");
}
void main() {
  foo();
  print("Y");
}
```

Now the possible behaviors of the above program are {print("X"),
print("Y")} or {print("Y")}.  But if we run opt then we have

```
void foo() available_externally readnone nounwind {
  ;; After CSE'ing the two loads and folding the condition
}
void main() {
  foo();
  print("Y");
}
```

and some generic reordering

```
void foo() available_externally readnone nounwind {
  ;; After CSE'ing the two loads and folding the condition
}
void main() {
  print("Y");
  foo();  // legal since we're moving a readnone nounwind function that
          // was guaranteed to execute (hence can't have UB)
}
```

Now if we do not inline @foo(), and instead re-link the call site in
@main to some non-optimized copy (or differently optimized copy) of
foo, then it is possible for the program to have the behavior
{print("Y"); print ("X")}, which was disallowed in the
earlier
program.

In other words, opt refined the semantics of @foo() (i.e. reduced the
set of behaviors it may have) in ways that would make later
optimizations invalid if we de-refine the implementation of @foo().

Given this, I'd say we don't need a new attribute / linkage type, and
can add our restriction to the available_externally linkage.
> describing it because it seems deeply overlapping with linkage, which is
> where the whole interposition thing comes from, and yet you never mention
> how it interacts with linkage at all. What does it mean to have a common
> linkage function that lacks the interposable attribute? Or a LinkOnceODR
> function that does have that attribute?
What would you say about adding this as a new kind of linkage?  I was
trying to avoid doing that since the intended semantics of,
GlobalValue::InterposableLinkage don't just describe what a linker
does, but also restricts what can be legally linked in (for the
can-inline-but-can't-IPA property to hold), but perhaps that's the
best way forward?

[Edit: I wrote this section before I wrote the available_externally
thing above.]
> If the goal is to factor replaceability out of linkage, we should actually
> factor it out rather than adding yet one more way to talk about this.
>
> And generally, we need to be *really* careful adding function attributes.
> Look at the challenges we had figuring out norecurse. Adding attributes
> needs to be viewed as nearly as high cost as adding instructions,
> substantially higher cost than intrinsics.
Only indirectly relevant to this discussion, but this is news to me --
my mental cost model was "attributes are easy to add and maintain", so
I didn't think too hard about alternatives.
> I think it would be really helpful to work to describe these things in
terms
> of semantic contracts on the IR rather than in terms of implementation
> strategies. For example, not all IR interacts with an interpreter, and so I
> don't think we should use the term "interpreter" to specify
the semantic
> model exposed by the IR.
That's what I was getting at by:
>> Chandler raised some points on IRC around making `guard_on` (and
>> possibly `side_exit`?) more generally applicable to unmanaged
>> languages; so we'd want to be careful to specify these in a way
that
>> allows for implementations in an unmanaged environments (by function
>> cloning, for instance).
-- Sanjoy

Chandler Carruth via llvm-dev

2016-Feb-23 18:55 UTC

head link

[llvm-dev] RFC: Add guard intrinsics to LLVM

On Tue, Feb 23, 2016 at 9:33 AM Sanjoy Das <sanjoy at
playingwithpointers.com>
wrote:
> On Mon, Feb 22, 2016 at 11:18 PM, Chandler Carruth <chandlerc at
gmail.com>
> wrote:
> >> # step A: Introduce an `interposable` function attribute
> >>
> >> We can bike shed on the name and the exact specification, but the
> >> general idea is that you cannot do IPA / IPO over callsites
calling
> >> `interposable` functions without inlining them.  This attribute
will
> >> (usually) have to be used on function bodies that can deoptimize
(e.g.
> has
> >> a
> >> side exit / guard it in); but also has more general use cases.
> >
> >
> > Note that we already have this *exact* concept in the IR via linkage
for
> > better or worse. I think it is really confusing as you are currently
>
> I was going to have a more detailed discussion on this in the (yet to
> be started) review thread for `interposable`: we'd like to be able to
> inline `interposable` functions.  The "interposition" can only
happen
> in physical function boundaries, so opt is allowed to do as much
> IPA/IPO it wants once it makes the physical function boundary go away
> via inlining.  None of linkage types seem to have this property.
>
> Part of the challenge here is to specify the attribute in a way that
> allows inlining, but not IPA without inlining.  In fact, maybe it is
> best to not call it "interposable" at all?
>
Yea, this is something *very* different from interposable. GCC and other
compilers that work to support symbol interposition make specific efforts
to not inline them in specific ways (that frankly I don't fully understand,
as it doesn't seem to be always which is what the definition of
interposable indicates to me...).

>
> Actually, I think one of the problems we're trying to solve with
> `interposable` is applicable to the available_externally linkage as
> well.  Say we have
>
> ```
> void foo() available_externally {
>   %t0 = load atomic %ptr
>   %t1 = load atomic %ptr
>   if (%t0 != %t1) print("X");
> }
> void main() {
>   foo();
>   print("Y");
> }
> ```
>
> Now the possible behaviors of the above program are {print("X"),
> print("Y")} or {print("Y")}.  But if we run opt then we
have
>
> ```
> void foo() available_externally readnone nounwind {
>   ;; After CSE'ing the two loads and folding the condition
> }
> void main() {
>   foo();
>   print("Y");
> }
> ```
>
> and some generic reordering
>
> ```
> void foo() available_externally readnone nounwind {
>   ;; After CSE'ing the two loads and folding the condition
> }
> void main() {
>   print("Y");
>   foo();  // legal since we're moving a readnone nounwind function that
>           // was guaranteed to execute (hence can't have UB)
> }
> ```
>
> Now if we do not inline @foo(), and instead re-link the call site in
> @main to some non-optimized copy (or differently optimized copy) of
> foo, then it is possible for the program to have the behavior
> {print("Y"); print ("X")}, which was disallowed in the
earlier
> program.
>
> In other words, opt refined the semantics of @foo() (i.e. reduced the
> set of behaviors it may have) in ways that would make later
> optimizations invalid if we de-refine the implementation of @foo().
>
> Given this, I'd say we don't need a new attribute / linkage type,
and
> can add our restriction to the available_externally linkage.
>
Interesting example, I agree it seems quite broken. Even more interesting,
I can't see anything we do in LLVM that prevents this from breaking
essentially everywhere. =[[[[[[

link_once and link_once_odr at least seem equally broken because we don't
put the caller and callee into a single comdat or anything to ensure that
the optimized one is selected at link time.

But there are also multiple different kinds of overriding we should think
about:

1) Can the definition get replaced at link time (or at runtime via an
interpreter) with a differently *optimized* variant stemming from the same
definition (thus it has the same behavior but not the same refinement).
This is the "ODR" guarantee in some linkages (and vaguely implied for
available_externally)

2) Can the definition get replaced at link time (or at runtime via an
interpreter) with a function that has fundamentally different behavior

3) To support replacing the definition, the call edge must be preserved.

To support interposition you need #3, the most restrictive model. LLVM (i
think) actually does a decent job of modeling this as we say that the
function is totally opaque. We don't do IPA or inlining. But I don't
think
that's what you're looking for.

I'm curious whether your use case is actually in the #1 bucket or #2
bucket. That is, I'm wondering if there is any way in which the
"different
implementation" would actually break in the face of optimizations on things
like *non-deduced* function attributes, etc.

If your use case looks more like #1, then I actually think this is what we
want for link_once_odr and available_externally. You probably want the
former rather than the latter as you don't want it to be discardable.

If your use case looks more like #2, then I think its essentially
"link_once" or "link_any", and it isn't clear that LLVM
does a great job of
modeling this today.

I'd be mildly interested in factoring the discarding semantics from the
"what do other definitions look like" semantics. The former are what I
think fit cleanly into linkages, and the latter I think we wedged into them
because they seemed to correspond in some cases and because attributes used
to be very limited in number.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160223/ecb0ddf1/attachment.html>

Sanjoy Das via llvm-dev

2016-Feb-23 23:06 UTC

head link

[llvm-dev] RFC: Add guard intrinsics to LLVM

On Tue, Feb 23, 2016 at 10:55 AM, Chandler Carruth <chandlerc at
gmail.com> wrote:>> Part of the challenge here is to specify the attribute in a way that
>> allows inlining, but not IPA without inlining.  In fact, maybe it is
>> best to not call it "interposable" at all?
>
>
> Yea, this is something *very* different from interposable. GCC and other
> compilers that work to support symbol interposition make specific efforts
to
> not inline them in specific ways (that frankly I don't fully
understand, as
> it doesn't seem to be always which is what the definition of
interposable
> indicates to me...).
Sure, not calling it interposable is fine for me.  Credit where credit
is due: Philip had warned me about this exact thing offline (that the
term "interposable" is already taken).
>> In other words, opt refined the semantics of @foo() (i.e. reduced the
>> set of behaviors it may have) in ways that would make later
>> optimizations invalid if we de-refine the implementation of @foo().
>>
>> Given this, I'd say we don't need a new attribute / linkage
type, and
>> can add our restriction to the available_externally linkage.
>
>
> Interesting example, I agree it seems quite broken. Even more interesting,
I
> can't see anything we do in LLVM that prevents this from breaking
> essentially everywhere. =[[[[[[
>
> link_once and link_once_odr at least seem equally broken because we
don't
> put the caller and callee into a single comdat or anything to ensure that
> the optimized one is selected at link time.
>
> But there are also multiple different kinds of overriding we should think
> about:
>
> 1) Can the definition get replaced at link time (or at runtime via an
> interpreter) with a differently *optimized* variant stemming from the same
> definition (thus it has the same behavior but not the same refinement).
This
> is the "ODR" guarantee in some linkages (and vaguely implied for
> available_externally)
>
> 2) Can the definition get replaced at link time (or at runtime via an
> interpreter) with a function that has fundamentally different behavior
>
> 3) To support replacing the definition, the call edge must be preserved.
I'm working under context of a optimizer that does not know if its
input has been previously optimized or if its input is "raw" IR.
Realistically, I'd say deviating LLVM from this will be painful.
Given that I don't see how (2) and (3) are different:

Firstly, (1) and (2) are not _that_ different -- a differently
optimized variant of a function can have completely different
observable behavior (e.g. the "original" function could have started
with "if (*ptr != *ptr) { call @unknown(); return; }").  The only
practical difference I can see between (1) and (2) is that in (2)
inlining is incorrect since it would be retroactively invalid on
replacement.  In (1) we have the invariant that the function in
question is always *a* valid implementation of what we started with,
but this can not be used to infer anything about the function we'll
actually call at runtime.  Thus, I don't understand the difference
between (2) and (3); both of them seem to imply "don't do IPA/IPO,
including inlining" while (1) implies "the only IPA/IPO you can do is
inlining".

> I'm curious whether your use case is actually in the #1 bucket or #2
> bucket. That is, I'm wondering if there is any way in which the
> "different implementation" would actually break in the face of
> optimizations on things like *non-deduced* function attributes, etc.
With the understanding I have at this time (that isn't complete, as I
say above) I'd say we're (1).  We can replace a possibly inlined
callee with another
arbitrary function, but if that happens the runtime will deoptimize
the caller.  I'm not sure if I understood your second statement -- but
assuming I did -- we do "manually" attach attributes to some
well-known functions (e.g. in the standard library), but they never
get replaced.

-- Sanjoy

Possibly Parallel Threads

Search for more possibly parallel threads

llvm dev - Feb 2016 - RFC: Add guard intrinsics to LLVM

[llvm-dev] RFC: Add guard intrinsics to LLVM

[llvm-dev] RFC: Add guard intrinsics to LLVM

[llvm-dev] RFC: Add guard intrinsics to LLVM

Possibly Parallel Threads