thr3ads.net - llvm dev - [LLVMdev] lifetime.start/end clarification [Nov 2014]

If this information is useful, please help other people find it:
Share via:

Reid Kleckner

2014-Nov-05 18:54 UTC

[LLVMdev] lifetime.start/end clarification

>
> This seems fine to me.  The optimizer can (soundly) conclude that %p is
> dead after the "lifetime.end" (for the two instructions), and
dead before
> the "lifetime.start" (for the *single* instruction in that basic
block,
> *not* for the previous BB).  This seems like the proper result for this
> example, am I missing something?
>
What if I put that in a loop, unroll it once, and prove that the
lifetime.start is unreachable? We would end up with IR like:

loop:
  ... use %p
  call void @lifetime.end( %p )
  ... use %p
  call void @lifetime.end( %p )
  br i1 %c, label %loop, label %exit

Are the second uses of %p uses of dead memory?

We have similar issues if the optimizer somehow removes the lifetime end
and keeps the start:

loop:
  call void @lifetime.start( %p )
  ... use %p
  call void @lifetime.start( %p )
  ... use %p
  br i1 %c, label %loop, label %exit

For this reason, it has been suggested that these intrinsics are horribly
broken, and both should be remodeled to just mean "store of undef bytes to
this memory". If "use %p" is a load, for example, in both cases
we can
safely say it returns undef, because it's a use-after-scope.

I think coming up with a new representation with simpler semantics is the
way to go. One allocation or lifetime start, and one deallocation and end.

Implementing this in Clang will be tricky, though. Clang's IRGen is
supposed to be a dumb AST walk, but it has already strayed from that path.
Needs more thought...
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141105/e7736105/attachment.html>

Philip Reames

2014-Nov-05 19:17 UTC

head link

[LLVMdev] lifetime.start/end clarification

On 11/05/2014 10:54 AM, Reid Kleckner wrote:>
>     This seems fine to me.  The optimizer can (soundly) conclude that
>     %p is dead after the "lifetime.end" (for the two
instructions),
>     and dead before the "lifetime.start" (for the *single*
instruction
>     in that basic block, *not* for the previous BB).  This seems like
>     the proper result for this example, am I missing something?
>
>
> What if I put that in a loop, unroll it once, and prove that the 
> lifetime.start is unreachable? We would end up with IR like:
>
> loop:
>   ... use %p
>   call void @lifetime.end( %p )
>   ... use %p
>   call void @lifetime.end( %p )
>   br i1 %c, label %loop, label %exit
>
> Are the second uses of %p uses of dead memory?It's hard to discuss this without being specific about the starting IR 
and transforms.  My general response is that either a) such a transform 
wouldn't be valid or b) the behaviour of the original program was 
undefined.

In the particular final IR result you gave, the second use of %p would 
be dead.  Moreover, if the optimizer can prove this loop iterates at 
least once, *all access* to %p in the loop is dead. This is exactly the 
semantics you want.>
> We have similar issues if the optimizer somehow removes the lifetime 
> end and keeps the start:
>
> loop:
>   call void @lifetime.start( %p )
>   ... use %p
>   call void @lifetime.start( %p )
>   ... use %p
>   br i1 %c, label %loop, label %exitBut we can't just drop arbitrary calls.  Doing so is
unsound.>
> For this reason, it has been suggested that these intrinsics are 
> horribly broken, and both should be remodeled to just mean "store of 
> undef bytes to this memory". If "use %p" is a load, for
example, in
> both cases we can safely say it returns undef, because it's a 
> use-after-scope.I'm not actually intrinsically opposed to just an alternative.  I would 
like to see a concrete example justifying the proposal though.  Nothing 
said in this thread to date warrants redefining these intrinsics.

Worth stating explicitly: your proposed semantics are strictly *less 
powerful* than the current ones.  They may be "simpler", but you loose
optimization possibilities.>
> I think coming up with a new representation with simpler semantics is 
> the way to go. One allocation or lifetime start, and one deallocation 
> and end.At least so far, I disagree.  I am open to being convinced though.
:)>
> Implementing this in Clang will be tricky, though. Clang's IRGen is 
> supposed to be a dumb AST walk, but it has already strayed from that 
> path. Needs more thought...Do you have motivating cases other than a goto into a scoped region?  
That seems like a fairly straight forward "special case". How far
would
you get by special casing a few cases that matter and leaving the 
general problem unsolved?  At worst, you're missing an optimization here.

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141105/ffb4dd4b/attachment.html>

Hal Finkel

2014-Nov-05 19:51 UTC

head link

[LLVMdev] lifetime.start/end clarification

----- Original Message -----> From: "Reid Kleckner" <rnk at google.com>
> To: "Philip Reames" <listmail at philipreames.com>
> Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>
> Sent: Wednesday, November 5, 2014 12:54:30 PM
> Subject: Re: [LLVMdev] lifetime.start/end clarification
> 
> This seems fine to me. The optimizer can (soundly) conclude that %p
> is dead after the "lifetime.end" (for the two instructions), and
> dead before the "lifetime.start" (for the *single* instruction in
> that basic block, *not* for the previous BB). This seems like the
> proper result for this example, am I missing something?
> 
> 
> What if I put that in a loop, unroll it once, and prove that the
> lifetime.start is unreachable? We would end up with IR like:
> 
> 
> loop:
> ... use %p
> call void @lifetime.end( %p )
> 
> ... use %p
> call void @lifetime.end( %p )
> br i1 %c, label %loop, label %exit
> 
> 
> Are the second uses of %p uses of dead memory?
> 
> 
> We have similar issues if the optimizer somehow removes the lifetime
> end and keeps the start:
> 
> 
> 
> loop:
> call void @lifetime.start( %p )
> 
> ... use %p
> call void @lifetime.start( %p )
> 
> 
> ... use %p
> br i1 %c, label %loop, label %exit
> 
> 
> For this reason, it has been suggested that these intrinsics are
> horribly broken,
I disagree, these just seem like bugs. lifetime_start are marked as
IntrReadWriteArgMem, but this is not really sufficient to prevent their removal
should the memory be subsequently unused. Plus there are other places that just
delete the lifetime intrinsics, like this in lib/Transforms/Scalar/SROA.cpp:

      // FIXME: Currently the SSAUpdater infrastructure doesn't reason about
      // lifetime intrinsics and so we strip them (and the bitcasts+GEPs
      // leading to them) here. Eventually it should use them to optimize the
      // scalar values produced.
      if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(I)) {
        assert(II->getIntrinsicID() == Intrinsic::lifetime_start ||
               II->getIntrinsicID() == Intrinsic::lifetime_end);
        II->eraseFromParent();
        continue;
      }

we need to go through the various places that might delete these intrinsics and
fix them. The same will be true with any other mechanism.
> and both should be remodeled to just mean "store of
> undef bytes to this memory". 
This is a bad idea. Stores of undef bytes can be removed if we can prove that
the address is dereferenceable. And if they can't be removed, then they have
side effects that can't ever be removed. Please don't do that.

 -Hal
> If "use %p" is a load, for example, in
> both cases we can safely say it returns undef, because it's a
> use-after-scope.
> 
> 
> I think coming up with a new representation with simpler semantics is
> the way to go. One allocation or lifetime start, and one
> deallocation and end.
> 
> 
> Implementing this in Clang will be tricky, though. Clang's IRGen is
> supposed to be a dumb AST walk, but it has already strayed from that
> path. Needs more thought...
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Nick Lewycky

2014-Nov-05 20:39 UTC

head link

[LLVMdev] lifetime.start/end clarification

On 5 November 2014 11:51, Hal Finkel <hfinkel at anl.gov> wrote:
> ----- Original Message -----
> > From: "Reid Kleckner" <rnk at google.com>
> > To: "Philip Reames" <listmail at philipreames.com>
> > Cc: "LLVM Developers Mailing List" <llvmdev at
cs.uiuc.edu>
> > Sent: Wednesday, November 5, 2014 12:54:30 PM
> > Subject: Re: [LLVMdev] lifetime.start/end clarification
> >
> > This seems fine to me. The optimizer can (soundly) conclude that %p
> > is dead after the "lifetime.end" (for the two instructions),
and
> > dead before the "lifetime.start" (for the *single*
instruction in
> > that basic block, *not* for the previous BB). This seems like the
> > proper result for this example, am I missing something?
> >
> >
> > What if I put that in a loop, unroll it once, and prove that the
> > lifetime.start is unreachable? We would end up with IR like:
> >
> >
> > loop:
> > ... use %p
> > call void @lifetime.end( %p )
> >
> > ... use %p
> > call void @lifetime.end( %p )
> > br i1 %c, label %loop, label %exit
> >
> >
> > Are the second uses of %p uses of dead memory?
> >
> >
> > We have similar issues if the optimizer somehow removes the lifetime
> > end and keeps the start:
> >
> >
> >
> > loop:
> > call void @lifetime.start( %p )
> >
> > ... use %p
> > call void @lifetime.start( %p )
> >
> >
> > ... use %p
> > br i1 %c, label %loop, label %exit
> >
> >
> > For this reason, it has been suggested that these intrinsics are
> > horribly broken,
>
> I disagree, these just seem like bugs. lifetime_start are marked as
> IntrReadWriteArgMem, but this is not really sufficient to prevent their
> removal should the memory be subsequently unused. Plus there are other
> places that just delete the lifetime intrinsics, like this in
> lib/Transforms/Scalar/SROA.cpp:
>
>       // FIXME: Currently the SSAUpdater infrastructure doesn't reason
> about
>       // lifetime intrinsics and so we strip them (and the bitcasts+GEPs
>       // leading to them) here. Eventually it should use them to optimize
> the
>       // scalar values produced.
>       if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(I)) {
>         assert(II->getIntrinsicID() == Intrinsic::lifetime_start ||
>                II->getIntrinsicID() == Intrinsic::lifetime_end);
>         II->eraseFromParent();
>         continue;
>       }
>
> we need to go through the various places that might delete these
> intrinsics and fix them. The same will be true with any other mechanism.
>
It removes them because it does (or will) remove the associated alloca
anyways as part of turning loads and stores into SSA. There's no need for
lifetime intrinsic equivalents on SSA given that we have use-lists and
tools like the dominator tree.

> > and both should be remodeled to just mean "store of
> > undef bytes to this memory".
>
> This is a bad idea. Stores of undef bytes can be removed if we can prove
> that the address is dereferenceable. And if they can't be removed, then
> they have side effects that can't ever be removed. Please don't do
that.
>
I think the idea is to define them with the semantics of storing undef
bytes, but keep them implemented as intrinsic function calls, so that the
optimizer does not simply delete them. It's a way of communicating that
these are deliberate and valuable stores to undef, as opposed to stores of
SSA values that were later found to be undef.

>
>  -Hal
>
> > If "use %p" is a load, for example, in
> > both cases we can safely say it returns undef, because it's a
> > use-after-scope.
> >
> >
> > I think coming up with a new representation with simpler semantics is
> > the way to go. One allocation or lifetime start, and one
> > deallocation and end.
> >
> >
> > Implementing this in Clang will be tricky, though. Clang's IRGen
is
> > supposed to be a dumb AST walk, but it has already strayed from that
> > path. Needs more thought...
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141105/b2fa0d16/attachment.html>

Reid Kleckner

2014-Nov-05 21:48 UTC

head link

[LLVMdev] lifetime.start/end clarification

On Wed, Nov 5, 2014 at 11:17 AM, Philip Reames <listmail at
philipreames.com>
wrote:
>
> On 11/05/2014 10:54 AM, Reid Kleckner wrote:
>
>   This seems fine to me.  The optimizer can (soundly) conclude that %p is
>> dead after the "lifetime.end" (for the two instructions), and
dead before
>> the "lifetime.start" (for the *single* instruction in that
basic block,
>> *not* for the previous BB).  This seems like the proper result for this
>> example, am I missing something?
>>
>
>  What if I put that in a loop, unroll it once, and prove that the
> lifetime.start is unreachable? We would end up with IR like:
>
>  loop:
>   ... use %p
>   call void @lifetime.end( %p )
>    ... use %p
>   call void @lifetime.end( %p )
>    br i1 %c, label %loop, label %exit
>
>  Are the second uses of %p uses of dead memory?
>
> It's hard to discuss this without being specific about the starting IR
and
> transforms.  My general response is that either a) such a transform
> wouldn't be valid or b) the behaviour of the original program was
> undefined.
>
The starting IR would be something that jumps into the middle of a lifetime
region, like the example that Arnaud gave. This was assuming the current
state of the world where we haven't added a second lifetime start call
prior to the goto branch.

Start with something like:

void f(int x) {
  while (x) {
    goto skip_start;
    {
      int y; // lifetime.start
skip_start:
      y = g();
      x -= y;
      // lifetime.end
    }
  }
}

The block containing the lifetime start of y is unreachable and can be
deleted.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141105/d8004b30/attachment.html>

Seemingly Similar Threads

Search for more possibly parallel threads

llvm dev - Nov 2014 - [LLVMdev] lifetime.start/end clarification

[LLVMdev] lifetime.start/end clarification

[LLVMdev] lifetime.start/end clarification

[LLVMdev] lifetime.start/end clarification

[LLVMdev] lifetime.start/end clarification

[LLVMdev] lifetime.start/end clarification

Seemingly Similar Threads