thr3ads.net - llvm dev - [LLVMdev] lifetime.start/end clarification [Nov 2014]

If this information is useful, please help other people find it:
Share via:

Nick Lewycky

2014-Nov-05 20:39 UTC

[LLVMdev] lifetime.start/end clarification

On 5 November 2014 11:51, Hal Finkel <hfinkel at anl.gov> wrote:
> ----- Original Message -----
> > From: "Reid Kleckner" <rnk at google.com>
> > To: "Philip Reames" <listmail at philipreames.com>
> > Cc: "LLVM Developers Mailing List" <llvmdev at
cs.uiuc.edu>
> > Sent: Wednesday, November 5, 2014 12:54:30 PM
> > Subject: Re: [LLVMdev] lifetime.start/end clarification
> >
> > This seems fine to me. The optimizer can (soundly) conclude that %p
> > is dead after the "lifetime.end" (for the two instructions),
and
> > dead before the "lifetime.start" (for the *single*
instruction in
> > that basic block, *not* for the previous BB). This seems like the
> > proper result for this example, am I missing something?
> >
> >
> > What if I put that in a loop, unroll it once, and prove that the
> > lifetime.start is unreachable? We would end up with IR like:
> >
> >
> > loop:
> > ... use %p
> > call void @lifetime.end( %p )
> >
> > ... use %p
> > call void @lifetime.end( %p )
> > br i1 %c, label %loop, label %exit
> >
> >
> > Are the second uses of %p uses of dead memory?
> >
> >
> > We have similar issues if the optimizer somehow removes the lifetime
> > end and keeps the start:
> >
> >
> >
> > loop:
> > call void @lifetime.start( %p )
> >
> > ... use %p
> > call void @lifetime.start( %p )
> >
> >
> > ... use %p
> > br i1 %c, label %loop, label %exit
> >
> >
> > For this reason, it has been suggested that these intrinsics are
> > horribly broken,
>
> I disagree, these just seem like bugs. lifetime_start are marked as
> IntrReadWriteArgMem, but this is not really sufficient to prevent their
> removal should the memory be subsequently unused. Plus there are other
> places that just delete the lifetime intrinsics, like this in
> lib/Transforms/Scalar/SROA.cpp:
>
>       // FIXME: Currently the SSAUpdater infrastructure doesn't reason
> about
>       // lifetime intrinsics and so we strip them (and the bitcasts+GEPs
>       // leading to them) here. Eventually it should use them to optimize
> the
>       // scalar values produced.
>       if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(I)) {
>         assert(II->getIntrinsicID() == Intrinsic::lifetime_start ||
>                II->getIntrinsicID() == Intrinsic::lifetime_end);
>         II->eraseFromParent();
>         continue;
>       }
>
> we need to go through the various places that might delete these
> intrinsics and fix them. The same will be true with any other mechanism.
>
It removes them because it does (or will) remove the associated alloca
anyways as part of turning loads and stores into SSA. There's no need for
lifetime intrinsic equivalents on SSA given that we have use-lists and
tools like the dominator tree.

> > and both should be remodeled to just mean "store of
> > undef bytes to this memory".
>
> This is a bad idea. Stores of undef bytes can be removed if we can prove
> that the address is dereferenceable. And if they can't be removed, then
> they have side effects that can't ever be removed. Please don't do
that.
>
I think the idea is to define them with the semantics of storing undef
bytes, but keep them implemented as intrinsic function calls, so that the
optimizer does not simply delete them. It's a way of communicating that
these are deliberate and valuable stores to undef, as opposed to stores of
SSA values that were later found to be undef.

>
>  -Hal
>
> > If "use %p" is a load, for example, in
> > both cases we can safely say it returns undef, because it's a
> > use-after-scope.
> >
> >
> > I think coming up with a new representation with simpler semantics is
> > the way to go. One allocation or lifetime start, and one
> > deallocation and end.
> >
> >
> > Implementing this in Clang will be tricky, though. Clang's IRGen
is
> > supposed to be a dumb AST walk, but it has already strayed from that
> > path. Needs more thought...
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141105/b2fa0d16/attachment.html>

Hal Finkel

2014-Nov-05 20:48 UTC

head link

[LLVMdev] lifetime.start/end clarification

----- Original Message -----> From: "Nick Lewycky" <nlewycky at google.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "Reid Kleckner" <rnk at google.com>, "LLVM
Developers Mailing List" <llvmdev at cs.uiuc.edu>
> Sent: Wednesday, November 5, 2014 2:39:38 PM
> Subject: Re: [LLVMdev] lifetime.start/end clarification
> 
> On 5 November 2014 11:51, Hal Finkel < hfinkel at anl.gov > wrote:
> 
> 
> ----- Original Message -----
> > From: "Reid Kleckner" < rnk at google.com >
> > To: "Philip Reames" < listmail at philipreames.com >
> > Cc: "LLVM Developers Mailing List" < llvmdev at
cs.uiuc.edu >
> > Sent: Wednesday, November 5, 2014 12:54:30 PM
> > Subject: Re: [LLVMdev] lifetime.start/end clarification
> > 
> > This seems fine to me. The optimizer can (soundly) conclude that %p
> > is dead after the "lifetime.end" (for the two instructions),
and
> > dead before the "lifetime.start" (for the *single*
instruction in
> > that basic block, *not* for the previous BB). This seems like the
> > proper result for this example, am I missing something?
> > 
> > 
> > What if I put that in a loop, unroll it once, and prove that the
> > lifetime.start is unreachable? We would end up with IR like:
> > 
> > 
> > loop:
> > ... use %p
> > call void @lifetime.end( %p )
> > 
> > ... use %p
> > call void @lifetime.end( %p )
> > br i1 %c, label %loop, label %exit
> > 
> > 
> > Are the second uses of %p uses of dead memory?
> > 
> > 
> > We have similar issues if the optimizer somehow removes the
> > lifetime
> > end and keeps the start:
> > 
> > 
> > 
> > loop:
> > call void @lifetime.start( %p )
> > 
> > ... use %p
> > call void @lifetime.start( %p )
> > 
> > 
> > ... use %p
> > br i1 %c, label %loop, label %exit
> > 
> > 
> > For this reason, it has been suggested that these intrinsics are
> > horribly broken,
> 
> I disagree, these just seem like bugs. lifetime_start are marked as
> IntrReadWriteArgMem, but this is not really sufficient to prevent
> their removal should the memory be subsequently unused. Plus there
> are other places that just delete the lifetime intrinsics, like this
> in lib/Transforms/Scalar/SROA.cpp:
> 
> // FIXME: Currently the SSAUpdater infrastructure doesn't reason
> about
> // lifetime intrinsics and so we strip them (and the bitcasts+GEPs
> // leading to them) here. Eventually it should use them to optimize
> the
> // scalar values produced.
> if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(I)) {
> assert(II->getIntrinsicID() == Intrinsic::lifetime_start ||
> II->getIntrinsicID() == Intrinsic::lifetime_end);
> II->eraseFromParent();
> continue;
> }
> 
> we need to go through the various places that might delete these
> intrinsics and fix them. The same will be true with any other
> mechanism.
> 
> 
> 
> It removes them because it does (or will) remove the associated
> alloca anyways as part of turning loads and stores into SSA. There's
> no need for lifetime intrinsic equivalents on SSA given that we have
> use-lists and tools like the dominator tree.
Good point, I did not think too carefully about what the code was doing, but
rather pointing out that there is special-case code dealing with lifetime
intrinsics that needs to be looked at, and code that does not deal specifically
with lifetime intrinsics that may have to do so. I certainly agree that we
don't need them for SSA values.

For the code in question, I don't see why you wouldn't just RAUW the
alloca with undef and then let DCE remove the intrinsics (this is, however,
somewhat off-topic for this thread).
> 
> 
> 
> 
> > and both should be remodeled to just mean "store of
> > undef bytes to this memory".
> 
> This is a bad idea. Stores of undef bytes can be removed if we can
> prove that the address is dereferenceable. And if they can't be
> removed, then they have side effects that can't ever be removed.
> Please don't do that.
> 
> I think the idea is to define them with the semantics of storing
> undef bytes, but keep them implemented as intrinsic function calls,
> so that the optimizer does not simply delete them. It's a way of
> communicating that these are deliberate and valuable stores to
> undef, as opposed to stores of SSA values that were later found to
> be undef.
I did not get that impression, and if that is what was proposed, I don't see
how that differs, in practice, from what we have now.

Thanks again,
Hal
> 
> 
> 
> -Hal
> 
> > If "use %p" is a load, for example, in
> > both cases we can safely say it returns undef, because it's a
> > use-after-scope.
> > 
> > 
> > I think coming up with a new representation with simpler semantics
> > is
> > the way to go. One allocation or lifetime start, and one
> > deallocation and end.
> > 
> > 
> > Implementing this in Clang will be tricky, though. Clang's IRGen
is
> > supposed to be a dumb AST walk, but it has already strayed from
> > that
> > path. Needs more thought...
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> > 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Hal Finkel

2014-Nov-05 20:55 UTC

head link

[LLVMdev] lifetime.start/end clarification

----- Original Message -----> From: "Hal Finkel" <hfinkel at anl.gov>
> To: "Nick Lewycky" <nlewycky at google.com>
> Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>
> Sent: Wednesday, November 5, 2014 2:48:32 PM
> Subject: Re: [LLVMdev] lifetime.start/end clarification
> 
> ----- Original Message -----
> > From: "Nick Lewycky" <nlewycky at google.com>
> > To: "Hal Finkel" <hfinkel at anl.gov>
> > Cc: "Reid Kleckner" <rnk at google.com>, "LLVM
Developers Mailing
> > List" <llvmdev at cs.uiuc.edu>
> > Sent: Wednesday, November 5, 2014 2:39:38 PM
> > Subject: Re: [LLVMdev] lifetime.start/end clarification
> > 
> > On 5 November 2014 11:51, Hal Finkel < hfinkel at anl.gov >
wrote:
> > 
> > 
> > ----- Original Message -----
> > > From: "Reid Kleckner" < rnk at google.com >
> > > To: "Philip Reames" < listmail at philipreames.com
>
> > > Cc: "LLVM Developers Mailing List" < llvmdev at
cs.uiuc.edu >
> > > Sent: Wednesday, November 5, 2014 12:54:30 PM
> > > Subject: Re: [LLVMdev] lifetime.start/end clarification
> > > 
> > > This seems fine to me. The optimizer can (soundly) conclude that
> > > %p
> > > is dead after the "lifetime.end" (for the two
instructions), and
> > > dead before the "lifetime.start" (for the *single*
instruction in
> > > that basic block, *not* for the previous BB). This seems like the
> > > proper result for this example, am I missing something?
> > > 
> > > 
> > > What if I put that in a loop, unroll it once, and prove that the
> > > lifetime.start is unreachable? We would end up with IR like:
> > > 
> > > 
> > > loop:
> > > ... use %p
> > > call void @lifetime.end( %p )
> > > 
> > > ... use %p
> > > call void @lifetime.end( %p )
> > > br i1 %c, label %loop, label %exit
> > > 
> > > 
> > > Are the second uses of %p uses of dead memory?
> > > 
> > > 
> > > We have similar issues if the optimizer somehow removes the
> > > lifetime
> > > end and keeps the start:
> > > 
> > > 
> > > 
> > > loop:
> > > call void @lifetime.start( %p )
> > > 
> > > ... use %p
> > > call void @lifetime.start( %p )
> > > 
> > > 
> > > ... use %p
> > > br i1 %c, label %loop, label %exit
> > > 
> > > 
> > > For this reason, it has been suggested that these intrinsics are
> > > horribly broken,
> > 
> > I disagree, these just seem like bugs. lifetime_start are marked as
> > IntrReadWriteArgMem, but this is not really sufficient to prevent
> > their removal should the memory be subsequently unused. Plus there
> > are other places that just delete the lifetime intrinsics, like
> > this
> > in lib/Transforms/Scalar/SROA.cpp:
> > 
> > // FIXME: Currently the SSAUpdater infrastructure doesn't reason
> > about
> > // lifetime intrinsics and so we strip them (and the bitcasts+GEPs
> > // leading to them) here. Eventually it should use them to optimize
> > the
> > // scalar values produced.
> > if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(I)) {
> > assert(II->getIntrinsicID() == Intrinsic::lifetime_start ||
> > II->getIntrinsicID() == Intrinsic::lifetime_end);
> > II->eraseFromParent();
> > continue;
> > }
> > 
> > we need to go through the various places that might delete these
> > intrinsics and fix them. The same will be true with any other
> > mechanism.
> > 
> > 
> > 
> > It removes them because it does (or will) remove the associated
> > alloca anyways as part of turning loads and stores into SSA.
> > There's
> > no need for lifetime intrinsic equivalents on SSA given that we
> > have
> > use-lists and tools like the dominator tree.
> 
> Good point, I did not think too carefully about what the code was
> doing, but rather pointing out that there is special-case code
> dealing with lifetime intrinsics that needs to be looked at, and
> code that does not deal specifically with lifetime intrinsics that
> may have to do so. I certainly agree that we don't need them for SSA
> values.
> 
> For the code in question, I don't see why you wouldn't just RAUW
the
> alloca with undef and then let DCE remove the intrinsics (this is,
> however, somewhat off-topic for this thread).
Eh. store to undef is not a good idea either, I suppose, so nevermind about
this.

 -Hal
> 
> > 
> > 
> > 
> > 
> > > and both should be remodeled to just mean "store of
> > > undef bytes to this memory".
> > 
> > This is a bad idea. Stores of undef bytes can be removed if we can
> > prove that the address is dereferenceable. And if they can't be
> > removed, then they have side effects that can't ever be removed.
> > Please don't do that.
> > 
> > I think the idea is to define them with the semantics of storing
> > undef bytes, but keep them implemented as intrinsic function calls,
> > so that the optimizer does not simply delete them. It's a way of
> > communicating that these are deliberate and valuable stores to
> > undef, as opposed to stores of SSA values that were later found to
> > be undef.
> 
> I did not get that impression, and if that is what was proposed, I
> don't see how that differs, in practice, from what we have now.
> 
> Thanks again,
> Hal
> 
> > 
> > 
> > 
> > -Hal
> > 
> > > If "use %p" is a load, for example, in
> > > both cases we can safely say it returns undef, because it's a
> > > use-after-scope.
> > > 
> > > 
> > > I think coming up with a new representation with simpler
> > > semantics
> > > is
> > > the way to go. One allocation or lifetime start, and one
> > > deallocation and end.
> > > 
> > > 
> > > Implementing this in Clang will be tricky, though. Clang's
IRGen
> > > is
> > > supposed to be a dumb AST walk, but it has already strayed from
> > > that
> > > path. Needs more thought...
> > > _______________________________________________
> > > LLVM Developers mailing list
> > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> > > 
> > 
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> > 
> > 
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> > 
> > 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Nick Lewycky

2014-Nov-05 20:59 UTC

head link

[LLVMdev] lifetime.start/end clarification

On 5 November 2014 12:48, Hal Finkel <hfinkel at anl.gov> wrote:
> ----- Original Message -----
> > From: "Nick Lewycky" <nlewycky at google.com>
> > To: "Hal Finkel" <hfinkel at anl.gov>
> > Cc: "Reid Kleckner" <rnk at google.com>, "LLVM
Developers Mailing List" <
> llvmdev at cs.uiuc.edu>
> > Sent: Wednesday, November 5, 2014 2:39:38 PM
> > Subject: Re: [LLVMdev] lifetime.start/end clarification
> >
> > On 5 November 2014 11:51, Hal Finkel < hfinkel at anl.gov >
wrote:
> >
> >
> > ----- Original Message -----
> > > From: "Reid Kleckner" < rnk at google.com >
> > > To: "Philip Reames" < listmail at philipreames.com
>
> > > Cc: "LLVM Developers Mailing List" < llvmdev at
cs.uiuc.edu >
> > > Sent: Wednesday, November 5, 2014 12:54:30 PM
> > > Subject: Re: [LLVMdev] lifetime.start/end clarification
> > >
> > > This seems fine to me. The optimizer can (soundly) conclude that
%p
> > > is dead after the "lifetime.end" (for the two
instructions), and
> > > dead before the "lifetime.start" (for the *single*
instruction in
> > > that basic block, *not* for the previous BB). This seems like the
> > > proper result for this example, am I missing something?
> > >
> > >
> > > What if I put that in a loop, unroll it once, and prove that the
> > > lifetime.start is unreachable? We would end up with IR like:
> > >
> > >
> > > loop:
> > > ... use %p
> > > call void @lifetime.end( %p )
> > >
> > > ... use %p
> > > call void @lifetime.end( %p )
> > > br i1 %c, label %loop, label %exit
> > >
> > >
> > > Are the second uses of %p uses of dead memory?
> > >
> > >
> > > We have similar issues if the optimizer somehow removes the
> > > lifetime
> > > end and keeps the start:
> > >
> > >
> > >
> > > loop:
> > > call void @lifetime.start( %p )
> > >
> > > ... use %p
> > > call void @lifetime.start( %p )
> > >
> > >
> > > ... use %p
> > > br i1 %c, label %loop, label %exit
> > >
> > >
> > > For this reason, it has been suggested that these intrinsics are
> > > horribly broken,
> >
> > I disagree, these just seem like bugs. lifetime_start are marked as
> > IntrReadWriteArgMem, but this is not really sufficient to prevent
> > their removal should the memory be subsequently unused. Plus there
> > are other places that just delete the lifetime intrinsics, like this
> > in lib/Transforms/Scalar/SROA.cpp:
> >
> > // FIXME: Currently the SSAUpdater infrastructure doesn't reason
> > about
> > // lifetime intrinsics and so we strip them (and the bitcasts+GEPs
> > // leading to them) here. Eventually it should use them to optimize
> > the
> > // scalar values produced.
> > if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(I)) {
> > assert(II->getIntrinsicID() == Intrinsic::lifetime_start ||
> > II->getIntrinsicID() == Intrinsic::lifetime_end);
> > II->eraseFromParent();
> > continue;
> > }
> >
> > we need to go through the various places that might delete these
> > intrinsics and fix them. The same will be true with any other
> > mechanism.
> >
> >
> >
> > It removes them because it does (or will) remove the associated
> > alloca anyways as part of turning loads and stores into SSA.
There's
> > no need for lifetime intrinsic equivalents on SSA given that we have
> > use-lists and tools like the dominator tree.
>
> Good point, I did not think too carefully about what the code was doing,
> but rather pointing out that there is special-case code dealing with
> lifetime intrinsics that needs to be looked at, and code that does not deal
> specifically with lifetime intrinsics that may have to do so. I certainly
> agree that we don't need them for SSA values.
>
> For the code in question, I don't see why you wouldn't just RAUW
the
> alloca with undef and then let DCE remove the intrinsics (this is, however,
> somewhat off-topic for this thread).
>
> >
> >
> >
> >
> > > and both should be remodeled to just mean "store of
> > > undef bytes to this memory".
> >
> > This is a bad idea. Stores of undef bytes can be removed if we can
> > prove that the address is dereferenceable. And if they can't be
> > removed, then they have side effects that can't ever be removed.
> > Please don't do that.
> >
> > I think the idea is to define them with the semantics of storing
> > undef bytes, but keep them implemented as intrinsic function calls,
> > so that the optimizer does not simply delete them. It's a way of
> > communicating that these are deliberate and valuable stores to
> > undef, as opposed to stores of SSA values that were later found to
> > be undef.
>
> I did not get that impression, and if that is what was proposed, I
don't
> see how that differs, in practice, from what we have now.
>
The LangRef definition looks like that plus some special rules about how
*all* uses before the start are dead. *The* start? What about multiple
starts? What does it mean to have start/end/start/end? Can you use an
alloca normally, then lifetime.start it? According to langref, no, *all*
uses before the start may be nuked. It's a weird rule, but it's intended
to
support the use case of stack slot colouring, where your starts and ends
are paired and tightly wrap the point where the variable is live.

If you remove that oddity, lifetime.start and lifetime.end become
semantically equivalent and both just mean "store undef there" and
become
straight-forward to reason about, though harder to use for stack slot
colouring (it becomes a bidirectional data flow problem, which is hard on
compile time). At this stage, I think the tradeoff is worthwhile.

Thanks again,> Hal
>
> >
> >
> >
> > -Hal
> >
> > > If "use %p" is a load, for example, in
> > > both cases we can safely say it returns undef, because it's a
> > > use-after-scope.
> > >
> > >
> > > I think coming up with a new representation with simpler
semantics
> > > is
> > > the way to go. One allocation or lifetime start, and one
> > > deallocation and end.
> > >
> > >
> > > Implementing this in Clang will be tricky, though. Clang's
IRGen is
> > > supposed to be a dumb AST walk, but it has already strayed from
> > > that
> > > path. Needs more thought...
> > > _______________________________________________
> > > LLVM Developers mailing list
> > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> > >
> >
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> >
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >
> >
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20141105/ff383557/attachment.html>

Maybe Matching Threads

Search for more reasonably related threads

llvm dev - Nov 2014 - [LLVMdev] lifetime.start/end clarification

[LLVMdev] lifetime.start/end clarification

[LLVMdev] lifetime.start/end clarification

[LLVMdev] lifetime.start/end clarification

[LLVMdev] lifetime.start/end clarification

Maybe Matching Threads