thr3ads.net - llvm dev - [llvm-dev] lifetime.start/end [Jan 2021]

If this information is useful, please help other people find it:
Share via:

Johannes Doerfert via llvm-dev

2021-Jan-07 00:02 UTC

[llvm-dev] lifetime.start/end

On 1/6/21 4:33 PM, Juneyoung Lee wrote:>>> Stepwisely defining the semantics of instructions is a desirable
> direction
>>> IMO.
>> I'm confused. What in the proposal would prevent us from defining
>> the semantics of instructions, or force us to do it in an
"undesirable
> way"?
>
> I meant it would be great if the output state after executing an
> instruction can be described using its input state.
> (that was the meaning of 'stepwise semantics', I should have been
more
> clear about this)
>
> For example, the semantics of 'z = add x y' can be defined as
follows:
> Given an input state s, next state s' = s[z -> s(x) + s(y)]
> where s(x) is the value of x in the previous state, and s[z -> v] is a
> state with z updated to v.
>
> Another example that involves memory: the semantics of 'i = ptrtoint
p' can
> be defined as follows:
> Given an input state s, next state s' = s[i -> s(p).obj.address +
> s(p).offset]
> where obj.address is the begin address of a memory object obj pointed by p
> & offset is p's byte offset. (Imagine a pointer to the offset of
some char
> array).
> Note that ptrtoint & add can be nicely defined w.r.t the input state.
>
>
> Now, the instruction that we're facing is 'p = alloca'.
> To describe the output state after executing 'p = alloca', the
address of
> new alloca should be determined.
> If observedness is involved, we need to know the future state again. :/ We
> don't know whether the alloca is going to be observed or not without
seeing
> the future.
> This is the problem of the current lifetime intrinsics as well.
No, you mix things up here.

Nobody proposed to modify the semantics of `alloca`.
`alloca` provides you with a fresh, unobserved block of
dereferenceable memory that is implicitly freed as the stack
unwinds. That is it. No context necessary.

If you want to modify the IR, you need to argue the observable
semantics which is nothing new. That this might require more than
a peephole view of the program is also not new.

> One possible approach to resolve this is adding an 'unobserved'
flag to
> alloca instruction (similar to what was suggested by Nicolai).
> And, we can say that if alloca with 'unobserved' is used by
ptrtoint/icmp,
> it is UB.
The flag can be added, like we add other attributes. It should not
be required for any optimization we talked about though. It basically
is a way to manifest derived or given information into the IR.

Attribute deduction, as well as frontends with domain knowledge,
can add such information. The flag we discussed in phab was not even
sufficient for all the transformation examples I presented in my mail,
that is why I extended my  argument. We could still have a "noescape"
flag for allocas, but I'm not sure how useful that really is. We can
certainly deduce it and manifest it, unsure if we have domain knowledge
we can use for non-trivial cases though.

> But this makes ptrtoint/icmp make UB-raising instructions, which
> contradicts with what LLVM does.
As with other violation of attributes I would, on first though, suggest
to produce poison, not UB.

> Also, existing optimizations like loop versioning can introduce
> ptrtoint/pointer comparisons too.
Sure. I am not certain why that is a problem. I get the feeling things
are still mixed up here.

What I proposed is twofold:

1) We stop folding comparisons between different allocas if changing the 
address
    of both might be observable. Thus, if both might have their address 
"taken"/escaped,
    other than the comparisons we want to fold, we cannot proceed.

2) We define lifetime markers to mean `memset(undef)`.

The first should be sufficient for the problem you were trying to solve 
in the
first place. The second makes lifetime markers less weird. Note that 1) 
is not changing
the semantics of the IR. We basically just argue there is a bug in our 
instcombine right
now as we do not check all necessary preconditions.

>
> I see that there are other questions that I didn't answer yet, but let
me
> answer this first to limit the length of the text :)
Sure, we can split the discussion :)

~ Johannes

>
> Thanks,
> Juneyoung
>
> On Thu, Jan 7, 2021 at 3:36 AM Johannes Doerfert <johannesdoerfert at
gmail.com>
> wrote:
>
>> On 1/5/21 8:00 PM, Juneyoung Lee wrote:
>>> Hi Johannes,
>>>
>>> I read your proposal and thought about the model.
>> Cool, thanks!
>>
>>
>>> As you concerned in A3, certain programs may be valid only when
memory
>>> blocks with overlapping lifetimes have disjoint addresses.
>>> Look at this example (I'm using malloc, but alloca also works):
>>>
>>> p1 = malloc(4)
>>> p2 = malloc(4) // for brevity, assume that there as enough space
& p1 and
>>> p2 != null
>>> set<char*> s;
>>> s.insert(p1); s.insert(p2); // If the second insert did nothing, it
would
>>> be surprise to programmers
>>> work(s);
>>> free(data1)
>>> free(data2)
>>>
>>> Clearly, IR semantics should guarantee that escaped blocks are
disjoint.
>> It
>>> would be great for verification tools on LLVM IR to be able to
answer
>> that
>>> the second insert will succeed.
>> I agree, the second insert should succeed, assuming `p1 && p2`.
>> I don't think my proposal would in any way impact the program
above,
>> if anything the A3 reasoning makes sure such a program with allocas
>> is not miscompiled.
>>
>> I'm also not sure I understand what you try to argue for. Maybe
>> elaborate a bit what it is you think is bad or needs to be changed.
>>
>>
>>> The problem is that definition of escapedness is not clear at the
>> semantic
>>> level. Describing the IR semantics w.r.t. LLVM's escaped
analysis
>> function
>>> isn't something we want.
>>>
>>> Semantic definition of escapedness of a pointer seems hard, I mean
in a
>>> stepwise manner.
>>> It isn't a single instruction such as 'escape i8* ptr',
and we need to
>> look
>>> over all instructions in the function. For example, '(int)(p+1)
- (int)p'
>>> isn't semantically escaping the pointer p because the result is
1
>>> regardless of the value of p.
>>> Stepwisely defining the semantics of instructions is a desirable
>> direction
>>> IMO.
>> I'm confused. What in the proposal would prevent us from defining
>> the semantics of instructions, or force us to do it in an
"undesirable
>> way"?
>>
>>
>>> In practice, syntactically checking escapedness + nice engineering
might
>>> not break optimizations in most cases (as undef/poison did); but it
would
>>> be great to move to another level, since LLVM IR is used in so many
>> places
>>> :)
>> The property under which you can coalesce objects is simple:
>>     It is not observable.
>>
>> Now, if the address of one of the two objects you coalesce is not
>> observed, coalescing is not observable. That is a sufficient condition
>> not a necessary one. Pointer "escaping" is one step further.
If the
>> address doesn't escape it is not observed. This does not mean the
>> "semantic conditions for coalescing", e.g., for the purpose
of translation
>> validation, is supposed to be build on top of our "definition of
escaping
>> pointers". That said, we use "does not escape" as a
precondition for
>> various transformation and I'm unsure what is any different now.
The
>> entire escape argument is only used in the validity of the pointer
folding.
>> Similarly, we can fold a comparison of a noalias pointer with another
one
>> if the former does not escape (and both are dereferenced and one is
>> written for sure).
>>
>>
>>> The pointer comparison is another beast. It indeed has a few
issues, and
>>> solving it might require nontrivial solution.
>> I think the main problem of the inconsistencies and such we've seen
is
>> rooted in the erroneous folding of pointer comparisons. Cleaning up the
>> lifetime marker semantics is actually unrelated and simply not folding
>> as described in A3 should solve the issue that has been reported.
>> Nevertheless,
>> we should take a crack at lifetime markers while we are here.
>>
>>
>> ~ Johannes
>>
>>
>>
>>> Thanks,
>>> Juneyoung
>>>
>>> On Tue, Jan 5, 2021 at 9:37 AM Johannes Doerfert <
>> johannesdoerfert at gmail.com>
>>> wrote:
>>>
>>>> Hi Juneyoung,
>>>>
>>>> Happy new year :)
>>>>
>>>> After we had a lengthy discussion on phab last year, I've
tried to
>>>> summarize my thoughts,
>>>> especially given that I had some time to think about things
over the
>> break.
>>>> Still, no promises on the quality ;)
>>>>
>>>> I start with general questions I asked myself and then go on
rambling
>>>> about a potential design.
>>>>
>>>>
>>>> Q1: Is lifetime a given property or a derived one, thus is it
fixed or
>>>> modifiable?
>>>>
>>>> This is a question I asked myself a lot recently. I think it is
derived
>>>> and modifiable,
>>>> at least I hope it is. Only that would allow transformations I
would
>> want
>>>> us to do. Here are some examples:
>>>>      https://godbolt.org/z/G8obj3
>>>>      https://godbolt.org/z/obaTc
>>>>
>>>>
>>>> Q2: Is a pointer comparison, or similar use, extending the
lifetime?
>>>>
>>>> Asked differently, can we hoist a pointer comparison into a
region where
>>>> the pointer is dead?
>>>> This is an important question which we haven't discussed
much as we
>>>> assumed LICM has to work.
>>>>
>>>> The current behavior is that non-dereferencing uses are not
extending
>>>> the lifetime and are
>>>> allowed outside of "lifetime regions" (as indicated
by markers). They
>>>> will always produce valid
>>>> results. Though, we might want to think about a lifetime marker
that
>>>> spits out a new pointer
>>>> value instead of reusing the old one.
>>>>
>>>>
>>>> Q3: Can we use lifetime to argue about addresses?
>>>>
>>>> The question here is, can we fold address comparisons based on
>>>> lifetimes, or not.
>>>>
>>>> So far, we fold comparisons based on "address
information". For example,
>>>> we "know" globals,
>>>> allocas, and mallocs cannot be equal to one another. Also, two
distinct
>>>> allocations, for globals
>>>> and allocas, are considered unequal. Now, the crux is that we
have to be
>>>> consistent if we do two
>>>> comparisons, and, as of now, we are not (bug number missing).
Since the
>>>> backend (or any other place
>>>> for that matter) might coalesce allocas, their addresses might
not be
>>>> different after all. If we
>>>> already folded a comparison to "unequal" we are
doomed if we later have
>>>> a comparison that results
>>>> in "equal". (Note, this is different from aliasing
rules as they can be
>>>> stricter.)
>>>>
>>>>
>>>> Design:
>>>>
>>>> I would hope we can come up with a model that treats memory
"the same",
>>>> regardless if it is global,
>>>> stack, or heap. I want to avoid special casing one of them wrt.
lifetime
>>>> as I believe most optimizations
>>>> would apply to any of them, potentially for different reasons
and with
>>>> different gains, but nevertheless.
>>>>
>>>>
>>>> Proposal (largely based on the conversation in phab):
>>>>
>>>> A1: Lifetime is a concept that talks about memory content
*only*.
>>>> Basically, the memory content is set to
>>>>        undefined by lifetime markers. It is derived/modifiable
as it is
>>>> just an "as-is" property of the memory
>>>>        content. The lifetimes of an object, as described by
markers,
>> might
>>>> change during the compilation. They
>>>>        might become smaller if we deduce the object is not
accessed and
>>>> the memory content is not used, they
>>>>        might become larger if objects with non-overlapping
lifetimes are
>>>> coalesced. (One could see the latter as
>>>>        introducing a new object though.)
>>>>
>>>> A2: If we define lifetime as above, it has nothing to do with
the
>>>> address of an object. Consequently, pointer
>>>>        comparisons and similar operations are valid outside the
lifetime.
>>>> Loads and stores are as well, they can
>>>>        even not be removed "easily". A store followed
by a lifetime
>> marker
>>>> or a load following a lifetime marker
>>>>        is dead or results in undef respectively.
>>>>
>>>> A3: We could not use lifetime to argue about addresses. This
means we
>>>> could/should also not argue that overlapping
>>>>        lifetimes result in different addresses. Thus, a
comparison
>> between
>>>> the address of two allocas could not
>>>>        immediately be folded. That said, there would be special
cases
>>>> though. Basically, if one of the allocas does
>>>>        not escape, other than the comparisons to be folded, we
can fold
>>>> them. Afterwards, coalescing or splitting
>>>>        would still be consistent because it is unobservable.
The problem
>>>> we have in-tree is that we fold even though
>>>>        the address is still observed (after the fold). It is
still
>> unclear
>>>> to me what the impact of this would be
>>>>        on real code. I suggested before that we run some
experiments
>> first
>>>> before we make any decision whatsoever.
>>>>
>>>> This is pretty much saying that lifetime markers are
`memset(undef)`, as
>>>> you suggested before (I think).
>>>> There are some implementation level differences but at the end
of the
>>>> day they are basically the same.
>>>>
>>>> Happy to hear some thoughts on this, especially if it fixes the
problems
>>>> that lead to D93376 in the first place.
>>>>
>>>> ~ Johannes
>>>>
>>>>
>>>> On 12/18/20 2:42 AM, Juneyoung Lee via llvm-dev wrote:
>>>>> Hello all,
>>>>>
>>>>> We're discussing the well-formedness of
lifetime.start/end intrinsic
>>>> here (
>>>>> https://reviews.llvm.org/D93376), deciding what is a
(syntactically &
>>>>> semantically) valid use of these intrinsics.
>>>>>
>>>>> I'd like to gather more context about the intrinsics.
>>>>>
>>>>> 1. Is there a frontend or framework that introduces
lifetime call with
>>>>> non-stack allocated objects?
>>>>> If exists, which behavior do they expect from it?
>>>>>
>>>>> 2. Can a block's lifetime be started bytewise or
elementwise?
>>>>> I imagine an optimization that allow using stack very
compactly, but
>>>> wonder
>>>>> there is a real-world use case.
>>>>>
>>>>> Thanks,
>>>>> Juneyoung
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>

Juneyoung Lee via llvm-dev

2021-Jan-07 22:55 UTC

head link

[llvm-dev] lifetime.start/end

> No, you mix things up here.
>
> Nobody proposed to modify the semantics of `alloca`.
> `alloca` provides you with a fresh, unobserved block of
> dereferenceable memory that is implicitly freed as the stack
> unwinds. That is it. No context necessary.
It is, alloca must have its integral address decided at its *allocation*
time as well, like 0xFF00... .

For example, if 0xFFFF0000 ~ 0xFF000000 is already allocated, the new
alloca cannot be placed there unless it is not observed in the future,
according to your proposal.

But how do we know from the current state whether the stack variable is
going to be observed or not?
It is like writing an interpreter for IR; we cannot write an interpreter
that relies on the future memory state to make the current step.

Making icmp/ptrtoint yield poison will still make loop versioning or
pointer rewriting transformations unsound because these operations now can
create poison (even if pointers are noundef).
> What I proposed is twofold:
>
> 1) We stop folding comparisons between different allocas if changing the
> address
>     of both might be observable. Thus, if both might have their address
> "taken"/escaped,
>     other than the comparisons we want to fold, we cannot proceed.
>
> 2) We define lifetime markers to mean `memset(undef)`.
2) is fine, I think the suggestion semantically makes sense perfectly. 1)
is something I'm concerned about now.

There are more than pointer foldings, such as rewriting such expression,
code motion ptr cmp, introduce ptr cmp, etc. There is also analysis relying
on ptr cmp.
Defining the correctness of each of them is something we want to avoid, and
maybe that's why we want to define precise semantics for things.


If the sudden textual change at lifetime intrinsics is a concern, maybe we
can have a migration time and make the Verifier raise a warning if lifetime
is used with an unknown pointer.
I think this will be less aggressive and may give nice feedback to
potential projects that are using lifetime with non-alloca.
At the end, we can implement IR writer that lowers lifetime with non-alloca
into memset(undef). WDYT?

Thanks,
Juneyoung

p.s. The reply was late, sorry. I think I can spend more time on this today.

On Thu, Jan 7, 2021 at 9:02 AM Johannes Doerfert <johannesdoerfert at
gmail.com>
wrote:
>
> On 1/6/21 4:33 PM, Juneyoung Lee wrote:
> >>> Stepwisely defining the semantics of instructions is a
desirable
> > direction
> >>> IMO.
> >> I'm confused. What in the proposal would prevent us from
defining
> >> the semantics of instructions, or force us to do it in an
"undesirable
> > way"?
> >
> > I meant it would be great if the output state after executing an
> > instruction can be described using its input state.
> > (that was the meaning of 'stepwise semantics', I should have
been more
> > clear about this)
> >
> > For example, the semantics of 'z = add x y' can be defined as
follows:
> > Given an input state s, next state s' = s[z -> s(x) + s(y)]
> > where s(x) is the value of x in the previous state, and s[z -> v]
is a
> > state with z updated to v.
> >
> > Another example that involves memory: the semantics of 'i =
ptrtoint p'
> can
> > be defined as follows:
> > Given an input state s, next state s' = s[i -> s(p).obj.address
+
> > s(p).offset]
> > where obj.address is the begin address of a memory object obj pointed
by
> p
> > & offset is p's byte offset. (Imagine a pointer to the offset
of some
> char
> > array).
> > Note that ptrtoint & add can be nicely defined w.r.t the input
state.
> >
> >
> > Now, the instruction that we're facing is 'p = alloca'.
> > To describe the output state after executing 'p = alloca', the
address of
> > new alloca should be determined.
> > If observedness is involved, we need to know the future state again.
:/
> We
> > don't know whether the alloca is going to be observed or not
without
> seeing
> > the future.
> > This is the problem of the current lifetime intrinsics as well.
>
> No, you mix things up here.
>
> Nobody proposed to modify the semantics of `alloca`.
> `alloca` provides you with a fresh, unobserved block of
> dereferenceable memory that is implicitly freed as the stack
> unwinds. That is it. No context necessary.
>
> If you want to modify the IR, you need to argue the observable
> semantics which is nothing new. That this might require more than
> a peephole view of the program is also not new.
>
>
> > One possible approach to resolve this is adding an
'unobserved' flag to
> > alloca instruction (similar to what was suggested by Nicolai).
> > And, we can say that if alloca with 'unobserved' is used by
> ptrtoint/icmp,
> > it is UB.
>
> The flag can be added, like we add other attributes. It should not
> be required for any optimization we talked about though. It basically
> is a way to manifest derived or given information into the IR.
>
> Attribute deduction, as well as frontends with domain knowledge,
> can add such information. The flag we discussed in phab was not even
> sufficient for all the transformation examples I presented in my mail,
> that is why I extended my  argument. We could still have a
"noescape"
> flag for allocas, but I'm not sure how useful that really is. We can
> certainly deduce it and manifest it, unsure if we have domain knowledge
> we can use for non-trivial cases though.
>
>
> > But this makes ptrtoint/icmp make UB-raising instructions, which
> > contradicts with what LLVM does.
>
> As with other violation of attributes I would, on first though, suggest
> to produce poison, not UB.
>
>
> > Also, existing optimizations like loop versioning can introduce
> > ptrtoint/pointer comparisons too.
>
> Sure. I am not certain why that is a problem. I get the feeling things
> are still mixed up here.
>
> What I proposed is twofold:
>
> 1) We stop folding comparisons between different allocas if changing the
> address
>     of both might be observable. Thus, if both might have their address
> "taken"/escaped,
>     other than the comparisons we want to fold, we cannot proceed.
>
> 2) We define lifetime markers to mean `memset(undef)`.
>
> The first should be sufficient for the problem you were trying to solve
> in the
> first place. The second makes lifetime markers less weird. Note that 1)
> is not changing
> the semantics of the IR. We basically just argue there is a bug in our
> instcombine right
> now as we do not check all necessary preconditions.
>
>
> >
> > I see that there are other questions that I didn't answer yet, but
let me
> > answer this first to limit the length of the text :)
>
> Sure, we can split the discussion :)
>
> ~ Johannes
>
>
> >
> > Thanks,
> > Juneyoung
> >
> > On Thu, Jan 7, 2021 at 3:36 AM Johannes Doerfert <
> johannesdoerfert at gmail.com>
> > wrote:
> >
> >> On 1/5/21 8:00 PM, Juneyoung Lee wrote:
> >>> Hi Johannes,
> >>>
> >>> I read your proposal and thought about the model.
> >> Cool, thanks!
> >>
> >>
> >>> As you concerned in A3, certain programs may be valid only
when memory
> >>> blocks with overlapping lifetimes have disjoint addresses.
> >>> Look at this example (I'm using malloc, but alloca also
works):
> >>>
> >>> p1 = malloc(4)
> >>> p2 = malloc(4) // for brevity, assume that there as enough
space & p1
> and
> >>> p2 != null
> >>> set<char*> s;
> >>> s.insert(p1); s.insert(p2); // If the second insert did
nothing, it
> would
> >>> be surprise to programmers
> >>> work(s);
> >>> free(data1)
> >>> free(data2)
> >>>
> >>> Clearly, IR semantics should guarantee that escaped blocks are
> disjoint.
> >> It
> >>> would be great for verification tools on LLVM IR to be able to
answer
> >> that
> >>> the second insert will succeed.
> >> I agree, the second insert should succeed, assuming `p1 &&
p2`.
> >> I don't think my proposal would in any way impact the program
above,
> >> if anything the A3 reasoning makes sure such a program with
allocas
> >> is not miscompiled.
> >>
> >> I'm also not sure I understand what you try to argue for.
Maybe
> >> elaborate a bit what it is you think is bad or needs to be
changed.
> >>
> >>
> >>> The problem is that definition of escapedness is not clear at
the
> >> semantic
> >>> level. Describing the IR semantics w.r.t. LLVM's escaped
analysis
> >> function
> >>> isn't something we want.
> >>>
> >>> Semantic definition of escapedness of a pointer seems hard, I
mean in a
> >>> stepwise manner.
> >>> It isn't a single instruction such as 'escape i8*
ptr', and we need to
> >> look
> >>> over all instructions in the function. For example,
'(int)(p+1) -
> (int)p'
> >>> isn't semantically escaping the pointer p because the
result is 1
> >>> regardless of the value of p.
> >>> Stepwisely defining the semantics of instructions is a
desirable
> >> direction
> >>> IMO.
> >> I'm confused. What in the proposal would prevent us from
defining
> >> the semantics of instructions, or force us to do it in an
"undesirable
> >> way"?
> >>
> >>
> >>> In practice, syntactically checking escapedness + nice
engineering
> might
> >>> not break optimizations in most cases (as undef/poison did);
but it
> would
> >>> be great to move to another level, since LLVM IR is used in so
many
> >> places
> >>> :)
> >> The property under which you can coalesce objects is simple:
> >>     It is not observable.
> >>
> >> Now, if the address of one of the two objects you coalesce is not
> >> observed, coalescing is not observable. That is a sufficient
condition
> >> not a necessary one. Pointer "escaping" is one step
further. If the
> >> address doesn't escape it is not observed. This does not mean
the
> >> "semantic conditions for coalescing", e.g., for the
purpose of
> translation
> >> validation, is supposed to be build on top of our "definition
of
> escaping
> >> pointers". That said, we use "does not escape" as a
precondition for
> >> various transformation and I'm unsure what is any different
now. The
> >> entire escape argument is only used in the validity of the pointer
> folding.
> >> Similarly, we can fold a comparison of a noalias pointer with
another
> one
> >> if the former does not escape (and both are dereferenced and one
is
> >> written for sure).
> >>
> >>
> >>> The pointer comparison is another beast. It indeed has a few
issues,
> and
> >>> solving it might require nontrivial solution.
> >> I think the main problem of the inconsistencies and such we've
seen is
> >> rooted in the erroneous folding of pointer comparisons. Cleaning
up the
> >> lifetime marker semantics is actually unrelated and simply not
folding
> >> as described in A3 should solve the issue that has been reported.
> >> Nevertheless,
> >> we should take a crack at lifetime markers while we are here.
> >>
> >>
> >> ~ Johannes
> >>
> >>
> >>
> >>> Thanks,
> >>> Juneyoung
> >>>
> >>> On Tue, Jan 5, 2021 at 9:37 AM Johannes Doerfert <
> >> johannesdoerfert at gmail.com>
> >>> wrote:
> >>>
> >>>> Hi Juneyoung,
> >>>>
> >>>> Happy new year :)
> >>>>
> >>>> After we had a lengthy discussion on phab last year,
I've tried to
> >>>> summarize my thoughts,
> >>>> especially given that I had some time to think about
things over the
> >> break.
> >>>> Still, no promises on the quality ;)
> >>>>
> >>>> I start with general questions I asked myself and then go
on rambling
> >>>> about a potential design.
> >>>>
> >>>>
> >>>> Q1: Is lifetime a given property or a derived one, thus is
it fixed or
> >>>> modifiable?
> >>>>
> >>>> This is a question I asked myself a lot recently. I think
it is
> derived
> >>>> and modifiable,
> >>>> at least I hope it is. Only that would allow
transformations I would
> >> want
> >>>> us to do. Here are some examples:
> >>>>      https://godbolt.org/z/G8obj3
> >>>>      https://godbolt.org/z/obaTc
> >>>>
> >>>>
> >>>> Q2: Is a pointer comparison, or similar use, extending the
lifetime?
> >>>>
> >>>> Asked differently, can we hoist a pointer comparison into
a region
> where
> >>>> the pointer is dead?
> >>>> This is an important question which we haven't
discussed much as we
> >>>> assumed LICM has to work.
> >>>>
> >>>> The current behavior is that non-dereferencing uses are
not extending
> >>>> the lifetime and are
> >>>> allowed outside of "lifetime regions" (as
indicated by markers). They
> >>>> will always produce valid
> >>>> results. Though, we might want to think about a lifetime
marker that
> >>>> spits out a new pointer
> >>>> value instead of reusing the old one.
> >>>>
> >>>>
> >>>> Q3: Can we use lifetime to argue about addresses?
> >>>>
> >>>> The question here is, can we fold address comparisons
based on
> >>>> lifetimes, or not.
> >>>>
> >>>> So far, we fold comparisons based on "address
information". For
> example,
> >>>> we "know" globals,
> >>>> allocas, and mallocs cannot be equal to one another. Also,
two
> distinct
> >>>> allocations, for globals
> >>>> and allocas, are considered unequal. Now, the crux is that
we have to
> be
> >>>> consistent if we do two
> >>>> comparisons, and, as of now, we are not (bug number
missing). Since
> the
> >>>> backend (or any other place
> >>>> for that matter) might coalesce allocas, their addresses
might not be
> >>>> different after all. If we
> >>>> already folded a comparison to "unequal" we are
doomed if we later
> have
> >>>> a comparison that results
> >>>> in "equal". (Note, this is different from
aliasing rules as they can
> be
> >>>> stricter.)
> >>>>
> >>>>
> >>>> Design:
> >>>>
> >>>> I would hope we can come up with a model that treats
memory "the
> same",
> >>>> regardless if it is global,
> >>>> stack, or heap. I want to avoid special casing one of them
wrt.
> lifetime
> >>>> as I believe most optimizations
> >>>> would apply to any of them, potentially for different
reasons and with
> >>>> different gains, but nevertheless.
> >>>>
> >>>>
> >>>> Proposal (largely based on the conversation in phab):
> >>>>
> >>>> A1: Lifetime is a concept that talks about memory content
*only*.
> >>>> Basically, the memory content is set to
> >>>>        undefined by lifetime markers. It is
derived/modifiable as it
> is
> >>>> just an "as-is" property of the memory
> >>>>        content. The lifetimes of an object, as described
by markers,
> >> might
> >>>> change during the compilation. They
> >>>>        might become smaller if we deduce the object is not
accessed
> and
> >>>> the memory content is not used, they
> >>>>        might become larger if objects with non-overlapping
lifetimes
> are
> >>>> coalesced. (One could see the latter as
> >>>>        introducing a new object though.)
> >>>>
> >>>> A2: If we define lifetime as above, it has nothing to do
with the
> >>>> address of an object. Consequently, pointer
> >>>>        comparisons and similar operations are valid
outside the
> lifetime.
> >>>> Loads and stores are as well, they can
> >>>>        even not be removed "easily". A store
followed by a lifetime
> >> marker
> >>>> or a load following a lifetime marker
> >>>>        is dead or results in undef respectively.
> >>>>
> >>>> A3: We could not use lifetime to argue about addresses.
This means we
> >>>> could/should also not argue that overlapping
> >>>>        lifetimes result in different addresses. Thus, a
comparison
> >> between
> >>>> the address of two allocas could not
> >>>>        immediately be folded. That said, there would be
special cases
> >>>> though. Basically, if one of the allocas does
> >>>>        not escape, other than the comparisons to be
folded, we can
> fold
> >>>> them. Afterwards, coalescing or splitting
> >>>>        would still be consistent because it is
unobservable. The
> problem
> >>>> we have in-tree is that we fold even though
> >>>>        the address is still observed (after the fold). It
is still
> >> unclear
> >>>> to me what the impact of this would be
> >>>>        on real code. I suggested before that we run some
experiments
> >> first
> >>>> before we make any decision whatsoever.
> >>>>
> >>>> This is pretty much saying that lifetime markers are
`memset(undef)`,
> as
> >>>> you suggested before (I think).
> >>>> There are some implementation level differences but at the
end of the
> >>>> day they are basically the same.
> >>>>
> >>>> Happy to hear some thoughts on this, especially if it
fixes the
> problems
> >>>> that lead to D93376 in the first place.
> >>>>
> >>>> ~ Johannes
> >>>>
> >>>>
> >>>> On 12/18/20 2:42 AM, Juneyoung Lee via llvm-dev wrote:
> >>>>> Hello all,
> >>>>>
> >>>>> We're discussing the well-formedness of
lifetime.start/end intrinsic
> >>>> here (
> >>>>> https://reviews.llvm.org/D93376), deciding what is a
(syntactically
> &
> >>>>> semantically) valid use of these intrinsics.
> >>>>>
> >>>>> I'd like to gather more context about the
intrinsics.
> >>>>>
> >>>>> 1. Is there a frontend or framework that introduces
lifetime call
> with
> >>>>> non-stack allocated objects?
> >>>>> If exists, which behavior do they expect from it?
> >>>>>
> >>>>> 2. Can a block's lifetime be started bytewise or
elementwise?
> >>>>> I imagine an optimization that allow using stack very
compactly, but
> >>>> wonder
> >>>>> there is a real-world use case.
> >>>>>
> >>>>> Thanks,
> >>>>> Juneyoung
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> LLVM Developers mailing list
> >>>>> llvm-dev at lists.llvm.org
> >>>>>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >>
>

-- 

Juneyoung Lee
Software Foundation Lab, Seoul National University
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210108/b614b7e0/attachment-0001.html>

llvm dev - Jan 2021 - lifetime.start/end

[llvm-dev] lifetime.start/end

[llvm-dev] lifetime.start/end