thr3ads.net - llvm dev - [llvm-dev] RFC: Decomposing deref(N) into deref(N) + nofree [Mar 2021]

If this information is useful, please help other people find it:
Share via:

Philip Reames via llvm-dev

2021-Mar-17 21:22 UTC

[llvm-dev] RFC: Decomposing deref(N) into deref(N) + nofree

TLDR: We should change the existing dereferenceability related 
attributes to imply point in time facts only, and re-infer stronger 
global dereferenceability facts where needed.


    Meta
   
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id1>

If you prefer to read proposals in a browser, you can read this email 
here
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst>.

This proposal greatly benefited from multiple rounds of feedback from 
Johannes, Artur, and Nick. All remaining mistakes are my own.

Johannes deserves a lot of credit for driving previous iterations on 
this design. In particular, I want to note that we've basically returned 
to something Johannes first proposed several years ago, before we had 
specified the nofree attribute family.


    The Basic Problem
   
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id2>

We have a long standing semantic problem with the way we define 
dereferenceability facts which makes it difficult to express C++ 
references, or more generally, dereferenceability on objects which may 
be freed at some point in the program. The current structure does lend 
itself well to memory which can't be freed. As discussed in detail a bit 
later, we want to seamlessly support both use cases.

The basic statement of the problem is that a piece of memory marked with 
deref(N) is assumed to remain dereferenceable indefinitely. For an 
object which can be freed, marking it as deref can enable unsound 
transformations in cases like the following:

o = deref(N) alloc();
if (c) free(o)
while(true) {
   if (c) break;
   // With the current semantics, we will hoist o.f above the loop
   v = o.f;
}

Despite this, Clang does emit the existing dereferenceable attribute in 
some problematic cases. We have observed miscompiles as a result, and 
optimizer has an assortment of hacks to try not to be too aggressive and 
miscompile too 
widely.<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#havent-we-already-solved-this>


    Haven't we already solved this?
   
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id3>

This has been discussed relatively extensively in the past, included an 
accepted review (https://reviews.llvm.org/D61652 
<https://reviews.llvm.org/D61652>) which proposed splitting the 
dereferenceable attribute into two to adress this. However, this change 
never landed and recent findings reveal that we both need a broader 
solution, and have an interesting oppurtunity to take advantage of other 
recent work.

The need for a broader solution comes from the observation that deref(N) 
is not the only attribute with this problem. deref_or_null(N) is a 
fairly obvious case we'd known about with the previous proposal, but it 
was recently realized that other allocation related facts have this 
problem as well. We now have specific examples with allocsize(N,M) - and 
the baked in variants in MemoryBuiltins - and suspect there are other 
attributes, either current or future, with the same challenge.

The opportunity comes from the addition of "nofree" attribute. Up
until
recently, we really didn't have a good notion of "free"ing an
allocation
in the abstract machine model. We used to comingle this with our notion 
of capture. (i.e. We'd assume that functions which could free must also 
capture.) With the explicit notion of "nofree", we have an approach 
available to us we didn't before.


    The Proposal Itself
   
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id4>

The basic idea is that we're going to redefine the currently globally 
scoped attributes (deref, deref_or_null, and allocsize) such that they 
imply a point in time fact only and then combine that with nofree to 
recover the previous global semantics.

More specifically:

  * A deref attribute on a function parameter will imply that the memory
    is dereferenceable for a specified number of bytes at the instant
    the function call occurs.
  * A deref attribute on a function return will imply that the memory is
    dereferenceable at the moment of return.

We will then use the point in time fact combined with other information 
to drive inference of the global facts. While in principle we may loose 
optimization potential, we believe this is sufficient to infer the 
global facts in all practical cases we care about.

Sample inference cases:

  * A deref(N) argument to a function with the nofree and nosync
    function attribute is known to be globally dereferenceable within
    the scope of the function call. We need the nosync to ensure that no
    other thread is freeing the memory on behalf of the callee in a
    coordinated manner.
  * An argument with the attributes deref(N), noalias, and nofree is
    known to be globally dereferenceable within the scope of the
    function call. This relies on the fact that free is modeled as
    writing to the memory freed, and thus noalias ensures there is no
    other argument which can be freed. (See discussion below.)
  * A memory allocation in a function with a garbage collector which
    guarantees collection occurs only at explicit safepoints and uses
    the gc.statepoint infrastructure, is known to be globally
    dereferenceable if there are no calls to gc.statepoint anywhere in
    the module. This effectively refines the abstract machine model used
    for garbage collection before lowering by RS4GC to disallow explicit
    deallocation (for collectors which opt in).

The items above are described in terms of deref(N) for ease of 
description. The other attributes are handle analogously.

*Explanation*

The "deref(N), noalias, + nofree" argument case requires a bit of 
explanation as it involves a bunch of subtleties.

First, the current wording of nofree argument attribute implies that the 
callee can not arrange for another thread to free the object on it's 
behalf. This is different than the specification of the nofree function 
attribute. There is no "nosync" equivalent for function attributes.

Second, the noalias argument attribute is subtle. There's a couple of 
sub-cases worth discussing:

  * If the noalias argument is written to (reminder: free is modeled as
    a write), then it must be the only copy of the pointer passed to the
    function and there can be no copies passed through memory used in
    the scope of function.
  * If the noalias argument is only read from, then there may be other
    copies of the pointer. However, all of those copies must also be
    read only. If the object was freed through one of those other
    copies, then we must have at least one writeable copy and having the
    noalias on the read copy was undefined behavior to begin with.

Essentially, what we're doing with noalias is using it to promote a fact 
about the pointer to a fact about the object being pointed to. Code 
structure wise, we should probably write it exactly that way.

*Result*

It's important to acknowledge that with this change, we will lose the 
ability to specify global dereferenceability of arguments and return 
values in the general case. We believe the current proposal allows us to 
recover that fact for all interesting cases, but if we've missed an 
important use case we may need to iterate a bit.

We've discussed a few alternatives (below) which could be revisited if 
it turns out we are missing an important use case.


    Use Cases
   
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id5>

*C++ References* -- A C++ reference implies that the value pointed to is 
dereferenceable at point of declaration, and that the reference itself 
is non-null. Of particular note, an object pointed to through a 
reference can be freed without introducing UB.

class  A  {int  f; };

void  ugly_delete(A &a) {delete  &a; }
ugly_delete(*new  A());

void  ugly_delete2(A &a, A *a2) {
   if  (unknown)
     // a.f can be *proven* deref here as it's deref on entry,
     // and no free on path from entry to here.
     x = a.f;
   delete  a2;
}
auto  *a =new  A();
ugly_delete2(*a, a);

A &foo() {...}
A &a = foo();
if  (unknown)
   delete  b;
// If a and b point to the same object, a.f may not be deref here
if  (unknown2)
   a.f;

*Garbage Collected Objects (Java)* -- LLVM supports two models of GCed 
objects, the abstract machine and the physical machine model. The later 
is essentially the same as that for c++ as deallocation points (at 
safepoints) are explicit. The former has objects conceptually live 
forever (i.e. reclaimation is handled outside the model).

class  A  {int  f; }

void  foo(A  a) {
   ...
   // a.f is trivially deref anywhere in foo
   x=  a.f;
}

A  a=  new  A();
...
// a.f is trivially deref following it's definition
x=  a.f;

A  foo();
a=  foo();
...
// a.f is (still) trivially deref
x=  a.f;

*Rust Borrows* -- A rust reference argument (e.g. "borrow") points to
an
object whose lifetime is guaranteed to be longer than the reference's 
defining scope. As such, the object is dereferenceable through the scope 
of the function. Today, rustc does emit a dereferenceable attribute 
using the current globally dereferenceable semantic.

pub  fn  square(num:&i32) ->i32  {
   num*  num
}
square(&5);

// a could be noalias, but isn't today
pub  fn  bar(a:&mut  i32, b:&i32) {
   *a=  a*  b
}

bar(&mut  5,&2);

// At first appearance, rust does not allow returning references. So return
// attributes are not relevant. This seems like a major language hole, 
so this
// should probably be checked with a language expert.


    Migration
   
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id6>

Existing bytecode will be upgraded to the weaker non-global semantics. 
This provides forward compatibility, but does lose optimization 
potential for previously compiled bytecode.

C++ and GC'd language frontends don't change.

Rustc should emit noalias where possible. In particular, 'a' in the case
'bar' above is currently not marked noalias and results in lost 
optimization potential as a result of this change. According to the 
rustc code, this is legal, but currently blocked on a noalias related 
miscompile. See https://github.com/rust-lang/rust/issues/54462 
<https://github.com/rust-lang/rust/issues/54462> and 
https://github.com/rust-lang/rust/issues/54878 
<https://github.com/rust-lang/rust/issues/54878> for further details. 
(My current belief is that all llvm side blockers have been resolved.)

Frontends which want the global semantics should emit noalias, nofree, 
and nosync where appropriate. If this is not enough to recover 
optimizations in common cases, please explain why not. It's possible 
we've failed to account for something.


    Alternative Designs
   
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id7>

All of the alternate designs listed focus on recovering the full global 
deref semantics. Our hope is that any common case we've missed can be 
resolved with additional inference rules instead.


      Extend nofree to object semantics
     
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id8>

The nofree argument attribute current describes whether an object can 
freed through some particular copy of the pointer. We could strength the 
semantics to imply that the object is not freed through any copy of the 
pointer in the specified scope.

Doing so greatly weakens our ability to infer the nofree property. The 
current nofree property when combined with capture tracking in the 
caller is enough to prove interest deref facts over calls. We don't want 
to loose the ability to infer that since it enables interesting 
transforms (such as code reordering over calls).


      Add a separate nofreeobj attribute
     
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id9>

Rather than change nofree, we could add a parallel attribute with the 
stronger object property. This - combined with deref(N) as a point in 
time fact - would be enough to recover the current globally 
deferenceable semantics.

The downside of this alternative is a) possible overkill, and b) the 
"ugly" factor of having two similar but not quite identical
attributes.


      Add an orthogonal attribute to promote pointer facts to object
      ones
     
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id10>

To address the weakness of the former alternative, we could specify an 
attribute which strengthens arbitrary pointer facts to object facts. 
Examples of current pointer facts are attributes such as readonly, and 
writeonly.

This has not been well explored; there's a huge possible design space here.

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210317/0689a105/attachment.html>

Nuno Lopes via llvm-dev

2021-Mar-18 10:39 UTC

head link

[llvm-dev] RFC: Decomposing deref(N) into deref(N) + nofree

Hi Philip,

 

TBH I was slightly surprised with this email, as the interpretation about deref
I had in my head was exactly what you are proposing as the new semantics.

I just looked again at LangRef and I see it could be interpreted in the way you
describe. Because it mentions a pointer with deref attribute can be always
dereferenced, irrespective of whether the pointer is free’d or not.

 

FWIW, Alive2 already implements the semantics you describe. For example,
introducing a load after a call with a deref attribute is UB as the pointer may
be free’d by the callee: https://alive2.llvm.org/ce/z/dz2wYy  (but works with
nofree: https://alive2.llvm.org/ce/z/XsHEn-)

 

TL;DR: Please go ahead!

 

Nuno

 

 

From: Philip Reames 
Sent: 17 March 2021 21:22
To: llvm-dev at lists.llvm.org
Cc: Artur Pilipenko <apilipenko at azul.com>; Nick Lewycky <nicholas at
mxc.ca>
Subject: [llvm-dev] RFC: Decomposing deref(N) into deref(N) + nofree

 

TLDR: We should change the existing dereferenceability related attributes to
imply point in time facts only, and re-infer stronger global dereferenceability
facts where needed.


Meta
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id1>


If you prefer to read proposals in a browser, you can read this email here
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst> .

This proposal greatly benefited from multiple rounds of feedback from Johannes,
Artur, and Nick. All remaining mistakes are my own.

Johannes deserves a lot of credit for driving previous iterations on this
design. In particular, I want to note that we've basically returned to
something Johannes first proposed several years ago, before we had specified the
nofree attribute family.


The Basic Problem
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id2>


We have a long standing semantic problem with the way we define
dereferenceability facts which makes it difficult to express C++ references, or
more generally, dereferenceability on objects which may be freed at some point
in the program. The current structure does lend itself well to memory which
can't be freed. As discussed in detail a bit later, we want to seamlessly
support both use cases.

The basic statement of the problem is that a piece of memory marked with
deref(N) is assumed to remain dereferenceable indefinitely. For an object which
can be freed, marking it as deref can enable unsound transformations in cases
like the following:

o = deref(N) alloc();
if (c) free(o)
while(true) {
  if (c) break;
  // With the current semantics, we will hoist o.f above the loop
  v = o.f;
}

Despite this, Clang does emit the existing dereferenceable attribute in some
problematic cases. We have observed miscompiles as a result, and optimizer has
an assortment of hacks to try not to be too aggressive and miscompile too
widely.


Haven't we already solved this?
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id3>


This has been discussed relatively extensively in the past, included an accepted
review (https://reviews.llvm.org/D61652) which proposed splitting the
dereferenceable attribute into two to adress this. However, this change never
landed and recent findings reveal that we both need a broader solution, and have
an interesting oppurtunity to take advantage of other recent work.

The need for a broader solution comes from the observation that deref(N) is not
the only attribute with this problem. deref_or_null(N) is a fairly obvious case
we'd known about with the previous proposal, but it was recently realized
that other allocation related facts have this problem as well. We now have
specific examples with allocsize(N,M) - and the baked in variants in
MemoryBuiltins - and suspect there are other attributes, either current or
future, with the same challenge.

The opportunity comes from the addition of "nofree" attribute. Up
until recently, we really didn't have a good notion of "free"ing
an allocation in the abstract machine model. We used to comingle this with our
notion of capture. (i.e. We'd assume that functions which could free must
also capture.) With the explicit notion of "nofree", we have an
approach available to us we didn't before.


The Proposal Itself
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id4>


The basic idea is that we're going to redefine the currently globally scoped
attributes (deref, deref_or_null, and allocsize) such that they imply a point in
time fact only and then combine that with nofree to recover the previous global
semantics.

More specifically:

*	A deref attribute on a function parameter will imply that the memory is
dereferenceable for a specified number of bytes at the instant the function call
occurs.
*	A deref attribute on a function return will imply that the memory is
dereferenceable at the moment of return.

We will then use the point in time fact combined with other information to drive
inference of the global facts. While in principle we may loose optimization
potential, we believe this is sufficient to infer the global facts in all
practical cases we care about.

Sample inference cases:

*	A deref(N) argument to a function with the nofree and nosync function
attribute is known to be globally dereferenceable within the scope of the
function call. We need the nosync to ensure that no other thread is freeing the
memory on behalf of the callee in a coordinated manner.
*	An argument with the attributes deref(N), noalias, and nofree is known to be
globally dereferenceable within the scope of the function call. This relies on
the fact that free is modeled as writing to the memory freed, and thus noalias
ensures there is no other argument which can be freed. (See discussion below.)
*	A memory allocation in a function with a garbage collector which guarantees
collection occurs only at explicit safepoints and uses the gc.statepoint
infrastructure, is known to be globally dereferenceable if there are no calls to
gc.statepoint anywhere in the module. This effectively refines the abstract
machine model used for garbage collection before lowering by RS4GC to disallow
explicit deallocation (for collectors which opt in).

The items above are described in terms of deref(N) for ease of description. The
other attributes are handle analogously.

Explanation

The "deref(N), noalias, + nofree" argument case requires a bit of
explanation as it involves a bunch of subtleties.

First, the current wording of nofree argument attribute implies that the callee
can not arrange for another thread to free the object on it's behalf. This
is different than the specification of the nofree function attribute. There is
no "nosync" equivalent for function attributes.

Second, the noalias argument attribute is subtle. There's a couple of
sub-cases worth discussing:

*	If the noalias argument is written to (reminder: free is modeled as a write),
then it must be the only copy of the pointer passed to the function and there
can be no copies passed through memory used in the scope of function.
*	If the noalias argument is only read from, then there may be other copies of
the pointer. However, all of those copies must also be read only. If the object
was freed through one of those other copies, then we must have at least one
writeable copy and having the noalias on the read copy was undefined behavior to
begin with.

Essentially, what we're doing with noalias is using it to promote a fact
about the pointer to a fact about the object being pointed to. Code structure
wise, we should probably write it exactly that way.

Result

It's important to acknowledge that with this change, we will lose the
ability to specify global dereferenceability of arguments and return values in
the general case. We believe the current proposal allows us to recover that fact
for all interesting cases, but if we've missed an important use case we may
need to iterate a bit.

We've discussed a few alternatives (below) which could be revisited if it
turns out we are missing an important use case.


Use Cases
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id5>


C++ References -- A C++ reference implies that the value pointed to is
dereferenceable at point of declaration, and that the reference itself is
non-null. Of particular note, an object pointed to through a reference can be
freed without introducing UB.

class A { int f; };
 
void ugly_delete(A &a) { delete &a; }
ugly_delete(*new A());
 
void ugly_delete2(A &a, A *a2) {
  if (unknown)
    // a.f can be *proven* deref here as it's deref on entry,
    // and no free on path from entry to here.
    x = a.f;
  delete a2;
}
auto *a = new A();
ugly_delete2(*a, a);
 
A &foo() {...}
A &a = foo();
if (unknown)
  delete b;
// If a and b point to the same object, a.f may not be deref here
if (unknown2)
  a.f;

Garbage Collected Objects (Java) -- LLVM supports two models of GCed objects,
the abstract machine and the physical machine model. The later is essentially
the same as that for c++ as deallocation points (at safepoints) are explicit.
The former has objects conceptually live forever (i.e. reclaimation is handled
outside the model).

class A { int f; }
 
void foo(A a) {
  ...
  // a.f is trivially deref anywhere in foo
  x = a.f;
}
 
A a = new A();
...
// a.f is trivially deref following it's definition
x = a.f;
 
A foo();
a = foo();
...
// a.f is (still) trivially deref
x = a.f;

Rust Borrows -- A rust reference argument (e.g. "borrow") points to an
object whose lifetime is guaranteed to be longer than the reference's
defining scope. As such, the object is dereferenceable through the scope of the
function. Today, rustc does emit a dereferenceable attribute using the current
globally dereferenceable semantic.

pub fn square(num: &i32) -> i32 {
  num * num
}
square(&5);
 
// a could be noalias, but isn't today
pub fn bar(a: &mut i32, b: &i32) {
  *a = a * b
}
 
bar(&mut 5, &2);
 
// At first appearance, rust does not allow returning references.  So return
// attributes are not relevant.  This seems like a major language hole, so this
// should probably be checked with a language expert.

Migration
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id6>


Existing bytecode will be upgraded to the weaker non-global semantics. This
provides forward compatibility, but does lose optimization potential for
previously compiled bytecode.

C++ and GC'd language frontends don't change.

Rustc should emit noalias where possible. In particular, 'a' in the case
'bar' above is currently not marked noalias and results in lost
optimization potential as a result of this change. According to the rustc code,
this is legal, but currently blocked on a noalias related miscompile. See
https://github.com/rust-lang/rust/issues/54462 and
https://github.com/rust-lang/rust/issues/54878 for further details. (My current
belief is that all llvm side blockers have been resolved.)

Frontends which want the global semantics should emit noalias, nofree, and
nosync where appropriate. If this is not enough to recover optimizations in
common cases, please explain why not. It's possible we've failed to
account for something.


Alternative Designs
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id7>


All of the alternate designs listed focus on recovering the full global deref
semantics. Our hope is that any common case we've missed can be resolved
with additional inference rules instead.


Extend nofree to object semantics
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id8>


The nofree argument attribute current describes whether an object can freed
through some particular copy of the pointer. We could strength the semantics to
imply that the object is not freed through any copy of the pointer in the
specified scope.

Doing so greatly weakens our ability to infer the nofree property. The current
nofree property when combined with capture tracking in the caller is enough to
prove interest deref facts over calls. We don't want to loose the ability to
infer that since it enables interesting transforms (such as code reordering over
calls).


Add a separate nofreeobj attribute
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id9>


Rather than change nofree, we could add a parallel attribute with the stronger
object property. This - combined with deref(N) as a point in time fact - would
be enough to recover the current globally deferenceable semantics.

The downside of this alternative is a) possible overkill, and b) the
"ugly" factor of having two similar but not quite identical
attributes.


Add an orthogonal attribute to promote pointer facts to object ones
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id10>


To address the weakness of the former alternative, we could specify an attribute
which strengthens arbitrary pointer facts to object facts. Examples of current
pointer facts are attributes such as readonly, and writeonly.

This has not been well explored; there's a huge possible design space here.

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210318/fc51b150/attachment.html>

Juneyoung Lee via llvm-dev

2021-Mar-21 02:26 UTC

head link

[llvm-dev] RFC: Decomposing deref(N) into deref(N) + nofree

+1


I have one minor question: according to
https://llvm.org/docs/Atomics.html#optimization-outside-atomic ,
introducing a store is problematic even if the pointer is known to be
dereferenceable. Does this also apply to dereferenceable+nosync+nofree?


On Thu, Mar 18, 2021 at 6:22 AM Philip Reames via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> TLDR: We should change the existing dereferenceability related attributes
> to imply point in time facts only, and re-infer stronger global
> dereferenceability facts where needed.
> Meta
>
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id1>
>
> If you prefer to read proposals in a browser, you can read this email here
>
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst>.
>
> This proposal greatly benefited from multiple rounds of feedback from
> Johannes, Artur, and Nick. All remaining mistakes are my own.
>
> Johannes deserves a lot of credit for driving previous iterations on this
> design. In particular, I want to note that we've basically returned to
> something Johannes first proposed several years ago, before we had
> specified the nofree attribute family.
> The Basic Problem
>
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id2>
>
> We have a long standing semantic problem with the way we define
> dereferenceability facts which makes it difficult to express C++
> references, or more generally, dereferenceability on objects which may be
> freed at some point in the program. The current structure does lend itself
> well to memory which can't be freed. As discussed in detail a bit
later, we
> want to seamlessly support both use cases.
>
> The basic statement of the problem is that a piece of memory marked with
> deref(N) is assumed to remain dereferenceable indefinitely. For an object
> which can be freed, marking it as deref can enable unsound transformations
> in cases like the following:
>
> o = deref(N) alloc();
> if (c) free(o)
> while(true) {
>   if (c) break;
>   // With the current semantics, we will hoist o.f above the loop
>   v = o.f;
> }
>
> Despite this, Clang does emit the existing dereferenceable attribute in
> some problematic cases. We have observed miscompiles as a result, and
> optimizer has an assortment of hacks to try not to be too aggressive and
> miscompile too widely.
>
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#havent-we-already-solved-this>
> Haven't we already solved this?
>
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id3>
>
> This has been discussed relatively extensively in the past, included an
> accepted review (https://reviews.llvm.org/D61652) which proposed
> splitting the dereferenceable attribute into two to adress this. However,
> this change never landed and recent findings reveal that we both need a
> broader solution, and have an interesting oppurtunity to take advantage of
> other recent work.
>
> The need for a broader solution comes from the observation that deref(N)
> is not the only attribute with this problem. deref_or_null(N) is a fairly
> obvious case we'd known about with the previous proposal, but it was
> recently realized that other allocation related facts have this problem as
> well. We now have specific examples with allocsize(N,M) - and the baked in
> variants in MemoryBuiltins - and suspect there are other attributes, either
> current or future, with the same challenge.
>
> The opportunity comes from the addition of "nofree" attribute. Up
until
> recently, we really didn't have a good notion of "free"ing an
allocation in
> the abstract machine model. We used to comingle this with our notion of
> capture. (i.e. We'd assume that functions which could free must also
> capture.) With the explicit notion of "nofree", we have an
approach
> available to us we didn't before.
> The Proposal Itself
>
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id4>
>
> The basic idea is that we're going to redefine the currently globally
> scoped attributes (deref, deref_or_null, and allocsize) such that they
> imply a point in time fact only and then combine that with nofree to
> recover the previous global semantics.
>
> More specifically:
>
>    - A deref attribute on a function parameter will imply that the memory
>    is dereferenceable for a specified number of bytes at the instant the
>    function call occurs.
>    - A deref attribute on a function return will imply that the memory is
>    dereferenceable at the moment of return.
>
> We will then use the point in time fact combined with other information to
> drive inference of the global facts. While in principle we may loose
> optimization potential, we believe this is sufficient to infer the global
> facts in all practical cases we care about.
>
> Sample inference cases:
>
>    - A deref(N) argument to a function with the nofree and nosync
>    function attribute is known to be globally dereferenceable within the
scope
>    of the function call. We need the nosync to ensure that no other thread
is
>    freeing the memory on behalf of the callee in a coordinated manner.
>    - An argument with the attributes deref(N), noalias, and nofree is
>    known to be globally dereferenceable within the scope of the function
call.
>    This relies on the fact that free is modeled as writing to the memory
>    freed, and thus noalias ensures there is no other argument which can be
>    freed. (See discussion below.)
>    - A memory allocation in a function with a garbage collector which
>    guarantees collection occurs only at explicit safepoints and uses the
>    gc.statepoint infrastructure, is known to be globally dereferenceable if
>    there are no calls to gc.statepoint anywhere in the module. This
>    effectively refines the abstract machine model used for garbage
collection
>    before lowering by RS4GC to disallow explicit deallocation (for
collectors
>    which opt in).
>
> The items above are described in terms of deref(N) for ease of
> description. The other attributes are handle analogously.
>
> *Explanation*
>
> The "deref(N), noalias, + nofree" argument case requires a bit of
> explanation as it involves a bunch of subtleties.
>
> First, the current wording of nofree argument attribute implies that the
> callee can not arrange for another thread to free the object on it's
> behalf. This is different than the specification of the nofree function
> attribute. There is no "nosync" equivalent for function
attributes.
>
> Second, the noalias argument attribute is subtle. There's a couple of
> sub-cases worth discussing:
>
>    - If the noalias argument is written to (reminder: free is modeled as
>    a write), then it must be the only copy of the pointer passed to the
>    function and there can be no copies passed through memory used in the
scope
>    of function.
>    - If the noalias argument is only read from, then there may be other
>    copies of the pointer. However, all of those copies must also be read
only.
>    If the object was freed through one of those other copies, then we must
>    have at least one writeable copy and having the noalias on the read copy
>    was undefined behavior to begin with.
>
> Essentially, what we're doing with noalias is using it to promote a
fact
> about the pointer to a fact about the object being pointed to. Code
> structure wise, we should probably write it exactly that way.
>
> *Result*
>
> It's important to acknowledge that with this change, we will lose the
> ability to specify global dereferenceability of arguments and return values
> in the general case. We believe the current proposal allows us to recover
> that fact for all interesting cases, but if we've missed an important
use
> case we may need to iterate a bit.
>
> We've discussed a few alternatives (below) which could be revisited if
it
> turns out we are missing an important use case.
> Use Cases
>
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id5>
>
> *C++ References* -- A C++ reference implies that the value pointed to is
> dereferenceable at point of declaration, and that the reference itself is
> non-null. Of particular note, an object pointed to through a reference can
> be freed without introducing UB.
>
> class A { int f; };
> void ugly_delete(A &a) { delete &a; }ugly_delete(*new A());
> void ugly_delete2(A &a, A *a2) {
>   if (unknown)
>     // a.f can be *proven* deref here as it's deref on entry,
>     // and no free on path from entry to here.
>     x = a.f;
>   delete a2;
> }auto *a = new A();ugly_delete2(*a, a);
>
> A &foo() {...}
> A &a = foo();if (unknown)
>   delete b;// If a and b point to the same object, a.f may not be deref
hereif (unknown2)
>   a.f;
>
> *Garbage Collected Objects (Java)* -- LLVM supports two models of GCed
> objects, the abstract machine and the physical machine model. The later is
> essentially the same as that for c++ as deallocation points (at safepoints)
> are explicit. The former has objects conceptually live forever (i.e.
> reclaimation is handled outside the model).
>
> class A { int f; }
> void foo(A a) {
>   ...
>   // a.f is trivially deref anywhere in foo
>   x = a.f;
> }
> A a = new A();...// a.f is trivially deref following it's definition
> x = a.f;
> A foo();
> a = foo();...// a.f is (still) trivially deref
> x = a.f;
>
> *Rust Borrows* -- A rust reference argument (e.g. "borrow")
points to an
> object whose lifetime is guaranteed to be longer than the reference's
> defining scope. As such, the object is dereferenceable through the scope of
> the function. Today, rustc does emit a dereferenceable attribute using the
> current globally dereferenceable semantic.
>
> pub fn square(num: &i32) -> i32 {
>   num * num
> }square(&5);
> // a could be noalias, but isn't todaypub fn bar(a: &mut i32, b:
&i32) {
>   *a = a * b
> }
> bar(&mut 5, &2);
> // At first appearance, rust does not allow returning references.  So
return// attributes are not relevant.  This seems like a major language hole, so
this// should probably be checked with a language expert.
>
> Migration
>
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id6>
>
> Existing bytecode will be upgraded to the weaker non-global semantics.
> This provides forward compatibility, but does lose optimization potential
> for previously compiled bytecode.
>
> C++ and GC'd language frontends don't change.
>
> Rustc should emit noalias where possible. In particular, 'a' in the
case
> 'bar' above is currently not marked noalias and results in lost
> optimization potential as a result of this change. According to the rustc
> code, this is legal, but currently blocked on a noalias related miscompile.
> See https://github.com/rust-lang/rust/issues/54462 and
> https://github.com/rust-lang/rust/issues/54878 for further details. (My
> current belief is that all llvm side blockers have been resolved.)
>
> Frontends which want the global semantics should emit noalias, nofree, and
> nosync where appropriate. If this is not enough to recover optimizations in
> common cases, please explain why not. It's possible we've failed to
account
> for something.
> Alternative Designs
>
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id7>
>
> All of the alternate designs listed focus on recovering the full global
> deref semantics. Our hope is that any common case we've missed can be
> resolved with additional inference rules instead.
> Extend nofree to object semantics
>
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id8>
>
> The nofree argument attribute current describes whether an object can
> freed through some particular copy of the pointer. We could strength the
> semantics to imply that the object is not freed through any copy of the
> pointer in the specified scope.
>
> Doing so greatly weakens our ability to infer the nofree property. The
> current nofree property when combined with capture tracking in the caller
> is enough to prove interest deref facts over calls. We don't want to
loose
> the ability to infer that since it enables interesting transforms (such as
> code reordering over calls).
> Add a separate nofreeobj attribute
>
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id9>
>
> Rather than change nofree, we could add a parallel attribute with the
> stronger object property. This - combined with deref(N) as a point in time
> fact - would be enough to recover the current globally deferenceable
> semantics.
>
> The downside of this alternative is a) possible overkill, and b) the
> "ugly" factor of having two similar but not quite identical
attributes.
> Add an orthogonal attribute to promote pointer facts to object ones
>
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id10>
>
> To address the weakness of the former alternative, we could specify an
> attribute which strengthens arbitrary pointer facts to object facts.
> Examples of current pointer facts are attributes such as readonly, and
> writeonly.
>
> This has not been well explored; there's a huge possible design space
here.
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

-- 

Juneyoung Lee
Software Foundation Lab, Seoul National University
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210321/96b49976/attachment.html>

Philip Reames via llvm-dev

2021-Mar-25 18:38 UTC

head link

[llvm-dev] RFC: Decomposing deref(N) into deref(N) + nofree

Replying to myself to clarify a bit on the scope of this proposal.

I am proposing changes to semantics of the dereferenceable, 
dereferenceable_or_null, and allocsize attributes.  I am also proposing 
changes to semantics of the analogous dereferenceable, and 
dereferenceable_or_null metadata since they are directly analogous.  I 
am also proposing changes to how we infer dereferenceability for known 
allocation sites (as this essentially infers the attributes even though 
we don't express it that way in code).

I am not proposing changes to the dereferenceability handling of any 
other IR construct.  Specifically, this proposal does not involve any 
changes to handling of allocas or lifetime intrinsics.  The semantics of 
those, and all other unmentioned IR features, should be unchanged after 
this work.

Philip

On 3/17/21 2:22 PM, Philip Reames via llvm-dev wrote:>
> TLDR: We should change the existing dereferenceability related 
> attributes to imply point in time facts only, and re-infer stronger 
> global dereferenceability facts where needed.
>
>
>     Meta
>    
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id1>
>
> If you prefer to read proposals in a browser, you can read this email 
> here 
>
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst>.
>
> This proposal greatly benefited from multiple rounds of feedback from 
> Johannes, Artur, and Nick. All remaining mistakes are my own.
>
> Johannes deserves a lot of credit for driving previous iterations on 
> this design. In particular, I want to note that we've basically 
> returned to something Johannes first proposed several years ago, 
> before we had specified the nofree attribute family.
>
>
>     The Basic Problem
>    
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id2>
>
> We have a long standing semantic problem with the way we define 
> dereferenceability facts which makes it difficult to express C++ 
> references, or more generally, dereferenceability on objects which may 
> be freed at some point in the program. The current structure does lend 
> itself well to memory which can't be freed. As discussed in detail a 
> bit later, we want to seamlessly support both use cases.
>
> The basic statement of the problem is that a piece of memory marked 
> with deref(N) is assumed to remain dereferenceable indefinitely. For 
> an object which can be freed, marking it as deref can enable unsound 
> transformations in cases like the following:
>
> o = deref(N) alloc();
> if (c) free(o)
> while(true) {
>    if (c) break;
>    // With the current semantics, we will hoist o.f above the loop
>    v = o.f;
> }
>
> Despite this, Clang does emit the existing dereferenceable attribute 
> in some problematic cases. We have observed miscompiles as a result, 
> and optimizer has an assortment of hacks to try not to be too 
> aggressive and miscompile too 
>
widely.<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#havent-we-already-solved-this>
>
>
>     Haven't we already solved this?
>    
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id3>
>
> This has been discussed relatively extensively in the past, included 
> an accepted review (https://reviews.llvm.org/D61652 
> <https://reviews.llvm.org/D61652>) which proposed splitting the 
> dereferenceable attribute into two to adress this. However, this 
> change never landed and recent findings reveal that we both need a 
> broader solution, and have an interesting oppurtunity to take 
> advantage of other recent work.
>
> The need for a broader solution comes from the observation that 
> deref(N) is not the only attribute with this problem. deref_or_null(N) 
> is a fairly obvious case we'd known about with the previous proposal, 
> but it was recently realized that other allocation related facts have 
> this problem as well. We now have specific examples with 
> allocsize(N,M) - and the baked in variants in MemoryBuiltins - and 
> suspect there are other attributes, either current or future, with the 
> same challenge.
>
> The opportunity comes from the addition of "nofree" attribute. Up
> until recently, we really didn't have a good notion of
"free"ing an
> allocation in the abstract machine model. We used to comingle this 
> with our notion of capture. (i.e. We'd assume that functions which 
> could free must also capture.) With the explicit notion of
"nofree",
> we have an approach available to us we didn't before.
>
>
>     The Proposal Itself
>    
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id4>
>
> The basic idea is that we're going to redefine the currently globally 
> scoped attributes (deref, deref_or_null, and allocsize) such that they 
> imply a point in time fact only and then combine that with nofree to 
> recover the previous global semantics.
>
> More specifically:
>
>   * A deref attribute on a function parameter will imply that the
>     memory is dereferenceable for a specified number of bytes at the
>     instant the function call occurs.
>   * A deref attribute on a function return will imply that the memory
>     is dereferenceable at the moment of return.
>
> We will then use the point in time fact combined with other 
> information to drive inference of the global facts. While in principle 
> we may loose optimization potential, we believe this is sufficient to 
> infer the global facts in all practical cases we care about.
>
> Sample inference cases:
>
>   * A deref(N) argument to a function with the nofree and nosync
>     function attribute is known to be globally dereferenceable within
>     the scope of the function call. We need the nosync to ensure that
>     no other thread is freeing the memory on behalf of the callee in a
>     coordinated manner.
>   * An argument with the attributes deref(N), noalias, and nofree is
>     known to be globally dereferenceable within the scope of the
>     function call. This relies on the fact that free is modeled as
>     writing to the memory freed, and thus noalias ensures there is no
>     other argument which can be freed. (See discussion below.)
>   * A memory allocation in a function with a garbage collector which
>     guarantees collection occurs only at explicit safepoints and uses
>     the gc.statepoint infrastructure, is known to be globally
>     dereferenceable if there are no calls to gc.statepoint anywhere in
>     the module. This effectively refines the abstract machine model
>     used for garbage collection before lowering by RS4GC to disallow
>     explicit deallocation (for collectors which opt in).
>
> The items above are described in terms of deref(N) for ease of 
> description. The other attributes are handle analogously.
>
> *Explanation*
>
> The "deref(N), noalias, + nofree" argument case requires a bit of
> explanation as it involves a bunch of subtleties.
>
> First, the current wording of nofree argument attribute implies that 
> the callee can not arrange for another thread to free the object on 
> it's behalf. This is different than the specification of the nofree 
> function attribute. There is no "nosync" equivalent for function 
> attributes.
>
> Second, the noalias argument attribute is subtle. There's a couple of 
> sub-cases worth discussing:
>
>   * If the noalias argument is written to (reminder: free is modeled
>     as a write), then it must be the only copy of the pointer passed
>     to the function and there can be no copies passed through memory
>     used in the scope of function.
>   * If the noalias argument is only read from, then there may be other
>     copies of the pointer. However, all of those copies must also be
>     read only. If the object was freed through one of those other
>     copies, then we must have at least one writeable copy and having
>     the noalias on the read copy was undefined behavior to begin with.
>
> Essentially, what we're doing with noalias is using it to promote a 
> fact about the pointer to a fact about the object being pointed to. 
> Code structure wise, we should probably write it exactly that way.
>
> *Result*
>
> It's important to acknowledge that with this change, we will lose the 
> ability to specify global dereferenceability of arguments and return 
> values in the general case. We believe the current proposal allows us 
> to recover that fact for all interesting cases, but if we've missed an 
> important use case we may need to iterate a bit.
>
> We've discussed a few alternatives (below) which could be revisited if 
> it turns out we are missing an important use case.
>
>
>     Use Cases
>    
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id5>
>
> *C++ References* -- A C++ reference implies that the value pointed to 
> is dereferenceable at point of declaration, and that the reference 
> itself is non-null. Of particular note, an object pointed to through a 
> reference can be freed without introducing UB.
>
> class  A  {int  f; };
>
> void  ugly_delete(A &a) {delete  &a; }
> ugly_delete(*new  A());
>
> void  ugly_delete2(A &a, A *a2) {
>    if  (unknown)
>      // a.f can be *proven* deref here as it's deref on entry,
>      // and no free on path from entry to here.
>      x = a.f;
>    delete  a2;
> }
> auto  *a =new  A();
> ugly_delete2(*a, a);
>
> A &foo() {...}
> A &a = foo();
> if  (unknown)
>    delete  b;
> // If a and b point to the same object, a.f may not be deref here
> if  (unknown2)
>    a.f;
>
> *Garbage Collected Objects (Java)* -- LLVM supports two models of GCed 
> objects, the abstract machine and the physical machine model. The 
> later is essentially the same as that for c++ as deallocation points 
> (at safepoints) are explicit. The former has objects conceptually live 
> forever (i.e. reclaimation is handled outside the model).
>
> class  A  {int  f; }
>
> void  foo(A  a) {
>    ...
>    // a.f is trivially deref anywhere in foo
>    x=  a.f;
> }
>
> A  a=  new  A();
> ...
> // a.f is trivially deref following it's definition
> x=  a.f;
>
> A  foo();
> a=  foo();
> ...
> // a.f is (still) trivially deref
> x=  a.f;
>
> *Rust Borrows* -- A rust reference argument (e.g. "borrow")
points to
> an object whose lifetime is guaranteed to be longer than the 
> reference's defining scope. As such, the object is dereferenceable 
> through the scope of the function. Today, rustc does emit a 
> dereferenceable attribute using the current globally dereferenceable 
> semantic.
>
> pub  fn  square(num:&i32) ->i32  {
>    num*  num
> }
> square(&5);
>
> // a could be noalias, but isn't today
> pub  fn  bar(a:&mut  i32, b:&i32) {
>    *a=  a*  b
> }
>
> bar(&mut  5,&2);
>
> // At first appearance, rust does not allow returning references. So 
> return
> // attributes are not relevant. This seems like a major language hole, 
> so this
> // should probably be checked with a language expert.
>
>
>     Migration
>    
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id6>
>
> Existing bytecode will be upgraded to the weaker non-global semantics. 
> This provides forward compatibility, but does lose optimization 
> potential for previously compiled bytecode.
>
> C++ and GC'd language frontends don't change.
>
> Rustc should emit noalias where possible. In particular, 'a' in the
> case 'bar' above is currently not marked noalias and results in
lost
> optimization potential as a result of this change. According to the 
> rustc code, this is legal, but currently blocked on a noalias related 
> miscompile. See https://github.com/rust-lang/rust/issues/54462 
> <https://github.com/rust-lang/rust/issues/54462> and 
> https://github.com/rust-lang/rust/issues/54878 
> <https://github.com/rust-lang/rust/issues/54878> for further details.
> (My current belief is that all llvm side blockers have been resolved.)
>
> Frontends which want the global semantics should emit noalias, nofree, 
> and nosync where appropriate. If this is not enough to recover 
> optimizations in common cases, please explain why not. It's possible 
> we've failed to account for something.
>
>
>     Alternative Designs
>    
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id7>
>
> All of the alternate designs listed focus on recovering the full 
> global deref semantics. Our hope is that any common case we've missed 
> can be resolved with additional inference rules instead.
>
>
>       Extend nofree to object semantics
>      
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id8>
>
> The nofree argument attribute current describes whether an object can 
> freed through some particular copy of the pointer. We could strength 
> the semantics to imply that the object is not freed through any copy 
> of the pointer in the specified scope.
>
> Doing so greatly weakens our ability to infer the nofree property. The 
> current nofree property when combined with capture tracking in the 
> caller is enough to prove interest deref facts over calls. We don't 
> want to loose the ability to infer that since it enables interesting 
> transforms (such as code reordering over calls).
>
>
>       Add a separate nofreeobj attribute
>      
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id9>
>
> Rather than change nofree, we could add a parallel attribute with the 
> stronger object property. This - combined with deref(N) as a point in 
> time fact - would be enough to recover the current globally 
> deferenceable semantics.
>
> The downside of this alternative is a) possible overkill, and b) the 
> "ugly" factor of having two similar but not quite identical
attributes.
>
>
>       Add an orthogonal attribute to promote pointer facts to object
>       ones
>      
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id10>
>
> To address the weakness of the former alternative, we could specify an 
> attribute which strengthens arbitrary pointer facts to object facts. 
> Examples of current pointer facts are attributes such as readonly, and 
> writeonly.
>
> This has not been well explored; there's a huge possible design space 
> here.
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210325/b84c4d4c/attachment.html>

Philip Reames via llvm-dev

2021-Jul-12 16:04 UTC

head link

[llvm-dev] RFC: Decomposing deref(N) into deref(N) + nofree

At this point, I find myself needing to declare that the proposal below 
is a failure, and ask the community what next steps we'd prefer.

This effort stumbled into the fact that we don't seem to have any actual 
agreement on what the semantics of various attributes are. In 
particular, the semantics of nofree don't appear to be in a usable 
state, and my attempts at driving consensus have failed.  I am not 
willing to continue investing effort in that direction.

Given that, I see three options, and need input from the community as to 
which we should chose.

Option 1 - Back out the couple of changes which have landed, update 
LangRef to be explicit about the scoped dereferenceability we had 
historically, and consider this effort a failure.

Option 2 - Change the semantic of the attributes to the point in time 
semantic *without* attempting any further inference of the scoped 
semantics.  At the current moment, the Java use case is covered (via the 
GC rule), no one seems to care about the lost optimization power for 
C/C++, and I am unclear on the practical impact (if any) on rust.

Option 3 - Introduce a new 'nofreeobj' attribute whose semantics would 
be specifically that an object is not freed in the dynamic scope of the 
function through any mechanism (including concurrency).  This attribute 
would be basically uninferrable, and would exist only to support 
language guarantees being encoded by frontends.

My recommendation would be for option 2, than 3, than 1.  It's worth 
noting that we could also chose option 2, then implement option 3 lazily 
if anyone reports a practical performance regression.

Philip

On 3/17/21 2:22 PM, Philip Reames via llvm-dev wrote:>
> TLDR: We should change the existing dereferenceability related 
> attributes to imply point in time facts only, and re-infer stronger 
> global dereferenceability facts where needed.
>
>
>     Meta
>    
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id1>
>
> If you prefer to read proposals in a browser, you can read this email 
> here 
>
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst>.
>
> This proposal greatly benefited from multiple rounds of feedback from 
> Johannes, Artur, and Nick. All remaining mistakes are my own.
>
> Johannes deserves a lot of credit for driving previous iterations on 
> this design. In particular, I want to note that we've basically 
> returned to something Johannes first proposed several years ago, 
> before we had specified the nofree attribute family.
>
>
>     The Basic Problem
>    
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id2>
>
> We have a long standing semantic problem with the way we define 
> dereferenceability facts which makes it difficult to express C++ 
> references, or more generally, dereferenceability on objects which may 
> be freed at some point in the program. The current structure does lend 
> itself well to memory which can't be freed. As discussed in detail a 
> bit later, we want to seamlessly support both use cases.
>
> The basic statement of the problem is that a piece of memory marked 
> with deref(N) is assumed to remain dereferenceable indefinitely. For 
> an object which can be freed, marking it as deref can enable unsound 
> transformations in cases like the following:
>
> o = deref(N) alloc();
> if (c) free(o)
> while(true) {
>    if (c) break;
>    // With the current semantics, we will hoist o.f above the loop
>    v = o.f;
> }
>
> Despite this, Clang does emit the existing dereferenceable attribute 
> in some problematic cases. We have observed miscompiles as a result, 
> and optimizer has an assortment of hacks to try not to be too 
> aggressive and miscompile too 
>
widely.<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#havent-we-already-solved-this>
>
>
>     Haven't we already solved this?
>    
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id3>
>
> This has been discussed relatively extensively in the past, included 
> an accepted review (https://reviews.llvm.org/D61652 
> <https://reviews.llvm.org/D61652>) which proposed splitting the 
> dereferenceable attribute into two to adress this. However, this 
> change never landed and recent findings reveal that we both need a 
> broader solution, and have an interesting oppurtunity to take 
> advantage of other recent work.
>
> The need for a broader solution comes from the observation that 
> deref(N) is not the only attribute with this problem. deref_or_null(N) 
> is a fairly obvious case we'd known about with the previous proposal, 
> but it was recently realized that other allocation related facts have 
> this problem as well. We now have specific examples with 
> allocsize(N,M) - and the baked in variants in MemoryBuiltins - and 
> suspect there are other attributes, either current or future, with the 
> same challenge.
>
> The opportunity comes from the addition of "nofree" attribute. Up
> until recently, we really didn't have a good notion of
"free"ing an
> allocation in the abstract machine model. We used to comingle this 
> with our notion of capture. (i.e. We'd assume that functions which 
> could free must also capture.) With the explicit notion of
"nofree",
> we have an approach available to us we didn't before.
>
>
>     The Proposal Itself
>    
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id4>
>
> The basic idea is that we're going to redefine the currently globally 
> scoped attributes (deref, deref_or_null, and allocsize) such that they 
> imply a point in time fact only and then combine that with nofree to 
> recover the previous global semantics.
>
> More specifically:
>
>   * A deref attribute on a function parameter will imply that the
>     memory is dereferenceable for a specified number of bytes at the
>     instant the function call occurs.
>   * A deref attribute on a function return will imply that the memory
>     is dereferenceable at the moment of return.
>
> We will then use the point in time fact combined with other 
> information to drive inference of the global facts. While in principle 
> we may loose optimization potential, we believe this is sufficient to 
> infer the global facts in all practical cases we care about.
>
> Sample inference cases:
>
>   * A deref(N) argument to a function with the nofree and nosync
>     function attribute is known to be globally dereferenceable within
>     the scope of the function call. We need the nosync to ensure that
>     no other thread is freeing the memory on behalf of the callee in a
>     coordinated manner.
>   * An argument with the attributes deref(N), noalias, and nofree is
>     known to be globally dereferenceable within the scope of the
>     function call. This relies on the fact that free is modeled as
>     writing to the memory freed, and thus noalias ensures there is no
>     other argument which can be freed. (See discussion below.)
>   * A memory allocation in a function with a garbage collector which
>     guarantees collection occurs only at explicit safepoints and uses
>     the gc.statepoint infrastructure, is known to be globally
>     dereferenceable if there are no calls to gc.statepoint anywhere in
>     the module. This effectively refines the abstract machine model
>     used for garbage collection before lowering by RS4GC to disallow
>     explicit deallocation (for collectors which opt in).
>
> The items above are described in terms of deref(N) for ease of 
> description. The other attributes are handle analogously.
>
> *Explanation*
>
> The "deref(N), noalias, + nofree" argument case requires a bit of
> explanation as it involves a bunch of subtleties.
>
> First, the current wording of nofree argument attribute implies that 
> the callee can not arrange for another thread to free the object on 
> it's behalf. This is different than the specification of the nofree 
> function attribute. There is no "nosync" equivalent for function 
> attributes.
>
> Second, the noalias argument attribute is subtle. There's a couple of 
> sub-cases worth discussing:
>
>   * If the noalias argument is written to (reminder: free is modeled
>     as a write), then it must be the only copy of the pointer passed
>     to the function and there can be no copies passed through memory
>     used in the scope of function.
>   * If the noalias argument is only read from, then there may be other
>     copies of the pointer. However, all of those copies must also be
>     read only. If the object was freed through one of those other
>     copies, then we must have at least one writeable copy and having
>     the noalias on the read copy was undefined behavior to begin with.
>
> Essentially, what we're doing with noalias is using it to promote a 
> fact about the pointer to a fact about the object being pointed to. 
> Code structure wise, we should probably write it exactly that way.
>
> *Result*
>
> It's important to acknowledge that with this change, we will lose the 
> ability to specify global dereferenceability of arguments and return 
> values in the general case. We believe the current proposal allows us 
> to recover that fact for all interesting cases, but if we've missed an 
> important use case we may need to iterate a bit.
>
> We've discussed a few alternatives (below) which could be revisited if 
> it turns out we are missing an important use case.
>
>
>     Use Cases
>    
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id5>
>
> *C++ References* -- A C++ reference implies that the value pointed to 
> is dereferenceable at point of declaration, and that the reference 
> itself is non-null. Of particular note, an object pointed to through a 
> reference can be freed without introducing UB.
>
> class  A  {int  f; };
>
> void  ugly_delete(A &a) {delete  &a; }
> ugly_delete(*new  A());
>
> void  ugly_delete2(A &a, A *a2) {
>    if  (unknown)
>      // a.f can be *proven* deref here as it's deref on entry,
>      // and no free on path from entry to here.
>      x = a.f;
>    delete  a2;
> }
> auto  *a =new  A();
> ugly_delete2(*a, a);
>
> A &foo() {...}
> A &a = foo();
> if  (unknown)
>    delete  b;
> // If a and b point to the same object, a.f may not be deref here
> if  (unknown2)
>    a.f;
>
> *Garbage Collected Objects (Java)* -- LLVM supports two models of GCed 
> objects, the abstract machine and the physical machine model. The 
> later is essentially the same as that for c++ as deallocation points 
> (at safepoints) are explicit. The former has objects conceptually live 
> forever (i.e. reclaimation is handled outside the model).
>
> class  A  {int  f; }
>
> void  foo(A  a) {
>    ...
>    // a.f is trivially deref anywhere in foo
>    x=  a.f;
> }
>
> A  a=  new  A();
> ...
> // a.f is trivially deref following it's definition
> x=  a.f;
>
> A  foo();
> a=  foo();
> ...
> // a.f is (still) trivially deref
> x=  a.f;
>
> *Rust Borrows* -- A rust reference argument (e.g. "borrow")
points to
> an object whose lifetime is guaranteed to be longer than the 
> reference's defining scope. As such, the object is dereferenceable 
> through the scope of the function. Today, rustc does emit a 
> dereferenceable attribute using the current globally dereferenceable 
> semantic.
>
> pub  fn  square(num:&i32) ->i32  {
>    num*  num
> }
> square(&5);
>
> // a could be noalias, but isn't today
> pub  fn  bar(a:&mut  i32, b:&i32) {
>    *a=  a*  b
> }
>
> bar(&mut  5,&2);
>
> // At first appearance, rust does not allow returning references. So 
> return
> // attributes are not relevant. This seems like a major language hole, 
> so this
> // should probably be checked with a language expert.
>
>
>     Migration
>    
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id6>
>
> Existing bytecode will be upgraded to the weaker non-global semantics. 
> This provides forward compatibility, but does lose optimization 
> potential for previously compiled bytecode.
>
> C++ and GC'd language frontends don't change.
>
> Rustc should emit noalias where possible. In particular, 'a' in the
> case 'bar' above is currently not marked noalias and results in
lost
> optimization potential as a result of this change. According to the 
> rustc code, this is legal, but currently blocked on a noalias related 
> miscompile. See https://github.com/rust-lang/rust/issues/54462 
> <https://github.com/rust-lang/rust/issues/54462> and 
> https://github.com/rust-lang/rust/issues/54878 
> <https://github.com/rust-lang/rust/issues/54878> for further details.
> (My current belief is that all llvm side blockers have been resolved.)
>
> Frontends which want the global semantics should emit noalias, nofree, 
> and nosync where appropriate. If this is not enough to recover 
> optimizations in common cases, please explain why not. It's possible 
> we've failed to account for something.
>
>
>     Alternative Designs
>    
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id7>
>
> All of the alternate designs listed focus on recovering the full 
> global deref semantics. Our hope is that any common case we've missed 
> can be resolved with additional inference rules instead.
>
>
>       Extend nofree to object semantics
>      
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id8>
>
> The nofree argument attribute current describes whether an object can 
> freed through some particular copy of the pointer. We could strength 
> the semantics to imply that the object is not freed through any copy 
> of the pointer in the specified scope.
>
> Doing so greatly weakens our ability to infer the nofree property. The 
> current nofree property when combined with capture tracking in the 
> caller is enough to prove interest deref facts over calls. We don't 
> want to loose the ability to infer that since it enables interesting 
> transforms (such as code reordering over calls).
>
>
>       Add a separate nofreeobj attribute
>      
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id9>
>
> Rather than change nofree, we could add a parallel attribute with the 
> stronger object property. This - combined with deref(N) as a point in 
> time fact - would be enough to recover the current globally 
> deferenceable semantics.
>
> The downside of this alternative is a) possible overkill, and b) the 
> "ugly" factor of having two similar but not quite identical
attributes.
>
>
>       Add an orthogonal attribute to promote pointer facts to object
>       ones
>      
<https://github.com/preames/public-notes/blob/master/deref+nofree.rst#id10>
>
> To address the weakness of the former alternative, we could specify an 
> attribute which strengthens arbitrary pointer facts to object facts. 
> Examples of current pointer facts are attributes such as readonly, and 
> writeonly.
>
> This has not been well explored; there's a huge possible design space 
> here.
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210712/f6d09bf3/attachment.html>

llvm dev - Mar 2021 - RFC: Decomposing deref(N) into deref(N) + nofree

[llvm-dev] RFC: Decomposing deref(N) into deref(N) + nofree

[llvm-dev] RFC: Decomposing deref(N) into deref(N) + nofree

[llvm-dev] RFC: Decomposing deref(N) into deref(N) + nofree

[llvm-dev] RFC: Decomposing deref(N) into deref(N) + nofree

[llvm-dev] RFC: Decomposing deref(N) into deref(N) + nofree