thr3ads.net - llvm dev - [llvm-dev] [RFC] A nofree (and nosynch) function attribute: Mixing dereferenceable and delete [Jul 2018]

If this information is useful, please help other people find it:
Share via:

Hal Finkel via llvm-dev

2018-Jul-11 23:12 UTC

[llvm-dev] [RFC] A nofree (and nosynch) function attribute: Mixing dereferenceable and delete

[+Richard]


On 07/11/2018 08:29 AM, Sanjoy Das wrote:> I'm not sure if nosynch is sufficient.  What if we had:
>
> void f(int& x) {
>   if (false) {
>     int r0 = x;
>   }
> }
>
> // other thread
> free(<pointer to x>);
>
> The source program is race free, but LLVM may speculate the read from
> x (seeing that it is dereferenceable) creating a race.
Interestingly, I'm not sure. I trust that Richard can answer this
question. :-)

So, if we had:

int y = ...;
...
f(y);

then I think that Clang's use of dereferenceable is almost certainly
okay (because the standard explicitly says, 9.2.3.2p5, "A reference
shall be initialized to refer to a valid object or
function."). Because the reference must have been valid when f(y) began
executing, unless it synchronizes somehow with the other thread, any
asynchronous deletion of y must be a race.

On the other hand, if we have:

int &y = ...;
...
f(y);

do we know that, when f(y) begins executing, the reference points to a
valid object? My reading of 9.3.3p2, which says, "Argument passing
(7.6.1.2) and
function value return (8.6.3) are initializations.", combined with the
statement above, implies that, perhaps surprisingly, the same holds
here. When the argument to f is initialized, it must refer to a valid
object (even if the initializer is another reference).

Richard, what do you think?

Thanks again,
Hal

P.S. If I'm right, then I might be happy, but it's also somewhat scary
(although we've been doing this optimization for multiple releases and I
don't think we have a bug along these lines), and I'd at least smell the
need for a sanitizer.
>
> -- Sanjoy
> On Tue, Jul 10, 2018 at 7:01 PM Hal Finkel via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>> Hi, everyone,
>>
>> I'd like to propose adding a nofree function attribute to indicate
that
>> a function does not, directly or indirectly, call a memory-deallocation
>> function (e.g., free, C++'s operator delete). Clang/LLVM can
currently
>> misoptimize functions that:
>>
>>  1. Have a reference argument.
>>
>>  2. Free the memory backing the object to which the reference is bound
>> during the function's execution.
>>
>> Because we tag, in Clang, all reference arguments using the
>> dereferenceable attribute, LLVM assumes that the pointer is
>> unconditionally dereferenceable throughout the course of the entire
>> function. This isn't true, however, if the memory is freed during
the
>> execution of the function. For more information, please see the
>> discussion in https://reviews.llvm.org/D48239.
>>
>> To solve this problem, we need to give LLVM more information in order
to
>> help it determine when a pointer, which is dereferenceable when the
>> functions begins to execute, will still be dereferenceable later on in
>> the function's execution. This nofree attribute can be part of that
>> solution. If we know that free (and friends) are not called by the
>> function (nor by any function called by the function, and so on), then
>> we know that pointers that started out dereferenceable will stay that
>> way (except as explained below).
>>
>> I'm initially proposing this to be only a function attribute,
although
>> one could easily imagine a parameter attribute as well (that indicates
>> that a particular pointer argument is not freed by the function). This
>> might be useful, but for the use case of helping dereferenceable, it
>> would be subtle to use, unless the parameter was also marked as
noalias,
>> because you'd need to know that the parameter was not also aliased
with
>> another argument (or had not been captured). Another analysis would
need
>> to provide this kind of information.
>>
>> Also, just because a function does not, directly or indirectly, call
>> free does not mean that it cannot cause memory to be deallocated. The
>> function might communicate (synchronize) with another thread causing
>> that other thread to delete the memory. For this reason, to use
>> dereferenceable as we currently do, we also need to know that the
>> function does not synchronize with any other threads. To solve this
>> problem, like nofree, I propose to add a nosynch attribute (to indicate
>> that a function does not use (non-relaxed) atomics or otherwise
>> synchronize with any other threads (e.g., perform I/O or, as a
practical
>> matter, use volatile accesses).
>>
>> I've posted a patch for the nofree attribute
>> (https://reviews.llvm.org/D49165). nosynch's implementation would
be
>> very similar (except instead of looking for calls to free, it would
look
>> for uses of non-relaxed atomics, volatile ops, and known functions that
>> are not I/O functions).
>>
>> With both of these attributes (nofree and nosynch), a function argument
>> with the dereferenceable attribute will be known to be dereferenceable
>> throughout the execution of the attributed function. We can update
>> isDereferenceableAndAlignedPointer to include these additional checks
on
>> the current function.
>>
>> One more choice we have: We can, as I proposed above, essentially
weaken
>> the current semantics of dereferenceable to not exclude
>> mid-function-execution deallocation. We can also add a second attribute
>> with the current, stronger, semantics. We can keep the current
attribute
>> as-is, and add a second attribute with the weaker semantics (and switch
>> Clang to use that).
>>
>> Please let me know what you think.
>>
>> Thanks again,
>>
>> Hal
>>
>> --
>> Hal Finkel
>> Lead, Compiler Technology and Programming Languages
>> Leadership Computing Facility
>> Argonne National Laboratory
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

Sanjoy Das via llvm-dev

2018-Jul-11 23:41 UTC

head link

[llvm-dev] [RFC] A nofree (and nosynch) function attribute: Mixing dereferenceable and delete

On Wed, Jul 11, 2018 at 4:13 PM Hal Finkel <hfinkel at anl.gov>
wrote:> Interestingly, I'm not sure. I trust that Richard can answer this
> question. :-)
>
> So, if we had:
>
> int y = ...;
> ...
> f(y);
>
> then I think that Clang's use of dereferenceable is almost certainly
> okay (because the standard explicitly says, 9.2.3.2p5, "A reference
> shall be initialized to refer to a valid object or
> function."). Because the reference must have been valid when f(y)
began
> executing, unless it synchronizes somehow with the other thread, any
> asynchronous deletion of y must be a race.
>
> On the other hand, if we have:
>
> int &y = ...;
> ...
> f(y);
>
> do we know that, when f(y) begins executing, the reference points to a
> valid object? My reading of 9.3.3p2, which says, "Argument passing
> (7.6.1.2) and
> function value return (8.6.3) are initializations.", combined with the
> statement above, implies that, perhaps surprisingly, the same holds
> here. When the argument to f is initialized, it must refer to a valid
> object (even if the initializer is another reference).
Ok, I didn't know this.  If this is true then nosynch + nofree seems
sufficient to me.  And I realized my example is needlessly complex; if
arg passing isn't initialization then this is a problem too:

int&y = *ptr;
free(ptr);
f(y)

-- Sanjoy

Richard Smith via llvm-dev

2018-Jul-11 23:43 UTC

head link

[llvm-dev] [RFC] A nofree (and nosynch) function attribute: Mixing dereferenceable and delete

On Wed, 11 Jul 2018 at 16:13, Hal Finkel via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> [+Richard]
>
>
> On 07/11/2018 08:29 AM, Sanjoy Das wrote:
> > I'm not sure if nosynch is sufficient.  What if we had:
> >
> > void f(int& x) {
> >   if (false) {
> >     int r0 = x;
> >   }
> > }
> >
> > // other thread
> > free(<pointer to x>);
> >
> > The source program is race free, but LLVM may speculate the read from
> > x (seeing that it is dereferenceable) creating a race.
>
> Interestingly, I'm not sure. I trust that Richard can answer this
> question. :-)
>
> So, if we had:
>
> int y = ...;
> ...
> f(y);
>
> then I think that Clang's use of dereferenceable is almost certainly
> okay (because the standard explicitly says, 9.2.3.2p5, "A reference
> shall be initialized to refer to a valid object or
> function."). Because the reference must have been valid when f(y)
began
> executing, unless it synchronizes somehow with the other thread, any
> asynchronous deletion of y must be a race.
>
> On the other hand, if we have:
>
> int &y = ...;
> ...
> f(y);
>
> do we know that, when f(y) begins executing, the reference points to a
> valid object? My reading of 9.3.3p2, which says, "Argument passing
> (7.6.1.2) and
> function value return (8.6.3) are initializations.", combined with the
> statement above, implies that, perhaps surprisingly, the same holds
> here. When the argument to f is initialized, it must refer to a valid
> object (even if the initializer is another reference).
>
> Richard, what do you think?
>
First, see also core issue 453 <http://wg21.link/cwg453>, under the guise
of which we're fixing the wording in [dcl.ref](9.2.3.2)p5 from

  "A reference shall be initialized to refer to a valid object or
function."

to something like

  "If an lvalue to which a reference is directly bound designates neither
an existing object or function of an appropriate type (11.6.3
[dcl.init.ref]), nor a region of storage of suitable size and alignment to
contain an object of the reference's type (4.5 [intro.object], 6.8
[basic.life], 6.9 [basic.types]), the behavior is undefined."

My take is that, if the end of the duration of the region of storage is
unsequenced with respect to the binding of the reference, then behavior is
undefined. Generally when we refer to a thing happening while some
condition is true, we mean that the execution point when the condition
became true is sequenced before the thing happening, and the execution
point where it becomes not true again is sequenced after.

So the behavior of that program is undefined regardless of whether 'f'
actually loads through 'x'.

> Thanks again,
> Hal
>
> P.S. If I'm right, then I might be happy, but it's also somewhat
scary
> (although we've been doing this optimization for multiple releases and
I
> don't think we have a bug along these lines), and I'd at least
smell the
> need for a sanitizer.
>
> >
> > -- Sanjoy
> > On Tue, Jul 10, 2018 at 7:01 PM Hal Finkel via llvm-dev
> > <llvm-dev at lists.llvm.org> wrote:
> >> Hi, everyone,
> >>
> >> I'd like to propose adding a nofree function attribute to
indicate that
> >> a function does not, directly or indirectly, call a
memory-deallocation
> >> function (e.g., free, C++'s operator delete). Clang/LLVM can
currently
> >> misoptimize functions that:
> >>
> >>  1. Have a reference argument.
> >>
> >>  2. Free the memory backing the object to which the reference is
bound
> >> during the function's execution.
> >>
> >> Because we tag, in Clang, all reference arguments using the
> >> dereferenceable attribute, LLVM assumes that the pointer is
> >> unconditionally dereferenceable throughout the course of the
entire
> >> function. This isn't true, however, if the memory is freed
during the
> >> execution of the function. For more information, please see the
> >> discussion in https://reviews.llvm.org/D48239.
> >>
> >> To solve this problem, we need to give LLVM more information in
order to
> >> help it determine when a pointer, which is dereferenceable when
the
> >> functions begins to execute, will still be dereferenceable later
on in
> >> the function's execution. This nofree attribute can be part of
that
> >> solution. If we know that free (and friends) are not called by the
> >> function (nor by any function called by the function, and so on),
then
> >> we know that pointers that started out dereferenceable will stay
that
> >> way (except as explained below).
> >>
> >> I'm initially proposing this to be only a function attribute,
although
> >> one could easily imagine a parameter attribute as well (that
indicates
> >> that a particular pointer argument is not freed by the function).
This
> >> might be useful, but for the use case of helping dereferenceable,
it
> >> would be subtle to use, unless the parameter was also marked as
noalias,
> >> because you'd need to know that the parameter was not also
aliased with
> >> another argument (or had not been captured). Another analysis
would need
> >> to provide this kind of information.
> >>
> >> Also, just because a function does not, directly or indirectly,
call
> >> free does not mean that it cannot cause memory to be deallocated.
The
> >> function might communicate (synchronize) with another thread
causing
> >> that other thread to delete the memory. For this reason, to use
> >> dereferenceable as we currently do, we also need to know that the
> >> function does not synchronize with any other threads. To solve
this
> >> problem, like nofree, I propose to add a nosynch attribute (to
indicate
> >> that a function does not use (non-relaxed) atomics or otherwise
> >> synchronize with any other threads (e.g., perform I/O or, as a
practical
> >> matter, use volatile accesses).
> >>
> >> I've posted a patch for the nofree attribute
> >> (https://reviews.llvm.org/D49165). nosynch's implementation
would be
> >> very similar (except instead of looking for calls to free, it
would look
> >> for uses of non-relaxed atomics, volatile ops, and known functions
that
> >> are not I/O functions).
> >>
> >> With both of these attributes (nofree and nosynch), a function
argument
> >> with the dereferenceable attribute will be known to be
dereferenceable
> >> throughout the execution of the attributed function. We can update
> >> isDereferenceableAndAlignedPointer to include these additional
checks on
> >> the current function.
> >>
> >> One more choice we have: We can, as I proposed above, essentially
weaken
> >> the current semantics of dereferenceable to not exclude
> >> mid-function-execution deallocation. We can also add a second
attribute
> >> with the current, stronger, semantics. We can keep the current
attribute
> >> as-is, and add a second attribute with the weaker semantics (and
switch
> >> Clang to use that).
> >>
> >> Please let me know what you think.
> >>
> >> Thanks again,
> >>
> >> Hal
> >>
> >> --
> >> Hal Finkel
> >> Lead, Compiler Technology and Programming Languages
> >> Leadership Computing Facility
> >> Argonne National Laboratory
> >>
> >> _______________________________________________
> >> LLVM Developers mailing list
> >> llvm-dev at lists.llvm.org
> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> --
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180711/42c7a678/attachment.html>

Hal Finkel via llvm-dev

2018-Jul-12 00:20 UTC

head link

[llvm-dev] [RFC] A nofree (and nosynch) function attribute: Mixing dereferenceable and delete

Thanks, Richard.

Based on the feedback from this thread, I'll move forward with the
patches for nofree, nosync, adding a new corresponding dereferenceable
attribute (my suggestion is to name this dereferenceable_on_entry;
suggestions welcome), and updating Clang is emit this new attribute
instead of the current one.

 -Hal

On 07/11/2018 06:43 PM, Richard Smith wrote:> On Wed, 11 Jul 2018 at 16:13, Hal Finkel via llvm-dev
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
>
>     [+Richard]
>
>
>     On 07/11/2018 08:29 AM, Sanjoy Das wrote:
>     > I'm not sure if nosynch is sufficient.  What if we had:
>     >
>     > void f(int& x) {
>     >   if (false) {
>     >     int r0 = x;
>     >   }
>     > }
>     >
>     > // other thread
>     > free(<pointer to x>);
>     >
>     > The source program is race free, but LLVM may speculate the read
>     from
>     > x (seeing that it is dereferenceable) creating a race.
>
>     Interestingly, I'm not sure. I trust that Richard can answer this
>     question. :-)
>
>     So, if we had:
>
>     int y = ...;
>     ...
>     f(y);
>
>     then I think that Clang's use of dereferenceable is almost
certainly
>     okay (because the standard explicitly says, 9.2.3.2p5, "A
reference
>     shall be initialized to refer to a valid object or
>     function."). Because the reference must have been valid when f(y)
>     began
>     executing, unless it synchronizes somehow with the other thread, any
>     asynchronous deletion of y must be a race.
>
>     On the other hand, if we have:
>
>     int &y = ...;
>     ...
>     f(y);
>
>     do we know that, when f(y) begins executing, the reference points to a
>     valid object? My reading of 9.3.3p2, which says, "Argument passing
>     (7.6.1.2) and
>     function value return (8.6.3) are initializations.", combined with
the
>     statement above, implies that, perhaps surprisingly, the same holds
>     here. When the argument to f is initialized, it must refer to a valid
>     object (even if the initializer is another reference).
>
>     Richard, what do you think?
>
>
> First, see also core issue 453 <http://wg21.link/cwg453>, under the
> guise of which we're fixing the wording in [dcl.ref](9.2.3.2)p5 from
>
>   "A reference shall be initialized to refer to a valid object or
> function."
>
> to something like
>
>   "If an lvalue to which a reference is directly bound designates
> neither an existing object or function of an appropriate type (11.6.3
> [dcl.init.ref]), nor a region of storage of suitable size and
> alignment to contain an object of the reference's type (4.5
> [intro.object], 6.8 [basic.life], 6.9 [basic.types]), the behavior is
> undefined."
>
> My take is that, if the end of the duration of the region of storage
> is unsequenced with respect to the binding of the reference, then
> behavior is undefined. Generally when we refer to a thing happening
> while some condition is true, we mean that the execution point when
> the condition became true is sequenced before the thing happening, and
> the execution point where it becomes not true again is sequenced after.
>
> So the behavior of that program is undefined regardless of whether
'f'
> actually loads through 'x'.
>  
>
>     Thanks again,
>     Hal
>
>     P.S. If I'm right, then I might be happy, but it's also
somewhat scary
>     (although we've been doing this optimization for multiple releases
>     and I
>     don't think we have a bug along these lines), and I'd at least
>     smell the
>     need for a sanitizer.
>
>     >
>     > -- Sanjoy
>     > On Tue, Jul 10, 2018 at 7:01 PM Hal Finkel via llvm-dev
>     > <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
>     >> Hi, everyone,
>     >>
>     >> I'd like to propose adding a nofree function attribute to
>     indicate that
>     >> a function does not, directly or indirectly, call a
>     memory-deallocation
>     >> function (e.g., free, C++'s operator delete). Clang/LLVM
can
>     currently
>     >> misoptimize functions that:
>     >>
>     >>  1. Have a reference argument.
>     >>
>     >>  2. Free the memory backing the object to which the reference
>     is bound
>     >> during the function's execution.
>     >>
>     >> Because we tag, in Clang, all reference arguments using the
>     >> dereferenceable attribute, LLVM assumes that the pointer is
>     >> unconditionally dereferenceable throughout the course of the
entire
>     >> function. This isn't true, however, if the memory is freed
>     during the
>     >> execution of the function. For more information, please see
the
>     >> discussion in https://reviews.llvm.org/D48239.
>     >>
>     >> To solve this problem, we need to give LLVM more information
in
>     order to
>     >> help it determine when a pointer, which is dereferenceable
when the
>     >> functions begins to execute, will still be dereferenceable
>     later on in
>     >> the function's execution. This nofree attribute can be
part of that
>     >> solution. If we know that free (and friends) are not called by
the
>     >> function (nor by any function called by the function, and so
>     on), then
>     >> we know that pointers that started out dereferenceable will
>     stay that
>     >> way (except as explained below).
>     >>
>     >> I'm initially proposing this to be only a function
attribute,
>     although
>     >> one could easily imagine a parameter attribute as well (that
>     indicates
>     >> that a particular pointer argument is not freed by the
>     function). This
>     >> might be useful, but for the use case of helping
>     dereferenceable, it
>     >> would be subtle to use, unless the parameter was also marked
as
>     noalias,
>     >> because you'd need to know that the parameter was not also
>     aliased with
>     >> another argument (or had not been captured). Another analysis
>     would need
>     >> to provide this kind of information.
>     >>
>     >> Also, just because a function does not, directly or
indirectly,
>     call
>     >> free does not mean that it cannot cause memory to be
>     deallocated. The
>     >> function might communicate (synchronize) with another thread
>     causing
>     >> that other thread to delete the memory. For this reason, to
use
>     >> dereferenceable as we currently do, we also need to know that
the
>     >> function does not synchronize with any other threads. To solve
this
>     >> problem, like nofree, I propose to add a nosynch attribute (to
>     indicate
>     >> that a function does not use (non-relaxed) atomics or
otherwise
>     >> synchronize with any other threads (e.g., perform I/O or, as a
>     practical
>     >> matter, use volatile accesses).
>     >>
>     >> I've posted a patch for the nofree attribute
>     >> (https://reviews.llvm.org/D49165). nosynch's
implementation
>     would be
>     >> very similar (except instead of looking for calls to free, it
>     would look
>     >> for uses of non-relaxed atomics, volatile ops, and known
>     functions that
>     >> are not I/O functions).
>     >>
>     >> With both of these attributes (nofree and nosynch), a function
>     argument
>     >> with the dereferenceable attribute will be known to be
>     dereferenceable
>     >> throughout the execution of the attributed function. We can
update
>     >> isDereferenceableAndAlignedPointer to include these additional
>     checks on
>     >> the current function.
>     >>
>     >> One more choice we have: We can, as I proposed above,
>     essentially weaken
>     >> the current semantics of dereferenceable to not exclude
>     >> mid-function-execution deallocation. We can also add a second
>     attribute
>     >> with the current, stronger, semantics. We can keep the current
>     attribute
>     >> as-is, and add a second attribute with the weaker semantics
>     (and switch
>     >> Clang to use that).
>     >>
>     >> Please let me know what you think.
>     >>
>     >> Thanks again,
>     >>
>     >> Hal
>     >>
>     >> --
>     >> Hal Finkel
>     >> Lead, Compiler Technology and Programming Languages
>     >> Leadership Computing Facility
>     >> Argonne National Laboratory
>     >>
>     >> _______________________________________________
>     >> LLVM Developers mailing list
>     >> llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
>     >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>     -- 
>     Hal Finkel
>     Lead, Compiler Technology and Programming Languages
>     Leadership Computing Facility
>     Argonne National Laboratory
>
>     _______________________________________________
>     LLVM Developers mailing list
>     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20180711/b4ac5cc2/attachment.html>

Apparently Analagous Threads

Search for more reasonably related threads

llvm dev - Jul 2018 - [RFC] A nofree (and nosynch) function attribute: Mixing dereferenceable and delete

[llvm-dev] [RFC] A nofree (and nosynch) function attribute: Mixing dereferenceable and delete

[llvm-dev] [RFC] A nofree (and nosynch) function attribute: Mixing dereferenceable and delete

[llvm-dev] [RFC] A nofree (and nosynch) function attribute: Mixing dereferenceable and delete

[llvm-dev] [RFC] A nofree (and nosynch) function attribute: Mixing dereferenceable and delete

Apparently Analagous Threads