thr3ads.net - llvm dev - [llvm-dev] @llvm.memcpy not honoring volatile? [Jun 2019]

If this information is useful, please help other people find it:
Share via:

JF Bastien via llvm-dev

2019-Jun-11 16:07 UTC

[llvm-dev] @llvm.memcpy not honoring volatile?

> On Jun 11, 2019, at 6:27 AM, Guillaume Chatelet <gchatelet at
google.com> wrote:
> 
> I spent some time reading the C standard
<https://web.archive.org/web/20181230041359if_/http://www.open-std.org/jtc1/sc22/wg14/www/abq/c17_updated_proposed_fdis.pdf>:
> 
> 5.1.2.3 Program execution
> 2. Accessing a volatile object, modifying an object, modifying a file, or
calling a function that does any of those operations are all side effects, which
are changes in the state of the execution environment...
> 
> 6.7.3 Type qualifiers
> 8. An object that has volatile-qualified type may be modified in ways
unknown to the implementation or have other unknown side effects. Therefore any
expression referring to such an object shall be evaluated strictly according to
the rules of the abstract machine, as described in 5.1.2.3. Furthermore, at
every sequence point the value last stored in the object shall agree with that
prescribed by the abstract machine, except as modified by the unknown factors
mentioned previously. What constitutes an access to an object that has
volatile-qualified type is implementation-defined.
> 
> My intuition is that it's unreasonable to do the overlap because it may
create side effects in the middle of the read. At the same time, "accessing
a volatile object" is implementation defined...
> As such "volatile" does not provide a lot of guarantees. I guess
you have to resort on assembly if you want to really control what's going
on.
> 
> gcc, icc, msvc do not implement the overlap trick. They either read the
memory region sequentially or use rep movs.
> Let's note that rep movs does not provide any guarantees on the memory
access patterns either.
> 
> Besides I'm not even sure that volatile in the context of @llvm.memcpy
intrinsics really relates to the C volatile semantic.
> 
> Now, I see several ways to move forward:
> 1. Do nothing and call it implementation defined
> 2. Fix the code that generates the loads/stores for the isVolatile case so
no overlap can occur.
> 3. Remove "volatile" argument from @llvm.memcpy, @llvm.memmove
and @llvm.memset and generates the code in the front end using volatile
load/stores.
> 
> 3 is probably controversial and involves a lot a changes, it would move
some complexity from backend to frontends.
> 
> I'm interested in thoughts from other developers here.
Volatile isn’t specified in any decent normative way. There’s substantial
documentation of intent, but C and C++ standards are lacking. We should
therefore implement something that’s sensible and works kind-of as expected.

I’ve documented specification and intent here: wg21.link/P1152R0
<http://wg21.link/P1152R0>
If you want to follow-on (without all that documentation): wg21.link/P1152R1
<http://wg21.link/P1152R1>

I think we want option 2.: keep volatile memcpy, and implement it as touching
each byte exactly once. That’s unlikely to be particularly useful for every
direct-to-hardware uses, but it behaves intuitively enough that I think it’s
desirable.

> On Mon, Jun 10, 2019 at 10:51 PM John Regehr via llvm-dev <llvm-dev at
lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> I agree, this is a bug.
> 
> John
> 
> 
> On 6/7/19 11:48 AM, JF Bastien via llvm-dev wrote:
> > 
> > 
> >> On Jun 5, 2019, at 2:28 PM, Tim Northover via llvm-dev
<llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>
wrote:
> >>
> >> On Wed, 5 Jun 2019 at 13:49, Eli Friedman via llvm-dev
> >> <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
> >>> I don’t see any particular reason to guarantee that a volatile
memcpy will access each byte exactly once.  How is that useful?
> >>
> >> I agree it's probably not that useful, but I think the
non-duplicating
> >> property of volatile is ingrained strongly enough that viewing a
> >> memcpy as a single load and store to each unit (in an unspecified
> >> order) should be legitimate; so I think this actually is a bug.
> >>
> >> As the documentation says though, it's unwise to depend on the
> >> behaviour of a volatile memcpy.
> > 
> > I agree with Tim, this seems like a bug. My expectation is that
volatile touch each memory location exactly once, unless absolutely impossible
(e.g. bitfields on most ISAs).
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> > 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190611/f9622c3f/attachment.html>

Guillaume Chatelet via llvm-dev

2019-Jun-12 13:37 UTC

head link

[llvm-dev] @llvm.memcpy not honoring volatile?

Thx for the reply.
I've created https://bugs.llvm.org/show_bug.cgi?id=42254 and will work on a
fix.

On Tue, Jun 11, 2019 at 6:08 PM JF Bastien <jfbastien at apple.com> wrote:
>
>
> On Jun 11, 2019, at 6:27 AM, Guillaume Chatelet <gchatelet at
google.com>
> wrote:
>
> I spent some time reading the C standard
>
<https://web.archive.org/web/20181230041359if_/http://www.open-std.org/jtc1/sc22/wg14/www/abq/c17_updated_proposed_fdis.pdf>
> :
>
> 5.1.2.3 Program execution
> 2. Accessing a volatile object, modifying an object, modifying a file, or
> calling a function that does any of those operations are all side effects,
> which are changes in the state of the execution environment...
>
>
> 6.7.3 Type qualifiers
> 8. An object that has volatile-qualified type may be modified in ways
> unknown to the implementation or have other unknown side effects. Therefore
> any expression referring to such an object shall be evaluated strictly
> according to the rules of the abstract machine, as described in 5.1.2.3.
> Furthermore, at every sequence point the value last stored in the object
> shall agree with that prescribed by the abstract machine, except as
> modified by the unknown factors mentioned previously. What constitutes an
> access to an object that has volatile-qualified type is
> implementation-defined.
>
>
> My intuition is that it's unreasonable to do the overlap because it may
> create side effects in the middle of the read. At the same time,
"accessing
> a volatile object" is implementation defined...
> As such "volatile" does not provide a lot of guarantees. I guess
you have
> to resort on assembly if you want to really control what's going on.
>
> gcc, icc, msvc do not implement the overlap trick. They either read the
> memory region sequentially or use rep movs.
> Let's note that rep movs does not provide any guarantees on the memory
> access patterns either.
>
> Besides I'm not even sure that volatile in the context of @llvm.memcpy
> intrinsics really relates to the C volatile semantic.
>
> Now, I see several ways to move forward:
> 1. Do nothing and call it implementation defined
> 2. Fix the code that generates the loads/stores for the isVolatile case so
> no overlap can occur.
> 3. Remove "volatile" argument from @llvm.memcpy, @llvm.memmove
> and @llvm.memset and generates the code in the front end using volatile
> load/stores.
>
> 3 is probably controversial and involves a lot a changes, it would move
> some complexity from backend to frontends.
>
> I'm interested in thoughts from other developers here.
>
>
> Volatile isn’t specified in any decent normative way. There’s substantial
> documentation of intent, but C and C++ standards are lacking. We should
> therefore implement something that’s sensible and works kind-of as
expected.
>
> I’ve documented specification and intent here: wg21.link/P1152R0
> If you want to follow-on (without all that documentation):
> wg21.link/P1152R1
>
> I think we want option 2.: keep volatile memcpy, and implement it as
> touching each byte exactly once. That’s unlikely to be particularly useful
> for every direct-to-hardware uses, but it behaves intuitively enough that I
> think it’s desirable.
>
>
> On Mon, Jun 10, 2019 at 10:51 PM John Regehr via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> I agree, this is a bug.
>>
>> John
>>
>>
>> On 6/7/19 11:48 AM, JF Bastien via llvm-dev wrote:
>> >
>> >
>> >> On Jun 5, 2019, at 2:28 PM, Tim Northover via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>> >>
>> >> On Wed, 5 Jun 2019 at 13:49, Eli Friedman via llvm-dev
>> >> <llvm-dev at lists.llvm.org> wrote:
>> >>> I don’t see any particular reason to guarantee that a
volatile memcpy
>> will access each byte exactly once.  How is that useful?
>> >>
>> >> I agree it's probably not that useful, but I think the
non-duplicating
>> >> property of volatile is ingrained strongly enough that viewing
a
>> >> memcpy as a single load and store to each unit (in an
unspecified
>> >> order) should be legitimate; so I think this actually is a
bug.
>> >>
>> >> As the documentation says though, it's unwise to depend on
the
>> >> behaviour of a volatile memcpy.
>> >
>> > I agree with Tim, this seems like a bug. My expectation is that
>> volatile touch each memory location exactly once, unless absolutely
>> impossible (e.g. bitfields on most ISAs).
>> > _______________________________________________
>> > LLVM Developers mailing list
>> > llvm-dev at lists.llvm.org
>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> >
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190612/9baa541c/attachment.html>

James Y Knight via llvm-dev

2019-Jun-13 04:38 UTC

head link

[llvm-dev] @llvm.memcpy not honoring volatile?

On Tue, Jun 11, 2019 at 12:08 PM JF Bastien via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> I think we want option 2.: keep volatile memcpy, and implement it as
> touching each byte exactly once. That’s unlikely to be particularly useful
> for every direct-to-hardware uses, but it behaves intuitively enough that I
> think it’s desirable.
>
As Eli pointed out, that precludes lowering a volatile memcpy into a call
the memcpy library function. The usual "memcpy" library function may
well
use the same overlapping-memory trick, and there is no
"volatile_memcpy"
libc function which would provide a guarantee of not touching bytes
multiple times. Perhaps it's okay to just always emit an inline loop
instead of falling back to a memcpy call.

But, possibly option 3 would be better. Maybe it's better to force
people/compiler-frontends to emit the raw load/store operations, so that
it's more clear exactly what semantics are desired.

The fundamental issue to me is that for reasonable usages of volatile, the
operand size and number of memmory instructions generated for a given
operation actually *matters*. Certainly, this is a somewhat
unfortunate situation, since the C standard explicitly doesn't forbid
implementing any volatile access with smaller memory operations. (Which,
among other issues, allows tearing as your wg21 doc nicely points out.)
Nevertheless, it _is_ an important property -- required by POSIX for
accesses of a volatile sig_atomic_t, even -- and is a property which
LLVM/Clang does provide when dealing with volatile accesses of
target-specific appropriate sizes and alignments.

But, what does that mean for volatile memcpy? What size should it use?
Always a byte-by-byte copy? May it do larger-sized reads/writes as well?
*Must* it do so? Does it have to read/write the data in order? Or can it do
so in reverse order? Can it use CPU's block-copy instructions (e.g. rep
movsb on x86) which may sometimes cause effectively-arbitrarily-sized
memory-ops, in arbitrary order, in hardware?

If we're going to keep volatile memcpy support, possibly those other
questions ought to be answered too?

I dunno...
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190613/9b6863b2/attachment.html>

JF Bastien via llvm-dev

2019-Jun-13 04:44 UTC

head link

[llvm-dev] @llvm.memcpy not honoring volatile?

> On Jun 12, 2019, at 9:38 PM, James Y Knight <jyknight at google.com>
wrote:
> 
> 
>> On Tue, Jun 11, 2019 at 12:08 PM JF Bastien via llvm-dev <llvm-dev
at lists.llvm.org> wrote:
> 
>> I think we want option 2.: keep volatile memcpy, and implement it as
touching each byte exactly once. That’s unlikely to be particularly useful for
every direct-to-hardware uses, but it behaves intuitively enough that I think
it’s desirable.
> 
> As Eli pointed out, that precludes lowering a volatile memcpy into a call
the memcpy library function. The usual "memcpy" library function may
well use the same overlapping-memory trick, and there is no
"volatile_memcpy" libc function which would provide a guarantee of not
touching bytes multiple times. Perhaps it's okay to just always emit an
inline loop instead of falling back to a memcpy call.
In which circumstances does this matter?

> But, possibly option 3 would be better. Maybe it's better to force
people/compiler-frontends to emit the raw load/store operations, so that
it's more clear exactly what semantics are desired.
> 
> The fundamental issue to me is that for reasonable usages of volatile, the
operand size and number of memmory instructions generated for a given operation
actually matters. Certainly, this is a somewhat unfortunate situation, since the
C standard explicitly doesn't forbid implementing any volatile access with
smaller memory operations. (Which, among other issues, allows tearing as your
wg21 doc nicely points out.) Nevertheless, it _is_ an important property --
required by POSIX for accesses of a volatile sig_atomic_t, even -- and is a
property which LLVM/Clang does provide when dealing with volatile accesses of
target-specific appropriate sizes and alignments.
> 
> But, what does that mean for volatile memcpy? What size should it use?
Any size that makes sense to HW. 

> Always a byte-by-byte copy?
It can. 
> May it do larger-sized reads/writes as well?
Any size, but no larger than memcpy’s size parameter specified. 
> Must it do so?
No, but it has to be sensible (whatever that means). 
> Does it have to read/write the data in order?
No. 
> Or can it do so in reverse order?
Yes. 
> Can it use CPU's block-copy instructions (e.g. rep movsb on x86) which
may sometimes cause effectively-arbitrarily-sized memory-ops, in arbitrary
order, in hardware?
Sure. 

> If we're going to keep volatile memcpy support, possibly those other
questions ought to be answered too?
Paul McKenney has a follow on paper (linked from R2 of mine) which addresses
some of your questions I think. LLVM can do what it wants for now since there’s
no standard, but there’s likely to be one eventually and we probably should
match what it’s likely to be.


> I dunno...-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190612/ed1b6072/attachment.html>

Possibly Parallel Threads

Search for more maybe matching threads

llvm dev - Jun 2019 - @llvm.memcpy not honoring volatile?

[llvm-dev] @llvm.memcpy not honoring volatile?

[llvm-dev] @llvm.memcpy not honoring volatile?

[llvm-dev] @llvm.memcpy not honoring volatile?

[llvm-dev] @llvm.memcpy not honoring volatile?

Possibly Parallel Threads