thr3ads.net - llvm dev - [llvm-dev] Intrinsics InstrReadMem memory properties [Jul 2019]

If this information is useful, please help other people find it:
Share via:

Son Tuan VU via llvm-dev

2019-Jul-24 22:51 UTC

[llvm-dev] Intrinsics InstrReadMem memory properties

Ok, now I think I've found a bug:

Consider this C code:
void bar(int b) {
  int a[10];
  memset(a, b, 10);
}

which generates this IR code:
define dso_local void @bar(i32 %b) #0 {
entry:
  %b.addr = alloca i32, align 4
  %a = alloca [10 x i32], align 16
  store i32 %b, i32* %b.addr, align 4
  %arraydecay = getelementptr inbounds [10 x i32], [10 x i32]* %a, i64 0,
i64 0
  %0 = bitcast i32* %arraydecay to i8*
  %1 = load i32, i32* %b.addr, align 4
  %2 = trunc i32 %1 to i8
  call void @llvm.memset.p0i8.i64(i8* align 16 %0, i8 %2, i64 10, i1 false)
  ret void
}

Now I have a pass that inserts an intrinsic with IntrReadMem into the IR:
define dso_local void @bar(i32 %b) #0 {
entry:
  %b.addr = alloca i32, align 4
  %a = alloca [10 x i32], align 16
  store i32 %b, i32* %b.addr, align 4
  %arraydecay = getelementptr inbounds [10 x i32], [10 x i32]* %a, i64 0,
i64 0
  %0 = bitcast i32* %arraydecay to i8*
  %1 = load i32, i32* %b.addr, align 4
  %2 = trunc i32 %1 to i8
  call void @llvm.memset.p0i8.i64(i8* align 16 %0, i8 %2, i64 10, i1 false)
*  tail call void @mem_read_test(i8* %0)*
  ret void
}

; Function Attrs: nounwind readonly
declare void @mem_read_test(i8*) #2

However, the call to memset() still got optimized away by DSE. What am I
missing here? Or this is indeed a bug in DSE?

Son Tuan Vu


On Wed, Jul 24, 2019 at 6:47 PM Doerfert, Johannes <jdoerfert at anl.gov>
wrote:
> You are on the right track. Addresses could get exposed in various ways,
> a probably non-exclusive list is:
>  - passed as arguments
>  - communicated through a global
>  - via I/O, or more general, system calls. This includes all forms of
>    synchronization, e.g., inter-lane communication.
>  - transitively passed by any of the means above, e.g., the address of a
>    pointer to the object could be exposed.
>
> So if we take the example below and add:
>   bar(&A[50]);
> just before the call to unknown, we have to assume A is known to unknown
> now, at least if we do not have information about bar that would suggest
> otherwise.
>
>
> On 07/24, Son Tuan VU wrote:
> > Hi Johannes,
> >
> > Thanks for your reply. I now see more clearly how things work with
these
> > properties. However, what would be an object whose address is
potentially
> > known by a callee? I suppose the intrinsic arguments and global
variable?
> >
> > So IIUC, if not restricted by *only properties, an intrinsic could
access
> > to:
> > - only its arguments if IntrArgMemOnly specified,
> > - its arguments and the global variable as well if Intr*Mem (other
than
> > IntrNoMem) specified.
> >
> > Please tell me if I'm correct or not!
> >
> > Thanks again,
> >
> >
> >
> > On Wed, Jul 24, 2019, 17:27 Doerfert, Johannes <jdoerfert at
anl.gov>
> wrote:
> >
> > > Hi Son Tuan Vu,
> > >
> > > if not restricted by *writeonly*, *readonly*, or *readnone*
> (basically), a
> > > call can access any object for which the
> > > callee could potentially know the address. That means, if the
address
> of
> > > an object cannot be known to the callee,
> > > it cannot access that object. An example is given below. Thus, a
dead
> > > store can be eliminated if the memory cannot
> > > be read by any subsequent operation. If you think there is a bug,
could
> > > you provide a reproducer?
> > >
> > > Example:
> > >
> > > void unknown();
> > > void foo() {
> > >    int *A = malloc(100 * sizeof(A[0]));
> > >    int B[100];
> > >   for (int i = 0; i < 100; i++)
> > >     A[i] = B[i] = i;
> > >
> > >   // The addresses/objects A and B are not known to the unknown
> function
> > > and the stores above can be removed.
> > >   unknown();
> > >
> > >   free(A);
> > > }
> > >
> > > I hope this helps,
> > >   Johannes
> > >
> > >
> > > ________________________________________
> > > From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on
behalf of Son
> Tuan VU
> > > via llvm-dev <llvm-dev at lists.llvm.org>
> > > Sent: Wednesday, July 24, 2019 08:20
> > > To: llvm-devmemory
> > > Subject: [llvm-dev] Intrinsics InstrReadMem memory properties
> > >
> > > Hello,
> > >
> > > According to include/llvm/IR/Intrinsics.td, InstrReadMem property
> > > indicates that the intrinsic only reads from and does not write
to
> memory.
> > >
> > > Does this mean that it can read anywhere in the memory? Because
we
> already
> > > have 'InstrArgMemOnly' for intrinsics which only access
memory that its
> > > argument(s) point(s) to.
> > >
> > > If 'InstrReadMem' really means read from anywhere in the
memory, this
> > > should imply that,  if there's an intrinsic having this
property
> *after* a
> > > dead store, the latter should not be eliminated by optimizations?
> > >
> > > This is not the current behavior of LLVM though, so it seems that
my
> > > guesses are wrong... But at least, can someone show me the
mistake
> here?
> > >
> > > Thanks for your time,
> > >
> > > Son Tuan Vu
> > >
>
> --
>
> Johannes Doerfert
> Researcher
>
> Argonne National Laboratory
> Lemont, IL 60439, USA
>
> jdoerfert at anl.gov
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190725/3584eea6/attachment.html>

Doerfert, Johannes via llvm-dev

2019-Jul-24 22:57 UTC

head link

[llvm-dev] Intrinsics InstrReadMem memory properties

Does the behavior change if you remove the tail from the call to your intrinsic?

I can later look in more detail.

Get Outlook for Android<https://aka.ms/ghei36>

________________________________
From: Son Tuan VU <sontuan.vu119 at gmail.com>
Sent: Wednesday, July 24, 2019 6:51:10 PM
To: Doerfert, Johannes
Cc: llvm-dev
Subject: Re: [llvm-dev] Intrinsics InstrReadMem memory properties

Ok, now I think I've found a bug:

Consider this C code:
void bar(int b) {
  int a[10];
  memset(a, b, 10);
}

which generates this IR code:
define dso_local void @bar(i32 %b) #0 {
entry:
  %b.addr = alloca i32, align 4
  %a = alloca [10 x i32], align 16
  store i32 %b, i32* %b.addr, align 4
  %arraydecay = getelementptr inbounds [10 x i32], [10 x i32]* %a, i64 0, i64 0
  %0 = bitcast i32* %arraydecay to i8*
  %1 = load i32, i32* %b.addr, align 4
  %2 = trunc i32 %1 to i8
  call void @llvm.memset.p0i8.i64(i8* align 16 %0, i8 %2, i64 10, i1 false)
  ret void
}

Now I have a pass that inserts an intrinsic with IntrReadMem into the IR:
define dso_local void @bar(i32 %b) #0 {
entry:
  %b.addr = alloca i32, align 4
  %a = alloca [10 x i32], align 16
  store i32 %b, i32* %b.addr, align 4
  %arraydecay = getelementptr inbounds [10 x i32], [10 x i32]* %a, i64 0, i64 0
  %0 = bitcast i32* %arraydecay to i8*
  %1 = load i32, i32* %b.addr, align 4
  %2 = trunc i32 %1 to i8
  call void @llvm.memset.p0i8.i64(i8* align 16 %0, i8 %2, i64 10, i1 false)
  tail call void @mem_read_test(i8* %0)
  ret void
}

; Function Attrs: nounwind readonly
declare void @mem_read_test(i8*) #2

However, the call to memset() still got optimized away by DSE. What am I missing
here? Or this is indeed a bug in DSE?

Son Tuan Vu


On Wed, Jul 24, 2019 at 6:47 PM Doerfert, Johannes <jdoerfert at
anl.gov<mailto:jdoerfert at anl.gov>> wrote:
You are on the right track. Addresses could get exposed in various ways,
a probably non-exclusive list is:
 - passed as arguments
 - communicated through a global
 - via I/O, or more general, system calls. This includes all forms of
   synchronization, e.g., inter-lane communication.
 - transitively passed by any of the means above, e.g., the address of a
   pointer to the object could be exposed.

So if we take the example below and add:
  bar(&A[50]);
just before the call to unknown, we have to assume A is known to unknown
now, at least if we do not have information about bar that would suggest
otherwise.


On 07/24, Son Tuan VU wrote:> Hi Johannes,
>
> Thanks for your reply. I now see more clearly how things work with these
> properties. However, what would be an object whose address is potentially
> known by a callee? I suppose the intrinsic arguments and global variable?
>
> So IIUC, if not restricted by *only properties, an intrinsic could access
> to:
> - only its arguments if IntrArgMemOnly specified,
> - its arguments and the global variable as well if Intr*Mem (other than
> IntrNoMem) specified.
>
> Please tell me if I'm correct or not!
>
> Thanks again,
>
>
>
> On Wed, Jul 24, 2019, 17:27 Doerfert, Johannes <jdoerfert at
anl.gov<mailto:jdoerfert at anl.gov>> wrote:
>
> > Hi Son Tuan Vu,
> >
> > if not restricted by *writeonly*, *readonly*, or *readnone*
(basically), a
> > call can access any object for which the
> > callee could potentially know the address. That means, if the address
of
> > an object cannot be known to the callee,
> > it cannot access that object. An example is given below. Thus, a dead
> > store can be eliminated if the memory cannot
> > be read by any subsequent operation. If you think there is a bug,
could
> > you provide a reproducer?
> >
> > Example:
> >
> > void unknown();
> > void foo() {
> >    int *A = malloc(100 * sizeof(A[0]));
> >    int B[100];
> >   for (int i = 0; i < 100; i++)
> >     A[i] = B[i] = i;
> >
> >   // The addresses/objects A and B are not known to the unknown
function
> > and the stores above can be removed.
> >   unknown();
> >
> >   free(A);
> > }
> >
> > I hope this helps,
> >   Johannes
> >
> >
> > ________________________________________
> > From: llvm-dev <llvm-dev-bounces at
lists.llvm.org<mailto:llvm-dev-bounces at lists.llvm.org>> on behalf of
Son Tuan VU
> > via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at
lists.llvm.org>>
> > Sent: Wednesday, July 24, 2019 08:20
> > To: llvm-devmemory
> > Subject: [llvm-dev] Intrinsics InstrReadMem memory properties
> >
> > Hello,
> >
> > According to include/llvm/IR/Intrinsics.td, InstrReadMem property
> > indicates that the intrinsic only reads from and does not write to
memory.
> >
> > Does this mean that it can read anywhere in the memory? Because we
already
> > have 'InstrArgMemOnly' for intrinsics which only access memory
that its
> > argument(s) point(s) to.
> >
> > If 'InstrReadMem' really means read from anywhere in the
memory, this
> > should imply that,  if there's an intrinsic having this property
*after* a
> > dead store, the latter should not be eliminated by optimizations?
> >
> > This is not the current behavior of LLVM though, so it seems that my
> > guesses are wrong... But at least, can someone show me the mistake
here?
> >
> > Thanks for your time,
> >
> > Son Tuan Vu
> >
--

Johannes Doerfert
Researcher

Argonne National Laboratory
Lemont, IL 60439, USA

jdoerfert at anl.gov<mailto:jdoerfert at anl.gov>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190724/ee77ee48/attachment.html>

Son Tuan VU via llvm-dev

2019-Jul-25 08:53 UTC

head link

[llvm-dev] Intrinsics InstrReadMem memory properties

So I removed the 'tail' from the call and try out different properties:
- IntrNoMem: memset() and the intrinsic are both optimized away as expected
- IntrWriteMem: memset() optimized away by DSE but the intrinsic isn't. I
would expect both to be removed, since the intrinsic is now also a dead
store.
- IntrReadMem: memset() and the intrinsic are both optimized away
*unexpectedly* (CSE removes the intrinsic, then InstCombine removes
memset). The latter is understandable, but why the intrinsic gets optimized
in the first place?
- IntrArgMemOnly: none gets optimized away as expected
- ReadOnly<0>: none gets optimized away as expected
- ReadNone<0> / WriteOnly<0>: none gets optimized *unexpectedly*

Am I missing something here or there are indeed bugs here? Btw, can you
tell me how and why 'tail' changes the optimizer behavior?

Thanks a lot for your explanation!

Son Tuan Vu


On Thu, Jul 25, 2019 at 12:57 AM Doerfert, Johannes <jdoerfert at anl.gov>
wrote:
> Does the behavior change if you remove the tail from the call to your
> intrinsic?
>
> I can later look in more detail.
>
> Get Outlook for Android <https://aka.ms/ghei36>
>
> ------------------------------
> *From:* Son Tuan VU <sontuan.vu119 at gmail.com>
> *Sent:* Wednesday, July 24, 2019 6:51:10 PM
> *To:* Doerfert, Johannes
> *Cc:* llvm-dev
> *Subject:* Re: [llvm-dev] Intrinsics InstrReadMem memory properties
>
> Ok, now I think I've found a bug:
>
> Consider this C code:
> void bar(int b) {
>   int a[10];
>   memset(a, b, 10);
> }
>
> which generates this IR code:
> define dso_local void @bar(i32 %b) #0 {
> entry:
>   %b.addr = alloca i32, align 4
>   %a = alloca [10 x i32], align 16
>   store i32 %b, i32* %b.addr, align 4
>   %arraydecay = getelementptr inbounds [10 x i32], [10 x i32]* %a, i64 0,
> i64 0
>   %0 = bitcast i32* %arraydecay to i8*
>   %1 = load i32, i32* %b.addr, align 4
>   %2 = trunc i32 %1 to i8
>   call void @llvm.memset.p0i8.i64(i8* align 16 %0, i8 %2, i64 10, i1
> false)
>   ret void
> }
>
> Now I have a pass that inserts an intrinsic with IntrReadMem into the IR:
> define dso_local void @bar(i32 %b) #0 {
> entry:
>   %b.addr = alloca i32, align 4
>   %a = alloca [10 x i32], align 16
>   store i32 %b, i32* %b.addr, align 4
>   %arraydecay = getelementptr inbounds [10 x i32], [10 x i32]* %a, i64 0,
> i64 0
>   %0 = bitcast i32* %arraydecay to i8*
>   %1 = load i32, i32* %b.addr, align 4
>   %2 = trunc i32 %1 to i8
>   call void @llvm.memset.p0i8.i64(i8* align 16 %0, i8 %2, i64 10, i1 false)
> *  tail call void @mem_read_test(i8* %0)*
>   ret void
> }
>
> ; Function Attrs: nounwind readonly
> declare void @mem_read_test(i8*) #2
>
> However, the call to memset() still got optimized away by DSE. What am I
> missing here? Or this is indeed a bug in DSE?
>
> Son Tuan Vu
>
>
> On Wed, Jul 24, 2019 at 6:47 PM Doerfert, Johannes <jdoerfert at
anl.gov>
> wrote:
>
>> You are on the right track. Addresses could get exposed in various
ways,
>> a probably non-exclusive list is:
>>  - passed as arguments
>>  - communicated through a global
>>  - via I/O, or more general, system calls. This includes all forms of
>>    synchronization, e.g., inter-lane communication.
>>  - transitively passed by any of the means above, e.g., the address of
a
>>    pointer to the object could be exposed.
>>
>> So if we take the example below and add:
>>   bar(&A[50]);
>> just before the call to unknown, we have to assume A is known to
unknown
>> now, at least if we do not have information about bar that would
suggest
>> otherwise.
>>
>>
>> On 07/24, Son Tuan VU wrote:
>> > Hi Johannes,
>> >
>> > Thanks for your reply. I now see more clearly how things work with
these
>> > properties. However, what would be an object whose address is
>> potentially
>> > known by a callee? I suppose the intrinsic arguments and global
>> variable?
>> >
>> > So IIUC, if not restricted by *only properties, an intrinsic could
>> access
>> > to:
>> > - only its arguments if IntrArgMemOnly specified,
>> > - its arguments and the global variable as well if Intr*Mem (other
than
>> > IntrNoMem) specified.
>> >
>> > Please tell me if I'm correct or not!
>> >
>> > Thanks again,
>> >
>> >
>> >
>> > On Wed, Jul 24, 2019, 17:27 Doerfert, Johannes <jdoerfert at
anl.gov>
>> wrote:
>> >
>> > > Hi Son Tuan Vu,
>> > >
>> > > if not restricted by *writeonly*, *readonly*, or *readnone*
>> (basically), a
>> > > call can access any object for which the
>> > > callee could potentially know the address. That means, if the
address
>> of
>> > > an object cannot be known to the callee,
>> > > it cannot access that object. An example is given below.
Thus, a dead
>> > > store can be eliminated if the memory cannot
>> > > be read by any subsequent operation. If you think there is a
bug,
>> could
>> > > you provide a reproducer?
>> > >
>> > > Example:
>> > >
>> > > void unknown();
>> > > void foo() {
>> > >    int *A = malloc(100 * sizeof(A[0]));
>> > >    int B[100];
>> > >   for (int i = 0; i < 100; i++)
>> > >     A[i] = B[i] = i;
>> > >
>> > >   // The addresses/objects A and B are not known to the
unknown
>> function
>> > > and the stores above can be removed.
>> > >   unknown();
>> > >
>> > >   free(A);
>> > > }
>> > >
>> > > I hope this helps,
>> > >   Johannes
>> > >
>> > >
>> > > ________________________________________
>> > > From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on
behalf of Son
>> Tuan VU
>> > > via llvm-dev <llvm-dev at lists.llvm.org>
>> > > Sent: Wednesday, July 24, 2019 08:20
>> > > To: llvm-devmemory
>> > > Subject: [llvm-dev] Intrinsics InstrReadMem memory properties
>> > >
>> > > Hello,
>> > >
>> > > According to include/llvm/IR/Intrinsics.td, InstrReadMem
property
>> > > indicates that the intrinsic only reads from and does not
write to
>> memory.
>> > >
>> > > Does this mean that it can read anywhere in the memory?
Because we
>> already
>> > > have 'InstrArgMemOnly' for intrinsics which only
access memory that
>> its
>> > > argument(s) point(s) to.
>> > >
>> > > If 'InstrReadMem' really means read from anywhere in
the memory, this
>> > > should imply that,  if there's an intrinsic having this
property
>> *after* a
>> > > dead store, the latter should not be eliminated by
optimizations?
>> > >
>> > > This is not the current behavior of LLVM though, so it seems
that my
>> > > guesses are wrong... But at least, can someone show me the
mistake
>> here?
>> > >
>> > > Thanks for your time,
>> > >
>> > > Son Tuan Vu
>> > >
>>
>> --
>>
>> Johannes Doerfert
>> Researcher
>>
>> Argonne National Laboratory
>> Lemont, IL 60439, USA
>>
>> jdoerfert at anl.gov
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190725/599e5e12/attachment-0001.html>

Possibly Parallel Threads

Search for more possibly parallel threads

llvm dev - Jul 2019 - Intrinsics InstrReadMem memory properties

[llvm-dev] Intrinsics InstrReadMem memory properties

[llvm-dev] Intrinsics InstrReadMem memory properties

[llvm-dev] Intrinsics InstrReadMem memory properties

Possibly Parallel Threads