Son Tuan VU via llvm-dev
2019-Jul-25 08:53 UTC
[llvm-dev] Intrinsics InstrReadMem memory properties
So I removed the 'tail' from the call and try out different properties: - IntrNoMem: memset() and the intrinsic are both optimized away as expected - IntrWriteMem: memset() optimized away by DSE but the intrinsic isn't. I would expect both to be removed, since the intrinsic is now also a dead store. - IntrReadMem: memset() and the intrinsic are both optimized away *unexpectedly* (CSE removes the intrinsic, then InstCombine removes memset). The latter is understandable, but why the intrinsic gets optimized in the first place? - IntrArgMemOnly: none gets optimized away as expected - ReadOnly<0>: none gets optimized away as expected - ReadNone<0> / WriteOnly<0>: none gets optimized *unexpectedly* Am I missing something here or there are indeed bugs here? Btw, can you tell me how and why 'tail' changes the optimizer behavior? Thanks a lot for your explanation! Son Tuan Vu On Thu, Jul 25, 2019 at 12:57 AM Doerfert, Johannes <jdoerfert at anl.gov> wrote:> Does the behavior change if you remove the tail from the call to your > intrinsic? > > I can later look in more detail. > > Get Outlook for Android <https://aka.ms/ghei36> > > ------------------------------ > *From:* Son Tuan VU <sontuan.vu119 at gmail.com> > *Sent:* Wednesday, July 24, 2019 6:51:10 PM > *To:* Doerfert, Johannes > *Cc:* llvm-dev > *Subject:* Re: [llvm-dev] Intrinsics InstrReadMem memory properties > > Ok, now I think I've found a bug: > > Consider this C code: > void bar(int b) { > int a[10]; > memset(a, b, 10); > } > > which generates this IR code: > define dso_local void @bar(i32 %b) #0 { > entry: > %b.addr = alloca i32, align 4 > %a = alloca [10 x i32], align 16 > store i32 %b, i32* %b.addr, align 4 > %arraydecay = getelementptr inbounds [10 x i32], [10 x i32]* %a, i64 0, > i64 0 > %0 = bitcast i32* %arraydecay to i8* > %1 = load i32, i32* %b.addr, align 4 > %2 = trunc i32 %1 to i8 > call void @llvm.memset.p0i8.i64(i8* align 16 %0, i8 %2, i64 10, i1 > false) > ret void > } > > Now I have a pass that inserts an intrinsic with IntrReadMem into the IR: > define dso_local void @bar(i32 %b) #0 { > entry: > %b.addr = alloca i32, align 4 > %a = alloca [10 x i32], align 16 > store i32 %b, i32* %b.addr, align 4 > %arraydecay = getelementptr inbounds [10 x i32], [10 x i32]* %a, i64 0, > i64 0 > %0 = bitcast i32* %arraydecay to i8* > %1 = load i32, i32* %b.addr, align 4 > %2 = trunc i32 %1 to i8 > call void @llvm.memset.p0i8.i64(i8* align 16 %0, i8 %2, i64 10, i1 false) > * tail call void @mem_read_test(i8* %0)* > ret void > } > > ; Function Attrs: nounwind readonly > declare void @mem_read_test(i8*) #2 > > However, the call to memset() still got optimized away by DSE. What am I > missing here? Or this is indeed a bug in DSE? > > Son Tuan Vu > > > On Wed, Jul 24, 2019 at 6:47 PM Doerfert, Johannes <jdoerfert at anl.gov> > wrote: > >> You are on the right track. Addresses could get exposed in various ways, >> a probably non-exclusive list is: >> - passed as arguments >> - communicated through a global >> - via I/O, or more general, system calls. This includes all forms of >> synchronization, e.g., inter-lane communication. >> - transitively passed by any of the means above, e.g., the address of a >> pointer to the object could be exposed. >> >> So if we take the example below and add: >> bar(&A[50]); >> just before the call to unknown, we have to assume A is known to unknown >> now, at least if we do not have information about bar that would suggest >> otherwise. >> >> >> On 07/24, Son Tuan VU wrote: >> > Hi Johannes, >> > >> > Thanks for your reply. I now see more clearly how things work with these >> > properties. However, what would be an object whose address is >> potentially >> > known by a callee? I suppose the intrinsic arguments and global >> variable? >> > >> > So IIUC, if not restricted by *only properties, an intrinsic could >> access >> > to: >> > - only its arguments if IntrArgMemOnly specified, >> > - its arguments and the global variable as well if Intr*Mem (other than >> > IntrNoMem) specified. >> > >> > Please tell me if I'm correct or not! >> > >> > Thanks again, >> > >> > >> > >> > On Wed, Jul 24, 2019, 17:27 Doerfert, Johannes <jdoerfert at anl.gov> >> wrote: >> > >> > > Hi Son Tuan Vu, >> > > >> > > if not restricted by *writeonly*, *readonly*, or *readnone* >> (basically), a >> > > call can access any object for which the >> > > callee could potentially know the address. That means, if the address >> of >> > > an object cannot be known to the callee, >> > > it cannot access that object. An example is given below. Thus, a dead >> > > store can be eliminated if the memory cannot >> > > be read by any subsequent operation. If you think there is a bug, >> could >> > > you provide a reproducer? >> > > >> > > Example: >> > > >> > > void unknown(); >> > > void foo() { >> > > int *A = malloc(100 * sizeof(A[0])); >> > > int B[100]; >> > > for (int i = 0; i < 100; i++) >> > > A[i] = B[i] = i; >> > > >> > > // The addresses/objects A and B are not known to the unknown >> function >> > > and the stores above can be removed. >> > > unknown(); >> > > >> > > free(A); >> > > } >> > > >> > > I hope this helps, >> > > Johannes >> > > >> > > >> > > ________________________________________ >> > > From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Son >> Tuan VU >> > > via llvm-dev <llvm-dev at lists.llvm.org> >> > > Sent: Wednesday, July 24, 2019 08:20 >> > > To: llvm-devmemory >> > > Subject: [llvm-dev] Intrinsics InstrReadMem memory properties >> > > >> > > Hello, >> > > >> > > According to include/llvm/IR/Intrinsics.td, InstrReadMem property >> > > indicates that the intrinsic only reads from and does not write to >> memory. >> > > >> > > Does this mean that it can read anywhere in the memory? Because we >> already >> > > have 'InstrArgMemOnly' for intrinsics which only access memory that >> its >> > > argument(s) point(s) to. >> > > >> > > If 'InstrReadMem' really means read from anywhere in the memory, this >> > > should imply that, if there's an intrinsic having this property >> *after* a >> > > dead store, the latter should not be eliminated by optimizations? >> > > >> > > This is not the current behavior of LLVM though, so it seems that my >> > > guesses are wrong... But at least, can someone show me the mistake >> here? >> > > >> > > Thanks for your time, >> > > >> > > Son Tuan Vu >> > > >> >> -- >> >> Johannes Doerfert >> Researcher >> >> Argonne National Laboratory >> Lemont, IL 60439, USA >> >> jdoerfert at anl.gov >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190725/599e5e12/attachment-0001.html>
Tim Northover via llvm-dev
2019-Jul-25 09:58 UTC
[llvm-dev] Intrinsics InstrReadMem memory properties
On Thu, 25 Jul 2019 at 09:53, Son Tuan VU via llvm-dev <llvm-dev at lists.llvm.org> wrote:> - IntrWriteMem: memset() optimized away by DSE but the intrinsic isn't. I would expect both to be removed, since the intrinsic is now also a dead store.IntrWriteMem means the intrinsic could write to memory *anywhere*, not just based on its argument.> - IntrReadMem: memset() and the intrinsic are both optimized away *unexpectedly* (CSE removes the intrinsic, then InstCombine removes memset). The latter is understandable, but why the intrinsic gets optimized in the first place?I haven't checked the code, but an intrinsic that only reads memory (no other side effects) and returns void can't actually accomplish anything observable.> Am I missing something here or there are indeed bugs here?It all looks as expected to me.> Btw, can you tell me how and why 'tail' changes the optimizer behavior?>From the LangRef about tail (and musttail): "Both markers imply thatthe callee does not access allocas from the caller". That seems directly applicable to your example. The reason, of course, is that if a call is actually implemented as a tail call then the current stack frame is reused for the new callee. So the lifetime of objects on it has ended and accessing them is just not possible in a well-defined program. Cheers. Tim.
Son Tuan VU via llvm-dev
2019-Jul-25 10:20 UTC
[llvm-dev] Intrinsics InstrReadMem memory properties
Thanks Tim for your reply. What about the case where the intrinsic is ReadNone and doesn't get optimized? Also when it is WriteOnly, memset() does not get DSE'd? Son Tuan Vu On Thu, Jul 25, 2019 at 11:58 AM Tim Northover <t.p.northover at gmail.com> wrote:> On Thu, 25 Jul 2019 at 09:53, Son Tuan VU via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > - IntrWriteMem: memset() optimized away by DSE but the intrinsic isn't. > I would expect both to be removed, since the intrinsic is now also a dead > store. > > IntrWriteMem means the intrinsic could write to memory *anywhere*, not > just based on its argument. > > > - IntrReadMem: memset() and the intrinsic are both optimized away > *unexpectedly* (CSE removes the intrinsic, then InstCombine removes > memset). The latter is understandable, but why the intrinsic gets optimized > in the first place? > > I haven't checked the code, but an intrinsic that only reads memory > (no other side effects) and returns void can't actually accomplish > anything observable. > > > Am I missing something here or there are indeed bugs here? > > It all looks as expected to me. > > > Btw, can you tell me how and why 'tail' changes the optimizer behavior? > > From the LangRef about tail (and musttail): "Both markers imply that > the callee does not access allocas from the caller". That seems > directly applicable to your example. > > The reason, of course, is that if a call is actually implemented as a > tail call then the current stack frame is reused for the new callee. > So the lifetime of objects on it has ended and accessing them is just > not possible in a well-defined program. > > Cheers. > > Tim. >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190725/bded98ee/attachment.html>