Hi - I've got a question about what optimizations the "inbounds" keyword of "getelementptr" allows you to use. In the code below, %five is loaded from and inbounds offset of either a null pointer or %mem: target datalayout = "e-i64:64-f80:128-n8:16:32:64-S128" define i8 @func(i8* %mem) { %test = icmp eq i8* %mem, null br i1 %test, label %done, label %mem.is.valid mem.is.valid: %1 = getelementptr inbounds i8, i8* %mem, i64 4 store i8 5, i8* %1, align 4 br label %done done: %after.phi = phi i8* [ %mem, %mem.is.valid ], [ null, %0 ] %2 = getelementptr inbounds i8, i8* %after.phi, i64 4 %five = load i8, i8* %2, align 4 ret i8 %five } According to the documentation, "the result value of the getelementptr is a poison value if the base pointer is not an in bounds address of an allocated object", so does this mean it's valid to optimise the function to: define i8 @func(i8* %mem) { %test = icmp eq i8* %mem, null br i1 %test, label %done, label %mem.is.valid mem.is.valid: ret i8 5 done: ret i8 undef } Or even this: define i8 @func(i8* %mem) { ret i8 5 } ...? This is a reduced example of something I saw while running "opt" on a test case that missed a null check. Thanks - Nick
On Sun, May 3, 2015 at 6:26 PM, Nicholas White <n.j.white at gmail.com> wrote:> Hi - I've got a question about what optimizations the "inbounds" > keyword of "getelementptr" allows you to use. In the code below, %five > is loaded from and inbounds offset of either a null pointer or %mem: > > target datalayout = "e-i64:64-f80:128-n8:16:32:64-S128" > > define i8 @func(i8* %mem) { > %test = icmp eq i8* %mem, null > br i1 %test, label %done, label %mem.is.valid > mem.is.valid: > %1 = getelementptr inbounds i8, i8* %mem, i64 4 > store i8 5, i8* %1, align 4 > br label %done > done: > %after.phi = phi i8* [ %mem, %mem.is.valid ], [ null, %0 ] > %2 = getelementptr inbounds i8, i8* %after.phi, i64 4 > %five = load i8, i8* %2, align 4 > ret i8 %five > } > > According to the documentation, "the result value of the getelementptr > is a poison value if the base pointer is not an in bounds address of > an allocated object", so does this mean it's valid to optimise the > function to: > > define i8 @func(i8* %mem) { > %test = icmp eq i8* %mem, null > br i1 %test, label %done, label %mem.is.valid > mem.is.valid: > ret i8 5 > done: > ret i8 undef > } > > Or even this: > > define i8 @func(i8* %mem) { > ret i8 5 > } >No, neither are semantics preserving because there exists no store to '%mem'. Let's start by hoisting both sides of the branch into the entry block. This will leave you with: define i8 @func(i8* %mem) { %test = icmp eq i8* %mem, null %after.phi = select i1 %test, i8* null, %mem %1 = getelementptr inbounds i8, i8* %mem, i64 4 store i8 5, i8* %1, align 4 %2 = getelementptr inbounds i8, i8* %after.phi, i64 4 %five = load i8, i8* %2, align 4 ret i8 %five } The SSA value '%after.phi' can be trivially simplified to '%mem', this leaves us with: define i8 @func(i8* %mem) { %1 = getelementptr inbounds i8, i8* %mem, i64 4 store i8 5, i8* %1, align 4 %2 = getelementptr inbounds i8, i8* %mem, i64 4 %five = load i8, i8* %2, align 4 ret i8 %five } The SSA values '%1' and '%2' are equivalent, this leaves us with: define i8 @func(i8* %mem) { %1 = getelementptr inbounds i8, i8* %mem, i64 4 store i8 5, i8* %1, align 4 ret i8 5 } I do not believe further simplification is possible.> ...? This is a reduced example of something I saw while running "opt" > on a test case that missed a null check. Thanks - > > Nick > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150503/9a8ac46e/attachment.html>
On Sun, May 3, 2015 at 9:57 PM, David Majnemer <david.majnemer at gmail.com> wrote:> > > On Sun, May 3, 2015 at 6:26 PM, Nicholas White <n.j.white at gmail.com> > wrote: > >> Hi - I've got a question about what optimizations the "inbounds" >> keyword of "getelementptr" allows you to use. In the code below, %five >> is loaded from and inbounds offset of either a null pointer or %mem: >> >> target datalayout = "e-i64:64-f80:128-n8:16:32:64-S128" >> >> define i8 @func(i8* %mem) { >> %test = icmp eq i8* %mem, null >> br i1 %test, label %done, label %mem.is.valid >> mem.is.valid: >> %1 = getelementptr inbounds i8, i8* %mem, i64 4 >> store i8 5, i8* %1, align 4 >> br label %done >> done: >> %after.phi = phi i8* [ %mem, %mem.is.valid ], [ null, %0 ] >> %2 = getelementptr inbounds i8, i8* %after.phi, i64 4 >> %five = load i8, i8* %2, align 4 >> ret i8 %five >> } >> >> According to the documentation, "the result value of the getelementptr >> is a poison value if the base pointer is not an in bounds address of >> an allocated object", so does this mean it's valid to optimise the >> function to: >> >> define i8 @func(i8* %mem) { >> %test = icmp eq i8* %mem, null >> br i1 %test, label %done, label %mem.is.valid >> mem.is.valid: >> ret i8 5 >> done: >> ret i8 undef >> } >> >> Or even this: >> >> define i8 @func(i8* %mem) { >> ret i8 5 >> } >> > > No, neither are semantics preserving because there exists no store to > '%mem'. > > Let's start by hoisting both sides of the branch into the entry block. > This will leave you with: > define i8 @func(i8* %mem) { > %test = icmp eq i8* %mem, null > %after.phi = select i1 %test, i8* null, %mem > %1 = getelementptr inbounds i8, i8* %mem, i64 4 > store i8 5, i8* %1, align 4 > %2 = getelementptr inbounds i8, i8* %after.phi, i64 4 > %five = load i8, i8* %2, align 4 > ret i8 %five > } > > The SSA value '%after.phi' can be trivially simplified to '%mem', this > leaves us with: > define i8 @func(i8* %mem) { > %1 = getelementptr inbounds i8, i8* %mem, i64 4 > store i8 5, i8* %1, align 4 > %2 = getelementptr inbounds i8, i8* %mem, i64 4 > %five = load i8, i8* %2, align 4 > ret i8 %five > } > > The SSA values '%1' and '%2' are equivalent, this leaves us with: > define i8 @func(i8* %mem) { > %1 = getelementptr inbounds i8, i8* %mem, i64 4 > store i8 5, i8* %1, align 4 > ret i8 5 > } > > I do not believe further simplification is possible. >One final point that I forgot to mention: the transformations that I performed did not rely on the presence of 'inbounds'.> > >> ...? This is a reduced example of something I saw while running "opt" >> on a test case that missed a null check. Thanks - >> >> Nick >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150503/b7191bd9/attachment.html>
Possibly Parallel Threads
- [LLVMdev] Semantics of an Inbounds GetElementPtr
- [LLVMdev] Semantics of an Inbounds GetElementPtr
- [LLVMdev] Correct usage of `llvm.assume` for loop vectorization alignment?
- [LLVMdev] Semantics of an Inbounds GetElementPtr
- [LLVMdev] [DragonEgg] Mysterious FRAME coming from gimple to LLVM