Kevin Choi via llvm-dev
2016-Jul-20 21:44 UTC
[llvm-dev] load instruction erroneously removed by GVN v2
before inlining all 20005 after inlining somewhere here changed made it NoAlias after Global Variable Optimizer 20014 20373 20255 20372 20254 before GVN 19993 20011 19991 20010 20030 It appears that TBAA metadata certainly changed after inlining and subsequent passes. I have attached the .bc file. I think I will try to dump out more TBAA metadata between passes. The method in interest is @main. On 20 July 2016 at 11:32, Daniel Berlin <dberlin at dberlin.org> wrote:> > > On Wed, Jul 20, 2016 at 11:24 AM, Kevin Choi <code.kchoi at gmail.com> wrote: > >> Thanks for quick reply Daniel, >> >> I tried to make a simple C testcase, but could not reproduce the same >> condition with output from Clang. >> > > even if you have a .ll file you can share, that would be helpful. > > >> I suppose I could modify the C code to make it look similar with TBAA's; >> I may be able to provide this by eod. >> >> > store %ptr above the load. >> >> My mistake; I was referring to the store $lcssa in bb2. Looking at the C >> source code, it should definitely alias with %dest. I will try and see if I >> can find out why it thinks there is no local dependence. >> > > Yes, i would start by seeing why the getdependency call is returning no > dep :) > > At a glance from your file, the load and that store are tagged with > different TBAA tags. > THat does not necessarily mean they don't alias, i'd have to see the TBAA > tree to say for sure. > The TBAA rule is that a tag aliases all of it's descendants and ancestors > in the tree. > > So if the tree look like this: > > !something > / \ > !20010 !20030 > > It will say no alias > but if it looks like > > !something > | > !20010 > | > !20030 > > TBAA will say they alias (something else may still say no-alias for other > reasons, of course) > > >> Thanks, >> Kevin >> >> On 20 July 2016 at 11:06, Daniel Berlin <dberlin at dberlin.org> wrote: >> >>> >>> >>> On Wed, Jul 20, 2016 at 9:56 AM, Kevin Choi via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>>> Hello to whom this may concern, >>>> >>>> Versioned this as I saw identical title before. I'm compiling a clang >>>> project where I'm seeing GVN mess up and replace a load with a wrong def >>>> value. >>>> >>> >>> >>> Do you have a c testcase or a .ll file i can actually compile? >>> >>> >>>> I am using LLVM-3.5, but the problem has been observed upto 3.8. >>>> To illustrate the problem, >>>> >>>> define i32 @main >>>> scalar.ph: >>>> <initialize [80 x i16] %dest> >>>> ... >>>> preheader: >>>> %index=0 >>>> br test, loop1, bb2 >>>> loop1: >>>> ... write to %dest in increasing index // ptr-based while loop >>>> %ptr++; >>>> br test, loop1, bb2 >>>> bb2: >>>> %lcssa = phi [%ptr, loop1], [%ptr, preheader] >>>> store i16 0, i16* %lcssa !dbg !20094 !tbaa 20030 // write null byte >>>> at end >>>> %76 = getelementptr inbounds [80 x i16]* %dest, i64 0, i64 7, !dbg >>>> !20095 // load addr of null byte (7th index) >>>> %77 = load i16* %76, align 2, !dbg !20095, !tbaa !20010 >>>> %78 = icmp eq i16 %77, 0, !dbg !20095 >>>> br i1 %78, label %80, label %79, !dbg !20095 >>>> >>>> GVN calls processNonLocalLoad() on "%77 = load..." and replaces it with >>>> init value from scalar.ph. >>>> >>>> >>> >>> >>>> bb2: >>>> %lcssa = phi [%ptr, loop1], [%ptr, preheader] >>>> store i16 0, i16* %lcssa, !dbg !20094, !tbaa !20030 >>>> %76 = getelementptr inbounds [80 x i16]* %dest, i64 0, i64 7, !dbg >>>> !20095 >>>> br i1 icmp eq (i16 trunc (i128 lshr (i128 bitcast (<8 x i16> <i16 120, >>>> i16 120, i16 120, i16 120, i16 120, i16 120, i16 120, i16 120> to i128), >>>> i128 96) to i16), i16 0), label %78, label %77, !dbg !20095 // simplifies >>>> to "icmp eq (i16 120, i16 0) --> false" >>>> >>>> I first suspected problem might be TBAA; invoking clang with >>>> -fno-strict-aliasing makes the test pass (similarly, opt with -basicaa does >>>> not make GVN transform the load). When I look at the C/C++ source code, I >>>> cannot find any type-based aliasing violations from eyeballing. >>>> >>> >>> Can you excerpt the actual source code? >>> >>>> >>>> I started looking at the aliasing and landed at >>>> getNonLocalPointerDepFromBB(), in which the worklist algorithm to find >>>> MemDep reported the result from the init block, ignoring all the kills >>>> after it. I did see one of the parm was SkipFirstBlock and this appears to >>>> ignore the store %ptr above the load. Is there a reason why it skips first >>>> block? >>>> >>> >>> >>> Yes. It is looking for *non-local* dependences. The block it should >>> skip is bb2, not loop1. >>> There is another call for local dependences. If you ask for local >>> dependences, then non-local ones, as GVN does, there is no point in having >>> it go and look at the local ones again :) >>> >>> Look at how GVN calls this: >>> MemDepResult Dep = MD->getDependency(L); >>> >>> // If it is defined in another block, try harder. >>> if (Dep.isNonLocal()) >>> return processNonLocalLoad(L); >>> >>> (IE get local dependence, if there is none, go looking at non-local >>> dependences). >>> >>> >>> The list of non-local dependences it includes should definitely include >>> the one from block "loop1", and if it is ignoring it, it is most likely >>> thinks it does not alias for some reason. >>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160720/eb798fca/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: test.bc Type: application/octet-stream Size: 3047784 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160720/eb798fca/attachment-0001.obj>
Kevin Choi via llvm-dev
2016-Jul-20 23:58 UTC
[llvm-dev] load instruction erroneously removed by GVN v2
I think I've narrowed it to a failure point: I am seeing that after [0x1a5b080] %35 = call i16* @PAL_wcsncat(i16* %7, i16* %33, i64 %34), !dbg !20064 is inlined into main, the alias of load and the preceding store no longer are aliased. I verified this via inserting modulePrinterPass right after inliningPass. I am wondering if this is because pointer aliasing is incorrectly set up on call to PAL_wcsncat or something is wrong in the inliner. I will see if I can pull the pointer aliases on the callsite. -Kevin On 20 July 2016 at 14:44, Kevin Choi <code.kchoi at gmail.com> wrote:> before inlining > all 20005 > > after inlining > somewhere here changed made it NoAlias > after Global Variable Optimizer > 20014 > 20373 20255 > 20372 20254 > > > before GVN > 19993 > 20011 19991 > 20010 20030 > > It appears that TBAA metadata certainly changed after inlining and > subsequent passes. I have attached the .bc file. I think I will try to dump > out more TBAA metadata between passes. The method in interest is @main. > > > > On 20 July 2016 at 11:32, Daniel Berlin <dberlin at dberlin.org> wrote: > >> >> >> On Wed, Jul 20, 2016 at 11:24 AM, Kevin Choi <code.kchoi at gmail.com> >> wrote: >> >>> Thanks for quick reply Daniel, >>> >>> I tried to make a simple C testcase, but could not reproduce the same >>> condition with output from Clang. >>> >> >> even if you have a .ll file you can share, that would be helpful. >> >> >>> I suppose I could modify the C code to make it look similar with TBAA's; >>> I may be able to provide this by eod. >>> >>> > store %ptr above the load. >>> >>> My mistake; I was referring to the store $lcssa in bb2. Looking at the C >>> source code, it should definitely alias with %dest. I will try and see if I >>> can find out why it thinks there is no local dependence. >>> >> >> Yes, i would start by seeing why the getdependency call is returning no >> dep :) >> >> At a glance from your file, the load and that store are tagged with >> different TBAA tags. >> THat does not necessarily mean they don't alias, i'd have to see the TBAA >> tree to say for sure. >> The TBAA rule is that a tag aliases all of it's descendants and ancestors >> in the tree. >> >> So if the tree look like this: >> >> !something >> / \ >> !20010 !20030 >> >> It will say no alias >> but if it looks like >> >> !something >> | >> !20010 >> | >> !20030 >> >> TBAA will say they alias (something else may still say no-alias for other >> reasons, of course) >> >> >>> Thanks, >>> Kevin >>> >>> On 20 July 2016 at 11:06, Daniel Berlin <dberlin at dberlin.org> wrote: >>> >>>> >>>> >>>> On Wed, Jul 20, 2016 at 9:56 AM, Kevin Choi via llvm-dev < >>>> llvm-dev at lists.llvm.org> wrote: >>>> >>>>> Hello to whom this may concern, >>>>> >>>>> Versioned this as I saw identical title before. I'm compiling a clang >>>>> project where I'm seeing GVN mess up and replace a load with a wrong def >>>>> value. >>>>> >>>> >>>> >>>> Do you have a c testcase or a .ll file i can actually compile? >>>> >>>> >>>>> I am using LLVM-3.5, but the problem has been observed upto 3.8. >>>>> To illustrate the problem, >>>>> >>>>> define i32 @main >>>>> scalar.ph: >>>>> <initialize [80 x i16] %dest> >>>>> ... >>>>> preheader: >>>>> %index=0 >>>>> br test, loop1, bb2 >>>>> loop1: >>>>> ... write to %dest in increasing index // ptr-based while loop >>>>> %ptr++; >>>>> br test, loop1, bb2 >>>>> bb2: >>>>> %lcssa = phi [%ptr, loop1], [%ptr, preheader] >>>>> store i16 0, i16* %lcssa !dbg !20094 !tbaa 20030 // write null byte >>>>> at end >>>>> %76 = getelementptr inbounds [80 x i16]* %dest, i64 0, i64 7, !dbg >>>>> !20095 // load addr of null byte (7th index) >>>>> %77 = load i16* %76, align 2, !dbg !20095, !tbaa !20010 >>>>> %78 = icmp eq i16 %77, 0, !dbg !20095 >>>>> br i1 %78, label %80, label %79, !dbg !20095 >>>>> >>>>> GVN calls processNonLocalLoad() on "%77 = load..." and replaces it >>>>> with init value from scalar.ph. >>>>> >>>>> >>>> >>>> >>>>> bb2: >>>>> %lcssa = phi [%ptr, loop1], [%ptr, preheader] >>>>> store i16 0, i16* %lcssa, !dbg !20094, !tbaa !20030 >>>>> %76 = getelementptr inbounds [80 x i16]* %dest, i64 0, i64 7, !dbg >>>>> !20095 >>>>> br i1 icmp eq (i16 trunc (i128 lshr (i128 bitcast (<8 x i16> <i16 120, >>>>> i16 120, i16 120, i16 120, i16 120, i16 120, i16 120, i16 120> to i128), >>>>> i128 96) to i16), i16 0), label %78, label %77, !dbg !20095 // simplifies >>>>> to "icmp eq (i16 120, i16 0) --> false" >>>>> >>>>> I first suspected problem might be TBAA; invoking clang with >>>>> -fno-strict-aliasing makes the test pass (similarly, opt with -basicaa does >>>>> not make GVN transform the load). When I look at the C/C++ source code, I >>>>> cannot find any type-based aliasing violations from eyeballing. >>>>> >>>> >>>> Can you excerpt the actual source code? >>>> >>>>> >>>>> I started looking at the aliasing and landed at >>>>> getNonLocalPointerDepFromBB(), in which the worklist algorithm to find >>>>> MemDep reported the result from the init block, ignoring all the kills >>>>> after it. I did see one of the parm was SkipFirstBlock and this appears to >>>>> ignore the store %ptr above the load. Is there a reason why it skips first >>>>> block? >>>>> >>>> >>>> >>>> Yes. It is looking for *non-local* dependences. The block it should >>>> skip is bb2, not loop1. >>>> There is another call for local dependences. If you ask for local >>>> dependences, then non-local ones, as GVN does, there is no point in having >>>> it go and look at the local ones again :) >>>> >>>> Look at how GVN calls this: >>>> MemDepResult Dep = MD->getDependency(L); >>>> >>>> // If it is defined in another block, try harder. >>>> if (Dep.isNonLocal()) >>>> return processNonLocalLoad(L); >>>> >>>> (IE get local dependence, if there is none, go looking at non-local >>>> dependences). >>>> >>>> >>>> The list of non-local dependences it includes should definitely include >>>> the one from block "loop1", and if it is ignoring it, it is most likely >>>> thinks it does not alias for some reason. >>>> >>>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160720/d011e453/attachment.html>
Kevin Choi via llvm-dev
2016-Jul-21 22:18 UTC
[llvm-dev] load instruction erroneously removed by GVN v2
Does anyone know how LLVM encodes aliasing on a call? I'm looking at the code in GVN and it seems like it's pulling TBAA metadata as below. (ImmutableCallSite*)CS.getInstruction()->getMetadata(LLVMContext::MD_tbaa)) but I do not see !tbaa attached to the call instruction, only !dbg. This is bitcode from LTO link step and I see that LTO pass manager uses both TypeAA and BasicAA. Any input is appreciated, -Kevin On 20 July 2016 at 16:58, Kevin Choi <code.kchoi at gmail.com> wrote:> I think I've narrowed it to a failure point: > I am seeing that after > [0x1a5b080] %35 = call i16* @PAL_wcsncat(i16* %7, i16* %33, i64 %34), !dbg > !20064 > > is inlined into main, the alias of load and the preceding store no longer > are aliased. I verified this via inserting modulePrinterPass right after > inliningPass. > > I am wondering if this is because pointer aliasing is incorrectly set up > on call to PAL_wcsncat or something is wrong in the inliner. I will see if > I can pull the pointer aliases on the callsite. > > -Kevin > > On 20 July 2016 at 14:44, Kevin Choi <code.kchoi at gmail.com> wrote: > >> before inlining >> all 20005 >> >> after inlining >> somewhere here changed made it NoAlias >> after Global Variable Optimizer >> 20014 >> 20373 20255 >> 20372 20254 >> >> >> before GVN >> 19993 >> 20011 19991 >> 20010 20030 >> >> It appears that TBAA metadata certainly changed after inlining and >> subsequent passes. I have attached the .bc file. I think I will try to dump >> out more TBAA metadata between passes. The method in interest is @main. >> >> >> >> On 20 July 2016 at 11:32, Daniel Berlin <dberlin at dberlin.org> wrote: >> >>> >>> >>> On Wed, Jul 20, 2016 at 11:24 AM, Kevin Choi <code.kchoi at gmail.com> >>> wrote: >>> >>>> Thanks for quick reply Daniel, >>>> >>>> I tried to make a simple C testcase, but could not reproduce the same >>>> condition with output from Clang. >>>> >>> >>> even if you have a .ll file you can share, that would be helpful. >>> >>> >>>> I suppose I could modify the C code to make it look similar with >>>> TBAA's; I may be able to provide this by eod. >>>> >>>> > store %ptr above the load. >>>> >>>> My mistake; I was referring to the store $lcssa in bb2. Looking at the >>>> C source code, it should definitely alias with %dest. I will try and see if >>>> I can find out why it thinks there is no local dependence. >>>> >>> >>> Yes, i would start by seeing why the getdependency call is returning no >>> dep :) >>> >>> At a glance from your file, the load and that store are tagged with >>> different TBAA tags. >>> THat does not necessarily mean they don't alias, i'd have to see the >>> TBAA tree to say for sure. >>> The TBAA rule is that a tag aliases all of it's descendants and >>> ancestors in the tree. >>> >>> So if the tree look like this: >>> >>> !something >>> / \ >>> !20010 !20030 >>> >>> It will say no alias >>> but if it looks like >>> >>> !something >>> | >>> !20010 >>> | >>> !20030 >>> >>> TBAA will say they alias (something else may still say no-alias for >>> other reasons, of course) >>> >>> >>>> Thanks, >>>> Kevin >>>> >>>> On 20 July 2016 at 11:06, Daniel Berlin <dberlin at dberlin.org> wrote: >>>> >>>>> >>>>> >>>>> On Wed, Jul 20, 2016 at 9:56 AM, Kevin Choi via llvm-dev < >>>>> llvm-dev at lists.llvm.org> wrote: >>>>> >>>>>> Hello to whom this may concern, >>>>>> >>>>>> Versioned this as I saw identical title before. I'm compiling a clang >>>>>> project where I'm seeing GVN mess up and replace a load with a wrong def >>>>>> value. >>>>>> >>>>> >>>>> >>>>> Do you have a c testcase or a .ll file i can actually compile? >>>>> >>>>> >>>>>> I am using LLVM-3.5, but the problem has been observed upto 3.8. >>>>>> To illustrate the problem, >>>>>> >>>>>> define i32 @main >>>>>> scalar.ph: >>>>>> <initialize [80 x i16] %dest> >>>>>> ... >>>>>> preheader: >>>>>> %index=0 >>>>>> br test, loop1, bb2 >>>>>> loop1: >>>>>> ... write to %dest in increasing index // ptr-based while loop >>>>>> %ptr++; >>>>>> br test, loop1, bb2 >>>>>> bb2: >>>>>> %lcssa = phi [%ptr, loop1], [%ptr, preheader] >>>>>> store i16 0, i16* %lcssa !dbg !20094 !tbaa 20030 // write null >>>>>> byte at end >>>>>> %76 = getelementptr inbounds [80 x i16]* %dest, i64 0, i64 7, !dbg >>>>>> !20095 // load addr of null byte (7th index) >>>>>> %77 = load i16* %76, align 2, !dbg !20095, !tbaa !20010 >>>>>> %78 = icmp eq i16 %77, 0, !dbg !20095 >>>>>> br i1 %78, label %80, label %79, !dbg !20095 >>>>>> >>>>>> GVN calls processNonLocalLoad() on "%77 = load..." and replaces it >>>>>> with init value from scalar.ph. >>>>>> >>>>>> >>>>> >>>>> >>>>>> bb2: >>>>>> %lcssa = phi [%ptr, loop1], [%ptr, preheader] >>>>>> store i16 0, i16* %lcssa, !dbg !20094, !tbaa !20030 >>>>>> %76 = getelementptr inbounds [80 x i16]* %dest, i64 0, i64 7, !dbg >>>>>> !20095 >>>>>> br i1 icmp eq (i16 trunc (i128 lshr (i128 bitcast (<8 x i16> <i16 >>>>>> 120, i16 120, i16 120, i16 120, i16 120, i16 120, i16 120, i16 120> to >>>>>> i128), i128 96) to i16), i16 0), label %78, label %77, !dbg !20095 // >>>>>> simplifies to "icmp eq (i16 120, i16 0) --> false" >>>>>> >>>>>> I first suspected problem might be TBAA; invoking clang with >>>>>> -fno-strict-aliasing makes the test pass (similarly, opt with -basicaa does >>>>>> not make GVN transform the load). When I look at the C/C++ source code, I >>>>>> cannot find any type-based aliasing violations from eyeballing. >>>>>> >>>>> >>>>> Can you excerpt the actual source code? >>>>> >>>>>> >>>>>> I started looking at the aliasing and landed at >>>>>> getNonLocalPointerDepFromBB(), in which the worklist algorithm to find >>>>>> MemDep reported the result from the init block, ignoring all the kills >>>>>> after it. I did see one of the parm was SkipFirstBlock and this appears to >>>>>> ignore the store %ptr above the load. Is there a reason why it skips first >>>>>> block? >>>>>> >>>>> >>>>> >>>>> Yes. It is looking for *non-local* dependences. The block it should >>>>> skip is bb2, not loop1. >>>>> There is another call for local dependences. If you ask for local >>>>> dependences, then non-local ones, as GVN does, there is no point in having >>>>> it go and look at the local ones again :) >>>>> >>>>> Look at how GVN calls this: >>>>> MemDepResult Dep = MD->getDependency(L); >>>>> >>>>> // If it is defined in another block, try harder. >>>>> if (Dep.isNonLocal()) >>>>> return processNonLocalLoad(L); >>>>> >>>>> (IE get local dependence, if there is none, go looking at non-local >>>>> dependences). >>>>> >>>>> >>>>> The list of non-local dependences it includes should definitely >>>>> include the one from block "loop1", and if it is ignoring it, it is most >>>>> likely thinks it does not alias for some reason. >>>>> >>>>> >>>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160721/f3233f86/attachment-0001.html>