Daniel Berlin
2015-Jan-15 21:10 UTC
[LLVMdev] question about enabling cfl-aa and collecting a57 numbers
Yes. I've attached an updated patch that does the following: 1. Fixes the partialalias of globals/arguments 2. Enables partialalias for cases where nothing has been unified to a global/argument 3. Fixes that select was unifying the condition to the other pieces (the condition does not need to be processed :P). This was causing unnecessary aliasing. 4. Adds a regression test to must-and-partial 5. Fixes the pass ordering 6. Does not alias non-pointer arguments to random things :) 7. Fixes the CFL test cases that break with above. On Thu, Jan 15, 2015 at 11:33 AM, Ana Pazos <apazos at codeaurora.org> wrote:> Daniel, don’t we need to fix the order of invoking alias analyses in > lib/Transforms/IPO/PassManagerBuilder.cpp as well? > > > > Your patch fixed the order in lib/CodeGen/Passes.cpp and the delegation > code in lib/Analysis/CFLAliasAnalysis.cpp. > > > > Thanks, > > Ana. > > > > *From:* Daniel Berlin [mailto:dberlin at dberlin.org] > *Sent:* Wednesday, January 14, 2015 1:10 PM > *To:* Ana Pazos > *Cc:* George Burgess IV; Nick Lewycky; Jiangning Liu; LLVM Developers > Mailing List > > *Subject:* Re: [LLVMdev] question about enabling cfl-aa and collecting > a57 numbers > > > > Oh, sorry, i didn't rebase it when i changed the fix, you would have had > to apply the first on top of the second. > > Here is one against HEAD > > > > On Wed, Jan 14, 2015 at 12:32 PM, Ana Pazos <apazos at codeaurora.org> wrote: > > Daniel, your patch does not apply cleanly. Are you on the tip? > > The code I see there is no line if (QueryResult == MayAlias|| QueryResult == PartialAlias) to be removed. > > Thanks, > > Ana. > > > > *From:* George Burgess IV [mailto:george.burgess.iv at gmail.com] > *Sent:* Wednesday, January 14, 2015 10:31 AM > *To:* Daniel Berlin > *Cc:* Nick Lewycky; Ana Pazos; Jiangning Liu; LLVM Developers Mailing List > *Subject:* Re: [LLVMdev] question about enabling cfl-aa and collecting > a57 numbers > > > > Inline > > - George > > > On Jan 14, 2015, at 10:49 AM, Daniel Berlin <dberlin at dberlin.org> wrote: > > > > > > On Tue, Jan 13, 2015 at 11:26 PM, Nick Lewycky <nlewycky at google.com> > wrote: > > On 13 January 2015 at 22:11, Daniel Berlin <dberlin at dberlin.org> wrote: > > This is caused by CFLAA returning PartialAlias for a query that BasicAA > can prove is NoAlias. > > > > One of them is wrong. Which one? > > > > CFL-AA. > > > > Right now it checks whether two things come out to be the same stratified > info and have the same index, and if so, returns PartialAlias. This is > wrong, because it never knows why two things got unified, only that they > did. > > > > Therefore, the current check it does where it returns PartialAlias seems > wrong. > > > > George, why does it return PartialAlias? > > It shouldn't -- unless I misinterpreted an earlier discussion, I was under > the impression that we were testing with that line updated to return > MayAlias. We do need to update that in LLVM though; thanks for the fix. > > > > IMHO, you can't ever prove any kind of must-alias or partialalias info > with CFL-AA, unless i'm missing something. > > Agreed. > > > > > > I'm not sure from your description that this is a chaining issue. > PartialAlias doesn't chain and isn't supposed to, it's a final answer just > like NoAlias and MustAlias are. You return partial alias when you have > proven that the accesses are overlapping, meaning that it is proven to > neither be no alias nor must alias. > > Well, it's still chaining wrong, because it did not change on MayAlias, > but you are right that it is not the issue i identified :) > > > > Anyway, updated patch attached. > > > > > > Nick > > > > I've attached a patch that fixes it, but truthfully, chaining seems > somewhat badly designed IMHO. > > > > It should let you pass along the result you've gotten so far, so that you > don't have CFLAA get PartialAlias, chain, and have BasicAA get MayAlias. > > > > (IE it should let it work strictly down the hierarchy). > > > > Note that this patch breaks the CFLAliasAnalysis tests because it no > longer returns PartialAlias in some cases it is expecting it. > > > > I don't have time ATM to fix chaining completely, so if someone wants to > shepherd this patch through, that would be appreciated. > > > > > > > > > > > > On Tue, Jan 13, 2015 at 8:58 PM, Daniel Berlin <dberlin at dberlin.org> > wrote: > > Can you send me actual LLVM IR or a preprocessed source from using -E? > I don't have a machine handy that has headers that target that arch. > > > > > > On Tue Jan 13 2015 at 4:33:29 PM Daniel Berlin <dberlin at dberlin.org> > wrote: > > Anything other than noalias or mustalias should be getting passed down the > stack, so either that is not happening or CFL aa is giving better answers > and something does worse with those better answers. I'll take a look this > evening > > On Jan 13, 2015 3:58 PM, "Ana Pazos" <apazos at codeaurora.org> wrote: > > Hi folks, > > Moving the discussion to llvm.dev. > > None of the changes we talked earlier help. > > Find attached the C source code that you can use to reproduce the issue. > > clang --target=aarch64-linux-gnu -c -mcpu=cortex-a57 -Ofast > -fno-math-errno test.c -S -o test.s -mllvm -debug-only=licm > LICM hoisting to while.body.lr.ph: %21 = load double** %arrayidx8, > align 8, !tbaa !5 > LICM hoisting to while.body.lr.ph: %arrayidx72 = getelementptr inbounds > double* %11, i64 1 > LICM hoisting to while.body.lr.ph: %arrayidx81 = getelementptr inbounds > double* %11, i64 2 > LICM hoisting to for.body.lr.ph: %2 = ptrtoint i32* %Index to i64 > > clang --target=aarch64-linux-gnu -c -mcpu=cortex-a57 -Ofast > -fno-math-errno test.c -S -o test-cflaa.s -mllvm -use-cfl-aa -mllvm > -debug-only=licm > LICM hoisting to for.body.lr.ph: %2 = ptrtoint i32* %Index to i64 > > Why CFL AA cannot allow hoisting this out: %21 = load double** > %arrayidx8, align 8, !tbaa !5 > > Which leads to this extra load in assembly code: ldr x14, [x4, x9, lsl > #3] > > Thanks! > Ana. > > -----Original Message----- > From: Hal Finkel [mailto:hfinkel at anl.gov] > Sent: Tuesday, January 13, 2015 12:51 AM > To: Chandler Carruth > Cc: Jiangning Liu; Pazos, Ana; Ana Pazos; Daniel Berlin; George Burgess IV > Subject: Re: question about enabling cfl-aa and collecting a57 numbers > > ----- Original Message ----- > > From: "Chandler Carruth" <chandlerc at gmail.com> > > To: "Ana Pazos" <apazos at codeaurora.org>, "Daniel Berlin" < > dberlin at dberlin.org>, "George Burgess IV" > > <george.burgess.iv at gmail.com>, "Hal Finkel" <hfinkel at anl.gov> > > Cc: "Jiangning Liu" <Jiangning.Liu at arm.com>, "Ana Pazos" > > <apazos at quicinc.com> > > Sent: Monday, January 12, 2015 6:43:44 PM > > Subject: Re: question about enabling cfl-aa and collecting a57 numbers > > > > Not quite: > > > > > > On Mon Jan 12 2015 at 4:27:30 PM Ana Pazos < apazos at codeaurora.org > > > wrote: > > > > Thanks George and Daniel, > > > > > > > > If the recommended order is “–basicaa –cfla-aa” it means we should fix > > the trunk code that processes the flags -mllvm use-cfl-aa-in-codegen > > and –mllvm use-cfl-aa, right? > > > > > > > > In Transforms/IPO/PassManagerBuilder.cpp and CodeGen/Passes.cpp we see > > this sequence: > > > > > > > > if (UseCFLAA) > > > > PM.add(createCFLAliasAnalysisPass()); > > > > PM.add(createTypeBasedAliasAnalysisPass()); > > > > PM.add(createScopedNoAliasAAPass()); > > > > PM.add(createBasicAliasAnalysisPass()); > > > > > > > > So are you recommending changing to the sequence below instead? > > > > > > Not quite: > > > > > > addPass(createTypeBasedAliasAnalysisPass()); > > > > addPass(createScopedNoAliasAAPass()); > > You want it here. > > > > addPass(createBasicAliasAnalysisPass()); > > > > if (UseCFLAA) > > > > addPass(createCFLAliasAnalysisPass()); > > > > > > The way to think about this is as follows: > > > > > > if TBAA or ScopedNoAlias *can* return results, you want them to. They > > are reflecting a-priori knowledge of aliasing. > > > > > > if both of those say "maybe", you want to ask CFL. If CFL says "maybe" > > you want to ask "BasicAA". > > I could very well be misunderstanding something, but I recall that we add > BasicAA last so that it will be queried first. We do this because of our > BasicAA delegation hack: We want a MustAlias from BasicAA to override a > NoAlias from TBAA, so that we can catch cases of basic local type punning. > Thus, assuming CFL delegates properly, I fail to see how switching its > order w.r.t. the metadata-based analysis changes anything. > > -Hal > > > > > > > It's possible that you could as a compile-time hack flip the last two > > to your suggested ordering, but a) it should really only be a compile > > time hack, and b) we haven't really tested that BasicAA actually > > delegates rather than directly returning "maybe". > > > > > > Thanks, > > > > Ana. > > > > > > > > From: Daniel Berlin [mailto: dberlin at dberlin.org ] > > Sent: Friday, January 09, 2015 6:41 PM > > To: George Burgess IV; Hal Finkel > > Cc: Ana Pazos; Pazos, Ana; Jiangning Liu; chandlerc > > > > > > Subject: Re: question about enabling cfl-aa and collecting a57 numbers > > > > > > Right. I would not try with -cfl-aa -basicaa as the order, it should > > be the other way around. > > > > > > If you discover performance regressions with *that*, that would be > > highly interesting. > > > > > > On Fri Jan 09 2015 at 6:32:28 PM George Burgess IV < > > george.burgess.iv at gmail.com > wrote: > > > > > > > > Thanks for the notes and observations Ana! :) > > > > >> Hal, what do you think? Does it meant that cfl-aa is more > > >> conservative (note I set the right order -cfa-aa followed by > > >> -basicaa)? > > Currently, yes. The implementation as it is is focused heavily on > > being fast, and on trying to cover cases that BasicAA can’t easily > > cover itself. If our end goal is to use this as a replacement for > > BasicAA, there’s probably quite a few low-hanging fruit for accuracy > > improvements, and we can start playing with sacrificing speed in > > pursuit of greater accuracy. On the other hand, if the end goal is to > > append this pass to the list of AA passes already being performed, > > then I’d be interested in seeing the difference in perf between > > `-basicaa` and `-basicaa -cfl-aa`, if any. > > > > WRT this issue in particular, I believe CFLAA answers conservatively > > because: > > A. it doesn’t take into account constant pointer offsets (i.e. it > > considers a[0]..a[n] all as the same “address”) B. CFLAA is entirely > > context insensitive > > > > IIRC, BasicAA is somewhat context sensitive, and it does take into > > account constant pointer offsets, so it can be more accurate in this > > case. That being said, I don’t have access to the SPEC code, so I can > > only speculate. :) > > > > George > > > > > On Jan 9, 2015, at 4:25 PM, Hal Finkel < hfinkel at anl.gov > wrote: > > > > > > Ana, > > > > > > Thanks so much for looking into this. I'm adding some additional > > > relevant people to the CC line... > > > > > > -Hal > > > > > > ----- Original Message ----- > > >> From: "Ana Pazos" < apazos at codeaurora.org > > > >> To: "Ana Pazos" < apazos at quicinc.com >, "Hal Finkel" < > > >> hfinkel at anl.gov >, "Jiangning Liu" < Jiangning.Liu at arm.com > > > >> Sent: Friday, January 9, 2015 3:12:04 PM > > >> Subject: RE: question about enabling cfl-aa and collecting a57 > > >> numbers > > >> > > >> Hi Hal and Jiangning, > > >> > > >> I started to look at the effect of cfl-aa on SPEC. > > >> > > >> The first observation is that I see more redundant loads not being > > >> hoisted out of loops by LICM. > > >> > > >> Due to license restrictions, I cannot paste SPEC code here, but I > > >> think you have access to it. > > >> > > >> I tried to simplify a top function in equake (quake.c smvp) to show > > >> the issue. See the attached reduced test case. > > >> > > >> If you look at the the source code you will see something like: > > >> for { > > >> ...load v[i][0]... > > >> while { > > >> ...load v[i][0]... > > >> store w[col][0].... > > >> } > > >> ...load w[i][0]... > > >> store w[i][0]... > > >> } > > >> > > >> It should be possible to avoid load &v[i][0], &v[i][1], &vi[i][2] > > >> inside the while loop. > > >> > > >> In the simplified example that would be this instruction: > > >> %21 = load double** %arrayidx8, align 8, !tbaa !5 > > >> > > >> And the pointer for the load instruction: > > >> %arrayidx8 = getelementptr inbounds double** %v, i64 %idxprom > > >> > > >> The decision to hoist it fails at LICM.cpp:189: AliasSet &AS > > >> CurAST->getAliasSetForPointer(V, Size, AAInfo).isMod. > > >> > > >> With basicaa, only loads are in the alias set, but with cfl-aa you > > >> find loads and stores, so isMod returns true. > > >> > > >> AliasSet with basicaa: > > >> AliasSet[0x2b35aa0, 7] may alias, Ref Pointers: (double*** > > >> %arrayidx32, 8), (double** %12, 8), (double** %arrayidx36, 8), > > >> (double** %arrayidx8, 8), (double** %arrayidx68, 8), (double** > > >> %arrayidx77, 8), (double** %arrayidx85, 8) > > >> $5 = void > > >> > > >> AliasSet with cfa-aa: > > >> AliasSet[0x2b37df0, 20] may alias, Mod/Ref Pointers: (i32* > > >> %arrayidx30, 4), (double*** %arrayidx32, 8), (double** %12, 8), > > >> (double* %13, 8), (double** %arrayidx36, 8), (double* %15, 8), > > >> (double* %arrayidx42, 8), (double* %arrayidx45, 8), (double* > > >> %arrayidx51, 8), (double* %arrayidx54, 8), (double** %arrayidx8, > > >> 8), (double* %21, 8), (double** %arrayidx68, 8), (double* %23, 8), > > >> (double* %arrayidx72, 8), (double** %arrayidx77, 8), (double* %26, > > >> 8), (double* %arrayidx81, 8), (double** %arrayidx85, 8), (double* > > >> %29, 8) > > >> $8 = void > > >> > > >> You can reproduce it with the commands: > > >> opt -O3 -S -cfl-aa -basicaa -licm -debug-only=licm reduce.ll -o out > > >> > > >> opt -O3 -S - -basicaa -licm -debug-only=licm reduce.ll -o out LICM > > >> hoisting to while.body.lr.ph : %21 = load double** %arrayidx8, > > >> align 8, !tbaa !5 LICM hoisting to while.body.lr.ph : %arrayidx72 > > >> getelementptr inbounds double* %11, i64 1 LICM hoisting to > > >> while.body.lr.ph : %arrayidx81 = getelementptr inbounds double* > > >> %11, i64 2 > > >> > > >> Hal, what do you think? Does it meant that cfl-aa is more > > >> conservative (note I set the right order -cfa-aa followed by > > >> -basicaa)? Could it be an issue with building the AliasSet tracker? > > >> Any pointers on how to fix this? > > >> > > >> Thanks, > > >> Ana. > > >> > > >> > > >> > > >> -----Original Message----- > > >> From: Jiangning Liu [mailto: Jiangning.Liu at arm.com ] > > >> Sent: Wednesday, December 24, 2014 6:27 PM > > >> To: Hal Finkel; Ana Pazos > > >> Subject: RE: question about enabling cfl-aa and collecting a57 > > >> numbers > > >> > > >> Hi, > > >> > > >> I saw big regressions on cortex-A57 for the following SPEC > > >> benchmarks after using "-mllvm -use-cfl-aa ". > > >> > > >> spec.cpu2006.ref.462_libquantum 9.22% spec.cpu2000.ref.179_art > > >> 5.32% > > >> spec.cpu2000.ref.256_bzip2 4.91% > > >> > > >> Thanks, > > >> -Jiangning > > >> > > >>> -----Original Message----- > > >>> From: Hal Finkel [mailto: hfinkel at anl.gov ] > > >>> Sent: Wednesday, December 24, 2014 3:56 AM > > >>> To: Ana Pazos > > >>> Cc: Jiangning Liu > > >>> Subject: Re: question about enabling cfl-aa and collecting a57 > > >>> numbers > > >>> > > >>> ----- Original Message ----- > > >>>> From: "Ana Pazos" < apazos at codeaurora.org > > > >>>> To: "Hal Finkel" < hfinkel at anl.gov >, "Jiangning Liu" > > >>>> < Jiangning.Liu at arm.com > > > >>>> Sent: Monday, December 22, 2014 4:25:07 PM > > >>>> Subject: question about enabling cfl-aa and collecting a57 > > >>>> numbers > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> Hi Hal, > > >>>> > > >>>> > > >>>> > > >>>> Can you clarify to enable cfl-aa from clang we need both these > > >>>> flags? > > >>>> > > >>>> > > >>> > > >>> Hi Ana, > > >>> > > >>> Thanks for following-up on this! You only need (-mllvm > > >>> -use-cfl-aa) > > >>> to > > >>> use CFL AA during the main optimization pipeline, and just this is > > >>> enough to reproduce the performance regressions as far as I know. > > >>> You > > >>> can also use CFL AA during code generation (-mllvm > > >>> -use-cfl-aa-in-codegen), which matters only if your using AA at > > >>> all for code generation (which AArch64 does only for the > > >>> Cortex/A53). > > >>> > > >>> -Hal > > >>> > > >>>> > > >>>> -mllvm -use-cfl-aa > > >>>> > > >>>> -mllvm -cfl-aa-in-codegen > > >>>> > > >>>> > > >>>> > > >>>> Hi Jiangning, > > >>>> > > >>>> Did you collect a57 perf results for SPEC 2000 and 2006 enabling > > >>>> these two flags? > > >>>> > > >>>> Did you notice any correctness and performance regression? > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> Thanks, > > >>>> > > >>>> Ana. > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> Ana Pazos > > >>>> Qualcomm Innovation Center, Inc. > > >>>> The Qualcomm Innovation Center, Inc. is a member of the Code > > >>>> Aurora Forum, a Linux Foundation Collaborative Project. > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> > > >>> > > >>> -- > > >>> Hal Finkel > > >>> Assistant Computational Scientist > > >>> Leadership Computing Facility > > >>> Argonne National Laboratory > > >> > > >> > > >> -- IMPORTANT NOTICE: The contents of this email and any attachments > > >> are confidential and may also be privileged. If you are not the > > >> intended recipient, please notify the sender immediately and do not > > >> disclose the contents to any other person, use it for any purpose, > > >> or store or copy the information in any medium. Thank you. > > >> > > >> ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 > > >> 9NJ, Registered in England & Wales, Company No: 2557590 ARM > > >> Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 > > >> 9NJ, Registered in England & Wales, Company No: 2548782 > > >> > > > > > > -- > > > Hal Finkel > > > Assistant Computational Scientist > > > Leadership Computing Facility > > > Argonne National Laboratory > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory > > > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > > > <cflaafix.diff> > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150115/17ec90ed/attachment.html> -------------- next part -------------- diff --git a/lib/Analysis/CFLAliasAnalysis.cpp b/lib/Analysis/CFLAliasAnalysis.cpp index 88bb84a..95fd377 100644 --- a/lib/Analysis/CFLAliasAnalysis.cpp +++ b/lib/Analysis/CFLAliasAnalysis.cpp @@ -227,10 +227,13 @@ public: // Comparisons between global variables and other constants should be // handled by BasicAA. if (isa<Constant>(LocA.Ptr) && isa<Constant>(LocB.Ptr)) { - return MayAlias; + return AliasAnalysis::alias(LocA, LocB); } + AliasResult QueryResult = query(LocA, LocB); + if (QueryResult == MayAlias) + return AliasAnalysis::alias(LocA, LocB); - return query(LocA, LocB); + return QueryResult; } void initializePass() override { InitializeAliasAnalysis(this); } @@ -295,8 +298,9 @@ public: } void visitSelectInst(SelectInst &Inst) { - auto *Condition = Inst.getCondition(); - Output.push_back(Edge(&Inst, Condition, EdgeType::Assign, AttrNone)); + // Condition is irrelevant, it is evaluated, but not loaded, + // stored, or assigned + // The results of the select are assigned auto *TrueVal = Inst.getTrueValue(); Output.push_back(Edge(&Inst, TrueVal, EdgeType::Assign, AttrNone)); auto *FalseVal = Inst.getFalseValue(); @@ -768,7 +772,10 @@ static Optional<StratifiedAttr> valueToAttrIndex(Value *Val) { return AttrGlobalIndex; if (auto *Arg = dyn_cast<Argument>(Val)) - if (!Arg->hasNoAliasAttr()) + // Only pointer arguments should have the argument attribute, + // because things can't escape through scalars without us seeing a + // cast, and thus, interaction with them doesn't matter. + if (!Arg->hasNoAliasAttr() && Arg->getType()->isPointerTy()) return argNumberToAttrIndex(Arg->getArgNo()); return NoneType(); } @@ -991,10 +998,6 @@ CFLAliasAnalysis::query(const AliasAnalysis::Location &LocA, auto SetA = *MaybeA; auto SetB = *MaybeB; - - if (SetA.Index == SetB.Index) - return AliasAnalysis::PartialAlias; - auto AttrsA = Sets.getLink(SetA.Index).Attrs; auto AttrsB = Sets.getLink(SetB.Index).Attrs; // Stratified set attributes are used as markets to signify whether a member @@ -1009,5 +1012,12 @@ CFLAliasAnalysis::query(const AliasAnalysis::Location &LocA, if (AttrsA.any() && AttrsB.any()) return AliasAnalysis::MayAlias; + // If we haven't interacted with globals, and they still got + // unified, they are partial alias right now. + // NOTE: This assumes we don't unify for any reason other than + // assignments + if (SetA.Index == SetB.Index) + return AliasAnalysis::PartialAlias; + return AliasAnalysis::NoAlias; } diff --git a/lib/CodeGen/Passes.cpp b/lib/CodeGen/Passes.cpp index 28e9f84..82ca024 100644 --- a/lib/CodeGen/Passes.cpp +++ b/lib/CodeGen/Passes.cpp @@ -400,10 +400,10 @@ void TargetPassConfig::addIRPasses() { // Add TypeBasedAliasAnalysis before BasicAliasAnalysis so that // BasicAliasAnalysis wins if they disagree. This is intended to help // support "obvious" type-punning idioms. - if (UseCFLAA) - addPass(createCFLAliasAnalysisPass()); addPass(createTypeBasedAliasAnalysisPass()); addPass(createScopedNoAliasAAPass()); + if (UseCFLAA) + addPass(createCFLAliasAnalysisPass()); addPass(createBasicAliasAnalysisPass()); // Before running any passes, run the verifier to determine if the input diff --git a/lib/Transforms/IPO/PassManagerBuilder.cpp b/lib/Transforms/IPO/PassManagerBuilder.cpp index bb776ef..645686c 100644 --- a/lib/Transforms/IPO/PassManagerBuilder.cpp +++ b/lib/Transforms/IPO/PassManagerBuilder.cpp @@ -132,10 +132,10 @@ PassManagerBuilder::addInitialAliasAnalysisPasses(PassManagerBase &PM) const { // Add TypeBasedAliasAnalysis before BasicAliasAnalysis so that // BasicAliasAnalysis wins if they disagree. This is intended to help // support "obvious" type-punning idioms. - if (UseCFLAA) - PM.add(createCFLAliasAnalysisPass()); PM.add(createTypeBasedAliasAnalysisPass()); PM.add(createScopedNoAliasAAPass()); + if (UseCFLAA) + PM.add(createCFLAliasAnalysisPass()); PM.add(createBasicAliasAnalysisPass()); } diff --git a/test/Analysis/CFLAliasAnalysis/gep-signed-arithmetic.ll b/test/Analysis/CFLAliasAnalysis/gep-signed-arithmetic.ll index a0195d7..f68edaf 100644 --- a/test/Analysis/CFLAliasAnalysis/gep-signed-arithmetic.ll +++ b/test/Analysis/CFLAliasAnalysis/gep-signed-arithmetic.ll @@ -5,7 +5,8 @@ target datalayout = "e-p:32:32:32" ; CHECK: 1 partial alias response -define i32 @test(i32* %tab, i32 %indvar) nounwind { +define i32 @test(i32 %indvar) nounwind { + %tab = alloca i32, align 4 %tmp31 = mul i32 %indvar, -2 %tmp32 = add i32 %tmp31, 30 %t.5 = getelementptr i32* %tab, i32 %tmp32 diff --git a/test/Analysis/CFLAliasAnalysis/must-and-partial.ll b/test/Analysis/CFLAliasAnalysis/must-and-partial.ll index df7de38..2c66d90 100644 --- a/test/Analysis/CFLAliasAnalysis/must-and-partial.ll +++ b/test/Analysis/CFLAliasAnalysis/must-and-partial.ll @@ -7,8 +7,9 @@ target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64" ; CHECK: PartialAlias: i16* %bigbase0, i8* %phi -define i8 @test0(i8* %base, i1 %x) { +define i8 @test0(i1 %x) { entry: + %base = alloca i8, align 4 %baseplusone = getelementptr i8* %base, i64 1 br i1 %x, label %red, label %green red: @@ -25,8 +26,9 @@ green: } ; CHECK: PartialAlias: i16* %bigbase1, i8* %sel -define i8 @test1(i8* %base, i1 %x) { +define i8 @test1(i1 %x) { entry: + %base = alloca i8, align 4 %baseplusone = getelementptr i8* %base, i64 1 %sel = select i1 %x, i8* %baseplusone, i8* %base store i8 0, i8* %sel @@ -37,3 +39,15 @@ entry: %loaded = load i8* %sel ret i8 %loaded } + +; Arguments should not be partial alias +; CHECK: MayAlias: double* %A, double* %Index +define void @testr2(double* nocapture readonly %A, double* nocapture readonly %Index) { + %arrayidx22 = getelementptr inbounds double* %Index, i64 2 + %1 = load double* %arrayidx22 + %arrayidx25 = getelementptr inbounds double* %A, i64 2 + %2 = load double* %arrayidx25 + %mul26 = fmul double %1, %2 + ret void +} +
Nick Lewycky
2015-Jan-15 21:26 UTC
[LLVMdev] question about enabling cfl-aa and collecting a57 numbers
On 15 January 2015 at 13:10, Daniel Berlin <dberlin at dberlin.org> wrote:> Yes. > I've attached an updated patch that does the following: > > 1. Fixes the partialalias of globals/arguments > 2. Enables partialalias for cases where nothing has been unified to a > global/argument > 3. Fixes that select was unifying the condition to the other pieces (the > condition does not need to be processed :P). This was causing unnecessary > aliasing. >Consider this: void *p = ...; uintptr_t i = p; uintptr_t j = 0; for (int a = 0; a < sizeof(uintptr_t); ++a) { j = i >> (sizeof(uintptr_t) - a - 1) ? 1 : 0; j <<= 1; } void *q = j; alias(p, q) isn't NoAlias. (Okay, it kinda is in C++, but not in the equivalent LLVM IR. Please don't make me rewrite my example in LLVM IR.) Nick> 4. Adds a regression test to must-and-partial > 5. Fixes the pass ordering > 6. Does not alias non-pointer arguments to random things :) > 7. Fixes the CFL test cases that break with above. > > > > > On Thu, Jan 15, 2015 at 11:33 AM, Ana Pazos <apazos at codeaurora.org> wrote: > >> Daniel, don’t we need to fix the order of invoking alias analyses in >> lib/Transforms/IPO/PassManagerBuilder.cpp as well? >> >> >> >> Your patch fixed the order in lib/CodeGen/Passes.cpp and the delegation >> code in lib/Analysis/CFLAliasAnalysis.cpp. >> >> >> >> Thanks, >> >> Ana. >> >> >> >> *From:* Daniel Berlin [mailto:dberlin at dberlin.org] >> *Sent:* Wednesday, January 14, 2015 1:10 PM >> *To:* Ana Pazos >> *Cc:* George Burgess IV; Nick Lewycky; Jiangning Liu; LLVM Developers >> Mailing List >> >> *Subject:* Re: [LLVMdev] question about enabling cfl-aa and collecting >> a57 numbers >> >> >> >> Oh, sorry, i didn't rebase it when i changed the fix, you would have had >> to apply the first on top of the second. >> >> Here is one against HEAD >> >> >> >> On Wed, Jan 14, 2015 at 12:32 PM, Ana Pazos <apazos at codeaurora.org> >> wrote: >> >> Daniel, your patch does not apply cleanly. Are you on the tip? >> >> The code I see there is no line if (QueryResult == MayAlias|| QueryResult == PartialAlias) to be removed. >> >> Thanks, >> >> Ana. >> >> >> >> *From:* George Burgess IV [mailto:george.burgess.iv at gmail.com] >> *Sent:* Wednesday, January 14, 2015 10:31 AM >> *To:* Daniel Berlin >> *Cc:* Nick Lewycky; Ana Pazos; Jiangning Liu; LLVM Developers Mailing >> List >> *Subject:* Re: [LLVMdev] question about enabling cfl-aa and collecting >> a57 numbers >> >> >> >> Inline >> >> - George >> >> >> On Jan 14, 2015, at 10:49 AM, Daniel Berlin <dberlin at dberlin.org> wrote: >> >> >> >> >> >> On Tue, Jan 13, 2015 at 11:26 PM, Nick Lewycky <nlewycky at google.com> >> wrote: >> >> On 13 January 2015 at 22:11, Daniel Berlin <dberlin at dberlin.org> wrote: >> >> This is caused by CFLAA returning PartialAlias for a query that BasicAA >> can prove is NoAlias. >> >> >> >> One of them is wrong. Which one? >> >> >> >> CFL-AA. >> >> >> >> Right now it checks whether two things come out to be the same stratified >> info and have the same index, and if so, returns PartialAlias. This is >> wrong, because it never knows why two things got unified, only that they >> did. >> >> >> >> Therefore, the current check it does where it returns PartialAlias seems >> wrong. >> >> >> >> George, why does it return PartialAlias? >> >> It shouldn't -- unless I misinterpreted an earlier discussion, I was >> under the impression that we were testing with that line updated to return >> MayAlias. We do need to update that in LLVM though; thanks for the fix. >> >> >> >> IMHO, you can't ever prove any kind of must-alias or partialalias info >> with CFL-AA, unless i'm missing something. >> >> Agreed. >> >> >> >> >> >> I'm not sure from your description that this is a chaining issue. >> PartialAlias doesn't chain and isn't supposed to, it's a final answer just >> like NoAlias and MustAlias are. You return partial alias when you have >> proven that the accesses are overlapping, meaning that it is proven to >> neither be no alias nor must alias. >> >> Well, it's still chaining wrong, because it did not change on MayAlias, >> but you are right that it is not the issue i identified :) >> >> >> >> Anyway, updated patch attached. >> >> >> >> >> >> Nick >> >> >> >> I've attached a patch that fixes it, but truthfully, chaining seems >> somewhat badly designed IMHO. >> >> >> >> It should let you pass along the result you've gotten so far, so that you >> don't have CFLAA get PartialAlias, chain, and have BasicAA get MayAlias. >> >> >> >> (IE it should let it work strictly down the hierarchy). >> >> >> >> Note that this patch breaks the CFLAliasAnalysis tests because it no >> longer returns PartialAlias in some cases it is expecting it. >> >> >> >> I don't have time ATM to fix chaining completely, so if someone wants to >> shepherd this patch through, that would be appreciated. >> >> >> >> >> >> >> >> >> >> >> >> On Tue, Jan 13, 2015 at 8:58 PM, Daniel Berlin <dberlin at dberlin.org> >> wrote: >> >> Can you send me actual LLVM IR or a preprocessed source from using -E? >> I don't have a machine handy that has headers that target that arch. >> >> >> >> >> >> On Tue Jan 13 2015 at 4:33:29 PM Daniel Berlin <dberlin at dberlin.org> >> wrote: >> >> Anything other than noalias or mustalias should be getting passed down >> the stack, so either that is not happening or CFL aa is giving better >> answers and something does worse with those better answers. I'll take a >> look this evening >> >> On Jan 13, 2015 3:58 PM, "Ana Pazos" <apazos at codeaurora.org> wrote: >> >> Hi folks, >> >> Moving the discussion to llvm.dev. >> >> None of the changes we talked earlier help. >> >> Find attached the C source code that you can use to reproduce the issue. >> >> clang --target=aarch64-linux-gnu -c -mcpu=cortex-a57 -Ofast >> -fno-math-errno test.c -S -o test.s -mllvm -debug-only=licm >> LICM hoisting to while.body.lr.ph: %21 = load double** %arrayidx8, >> align 8, !tbaa !5 >> LICM hoisting to while.body.lr.ph: %arrayidx72 = getelementptr >> inbounds double* %11, i64 1 >> LICM hoisting to while.body.lr.ph: %arrayidx81 = getelementptr >> inbounds double* %11, i64 2 >> LICM hoisting to for.body.lr.ph: %2 = ptrtoint i32* %Index to i64 >> >> clang --target=aarch64-linux-gnu -c -mcpu=cortex-a57 -Ofast >> -fno-math-errno test.c -S -o test-cflaa.s -mllvm -use-cfl-aa -mllvm >> -debug-only=licm >> LICM hoisting to for.body.lr.ph: %2 = ptrtoint i32* %Index to i64 >> >> Why CFL AA cannot allow hoisting this out: %21 = load double** >> %arrayidx8, align 8, !tbaa !5 >> >> Which leads to this extra load in assembly code: ldr x14, [x4, x9, lsl >> #3] >> >> Thanks! >> Ana. >> >> -----Original Message----- >> From: Hal Finkel [mailto:hfinkel at anl.gov] >> Sent: Tuesday, January 13, 2015 12:51 AM >> To: Chandler Carruth >> Cc: Jiangning Liu; Pazos, Ana; Ana Pazos; Daniel Berlin; George Burgess IV >> Subject: Re: question about enabling cfl-aa and collecting a57 numbers >> >> ----- Original Message ----- >> > From: "Chandler Carruth" <chandlerc at gmail.com> >> > To: "Ana Pazos" <apazos at codeaurora.org>, "Daniel Berlin" < >> dberlin at dberlin.org>, "George Burgess IV" >> > <george.burgess.iv at gmail.com>, "Hal Finkel" <hfinkel at anl.gov> >> > Cc: "Jiangning Liu" <Jiangning.Liu at arm.com>, "Ana Pazos" >> > <apazos at quicinc.com> >> > Sent: Monday, January 12, 2015 6:43:44 PM >> > Subject: Re: question about enabling cfl-aa and collecting a57 numbers >> > >> > Not quite: >> > >> > >> > On Mon Jan 12 2015 at 4:27:30 PM Ana Pazos < apazos at codeaurora.org > >> > wrote: >> > >> > Thanks George and Daniel, >> > >> > >> > >> > If the recommended order is “–basicaa –cfla-aa” it means we should fix >> > the trunk code that processes the flags -mllvm use-cfl-aa-in-codegen >> > and –mllvm use-cfl-aa, right? >> > >> > >> > >> > In Transforms/IPO/PassManagerBuilder.cpp and CodeGen/Passes.cpp we see >> > this sequence: >> > >> > >> > >> > if (UseCFLAA) >> > >> > PM.add(createCFLAliasAnalysisPass()); >> > >> > PM.add(createTypeBasedAliasAnalysisPass()); >> > >> > PM.add(createScopedNoAliasAAPass()); >> > >> > PM.add(createBasicAliasAnalysisPass()); >> > >> > >> > >> > So are you recommending changing to the sequence below instead? >> > >> > >> > Not quite: >> > >> > >> > addPass(createTypeBasedAliasAnalysisPass()); >> > >> > addPass(createScopedNoAliasAAPass()); >> > You want it here. >> > >> > addPass(createBasicAliasAnalysisPass()); >> > >> > if (UseCFLAA) >> > >> > addPass(createCFLAliasAnalysisPass()); >> > >> > >> > The way to think about this is as follows: >> > >> > >> > if TBAA or ScopedNoAlias *can* return results, you want them to. They >> > are reflecting a-priori knowledge of aliasing. >> > >> > >> > if both of those say "maybe", you want to ask CFL. If CFL says "maybe" >> > you want to ask "BasicAA". >> >> I could very well be misunderstanding something, but I recall that we add >> BasicAA last so that it will be queried first. We do this because of our >> BasicAA delegation hack: We want a MustAlias from BasicAA to override a >> NoAlias from TBAA, so that we can catch cases of basic local type punning. >> Thus, assuming CFL delegates properly, I fail to see how switching its >> order w.r.t. the metadata-based analysis changes anything. >> >> -Hal >> >> > >> > >> > It's possible that you could as a compile-time hack flip the last two >> > to your suggested ordering, but a) it should really only be a compile >> > time hack, and b) we haven't really tested that BasicAA actually >> > delegates rather than directly returning "maybe". >> > >> > >> > Thanks, >> > >> > Ana. >> > >> > >> > >> > From: Daniel Berlin [mailto: dberlin at dberlin.org ] >> > Sent: Friday, January 09, 2015 6:41 PM >> > To: George Burgess IV; Hal Finkel >> > Cc: Ana Pazos; Pazos, Ana; Jiangning Liu; chandlerc >> > >> > >> > Subject: Re: question about enabling cfl-aa and collecting a57 numbers >> > >> > >> > Right. I would not try with -cfl-aa -basicaa as the order, it should >> > be the other way around. >> > >> > >> > If you discover performance regressions with *that*, that would be >> > highly interesting. >> > >> > >> > On Fri Jan 09 2015 at 6:32:28 PM George Burgess IV < >> > george.burgess.iv at gmail.com > wrote: >> > >> > >> > >> > Thanks for the notes and observations Ana! :) >> > >> > >> Hal, what do you think? Does it meant that cfl-aa is more >> > >> conservative (note I set the right order -cfa-aa followed by >> > >> -basicaa)? >> > Currently, yes. The implementation as it is is focused heavily on >> > being fast, and on trying to cover cases that BasicAA can’t easily >> > cover itself. If our end goal is to use this as a replacement for >> > BasicAA, there’s probably quite a few low-hanging fruit for accuracy >> > improvements, and we can start playing with sacrificing speed in >> > pursuit of greater accuracy. On the other hand, if the end goal is to >> > append this pass to the list of AA passes already being performed, >> > then I’d be interested in seeing the difference in perf between >> > `-basicaa` and `-basicaa -cfl-aa`, if any. >> > >> > WRT this issue in particular, I believe CFLAA answers conservatively >> > because: >> > A. it doesn’t take into account constant pointer offsets (i.e. it >> > considers a[0]..a[n] all as the same “address”) B. CFLAA is entirely >> > context insensitive >> > >> > IIRC, BasicAA is somewhat context sensitive, and it does take into >> > account constant pointer offsets, so it can be more accurate in this >> > case. That being said, I don’t have access to the SPEC code, so I can >> > only speculate. :) >> > >> > George >> > >> > > On Jan 9, 2015, at 4:25 PM, Hal Finkel < hfinkel at anl.gov > wrote: >> > > >> > > Ana, >> > > >> > > Thanks so much for looking into this. I'm adding some additional >> > > relevant people to the CC line... >> > > >> > > -Hal >> > > >> > > ----- Original Message ----- >> > >> From: "Ana Pazos" < apazos at codeaurora.org > >> > >> To: "Ana Pazos" < apazos at quicinc.com >, "Hal Finkel" < >> > >> hfinkel at anl.gov >, "Jiangning Liu" < Jiangning.Liu at arm.com > >> > >> Sent: Friday, January 9, 2015 3:12:04 PM >> > >> Subject: RE: question about enabling cfl-aa and collecting a57 >> > >> numbers >> > >> >> > >> Hi Hal and Jiangning, >> > >> >> > >> I started to look at the effect of cfl-aa on SPEC. >> > >> >> > >> The first observation is that I see more redundant loads not being >> > >> hoisted out of loops by LICM. >> > >> >> > >> Due to license restrictions, I cannot paste SPEC code here, but I >> > >> think you have access to it. >> > >> >> > >> I tried to simplify a top function in equake (quake.c smvp) to show >> > >> the issue. See the attached reduced test case. >> > >> >> > >> If you look at the the source code you will see something like: >> > >> for { >> > >> ...load v[i][0]... >> > >> while { >> > >> ...load v[i][0]... >> > >> store w[col][0].... >> > >> } >> > >> ...load w[i][0]... >> > >> store w[i][0]... >> > >> } >> > >> >> > >> It should be possible to avoid load &v[i][0], &v[i][1], &vi[i][2] >> > >> inside the while loop. >> > >> >> > >> In the simplified example that would be this instruction: >> > >> %21 = load double** %arrayidx8, align 8, !tbaa !5 >> > >> >> > >> And the pointer for the load instruction: >> > >> %arrayidx8 = getelementptr inbounds double** %v, i64 %idxprom >> > >> >> > >> The decision to hoist it fails at LICM.cpp:189: AliasSet &AS >> > >> CurAST->getAliasSetForPointer(V, Size, AAInfo).isMod. >> > >> >> > >> With basicaa, only loads are in the alias set, but with cfl-aa you >> > >> find loads and stores, so isMod returns true. >> > >> >> > >> AliasSet with basicaa: >> > >> AliasSet[0x2b35aa0, 7] may alias, Ref Pointers: (double*** >> > >> %arrayidx32, 8), (double** %12, 8), (double** %arrayidx36, 8), >> > >> (double** %arrayidx8, 8), (double** %arrayidx68, 8), (double** >> > >> %arrayidx77, 8), (double** %arrayidx85, 8) >> > >> $5 = void >> > >> >> > >> AliasSet with cfa-aa: >> > >> AliasSet[0x2b37df0, 20] may alias, Mod/Ref Pointers: (i32* >> > >> %arrayidx30, 4), (double*** %arrayidx32, 8), (double** %12, 8), >> > >> (double* %13, 8), (double** %arrayidx36, 8), (double* %15, 8), >> > >> (double* %arrayidx42, 8), (double* %arrayidx45, 8), (double* >> > >> %arrayidx51, 8), (double* %arrayidx54, 8), (double** %arrayidx8, >> > >> 8), (double* %21, 8), (double** %arrayidx68, 8), (double* %23, 8), >> > >> (double* %arrayidx72, 8), (double** %arrayidx77, 8), (double* %26, >> > >> 8), (double* %arrayidx81, 8), (double** %arrayidx85, 8), (double* >> > >> %29, 8) >> > >> $8 = void >> > >> >> > >> You can reproduce it with the commands: >> > >> opt -O3 -S -cfl-aa -basicaa -licm -debug-only=licm reduce.ll -o out >> > >> >> > >> opt -O3 -S - -basicaa -licm -debug-only=licm reduce.ll -o out LICM >> > >> hoisting to while.body.lr.ph : %21 = load double** %arrayidx8, >> > >> align 8, !tbaa !5 LICM hoisting to while.body.lr.ph : %arrayidx72 >> > >> getelementptr inbounds double* %11, i64 1 LICM hoisting to >> > >> while.body.lr.ph : %arrayidx81 = getelementptr inbounds double* >> > >> %11, i64 2 >> > >> >> > >> Hal, what do you think? Does it meant that cfl-aa is more >> > >> conservative (note I set the right order -cfa-aa followed by >> > >> -basicaa)? Could it be an issue with building the AliasSet tracker? >> > >> Any pointers on how to fix this? >> > >> >> > >> Thanks, >> > >> Ana. >> > >> >> > >> >> > >> >> > >> -----Original Message----- >> > >> From: Jiangning Liu [mailto: Jiangning.Liu at arm.com ] >> > >> Sent: Wednesday, December 24, 2014 6:27 PM >> > >> To: Hal Finkel; Ana Pazos >> > >> Subject: RE: question about enabling cfl-aa and collecting a57 >> > >> numbers >> > >> >> > >> Hi, >> > >> >> > >> I saw big regressions on cortex-A57 for the following SPEC >> > >> benchmarks after using "-mllvm -use-cfl-aa ". >> > >> >> > >> spec.cpu2006.ref.462_libquantum 9.22% spec.cpu2000.ref.179_art >> > >> 5.32% >> > >> spec.cpu2000.ref.256_bzip2 4.91% >> > >> >> > >> Thanks, >> > >> -Jiangning >> > >> >> > >>> -----Original Message----- >> > >>> From: Hal Finkel [mailto: hfinkel at anl.gov ] >> > >>> Sent: Wednesday, December 24, 2014 3:56 AM >> > >>> To: Ana Pazos >> > >>> Cc: Jiangning Liu >> > >>> Subject: Re: question about enabling cfl-aa and collecting a57 >> > >>> numbers >> > >>> >> > >>> ----- Original Message ----- >> > >>>> From: "Ana Pazos" < apazos at codeaurora.org > >> > >>>> To: "Hal Finkel" < hfinkel at anl.gov >, "Jiangning Liu" >> > >>>> < Jiangning.Liu at arm.com > >> > >>>> Sent: Monday, December 22, 2014 4:25:07 PM >> > >>>> Subject: question about enabling cfl-aa and collecting a57 >> > >>>> numbers >> > >>>> >> > >>>> >> > >>>> >> > >>>> >> > >>>> Hi Hal, >> > >>>> >> > >>>> >> > >>>> >> > >>>> Can you clarify to enable cfl-aa from clang we need both these >> > >>>> flags? >> > >>>> >> > >>>> >> > >>> >> > >>> Hi Ana, >> > >>> >> > >>> Thanks for following-up on this! You only need (-mllvm >> > >>> -use-cfl-aa) >> > >>> to >> > >>> use CFL AA during the main optimization pipeline, and just this is >> > >>> enough to reproduce the performance regressions as far as I know. >> > >>> You >> > >>> can also use CFL AA during code generation (-mllvm >> > >>> -use-cfl-aa-in-codegen), which matters only if your using AA at >> > >>> all for code generation (which AArch64 does only for the >> > >>> Cortex/A53). >> > >>> >> > >>> -Hal >> > >>> >> > >>>> >> > >>>> -mllvm -use-cfl-aa >> > >>>> >> > >>>> -mllvm -cfl-aa-in-codegen >> > >>>> >> > >>>> >> > >>>> >> > >>>> Hi Jiangning, >> > >>>> >> > >>>> Did you collect a57 perf results for SPEC 2000 and 2006 enabling >> > >>>> these two flags? >> > >>>> >> > >>>> Did you notice any correctness and performance regression? >> > >>>> >> > >>>> >> > >>>> >> > >>>> >> > >>>> >> > >>>> Thanks, >> > >>>> >> > >>>> Ana. >> > >>>> >> > >>>> >> > >>>> >> > >>>> >> > >>>> >> > >>>> Ana Pazos >> > >>>> Qualcomm Innovation Center, Inc. >> > >>>> The Qualcomm Innovation Center, Inc. is a member of the Code >> > >>>> Aurora Forum, a Linux Foundation Collaborative Project. >> > >>>> >> > >>>> >> > >>>> >> > >>>> >> > >>>> >> > >>>> >> > >>> >> > >>> -- >> > >>> Hal Finkel >> > >>> Assistant Computational Scientist >> > >>> Leadership Computing Facility >> > >>> Argonne National Laboratory >> > >> >> > >> >> > >> -- IMPORTANT NOTICE: The contents of this email and any attachments >> > >> are confidential and may also be privileged. If you are not the >> > >> intended recipient, please notify the sender immediately and do not >> > >> disclose the contents to any other person, use it for any purpose, >> > >> or store or copy the information in any medium. Thank you. >> > >> >> > >> ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 >> > >> 9NJ, Registered in England & Wales, Company No: 2557590 ARM >> > >> Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 >> > >> 9NJ, Registered in England & Wales, Company No: 2548782 >> > >> >> > > >> > > -- >> > > Hal Finkel >> > > Assistant Computational Scientist >> > > Leadership Computing Facility >> > > Argonne National Laboratory >> >> -- >> Hal Finkel >> Assistant Computational Scientist >> Leadership Computing Facility >> Argonne National Laboratory >> >> >> >> >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >> >> >> >> >> <cflaafix.diff> >> >> >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150115/268442a9/attachment.html>
Daniel Berlin
2015-Jan-15 23:05 UTC
[LLVMdev] question about enabling cfl-aa and collecting a57 numbers
On Thu, Jan 15, 2015 at 1:26 PM, Nick Lewycky <nlewycky at google.com> wrote:> On 15 January 2015 at 13:10, Daniel Berlin <dberlin at dberlin.org> wrote: > >> Yes. >> I've attached an updated patch that does the following: >> >> 1. Fixes the partialalias of globals/arguments >> 2. Enables partialalias for cases where nothing has been unified to a >> global/argument >> 3. Fixes that select was unifying the condition to the other pieces (the >> condition does not need to be processed :P). This was causing unnecessary >> aliasing. >> > > Consider this: > > void *p = ...; > uintptr_t i = p; > uintptr_t j = 0; > for (int a = 0; a < sizeof(uintptr_t); ++a) { > j = i >> (sizeof(uintptr_t) - a - 1) ? 1 : 0; > j <<= 1; > } > void *q = j; > > alias(p, q) isn't NoAlias. (Okay, it kinda is in C++, but not in the > equivalent LLVM IR. Please don't make me rewrite my example in LLVM IR.) > > Agreed :)But after chatting with you, i think we both agree that this change does not affect. I probably should not have said "the condition does not need to be processed". It would be more accurate to say "the reference to a condition in a select instruction, by itself, does not cause aliasing" What happens now is: given %4 = select %1, %2, %3 we do aliasset(%4) += %1 aliasset(%4) += %2 aliasset(%4) += %3 The first one is unnecessary. There can be no alias caused simply because it is referenced in condition of the select.> We still need to process what %1 refers to (and we do).To make this empirical, in your example, we get the right answer in CFL-AA. Interestingly, i'll point out that basic-aa says: NoAlias: i8* %p, i8** %q NoAlias: i8** %p.addr, i8** %q (I translated your case as best i can :P) So you may want to implement it for real if you think it's supposed to be handled right in basic-aa, because I don't believe it is :) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150115/1b1eacb5/attachment.html>