thr3ads.net - llvm dev - [LLVMdev] question about enabling cfl-aa and collecting a57 numbers [Jan 2015]

If this information is useful, please help other people find it:
Share via:

Daniel Berlin

2015-Jan-15 21:10 UTC

[LLVMdev] question about enabling cfl-aa and collecting a57 numbers

Yes.
I've attached an updated patch that does the following:

1. Fixes the partialalias of globals/arguments
2. Enables partialalias for cases where nothing has been unified to a
global/argument
3. Fixes that select was unifying the condition to the other pieces (the
condition does not need to be processed :P). This was causing unnecessary
aliasing.
4. Adds a regression test to must-and-partial
5. Fixes the pass ordering
6. Does not alias non-pointer arguments to random things :)
7. Fixes the CFL test cases that break with above.




On Thu, Jan 15, 2015 at 11:33 AM, Ana Pazos <apazos at codeaurora.org>
wrote:
> Daniel, don’t we need to fix the order of invoking alias analyses in
> lib/Transforms/IPO/PassManagerBuilder.cpp as well?
>
>
>
> Your patch fixed the order in lib/CodeGen/Passes.cpp and the delegation
> code in lib/Analysis/CFLAliasAnalysis.cpp.
>
>
>
> Thanks,
>
> Ana.
>
>
>
> *From:* Daniel Berlin [mailto:dberlin at dberlin.org]
> *Sent:* Wednesday, January 14, 2015 1:10 PM
> *To:* Ana Pazos
> *Cc:* George Burgess IV; Nick Lewycky; Jiangning Liu; LLVM Developers
> Mailing List
>
> *Subject:* Re: [LLVMdev] question about enabling cfl-aa and collecting
> a57 numbers
>
>
>
> Oh, sorry, i didn't rebase it when i changed the fix, you would have
had
> to apply the first on top of the second.
>
> Here is one against HEAD
>
>
>
> On Wed, Jan 14, 2015 at 12:32 PM, Ana Pazos <apazos at
codeaurora.org> wrote:
>
> Daniel, your patch does not apply cleanly. Are you on the tip?
>
> The code I see there is no line    if (QueryResult == MayAlias||
QueryResult == PartialAlias) to be removed.
>
> Thanks,
>
> Ana.
>
>
>
> *From:* George Burgess IV [mailto:george.burgess.iv at gmail.com]
> *Sent:* Wednesday, January 14, 2015 10:31 AM
> *To:* Daniel Berlin
> *Cc:* Nick Lewycky; Ana Pazos; Jiangning Liu; LLVM Developers Mailing List
> *Subject:* Re: [LLVMdev] question about enabling cfl-aa and collecting
> a57 numbers
>
>
>
> Inline
>
> - George
>
>
> On Jan 14, 2015, at 10:49 AM, Daniel Berlin <dberlin at dberlin.org>
wrote:
>
>
>
>
>
> On Tue, Jan 13, 2015 at 11:26 PM, Nick Lewycky <nlewycky at
google.com>
> wrote:
>
> On 13 January 2015 at 22:11, Daniel Berlin <dberlin at dberlin.org>
wrote:
>
> This is caused by CFLAA returning PartialAlias for a query that BasicAA
> can prove is NoAlias.
>
>
>
> One of them is wrong. Which one?
>
>
>
> CFL-AA.
>
>
>
> Right now it checks whether two things come out to be the same stratified
> info and have the same index, and if so, returns PartialAlias. This is
> wrong, because it never knows why two things got unified, only that they
> did.
>
>
>
> Therefore, the current check it does where it returns PartialAlias seems
> wrong.
>
>
>
> George, why does it return PartialAlias?
>
> It shouldn't -- unless I misinterpreted an earlier discussion, I was
under
> the impression that we were testing with that line updated to return
> MayAlias. We do need to update that in LLVM though; thanks for the fix.
>
>
>
> IMHO, you can't ever prove any kind of must-alias or partialalias info
> with CFL-AA, unless i'm missing something.
>
> Agreed.
>
>
>
>
>
> I'm not sure from your description that this is a chaining issue.
> PartialAlias doesn't chain and isn't supposed to, it's a final
answer just
> like NoAlias and MustAlias are. You return partial alias when you have
> proven that the accesses are overlapping, meaning that it is proven to
> neither be no alias nor must alias.
>
> Well, it's still chaining wrong, because it did not change on MayAlias,
> but you are right that it is not the issue i identified :)
>
>
>
> Anyway, updated patch attached.
>
>
>
>
>
> Nick
>
>
>
> I've attached a patch that fixes it, but truthfully, chaining seems
> somewhat badly designed IMHO.
>
>
>
> It should let you pass along the result you've gotten so far, so that
you
> don't have CFLAA get PartialAlias, chain, and have BasicAA get
MayAlias.
>
>
>
> (IE it should let it work strictly down the hierarchy).
>
>
>
> Note that this patch breaks the CFLAliasAnalysis tests because it no
> longer returns PartialAlias in some cases it is expecting it.
>
>
>
> I don't have time ATM to fix chaining completely, so if someone wants
to
> shepherd this patch through, that would be appreciated.
>
>
>
>
>
>
>
>
>
>
>
> On Tue, Jan 13, 2015 at 8:58 PM, Daniel Berlin <dberlin at
dberlin.org>
> wrote:
>
> Can you send me actual LLVM IR or a preprocessed source from using -E?
> I don't have a machine handy that has headers that target that arch.
>
>
>
>
>
> On Tue Jan 13 2015 at 4:33:29 PM Daniel Berlin <dberlin at
dberlin.org>
> wrote:
>
> Anything other than noalias or mustalias should be getting passed down the
> stack, so either that is not happening or CFL aa is giving better answers
> and something does worse with those better answers.  I'll take a look
this
> evening
>
> On Jan 13, 2015 3:58 PM, "Ana Pazos" <apazos at
codeaurora.org> wrote:
>
> Hi folks,
>
> Moving the discussion to llvm.dev.
>
> None of the changes we talked earlier help.
>
> Find attached the C source code that you can use to reproduce the issue.
>
> clang --target=aarch64-linux-gnu  -c -mcpu=cortex-a57 -Ofast
>  -fno-math-errno   test.c -S -o test.s  -mllvm -debug-only=licm
> LICM hoisting to while.body.lr.ph:   %21 = load double** %arrayidx8,
> align 8, !tbaa !5
> LICM hoisting to while.body.lr.ph:   %arrayidx72 = getelementptr inbounds
> double* %11, i64 1
> LICM hoisting to while.body.lr.ph:   %arrayidx81 = getelementptr inbounds
> double* %11, i64 2
> LICM hoisting to for.body.lr.ph:   %2 = ptrtoint i32* %Index to i64
>
> clang --target=aarch64-linux-gnu  -c   -mcpu=cortex-a57  -Ofast
>  -fno-math-errno   test.c -S -o test-cflaa.s -mllvm -use-cfl-aa -mllvm
> -debug-only=licm
> LICM hoisting to for.body.lr.ph:   %2 = ptrtoint i32* %Index to i64
>
> Why CFL AA cannot allow hoisting this out:  %21 = load double**
> %arrayidx8, align 8, !tbaa !5
>
> Which leads to this extra load in assembly code: ldr    x14, [x4, x9, lsl
> #3]
>
> Thanks!
> Ana.
>
> -----Original Message-----
> From: Hal Finkel [mailto:hfinkel at anl.gov]
> Sent: Tuesday, January 13, 2015 12:51 AM
> To: Chandler Carruth
> Cc: Jiangning Liu; Pazos, Ana; Ana Pazos; Daniel Berlin; George Burgess IV
> Subject: Re: question about enabling cfl-aa and collecting a57 numbers
>
> ----- Original Message -----
> > From: "Chandler Carruth" <chandlerc at gmail.com>
> > To: "Ana Pazos" <apazos at codeaurora.org>,
"Daniel Berlin" <
> dberlin at dberlin.org>, "George Burgess IV"
> > <george.burgess.iv at gmail.com>, "Hal Finkel"
<hfinkel at anl.gov>
> > Cc: "Jiangning Liu" <Jiangning.Liu at arm.com>,
"Ana Pazos"
> > <apazos at quicinc.com>
> > Sent: Monday, January 12, 2015 6:43:44 PM
> > Subject: Re: question about enabling cfl-aa and collecting a57 numbers
> >
> > Not quite:
> >
> >
> > On Mon Jan 12 2015 at 4:27:30 PM Ana Pazos < apazos at
codeaurora.org >
> > wrote:
> >
> > Thanks George and Daniel,
> >
> >
> >
> > If the recommended order is “–basicaa –cfla-aa” it means we should fix
> > the trunk code that processes the flags -mllvm use-cfl-aa-in-codegen
> > and –mllvm use-cfl-aa, right?
> >
> >
> >
> > In Transforms/IPO/PassManagerBuilder.cpp and CodeGen/Passes.cpp we see
> > this sequence:
> >
> >
> >
> > if (UseCFLAA)
> >
> > PM.add(createCFLAliasAnalysisPass());
> >
> > PM.add(createTypeBasedAliasAnalysisPass());
> >
> > PM.add(createScopedNoAliasAAPass());
> >
> > PM.add(createBasicAliasAnalysisPass());
> >
> >
> >
> > So are you recommending changing to the sequence below instead?
> >
> >
> > Not quite:
> >
> >
> > addPass(createTypeBasedAliasAnalysisPass());
> >
> > addPass(createScopedNoAliasAAPass());
> > You want it here.
> >
> > addPass(createBasicAliasAnalysisPass());
> >
> > if (UseCFLAA)
> >
> > addPass(createCFLAliasAnalysisPass());
> >
> >
> > The way to think about this is as follows:
> >
> >
> > if TBAA or ScopedNoAlias *can* return results, you want them to. They
> > are reflecting a-priori knowledge of aliasing.
> >
> >
> > if both of those say "maybe", you want to ask CFL. If CFL
says "maybe"
> > you want to ask "BasicAA".
>
> I could very well be misunderstanding something, but I recall that we add
> BasicAA last so that it will be queried first. We do this because of our
> BasicAA delegation hack: We want a MustAlias from BasicAA to override a
> NoAlias from TBAA, so that we can catch cases of basic local type punning.
> Thus, assuming CFL delegates properly, I fail to see how switching its
> order w.r.t. the metadata-based analysis changes anything.
>
>  -Hal
>
> >
> >
> > It's possible that you could as a compile-time hack flip the last
two
> > to your suggested ordering, but a) it should really only be a compile
> > time hack, and b) we haven't really tested that BasicAA actually
> > delegates rather than directly returning "maybe".
> >
> >
> > Thanks,
> >
> > Ana.
> >
> >
> >
> > From: Daniel Berlin [mailto: dberlin at dberlin.org ]
> > Sent: Friday, January 09, 2015 6:41 PM
> > To: George Burgess IV; Hal Finkel
> > Cc: Ana Pazos; Pazos, Ana; Jiangning Liu; chandlerc
> >
> >
> > Subject: Re: question about enabling cfl-aa and collecting a57 numbers
> >
> >
> > Right. I would not try with -cfl-aa -basicaa as the order, it should
> > be the other way around.
> >
> >
> > If you discover performance regressions with *that*, that would be
> > highly interesting.
> >
> >
> > On Fri Jan 09 2015 at 6:32:28 PM George Burgess IV <
> > george.burgess.iv at gmail.com > wrote:
> >
> >
> >
> > Thanks for the notes and observations Ana! :)
> >
> > >> Hal, what do you think? Does it meant that cfl-aa is more
> > >> conservative (note I set the right order -cfa-aa followed by
> > >> -basicaa)?
> > Currently, yes. The implementation as it is is focused heavily on
> > being fast, and on trying to cover cases that BasicAA can’t easily
> > cover itself. If our end goal is to use this as a replacement for
> > BasicAA, there’s probably quite a few low-hanging fruit for accuracy
> > improvements, and we can start playing with sacrificing speed in
> > pursuit of greater accuracy. On the other hand, if the end goal is to
> > append this pass to the list of AA passes already being performed,
> > then I’d be interested in seeing the difference in perf between
> > `-basicaa` and `-basicaa -cfl-aa`, if any.
> >
> > WRT this issue in particular, I believe CFLAA answers conservatively
> > because:
> > A. it doesn’t take into account constant pointer offsets (i.e. it
> > considers a[0]..a[n] all as the same “address”) B. CFLAA is entirely
> > context insensitive
> >
> > IIRC, BasicAA is somewhat context sensitive, and it does take into
> > account constant pointer offsets, so it can be more accurate in this
> > case. That being said, I don’t have access to the SPEC code, so I can
> > only speculate. :)
> >
> > George
> >
> > > On Jan 9, 2015, at 4:25 PM, Hal Finkel < hfinkel at anl.gov
> wrote:
> > >
> > > Ana,
> > >
> > > Thanks so much for looking into this. I'm adding some
additional
> > > relevant people to the CC line...
> > >
> > > -Hal
> > >
> > > ----- Original Message -----
> > >> From: "Ana Pazos" < apazos at codeaurora.org
>
> > >> To: "Ana Pazos" < apazos at quicinc.com >,
"Hal Finkel" <
> > >> hfinkel at anl.gov >, "Jiangning Liu" <
Jiangning.Liu at arm.com >
> > >> Sent: Friday, January 9, 2015 3:12:04 PM
> > >> Subject: RE: question about enabling cfl-aa and collecting
a57
> > >> numbers
> > >>
> > >> Hi Hal and Jiangning,
> > >>
> > >> I started to look at the effect of cfl-aa on SPEC.
> > >>
> > >> The first observation is that I see more redundant loads not
being
> > >> hoisted out of loops by LICM.
> > >>
> > >> Due to license restrictions, I cannot paste SPEC code here,
but I
> > >> think you have access to it.
> > >>
> > >> I tried to simplify a top function in equake (quake.c smvp)
to show
> > >> the issue. See the attached reduced test case.
> > >>
> > >> If you look at the the source code you will see something
like:
> > >> for {
> > >> ...load v[i][0]...
> > >> while {
> > >> ...load v[i][0]...
> > >> store w[col][0]....
> > >> }
> > >> ...load w[i][0]...
> > >> store w[i][0]...
> > >> }
> > >>
> > >> It should be possible to avoid load &v[i][0],
&v[i][1], &vi[i][2]
> > >> inside the while loop.
> > >>
> > >> In the simplified example that would be this instruction:
> > >> %21 = load double** %arrayidx8, align 8, !tbaa !5
> > >>
> > >> And the pointer for the load instruction:
> > >> %arrayidx8 = getelementptr inbounds double** %v, i64 %idxprom
> > >>
> > >> The decision to hoist it fails at LICM.cpp:189: AliasSet
&AS > > >> CurAST->getAliasSetForPointer(V, Size,
AAInfo).isMod.
> > >>
> > >> With basicaa, only loads are in the alias set, but with
cfl-aa you
> > >> find loads and stores, so isMod returns true.
> > >>
> > >> AliasSet with basicaa:
> > >> AliasSet[0x2b35aa0, 7] may alias, Ref Pointers: (double***
> > >> %arrayidx32, 8), (double** %12, 8), (double** %arrayidx36,
8),
> > >> (double** %arrayidx8, 8), (double** %arrayidx68, 8),
(double**
> > >> %arrayidx77, 8), (double** %arrayidx85, 8)
> > >> $5 = void
> > >>
> > >> AliasSet with cfa-aa:
> > >> AliasSet[0x2b37df0, 20] may alias, Mod/Ref Pointers: (i32*
> > >> %arrayidx30, 4), (double*** %arrayidx32, 8), (double** %12,
8),
> > >> (double* %13, 8), (double** %arrayidx36, 8), (double* %15,
8),
> > >> (double* %arrayidx42, 8), (double* %arrayidx45, 8), (double*
> > >> %arrayidx51, 8), (double* %arrayidx54, 8), (double**
%arrayidx8,
> > >> 8), (double* %21, 8), (double** %arrayidx68, 8), (double*
%23, 8),
> > >> (double* %arrayidx72, 8), (double** %arrayidx77, 8), (double*
%26,
> > >> 8), (double* %arrayidx81, 8), (double** %arrayidx85, 8),
(double*
> > >> %29, 8)
> > >> $8 = void
> > >>
> > >> You can reproduce it with the commands:
> > >> opt -O3 -S -cfl-aa -basicaa -licm -debug-only=licm reduce.ll
-o out
> > >>
> > >> opt -O3 -S - -basicaa -licm -debug-only=licm reduce.ll -o out
LICM
> > >> hoisting to while.body.lr.ph : %21 = load double**
%arrayidx8,
> > >> align 8, !tbaa !5 LICM hoisting to while.body.lr.ph :
%arrayidx72 > > >> getelementptr inbounds double* %11, i64 1 LICM
hoisting to
> > >> while.body.lr.ph : %arrayidx81 = getelementptr inbounds
double*
> > >> %11, i64 2
> > >>
> > >> Hal, what do you think? Does it meant that cfl-aa is more
> > >> conservative (note I set the right order -cfa-aa followed by
> > >> -basicaa)? Could it be an issue with building the AliasSet
tracker?
> > >> Any pointers on how to fix this?
> > >>
> > >> Thanks,
> > >> Ana.
> > >>
> > >>
> > >>
> > >> -----Original Message-----
> > >> From: Jiangning Liu [mailto: Jiangning.Liu at arm.com ]
> > >> Sent: Wednesday, December 24, 2014 6:27 PM
> > >> To: Hal Finkel; Ana Pazos
> > >> Subject: RE: question about enabling cfl-aa and collecting
a57
> > >> numbers
> > >>
> > >> Hi,
> > >>
> > >> I saw big regressions on cortex-A57 for the following SPEC
> > >> benchmarks after using "-mllvm -use-cfl-aa ".
> > >>
> > >> spec.cpu2006.ref.462_libquantum 9.22%
spec.cpu2000.ref.179_art
> > >> 5.32%
> > >> spec.cpu2000.ref.256_bzip2 4.91%
> > >>
> > >> Thanks,
> > >> -Jiangning
> > >>
> > >>> -----Original Message-----
> > >>> From: Hal Finkel [mailto: hfinkel at anl.gov ]
> > >>> Sent: Wednesday, December 24, 2014 3:56 AM
> > >>> To: Ana Pazos
> > >>> Cc: Jiangning Liu
> > >>> Subject: Re: question about enabling cfl-aa and
collecting a57
> > >>> numbers
> > >>>
> > >>> ----- Original Message -----
> > >>>> From: "Ana Pazos" < apazos at
codeaurora.org >
> > >>>> To: "Hal Finkel" < hfinkel at anl.gov
>, "Jiangning Liu"
> > >>>> < Jiangning.Liu at arm.com >
> > >>>> Sent: Monday, December 22, 2014 4:25:07 PM
> > >>>> Subject: question about enabling cfl-aa and
collecting a57
> > >>>> numbers
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> Hi Hal,
> > >>>>
> > >>>>
> > >>>>
> > >>>> Can you clarify to enable cfl-aa from clang we need
both these
> > >>>> flags?
> > >>>>
> > >>>>
> > >>>
> > >>> Hi Ana,
> > >>>
> > >>> Thanks for following-up on this! You only need (-mllvm
> > >>> -use-cfl-aa)
> > >>> to
> > >>> use CFL AA during the main optimization pipeline, and
just this is
> > >>> enough to reproduce the performance regressions as far as
I know.
> > >>> You
> > >>> can also use CFL AA during code generation (-mllvm
> > >>> -use-cfl-aa-in-codegen), which matters only if your using
AA at
> > >>> all for code generation (which AArch64 does only for the
> > >>> Cortex/A53).
> > >>>
> > >>> -Hal
> > >>>
> > >>>>
> > >>>> -mllvm -use-cfl-aa
> > >>>>
> > >>>> -mllvm -cfl-aa-in-codegen
> > >>>>
> > >>>>
> > >>>>
> > >>>> Hi Jiangning,
> > >>>>
> > >>>> Did you collect a57 perf results for SPEC 2000 and
2006 enabling
> > >>>> these two flags?
> > >>>>
> > >>>> Did you notice any correctness and performance
regression?
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> Thanks,
> > >>>>
> > >>>> Ana.
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> Ana Pazos
> > >>>> Qualcomm Innovation Center, Inc.
> > >>>> The Qualcomm Innovation Center, Inc. is a member of
the Code
> > >>>> Aurora Forum, a Linux Foundation Collaborative
Project.
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>
> > >>> --
> > >>> Hal Finkel
> > >>> Assistant Computational Scientist
> > >>> Leadership Computing Facility
> > >>> Argonne National Laboratory
> > >>
> > >>
> > >> -- IMPORTANT NOTICE: The contents of this email and any
attachments
> > >> are confidential and may also be privileged. If you are not
the
> > >> intended recipient, please notify the sender immediately and
do not
> > >> disclose the contents to any other person, use it for any
purpose,
> > >> or store or copy the information in any medium. Thank you.
> > >>
> > >> ARM Limited, Registered office 110 Fulbourn Road, Cambridge
CB1
> > >> 9NJ, Registered in England & Wales, Company No: 2557590
ARM
> > >> Holdings plc, Registered office 110 Fulbourn Road, Cambridge
CB1
> > >> 9NJ, Registered in England & Wales, Company No: 2548782
> > >>
> > >
> > > --
> > > Hal Finkel
> > > Assistant Computational Scientist
> > > Leadership Computing Facility
> > > Argonne National Laboratory
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>
>
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
>
>
> <cflaafix.diff>
>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150115/17ec90ed/attachment.html>
-------------- next part --------------
diff --git a/lib/Analysis/CFLAliasAnalysis.cpp
b/lib/Analysis/CFLAliasAnalysis.cpp
index 88bb84a..95fd377 100644
--- a/lib/Analysis/CFLAliasAnalysis.cpp
+++ b/lib/Analysis/CFLAliasAnalysis.cpp
@@ -227,10 +227,13 @@ public:
     // Comparisons between global variables and other constants should be
     // handled by BasicAA.
     if (isa<Constant>(LocA.Ptr) && isa<Constant>(LocB.Ptr))
{
-      return MayAlias;
+      return AliasAnalysis::alias(LocA, LocB);
     }
+    AliasResult QueryResult = query(LocA, LocB);
+    if (QueryResult == MayAlias)
+      return AliasAnalysis::alias(LocA, LocB);
 
-    return query(LocA, LocB);
+    return QueryResult;
   }
 
   void initializePass() override { InitializeAliasAnalysis(this); }
@@ -295,8 +298,9 @@ public:
   }
 
   void visitSelectInst(SelectInst &Inst) {
-    auto *Condition = Inst.getCondition();
-    Output.push_back(Edge(&Inst, Condition, EdgeType::Assign, AttrNone));
+    // Condition is irrelevant, it is evaluated, but not loaded,
+    // stored, or assigned
+    // The results of the select are assigned
     auto *TrueVal = Inst.getTrueValue();
     Output.push_back(Edge(&Inst, TrueVal, EdgeType::Assign, AttrNone));
     auto *FalseVal = Inst.getFalseValue();
@@ -768,7 +772,10 @@ static Optional<StratifiedAttr>
valueToAttrIndex(Value *Val) {
     return AttrGlobalIndex;
 
   if (auto *Arg = dyn_cast<Argument>(Val))
-    if (!Arg->hasNoAliasAttr())
+    // Only pointer arguments should have the argument attribute,
+    // because things can't escape through scalars without us seeing a
+    // cast, and thus, interaction with them doesn't matter.
+    if (!Arg->hasNoAliasAttr() &&
Arg->getType()->isPointerTy())
       return argNumberToAttrIndex(Arg->getArgNo());
   return NoneType();
 }
@@ -991,10 +998,6 @@ CFLAliasAnalysis::query(const AliasAnalysis::Location
&LocA,
 
   auto SetA = *MaybeA;
   auto SetB = *MaybeB;
-
-  if (SetA.Index == SetB.Index)
-    return AliasAnalysis::PartialAlias;
-
   auto AttrsA = Sets.getLink(SetA.Index).Attrs;
   auto AttrsB = Sets.getLink(SetB.Index).Attrs;
   // Stratified set attributes are used as markets to signify whether a member
@@ -1009,5 +1012,12 @@ CFLAliasAnalysis::query(const AliasAnalysis::Location
&LocA,
   if (AttrsA.any() && AttrsB.any())
     return AliasAnalysis::MayAlias;
 
+  // If we haven't interacted with globals, and they still got
+  // unified, they are partial alias right now.
+  // NOTE: This assumes we don't unify for any reason other than
+  // assignments
+  if (SetA.Index == SetB.Index)
+    return AliasAnalysis::PartialAlias;
+
   return AliasAnalysis::NoAlias;
 }
diff --git a/lib/CodeGen/Passes.cpp b/lib/CodeGen/Passes.cpp
index 28e9f84..82ca024 100644
--- a/lib/CodeGen/Passes.cpp
+++ b/lib/CodeGen/Passes.cpp
@@ -400,10 +400,10 @@ void TargetPassConfig::addIRPasses() {
   // Add TypeBasedAliasAnalysis before BasicAliasAnalysis so that
   // BasicAliasAnalysis wins if they disagree. This is intended to help
   // support "obvious" type-punning idioms.
-  if (UseCFLAA)
-    addPass(createCFLAliasAnalysisPass());
   addPass(createTypeBasedAliasAnalysisPass());
   addPass(createScopedNoAliasAAPass());
+  if (UseCFLAA)
+    addPass(createCFLAliasAnalysisPass());
   addPass(createBasicAliasAnalysisPass());
 
   // Before running any passes, run the verifier to determine if the input
diff --git a/lib/Transforms/IPO/PassManagerBuilder.cpp
b/lib/Transforms/IPO/PassManagerBuilder.cpp
index bb776ef..645686c 100644
--- a/lib/Transforms/IPO/PassManagerBuilder.cpp
+++ b/lib/Transforms/IPO/PassManagerBuilder.cpp
@@ -132,10 +132,10 @@
PassManagerBuilder::addInitialAliasAnalysisPasses(PassManagerBase &PM) const
{
   // Add TypeBasedAliasAnalysis before BasicAliasAnalysis so that
   // BasicAliasAnalysis wins if they disagree. This is intended to help
   // support "obvious" type-punning idioms.
-  if (UseCFLAA)
-    PM.add(createCFLAliasAnalysisPass());
   PM.add(createTypeBasedAliasAnalysisPass());
   PM.add(createScopedNoAliasAAPass());
+  if (UseCFLAA)
+    PM.add(createCFLAliasAnalysisPass());
   PM.add(createBasicAliasAnalysisPass());
 }
 
diff --git a/test/Analysis/CFLAliasAnalysis/gep-signed-arithmetic.ll
b/test/Analysis/CFLAliasAnalysis/gep-signed-arithmetic.ll
index a0195d7..f68edaf 100644
--- a/test/Analysis/CFLAliasAnalysis/gep-signed-arithmetic.ll
+++ b/test/Analysis/CFLAliasAnalysis/gep-signed-arithmetic.ll
@@ -5,7 +5,8 @@ target datalayout = "e-p:32:32:32"
 
 ; CHECK: 1 partial alias response
 
-define i32 @test(i32* %tab, i32 %indvar) nounwind {
+define i32 @test(i32 %indvar) nounwind {
+  %tab = alloca i32, align 4
   %tmp31 = mul i32 %indvar, -2
   %tmp32 = add i32 %tmp31, 30
   %t.5 = getelementptr i32* %tab, i32 %tmp32
diff --git a/test/Analysis/CFLAliasAnalysis/must-and-partial.ll
b/test/Analysis/CFLAliasAnalysis/must-and-partial.ll
index df7de38..2c66d90 100644
--- a/test/Analysis/CFLAliasAnalysis/must-and-partial.ll
+++ b/test/Analysis/CFLAliasAnalysis/must-and-partial.ll
@@ -7,8 +7,9 @@
 target datalayout =
"e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
 
 ; CHECK: PartialAlias:  i16* %bigbase0, i8* %phi
-define i8 @test0(i8* %base, i1 %x) {
+define i8 @test0(i1 %x) {
 entry:
+  %base = alloca i8, align 4
   %baseplusone = getelementptr i8* %base, i64 1
   br i1 %x, label %red, label %green
 red:
@@ -25,8 +26,9 @@ green:
 }
 
 ; CHECK: PartialAlias:  i16* %bigbase1, i8* %sel
-define i8 @test1(i8* %base, i1 %x) {
+define i8 @test1(i1 %x) {
 entry:
+  %base = alloca i8, align 4
   %baseplusone = getelementptr i8* %base, i64 1
   %sel = select i1 %x, i8* %baseplusone, i8* %base
   store i8 0, i8* %sel
@@ -37,3 +39,15 @@ entry:
   %loaded = load i8* %sel
   ret i8 %loaded
 }
+
+; Arguments should not be partial alias
+; CHECK: MayAlias:  double* %A, double* %Index
+define void @testr2(double* nocapture readonly %A, double* nocapture readonly
%Index) {
+  %arrayidx22 = getelementptr inbounds double* %Index, i64 2
+  %1 = load double* %arrayidx22
+  %arrayidx25 = getelementptr inbounds double* %A, i64 2
+  %2 = load double* %arrayidx25
+  %mul26 = fmul double %1, %2
+  ret void
+}
+

Nick Lewycky

2015-Jan-15 21:26 UTC

head link

[LLVMdev] question about enabling cfl-aa and collecting a57 numbers

On 15 January 2015 at 13:10, Daniel Berlin <dberlin at dberlin.org> wrote:
> Yes.
> I've attached an updated patch that does the following:
>
> 1. Fixes the partialalias of globals/arguments
> 2. Enables partialalias for cases where nothing has been unified to a
> global/argument
> 3. Fixes that select was unifying the condition to the other pieces (the
> condition does not need to be processed :P). This was causing unnecessary
> aliasing.
>
Consider this:

void *p = ...;
uintptr_t i = p;
uintptr_t j = 0;
for (int a = 0; a < sizeof(uintptr_t); ++a) {
  j = i >> (sizeof(uintptr_t) - a - 1) ? 1 : 0;
  j <<= 1;
}
void *q = j;

alias(p, q) isn't NoAlias. (Okay, it kinda is in C++, but not in the
equivalent LLVM IR. Please don't make me rewrite my example in LLVM IR.)

Nick

> 4. Adds a regression test to must-and-partial
> 5. Fixes the pass ordering
> 6. Does not alias non-pointer arguments to random things :)
> 7. Fixes the CFL test cases that break with above.
>
>
>
>
> On Thu, Jan 15, 2015 at 11:33 AM, Ana Pazos <apazos at
codeaurora.org> wrote:
>
>> Daniel, don’t we need to fix the order of invoking alias analyses in
>> lib/Transforms/IPO/PassManagerBuilder.cpp as well?
>>
>>
>>
>> Your patch fixed the order in lib/CodeGen/Passes.cpp and the delegation
>> code in lib/Analysis/CFLAliasAnalysis.cpp.
>>
>>
>>
>> Thanks,
>>
>> Ana.
>>
>>
>>
>> *From:* Daniel Berlin [mailto:dberlin at dberlin.org]
>> *Sent:* Wednesday, January 14, 2015 1:10 PM
>> *To:* Ana Pazos
>> *Cc:* George Burgess IV; Nick Lewycky; Jiangning Liu; LLVM Developers
>> Mailing List
>>
>> *Subject:* Re: [LLVMdev] question about enabling cfl-aa and collecting
>> a57 numbers
>>
>>
>>
>> Oh, sorry, i didn't rebase it when i changed the fix, you would
have had
>> to apply the first on top of the second.
>>
>> Here is one against HEAD
>>
>>
>>
>> On Wed, Jan 14, 2015 at 12:32 PM, Ana Pazos <apazos at
codeaurora.org>
>> wrote:
>>
>> Daniel, your patch does not apply cleanly. Are you on the tip?
>>
>> The code I see there is no line    if (QueryResult == MayAlias||
QueryResult == PartialAlias) to be removed.
>>
>> Thanks,
>>
>> Ana.
>>
>>
>>
>> *From:* George Burgess IV [mailto:george.burgess.iv at gmail.com]
>> *Sent:* Wednesday, January 14, 2015 10:31 AM
>> *To:* Daniel Berlin
>> *Cc:* Nick Lewycky; Ana Pazos; Jiangning Liu; LLVM Developers Mailing
>> List
>> *Subject:* Re: [LLVMdev] question about enabling cfl-aa and collecting
>> a57 numbers
>>
>>
>>
>> Inline
>>
>> - George
>>
>>
>> On Jan 14, 2015, at 10:49 AM, Daniel Berlin <dberlin at
dberlin.org> wrote:
>>
>>
>>
>>
>>
>> On Tue, Jan 13, 2015 at 11:26 PM, Nick Lewycky <nlewycky at
google.com>
>> wrote:
>>
>> On 13 January 2015 at 22:11, Daniel Berlin <dberlin at
dberlin.org> wrote:
>>
>> This is caused by CFLAA returning PartialAlias for a query that BasicAA
>> can prove is NoAlias.
>>
>>
>>
>> One of them is wrong. Which one?
>>
>>
>>
>> CFL-AA.
>>
>>
>>
>> Right now it checks whether two things come out to be the same
stratified
>> info and have the same index, and if so, returns PartialAlias. This is
>> wrong, because it never knows why two things got unified, only that
they
>> did.
>>
>>
>>
>> Therefore, the current check it does where it returns PartialAlias
seems
>> wrong.
>>
>>
>>
>> George, why does it return PartialAlias?
>>
>> It shouldn't -- unless I misinterpreted an earlier discussion, I
was
>> under the impression that we were testing with that line updated to
return
>> MayAlias. We do need to update that in LLVM though; thanks for the fix.
>>
>>
>>
>> IMHO, you can't ever prove any kind of must-alias or partialalias
info
>> with CFL-AA, unless i'm missing something.
>>
>> Agreed.
>>
>>
>>
>>
>>
>> I'm not sure from your description that this is a chaining issue.
>> PartialAlias doesn't chain and isn't supposed to, it's a
final answer just
>> like NoAlias and MustAlias are. You return partial alias when you have
>> proven that the accesses are overlapping, meaning that it is proven to
>> neither be no alias nor must alias.
>>
>> Well, it's still chaining wrong, because it did not change on
MayAlias,
>> but you are right that it is not the issue i identified :)
>>
>>
>>
>> Anyway, updated patch attached.
>>
>>
>>
>>
>>
>> Nick
>>
>>
>>
>> I've attached a patch that fixes it, but truthfully, chaining seems
>> somewhat badly designed IMHO.
>>
>>
>>
>> It should let you pass along the result you've gotten so far, so
that you
>> don't have CFLAA get PartialAlias, chain, and have BasicAA get
MayAlias.
>>
>>
>>
>> (IE it should let it work strictly down the hierarchy).
>>
>>
>>
>> Note that this patch breaks the CFLAliasAnalysis tests because it no
>> longer returns PartialAlias in some cases it is expecting it.
>>
>>
>>
>> I don't have time ATM to fix chaining completely, so if someone
wants to
>> shepherd this patch through, that would be appreciated.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Tue, Jan 13, 2015 at 8:58 PM, Daniel Berlin <dberlin at
dberlin.org>
>> wrote:
>>
>> Can you send me actual LLVM IR or a preprocessed source from using -E?
>> I don't have a machine handy that has headers that target that
arch.
>>
>>
>>
>>
>>
>> On Tue Jan 13 2015 at 4:33:29 PM Daniel Berlin <dberlin at
dberlin.org>
>> wrote:
>>
>> Anything other than noalias or mustalias should be getting passed down
>> the stack, so either that is not happening or CFL aa is giving better
>> answers and something does worse with those better answers.  I'll
take a
>> look this evening
>>
>> On Jan 13, 2015 3:58 PM, "Ana Pazos" <apazos at
codeaurora.org> wrote:
>>
>> Hi folks,
>>
>> Moving the discussion to llvm.dev.
>>
>> None of the changes we talked earlier help.
>>
>> Find attached the C source code that you can use to reproduce the
issue.
>>
>> clang --target=aarch64-linux-gnu  -c -mcpu=cortex-a57 -Ofast
>>  -fno-math-errno   test.c -S -o test.s  -mllvm -debug-only=licm
>> LICM hoisting to while.body.lr.ph:   %21 = load double** %arrayidx8,
>> align 8, !tbaa !5
>> LICM hoisting to while.body.lr.ph:   %arrayidx72 = getelementptr
>> inbounds double* %11, i64 1
>> LICM hoisting to while.body.lr.ph:   %arrayidx81 = getelementptr
>> inbounds double* %11, i64 2
>> LICM hoisting to for.body.lr.ph:   %2 = ptrtoint i32* %Index to i64
>>
>> clang --target=aarch64-linux-gnu  -c   -mcpu=cortex-a57  -Ofast
>>  -fno-math-errno   test.c -S -o test-cflaa.s -mllvm -use-cfl-aa -mllvm
>> -debug-only=licm
>> LICM hoisting to for.body.lr.ph:   %2 = ptrtoint i32* %Index to i64
>>
>> Why CFL AA cannot allow hoisting this out:  %21 = load double**
>> %arrayidx8, align 8, !tbaa !5
>>
>> Which leads to this extra load in assembly code: ldr    x14, [x4, x9,
lsl
>> #3]
>>
>> Thanks!
>> Ana.
>>
>> -----Original Message-----
>> From: Hal Finkel [mailto:hfinkel at anl.gov]
>> Sent: Tuesday, January 13, 2015 12:51 AM
>> To: Chandler Carruth
>> Cc: Jiangning Liu; Pazos, Ana; Ana Pazos; Daniel Berlin; George Burgess
IV
>> Subject: Re: question about enabling cfl-aa and collecting a57 numbers
>>
>> ----- Original Message -----
>> > From: "Chandler Carruth" <chandlerc at gmail.com>
>> > To: "Ana Pazos" <apazos at codeaurora.org>,
"Daniel Berlin" <
>> dberlin at dberlin.org>, "George Burgess IV"
>> > <george.burgess.iv at gmail.com>, "Hal Finkel"
<hfinkel at anl.gov>
>> > Cc: "Jiangning Liu" <Jiangning.Liu at arm.com>,
"Ana Pazos"
>> > <apazos at quicinc.com>
>> > Sent: Monday, January 12, 2015 6:43:44 PM
>> > Subject: Re: question about enabling cfl-aa and collecting a57
numbers
>> >
>> > Not quite:
>> >
>> >
>> > On Mon Jan 12 2015 at 4:27:30 PM Ana Pazos < apazos at
codeaurora.org >
>> > wrote:
>> >
>> > Thanks George and Daniel,
>> >
>> >
>> >
>> > If the recommended order is “–basicaa –cfla-aa” it means we should
fix
>> > the trunk code that processes the flags -mllvm
use-cfl-aa-in-codegen
>> > and –mllvm use-cfl-aa, right?
>> >
>> >
>> >
>> > In Transforms/IPO/PassManagerBuilder.cpp and CodeGen/Passes.cpp we
see
>> > this sequence:
>> >
>> >
>> >
>> > if (UseCFLAA)
>> >
>> > PM.add(createCFLAliasAnalysisPass());
>> >
>> > PM.add(createTypeBasedAliasAnalysisPass());
>> >
>> > PM.add(createScopedNoAliasAAPass());
>> >
>> > PM.add(createBasicAliasAnalysisPass());
>> >
>> >
>> >
>> > So are you recommending changing to the sequence below instead?
>> >
>> >
>> > Not quite:
>> >
>> >
>> > addPass(createTypeBasedAliasAnalysisPass());
>> >
>> > addPass(createScopedNoAliasAAPass());
>> > You want it here.
>> >
>> > addPass(createBasicAliasAnalysisPass());
>> >
>> > if (UseCFLAA)
>> >
>> > addPass(createCFLAliasAnalysisPass());
>> >
>> >
>> > The way to think about this is as follows:
>> >
>> >
>> > if TBAA or ScopedNoAlias *can* return results, you want them to.
They
>> > are reflecting a-priori knowledge of aliasing.
>> >
>> >
>> > if both of those say "maybe", you want to ask CFL. If
CFL says "maybe"
>> > you want to ask "BasicAA".
>>
>> I could very well be misunderstanding something, but I recall that we
add
>> BasicAA last so that it will be queried first. We do this because of
our
>> BasicAA delegation hack: We want a MustAlias from BasicAA to override a
>> NoAlias from TBAA, so that we can catch cases of basic local type
punning.
>> Thus, assuming CFL delegates properly, I fail to see how switching its
>> order w.r.t. the metadata-based analysis changes anything.
>>
>>  -Hal
>>
>> >
>> >
>> > It's possible that you could as a compile-time hack flip the
last two
>> > to your suggested ordering, but a) it should really only be a
compile
>> > time hack, and b) we haven't really tested that BasicAA
actually
>> > delegates rather than directly returning "maybe".
>> >
>> >
>> > Thanks,
>> >
>> > Ana.
>> >
>> >
>> >
>> > From: Daniel Berlin [mailto: dberlin at dberlin.org ]
>> > Sent: Friday, January 09, 2015 6:41 PM
>> > To: George Burgess IV; Hal Finkel
>> > Cc: Ana Pazos; Pazos, Ana; Jiangning Liu; chandlerc
>> >
>> >
>> > Subject: Re: question about enabling cfl-aa and collecting a57
numbers
>> >
>> >
>> > Right. I would not try with -cfl-aa -basicaa as the order, it
should
>> > be the other way around.
>> >
>> >
>> > If you discover performance regressions with *that*, that would be
>> > highly interesting.
>> >
>> >
>> > On Fri Jan 09 2015 at 6:32:28 PM George Burgess IV <
>> > george.burgess.iv at gmail.com > wrote:
>> >
>> >
>> >
>> > Thanks for the notes and observations Ana! :)
>> >
>> > >> Hal, what do you think? Does it meant that cfl-aa is more
>> > >> conservative (note I set the right order -cfa-aa followed
by
>> > >> -basicaa)?
>> > Currently, yes. The implementation as it is is focused heavily on
>> > being fast, and on trying to cover cases that BasicAA can’t easily
>> > cover itself. If our end goal is to use this as a replacement for
>> > BasicAA, there’s probably quite a few low-hanging fruit for
accuracy
>> > improvements, and we can start playing with sacrificing speed in
>> > pursuit of greater accuracy. On the other hand, if the end goal is
to
>> > append this pass to the list of AA passes already being performed,
>> > then I’d be interested in seeing the difference in perf between
>> > `-basicaa` and `-basicaa -cfl-aa`, if any.
>> >
>> > WRT this issue in particular, I believe CFLAA answers
conservatively
>> > because:
>> > A. it doesn’t take into account constant pointer offsets (i.e. it
>> > considers a[0]..a[n] all as the same “address”) B. CFLAA is
entirely
>> > context insensitive
>> >
>> > IIRC, BasicAA is somewhat context sensitive, and it does take into
>> > account constant pointer offsets, so it can be more accurate in
this
>> > case. That being said, I don’t have access to the SPEC code, so I
can
>> > only speculate. :)
>> >
>> > George
>> >
>> > > On Jan 9, 2015, at 4:25 PM, Hal Finkel < hfinkel at
anl.gov > wrote:
>> > >
>> > > Ana,
>> > >
>> > > Thanks so much for looking into this. I'm adding some
additional
>> > > relevant people to the CC line...
>> > >
>> > > -Hal
>> > >
>> > > ----- Original Message -----
>> > >> From: "Ana Pazos" < apazos at codeaurora.org
>
>> > >> To: "Ana Pazos" < apazos at quicinc.com
>, "Hal Finkel" <
>> > >> hfinkel at anl.gov >, "Jiangning Liu" <
Jiangning.Liu at arm.com >
>> > >> Sent: Friday, January 9, 2015 3:12:04 PM
>> > >> Subject: RE: question about enabling cfl-aa and
collecting a57
>> > >> numbers
>> > >>
>> > >> Hi Hal and Jiangning,
>> > >>
>> > >> I started to look at the effect of cfl-aa on SPEC.
>> > >>
>> > >> The first observation is that I see more redundant loads
not being
>> > >> hoisted out of loops by LICM.
>> > >>
>> > >> Due to license restrictions, I cannot paste SPEC code
here, but I
>> > >> think you have access to it.
>> > >>
>> > >> I tried to simplify a top function in equake (quake.c
smvp) to show
>> > >> the issue. See the attached reduced test case.
>> > >>
>> > >> If you look at the the source code you will see something
like:
>> > >> for {
>> > >> ...load v[i][0]...
>> > >> while {
>> > >> ...load v[i][0]...
>> > >> store w[col][0]....
>> > >> }
>> > >> ...load w[i][0]...
>> > >> store w[i][0]...
>> > >> }
>> > >>
>> > >> It should be possible to avoid load &v[i][0],
&v[i][1], &vi[i][2]
>> > >> inside the while loop.
>> > >>
>> > >> In the simplified example that would be this instruction:
>> > >> %21 = load double** %arrayidx8, align 8, !tbaa !5
>> > >>
>> > >> And the pointer for the load instruction:
>> > >> %arrayidx8 = getelementptr inbounds double** %v, i64
%idxprom
>> > >>
>> > >> The decision to hoist it fails at LICM.cpp:189: AliasSet
&AS >> > >> CurAST->getAliasSetForPointer(V, Size,
AAInfo).isMod.
>> > >>
>> > >> With basicaa, only loads are in the alias set, but with
cfl-aa you
>> > >> find loads and stores, so isMod returns true.
>> > >>
>> > >> AliasSet with basicaa:
>> > >> AliasSet[0x2b35aa0, 7] may alias, Ref Pointers:
(double***
>> > >> %arrayidx32, 8), (double** %12, 8), (double**
%arrayidx36, 8),
>> > >> (double** %arrayidx8, 8), (double** %arrayidx68, 8),
(double**
>> > >> %arrayidx77, 8), (double** %arrayidx85, 8)
>> > >> $5 = void
>> > >>
>> > >> AliasSet with cfa-aa:
>> > >> AliasSet[0x2b37df0, 20] may alias, Mod/Ref Pointers:
(i32*
>> > >> %arrayidx30, 4), (double*** %arrayidx32, 8), (double**
%12, 8),
>> > >> (double* %13, 8), (double** %arrayidx36, 8), (double*
%15, 8),
>> > >> (double* %arrayidx42, 8), (double* %arrayidx45, 8),
(double*
>> > >> %arrayidx51, 8), (double* %arrayidx54, 8), (double**
%arrayidx8,
>> > >> 8), (double* %21, 8), (double** %arrayidx68, 8), (double*
%23, 8),
>> > >> (double* %arrayidx72, 8), (double** %arrayidx77, 8),
(double* %26,
>> > >> 8), (double* %arrayidx81, 8), (double** %arrayidx85, 8),
(double*
>> > >> %29, 8)
>> > >> $8 = void
>> > >>
>> > >> You can reproduce it with the commands:
>> > >> opt -O3 -S -cfl-aa -basicaa -licm -debug-only=licm
reduce.ll -o out
>> > >>
>> > >> opt -O3 -S - -basicaa -licm -debug-only=licm reduce.ll -o
out LICM
>> > >> hoisting to while.body.lr.ph : %21 = load double**
%arrayidx8,
>> > >> align 8, !tbaa !5 LICM hoisting to while.body.lr.ph :
%arrayidx72 >> > >> getelementptr inbounds double* %11, i64 1
LICM hoisting to
>> > >> while.body.lr.ph : %arrayidx81 = getelementptr inbounds
double*
>> > >> %11, i64 2
>> > >>
>> > >> Hal, what do you think? Does it meant that cfl-aa is more
>> > >> conservative (note I set the right order -cfa-aa followed
by
>> > >> -basicaa)? Could it be an issue with building the
AliasSet tracker?
>> > >> Any pointers on how to fix this?
>> > >>
>> > >> Thanks,
>> > >> Ana.
>> > >>
>> > >>
>> > >>
>> > >> -----Original Message-----
>> > >> From: Jiangning Liu [mailto: Jiangning.Liu at arm.com ]
>> > >> Sent: Wednesday, December 24, 2014 6:27 PM
>> > >> To: Hal Finkel; Ana Pazos
>> > >> Subject: RE: question about enabling cfl-aa and
collecting a57
>> > >> numbers
>> > >>
>> > >> Hi,
>> > >>
>> > >> I saw big regressions on cortex-A57 for the following
SPEC
>> > >> benchmarks after using "-mllvm -use-cfl-aa ".
>> > >>
>> > >> spec.cpu2006.ref.462_libquantum 9.22%
spec.cpu2000.ref.179_art
>> > >> 5.32%
>> > >> spec.cpu2000.ref.256_bzip2 4.91%
>> > >>
>> > >> Thanks,
>> > >> -Jiangning
>> > >>
>> > >>> -----Original Message-----
>> > >>> From: Hal Finkel [mailto: hfinkel at anl.gov ]
>> > >>> Sent: Wednesday, December 24, 2014 3:56 AM
>> > >>> To: Ana Pazos
>> > >>> Cc: Jiangning Liu
>> > >>> Subject: Re: question about enabling cfl-aa and
collecting a57
>> > >>> numbers
>> > >>>
>> > >>> ----- Original Message -----
>> > >>>> From: "Ana Pazos" < apazos at
codeaurora.org >
>> > >>>> To: "Hal Finkel" < hfinkel at
anl.gov >, "Jiangning Liu"
>> > >>>> < Jiangning.Liu at arm.com >
>> > >>>> Sent: Monday, December 22, 2014 4:25:07 PM
>> > >>>> Subject: question about enabling cfl-aa and
collecting a57
>> > >>>> numbers
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>> Hi Hal,
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>> Can you clarify to enable cfl-aa from clang we
need both these
>> > >>>> flags?
>> > >>>>
>> > >>>>
>> > >>>
>> > >>> Hi Ana,
>> > >>>
>> > >>> Thanks for following-up on this! You only need
(-mllvm
>> > >>> -use-cfl-aa)
>> > >>> to
>> > >>> use CFL AA during the main optimization pipeline, and
just this is
>> > >>> enough to reproduce the performance regressions as
far as I know.
>> > >>> You
>> > >>> can also use CFL AA during code generation (-mllvm
>> > >>> -use-cfl-aa-in-codegen), which matters only if your
using AA at
>> > >>> all for code generation (which AArch64 does only for
the
>> > >>> Cortex/A53).
>> > >>>
>> > >>> -Hal
>> > >>>
>> > >>>>
>> > >>>> -mllvm -use-cfl-aa
>> > >>>>
>> > >>>> -mllvm -cfl-aa-in-codegen
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>> Hi Jiangning,
>> > >>>>
>> > >>>> Did you collect a57 perf results for SPEC 2000
and 2006 enabling
>> > >>>> these two flags?
>> > >>>>
>> > >>>> Did you notice any correctness and performance
regression?
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>> Thanks,
>> > >>>>
>> > >>>> Ana.
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>> Ana Pazos
>> > >>>> Qualcomm Innovation Center, Inc.
>> > >>>> The Qualcomm Innovation Center, Inc. is a member
of the Code
>> > >>>> Aurora Forum, a Linux Foundation Collaborative
Project.
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>
>> > >>> --
>> > >>> Hal Finkel
>> > >>> Assistant Computational Scientist
>> > >>> Leadership Computing Facility
>> > >>> Argonne National Laboratory
>> > >>
>> > >>
>> > >> -- IMPORTANT NOTICE: The contents of this email and any
attachments
>> > >> are confidential and may also be privileged. If you are
not the
>> > >> intended recipient, please notify the sender immediately
and do not
>> > >> disclose the contents to any other person, use it for any
purpose,
>> > >> or store or copy the information in any medium. Thank
you.
>> > >>
>> > >> ARM Limited, Registered office 110 Fulbourn Road,
Cambridge CB1
>> > >> 9NJ, Registered in England & Wales, Company No:
2557590 ARM
>> > >> Holdings plc, Registered office 110 Fulbourn Road,
Cambridge CB1
>> > >> 9NJ, Registered in England & Wales, Company No:
2548782
>> > >>
>> > >
>> > > --
>> > > Hal Finkel
>> > > Assistant Computational Scientist
>> > > Leadership Computing Facility
>> > > Argonne National Laboratory
>>
>> --
>> Hal Finkel
>> Assistant Computational Scientist
>> Leadership Computing Facility
>> Argonne National Laboratory
>>
>>
>>
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>>
>>
>>
>> <cflaafix.diff>
>>
>>
>>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150115/268442a9/attachment.html>

Daniel Berlin

2015-Jan-15 23:05 UTC

head link

[LLVMdev] question about enabling cfl-aa and collecting a57 numbers

On Thu, Jan 15, 2015 at 1:26 PM, Nick Lewycky <nlewycky at google.com>
wrote:
> On 15 January 2015 at 13:10, Daniel Berlin <dberlin at dberlin.org>
wrote:
>
>> Yes.
>> I've attached an updated patch that does the following:
>>
>> 1. Fixes the partialalias of globals/arguments
>> 2. Enables partialalias for cases where nothing has been unified to a
>> global/argument
>> 3. Fixes that select was unifying the condition to the other pieces
(the
>> condition does not need to be processed :P). This was causing
unnecessary
>> aliasing.
>>
>
> Consider this:
>
> void *p = ...;
> uintptr_t i = p;
> uintptr_t j = 0;
> for (int a = 0; a < sizeof(uintptr_t); ++a) {
>   j = i >> (sizeof(uintptr_t) - a - 1) ? 1 : 0;
>   j <<= 1;
> }
> void *q = j;
>
> alias(p, q) isn't NoAlias. (Okay, it kinda is in C++, but not in the
> equivalent LLVM IR. Please don't make me rewrite my example in LLVM
IR.)
>
> Agreed :)
But after chatting with you, i think we both agree that this change does
not affect.
I probably should not have said "the condition does not need to be
processed".  It would be more accurate to say "the reference to a
condition
in a select instruction, by itself, does not cause aliasing"

What happens now is:

given %4 = select %1, %2, %3

we do
aliasset(%4) += %1
aliasset(%4) += %2
aliasset(%4) += %3

The first one is unnecessary.
There can be no alias caused simply because it is referenced in condition
of the select.
> We still need to process what %1 refers to (and we do).

To make this empirical, in your example, we get the right answer in CFL-AA.

Interestingly, i'll point out that basic-aa says:
  NoAlias: i8* %p, i8** %q
  NoAlias: i8** %p.addr, i8** %q

(I translated your case as best i can :P)

So you may want to implement it for real if you think it's supposed to be
handled right in basic-aa, because I don't believe it is :)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150115/1b1eacb5/attachment.html>

llvm dev - Jan 2015 - [LLVMdev] question about enabling cfl-aa and collecting a57 numbers

[LLVMdev] question about enabling cfl-aa and collecting a57 numbers

[LLVMdev] question about enabling cfl-aa and collecting a57 numbers

[LLVMdev] question about enabling cfl-aa and collecting a57 numbers