Chawla, Pankaj via llvm-dev
2019-Aug-26 01:19 UTC
[llvm-dev] missing simplification in ScalarEvolution?
Hi Sanjoy, Thanks for the reply! Your approach sounds good to me! I think 1) is legal as address wraparound in unsigned range doesn't make sense given a positive offset, but I am not sure. I think umax will not be added if we can prove the predicate as known. I am not sure whether umax will get simplified if we add nuw to the expressions. -Pankaj -----Original Message----- From: Sanjoy Das <sanjoy at playingwithpointers.com> Sent: Sunday, August 25, 2019 1:53 PM To: Chawla, Pankaj <pankaj.chawla at intel.com> Cc: Philip Reames <listmail at philipreames.com>; llvm-dev at lists.llvm.org Subject: Re: [llvm-dev] missing simplification in ScalarEvolution? I didn't step through this in a debugger, but I suspect we could fix the problem by: 1. Teaching SCEV to infer nuw for (32 + @glob_const)<nsw> (if legal) 2. Generalize ScalarEvolution::isKnownPredicateViaNoOverflow to work with this case That should fold the umax and give us a more precise trip count. -- Sanjoy On Tue, Aug 20, 2019 at 5:36 PM Chawla, Pankaj via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > Thanks for the suggestion but datalayout info did not solve the problem! > > -Pankaj > > -----Original Message----- > From: Philip Reames <listmail at philipreames.com> > Sent: Tuesday, August 20, 2019 5:26 PM > To: Chawla, Pankaj <pankaj.chawla at intel.com>; llvm-dev at lists.llvm.org > Subject: Re: [llvm-dev] missing simplification in ScalarEvolution? > > Try adding a datalayout with pointer size information to your reduced case, and see what happens. Not sure this is your problem, but I've been bitten by this before with hand reduced examples... > > Philip > > On 8/20/19 3:43 PM, Chawla, Pankaj via llvm-dev wrote: > > Hi, > > > > I have this small test case- > > > > %struct1 = type { i32, i32 } > > > > @glob_const = internal constant [4 x %struct1] [%struct1 { i32 4, > > i32 > > 5 }, %struct1 { i32 8, i32 9 }, %struct1 { i32 16, i32 0 }, %struct1 > > { > > i32 32, i32 10 }], align 16 > > > > define void @foo() { > > entry: > > br label %loop > > > > loop: ; preds = %loop, %entry > > %iv = phi %struct1* [ getelementptr inbounds ([4 x %struct1], [4 x %struct1]* @glob_const, i64 0, i64 0), %entry ], [ %iv.inc, %loop ] > > %gep = getelementptr inbounds %struct1, %struct1* %iv, i64 0, i32 0 > > %ld = load i32, i32* %gep, align 8 > > %iv.inc = getelementptr inbounds %struct1, %struct1* %iv, i64 1 > > %cmp = icmp ult %struct1* %iv.inc, getelementptr inbounds ([4 x %struct1], [4 x %struct1]* @glob_const, i64 1, i64 0) > > br i1 %cmp, label %loop, label %exit > > > > exit: > > ret void > > } > > > > I was expecting the backedge taken count of the loop to be evaluated > > to 3 by ScalarEvolution but instead it evaluates to this- > > > > Loop %loop: backedge-taken count is ((-1 + (-1 * @glob_const) + ((8 > > + @glob_const)<nuw><nsw> umax (32 + @glob_const)<nsw>)) /u 8) Loop %loop: max backedge-taken count is 3, actual taken count either this or zero. > > > > > > Can we deduce the final value of the IV (32 + @glob_const) to be greater than the initial value (8 + @glob_const) and simplify the backedge taken count to 3? > > > > Thanks, > > Pankaj > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Sanjoy Das via llvm-dev
2019-Sep-02 15:59 UTC
[llvm-dev] missing simplification in ScalarEvolution?
On Sun, Aug 25, 2019 at 6:19 PM Chawla, Pankaj <pankaj.chawla at intel.com> wrote:> > Hi Sanjoy, > > Thanks for the reply! > Your approach sounds good to me! > > I think 1) is legal as address wraparound in unsigned range doesn't make sense given a positive offset, but I am not sure.Non-inbounds GEPs are just modular arithmetic so it is ok for a non-inbounds GEP to have unsigned / signed wrap. So at least the GEP will have to be an inbounds GEP. And you'll also have to check `isSCEVExprNeverPoison`. -- Sanjoy> I think umax will not be added if we can prove the predicate as known. > I am not sure whether umax will get simplified if we add nuw to the expressions. > > -Pankaj > > -----Original Message----- > From: Sanjoy Das <sanjoy at playingwithpointers.com> > Sent: Sunday, August 25, 2019 1:53 PM > To: Chawla, Pankaj <pankaj.chawla at intel.com> > Cc: Philip Reames <listmail at philipreames.com>; llvm-dev at lists.llvm.org > Subject: Re: [llvm-dev] missing simplification in ScalarEvolution? > > I didn't step through this in a debugger, but I suspect we could fix the problem by: > > 1. Teaching SCEV to infer nuw for (32 + @glob_const)<nsw> (if legal) 2. Generalize ScalarEvolution::isKnownPredicateViaNoOverflow to work with this case > > That should fold the umax and give us a more precise trip count. > > -- Sanjoy > > On Tue, Aug 20, 2019 at 5:36 PM Chawla, Pankaj via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > > Thanks for the suggestion but datalayout info did not solve the problem! > > > > -Pankaj > > > > -----Original Message----- > > From: Philip Reames <listmail at philipreames.com> > > Sent: Tuesday, August 20, 2019 5:26 PM > > To: Chawla, Pankaj <pankaj.chawla at intel.com>; llvm-dev at lists.llvm.org > > Subject: Re: [llvm-dev] missing simplification in ScalarEvolution? > > > > Try adding a datalayout with pointer size information to your reduced case, and see what happens. Not sure this is your problem, but I've been bitten by this before with hand reduced examples... > > > > Philip > > > > On 8/20/19 3:43 PM, Chawla, Pankaj via llvm-dev wrote: > > > Hi, > > > > > > I have this small test case- > > > > > > %struct1 = type { i32, i32 } > > > > > > @glob_const = internal constant [4 x %struct1] [%struct1 { i32 4, > > > i32 > > > 5 }, %struct1 { i32 8, i32 9 }, %struct1 { i32 16, i32 0 }, %struct1 > > > { > > > i32 32, i32 10 }], align 16 > > > > > > define void @foo() { > > > entry: > > > br label %loop > > > > > > loop: ; preds = %loop, %entry > > > %iv = phi %struct1* [ getelementptr inbounds ([4 x %struct1], [4 x %struct1]* @glob_const, i64 0, i64 0), %entry ], [ %iv.inc, %loop ] > > > %gep = getelementptr inbounds %struct1, %struct1* %iv, i64 0, i32 0 > > > %ld = load i32, i32* %gep, align 8 > > > %iv.inc = getelementptr inbounds %struct1, %struct1* %iv, i64 1 > > > %cmp = icmp ult %struct1* %iv.inc, getelementptr inbounds ([4 x %struct1], [4 x %struct1]* @glob_const, i64 1, i64 0) > > > br i1 %cmp, label %loop, label %exit > > > > > > exit: > > > ret void > > > } > > > > > > I was expecting the backedge taken count of the loop to be evaluated > > > to 3 by ScalarEvolution but instead it evaluates to this- > > > > > > Loop %loop: backedge-taken count is ((-1 + (-1 * @glob_const) + ((8 > > > + @glob_const)<nuw><nsw> umax (32 + @glob_const)<nsw>)) /u 8) Loop %loop: max backedge-taken count is 3, actual taken count either this or zero. > > > > > > > > > Can we deduce the final value of the IV (32 + @glob_const) to be greater than the initial value (8 + @glob_const) and simplify the backedge taken count to 3? > > > > > > Thanks, > > > Pankaj > > > _______________________________________________ > > > LLVM Developers mailing list > > > llvm-dev at lists.llvm.org > > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Chawla, Pankaj via llvm-dev
2019-Sep-03 18:47 UTC
[llvm-dev] missing simplification in ScalarEvolution?
Okay, that makes sense. Thanks for the helpful tips! -----Original Message----- From: Sanjoy Das <sanjoy at playingwithpointers.com> Sent: Monday, September 02, 2019 9:00 AM To: Chawla, Pankaj <pankaj.chawla at intel.com> Cc: Philip Reames <listmail at philipreames.com>; llvm-dev at lists.llvm.org Subject: Re: [llvm-dev] missing simplification in ScalarEvolution? On Sun, Aug 25, 2019 at 6:19 PM Chawla, Pankaj <pankaj.chawla at intel.com> wrote:> > Hi Sanjoy, > > Thanks for the reply! > Your approach sounds good to me! > > I think 1) is legal as address wraparound in unsigned range doesn't make sense given a positive offset, but I am not sure.Non-inbounds GEPs are just modular arithmetic so it is ok for a non-inbounds GEP to have unsigned / signed wrap. So at least the GEP will have to be an inbounds GEP. And you'll also have to check `isSCEVExprNeverPoison`. -- Sanjoy> I think umax will not be added if we can prove the predicate as known. > I am not sure whether umax will get simplified if we add nuw to the expressions. > > -Pankaj > > -----Original Message----- > From: Sanjoy Das <sanjoy at playingwithpointers.com> > Sent: Sunday, August 25, 2019 1:53 PM > To: Chawla, Pankaj <pankaj.chawla at intel.com> > Cc: Philip Reames <listmail at philipreames.com>; llvm-dev at lists.llvm.org > Subject: Re: [llvm-dev] missing simplification in ScalarEvolution? > > I didn't step through this in a debugger, but I suspect we could fix the problem by: > > 1. Teaching SCEV to infer nuw for (32 + @glob_const)<nsw> (if legal) > 2. Generalize ScalarEvolution::isKnownPredicateViaNoOverflow to work > with this case > > That should fold the umax and give us a more precise trip count. > > -- Sanjoy > > On Tue, Aug 20, 2019 at 5:36 PM Chawla, Pankaj via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > > Thanks for the suggestion but datalayout info did not solve the problem! > > > > -Pankaj > > > > -----Original Message----- > > From: Philip Reames <listmail at philipreames.com> > > Sent: Tuesday, August 20, 2019 5:26 PM > > To: Chawla, Pankaj <pankaj.chawla at intel.com>; > > llvm-dev at lists.llvm.org > > Subject: Re: [llvm-dev] missing simplification in ScalarEvolution? > > > > Try adding a datalayout with pointer size information to your reduced case, and see what happens. Not sure this is your problem, but I've been bitten by this before with hand reduced examples... > > > > Philip > > > > On 8/20/19 3:43 PM, Chawla, Pankaj via llvm-dev wrote: > > > Hi, > > > > > > I have this small test case- > > > > > > %struct1 = type { i32, i32 } > > > > > > @glob_const = internal constant [4 x %struct1] [%struct1 { i32 4, > > > i32 > > > 5 }, %struct1 { i32 8, i32 9 }, %struct1 { i32 16, i32 0 }, > > > %struct1 { > > > i32 32, i32 10 }], align 16 > > > > > > define void @foo() { > > > entry: > > > br label %loop > > > > > > loop: ; preds = %loop, %entry > > > %iv = phi %struct1* [ getelementptr inbounds ([4 x %struct1], [4 x %struct1]* @glob_const, i64 0, i64 0), %entry ], [ %iv.inc, %loop ] > > > %gep = getelementptr inbounds %struct1, %struct1* %iv, i64 0, i32 0 > > > %ld = load i32, i32* %gep, align 8 > > > %iv.inc = getelementptr inbounds %struct1, %struct1* %iv, i64 1 > > > %cmp = icmp ult %struct1* %iv.inc, getelementptr inbounds ([4 x %struct1], [4 x %struct1]* @glob_const, i64 1, i64 0) > > > br i1 %cmp, label %loop, label %exit > > > > > > exit: > > > ret void > > > } > > > > > > I was expecting the backedge taken count of the loop to be > > > evaluated to 3 by ScalarEvolution but instead it evaluates to > > > this- > > > > > > Loop %loop: backedge-taken count is ((-1 + (-1 * @glob_const) + > > > ((8 > > > + @glob_const)<nuw><nsw> umax (32 + @glob_const)<nsw>)) /u 8) Loop %loop: max backedge-taken count is 3, actual taken count either this or zero. > > > > > > > > > Can we deduce the final value of the IV (32 + @glob_const) to be greater than the initial value (8 + @glob_const) and simplify the backedge taken count to 3? > > > > > > Thanks, > > > Pankaj > > > _______________________________________________ > > > LLVM Developers mailing list > > > llvm-dev at lists.llvm.org > > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev