Thanks Sanjoy.> To be pedantic, "var[i<<1]" is not an add recurrence, but "&var[i << > 1]" is an add recurrence. I'll assume that's that you meant.Yes, I meant the same.> I think that is because in C, multiplication is nsw but left shift is > not and so "i << 1" can legitimately sign-overflow but i * 2 cannot > sign-overflow. I'd venture a guess that this lets LLVM transform a > sign-extend of an add-rec to an add-rec sign-extends.I agree here, we want LLVM to transform sign-extend of add-rec to add-rec of sign extend in possible cases. Currently I don’t see LLVM is doing this. i.e.: (sext i32 addrec{2,+,2}<%for.body4> to i64) Will convert it to: addrec{(sext i32 '2' to i64), + , (sext i32 '2' to i64)} <%for.body4> If this looks OK, I'll make changes and come up with a patch. With sign-extend (SCEVSignExtendExpr) similarly we have to take care and handle other casts as well (i.e. SCEVTruncateExpr, SCEVZeroExtendExpr). Regards, Ashutosh -----Original Message----- From: Sanjoy Das [mailto:sanjoy at playingwithpointers.com] Sent: Tuesday, March 31, 2015 9:41 AM To: Nema, Ashutosh Cc: Nick Lewycky; llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Cast to SCEVAddRecExpr On Mon, Mar 30, 2015 at 8:47 PM, Nema, Ashutosh <Ashutosh.Nema at amd.com> wrote:> Sorry typo in test case, Please ignore previous mail. > > Consider below case: > > for (j=1; j < itr; j++) { > - - - - > for (i=1; i < itr; i++) { > { > temp= var[i << 1]; > - - - - - > } > } > > In the above example, we are unable to get "SCEVAddRecExpr" for "var[i << 1]"To be pedantic, "var[i<<1]" is not an add recurrence, but "&var[i << 1]" is an add recurrence. I'll assume that's that you meant. If I run the following example through opt -analyze -scalar-evolution: define void @x(i1* %c, i64* %ptr) { entry: br label %loop loop: %i = phi i64 [ 0, %entry ], [ %i.inc, %loop ] %i.inc = add i64 %i, 1 %d = shl i64 %i.inc, 1 %gep = getelementptr i64, i64* %ptr, i64 %d %cc = load volatile i1, i1* %c br i1 %cc, label %exit, label %loop exit: ret void } I get the SCEV --> {(16 + %ptr),+,16}<%loop> for %gep. So SCEV /can/ interpret i<<1 as i*2 under some specific circumstances. Going back to your original question:> Instead if I change "var[i << 1]" to "var[i * 2]" then I'm getting "SCEVAddRecExpr".I think that is because in C, multiplication is nsw but left shift is not and so "i << 1" can legitimately sign-overflow but i * 2 cannot sign-overflow. I'd venture a guess that this lets LLVM transform a sign-extend of an add-rec to an add-rec sign-extends. Caveat: I have not looked at the C specification to determine that left shifts are allowed to sign-overflow. I concluded that left shifts are not nsw by looking at the llvm IR emitted by clang (the left shift generated from "i << 1" is not marked nsw). -- Sanjoy
On Wed, Apr 1, 2015 at 3:06 AM, Nema, Ashutosh <Ashutosh.Nema at amd.com> wrote:> Thanks Sanjoy. > >> To be pedantic, "var[i<<1]" is not an add recurrence, but "&var[i << >> 1]" is an add recurrence. I'll assume that's that you meant. > Yes, I meant the same. > >> I think that is because in C, multiplication is nsw but left shift is >> not and so "i << 1" can legitimately sign-overflow but i * 2 cannot >> sign-overflow. I'd venture a guess that this lets LLVM transform a >> sign-extend of an add-rec to an add-rec sign-extends. > I agree here, we want LLVM to transform sign-extend of add-rec to > add-rec of sign extend in possible cases. Currently I don’t see LLVM is doing this. > i.e.: > (sext i32 addrec{2,+,2}<%for.body4> to i64) > Will convert it to: > addrec{(sext i32 '2' to i64), + , (sext i32 '2' to i64)} <%for.body4>That transform is not valid because these two expressions are not equivalent if the add recurrence can sign-overflow. For instance, (i8 for simplicity) if you consider {sext i8 2 to i16,+, sext i8 2 to i16} and (sext i8 {2,+,2} to i16) then in the 63rd iteration the first add-recs value is (i16 2) * 63 + (i16 2) = 128 while the second add-rec's value in the same iteration is -128. However, if LLVM can prove the second expression cannot sign-overflow (i.e. the loop will run for < 63 iterations) then transforming (sext i8 {2,+,2} to i16) to {sext i8 2 to i16,+, sext i8 2 to i16} is valid. Add recurrences that have been proven to not overflow are marked as <nsw>, and are printed like {S,+,X}<nsw>. When compiling from C, left shifts operations are not marked nsw (because in C they are allowed to sign overflow) but the equivalent multiplies are marked nsw (because sign-overflow of a multiply is undefined behavior). This lets LLVM prove {2,+,2} is <nsw> when it comes from a multiply and lets it commute a sign extend into the add recurrence. This explains the difference between var[i << 1] and var[i * 2]. -- Sanjoy
> That transform is not valid because these two expressions are not > equivalent if the add recurrence can sign-overflow. For instance, (i8 > for simplicity) if you consider {sext i8 2 to i16,+, sext i8 2 to i16} > and (sext i8 {2,+,2} to i16) then in the 63rd iteration the first > add-recs value is (i16 2) * 63 + (i16 2) = 128 while the second > add-rec's value in the same iteration is -128.Thanks for this detailed explanation Sanjoy.> However, if LLVM can > prove the second expression cannot sign-overflow (i.e. the loop will > run for < 63 iterations) then transforming (sext i8 {2,+,2} to i16) to > {sext i8 2 to i16,+, sext i8 2 to i16} is valid.Yes, but it's difficult to prove in all types of loop.> Add recurrences that > have been proven to not overflow are marked as <nsw>, and are printed > like {S,+,X}<nsw>.Where compiler is sure about *will not overflow* (i.e. var[i * 2] ) will be marked as 'nsw'. Cases where we have power of two will be transformed to shift reduction by Instruction Combiner. i.e. var[i * 2] will be transformed to var[i << 1] but 'nsw' property was not carried forward. I have seen your mail thread on this. If somehow we can preserve 'nsw' property then the transformation I suggested looks valid ? i.e.: (sext i32 Addrec{2,+,2}<nuw><%for.body4> to i64) Transformed To: Addrec{(sext i32 '2' to i64), + , (sext i32 '2' to i64)} <nsw><%for.body4> This will open up optimization opportunities. Regards, Ashutosh -----Original Message----- From: Sanjoy Das [mailto:sanjoy at playingwithpointers.com] Sent: Wednesday, April 01, 2015 10:11 PM To: Nema, Ashutosh Cc: Nick Lewycky; llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Cast to SCEVAddRecExpr On Wed, Apr 1, 2015 at 3:06 AM, Nema, Ashutosh <Ashutosh.Nema at amd.com> wrote:> Thanks Sanjoy. > >> To be pedantic, "var[i<<1]" is not an add recurrence, but "&var[i << >> 1]" is an add recurrence. I'll assume that's that you meant. > Yes, I meant the same. > >> I think that is because in C, multiplication is nsw but left shift is >> not and so "i << 1" can legitimately sign-overflow but i * 2 cannot >> sign-overflow. I'd venture a guess that this lets LLVM transform a >> sign-extend of an add-rec to an add-rec sign-extends. > I agree here, we want LLVM to transform sign-extend of add-rec to > add-rec of sign extend in possible cases. Currently I don’t see LLVM is doing this. > i.e.: > (sext i32 addrec{2,+,2}<%for.body4> to i64) > Will convert it to: > addrec{(sext i32 '2' to i64), + , (sext i32 '2' to i64)} <%for.body4>That transform is not valid because these two expressions are not equivalent if the add recurrence can sign-overflow. For instance, (i8 for simplicity) if you consider {sext i8 2 to i16,+, sext i8 2 to i16} and (sext i8 {2,+,2} to i16) then in the 63rd iteration the first add-recs value is (i16 2) * 63 + (i16 2) = 128 while the second add-rec's value in the same iteration is -128. However, if LLVM can prove the second expression cannot sign-overflow (i.e. the loop will run for < 63 iterations) then transforming (sext i8 {2,+,2} to i16) to {sext i8 2 to i16,+, sext i8 2 to i16} is valid. Add recurrences that have been proven to not overflow are marked as <nsw>, and are printed like {S,+,X}<nsw>. When compiling from C, left shifts operations are not marked nsw (because in C they are allowed to sign overflow) but the equivalent multiplies are marked nsw (because sign-overflow of a multiply is undefined behavior). This lets LLVM prove {2,+,2} is <nsw> when it comes from a multiply and lets it commute a sign extend into the add recurrence. This explains the difference between var[i << 1] and var[i * 2]. -- Sanjoy