Geoff Berry via llvm-dev
2017-Aug-08 17:22 UTC
[llvm-dev] [ScalarEvolution][SCEV] no-wrap flags dependent on order of getSCEV() calls
Hi all, I'm looking into resolving a FIXME in the LoopDataPrefetch (and FalkorMarkStridedAccesses) pass by marking both of these passes as preserving the ScalarEvolution analysis. Unfortunately, when this change is made, LSR will generate different code. One of the root causes seems to be that SCEV will return different nsw/nuw flags for the same Value, depending on what order the SCEVs are computed, due to the fact that the SCEV object unique-ing doesn't take the nsw/nuw flags into account. Since LoopDataPrefetch computes SCEVs in a different order than LSR, the nsw/nuw flags seen by LSR will differ based on whether the SCEVs are preserved from LoopDataPrefetch. I believe this issue has been discussed before, but I just wanted to check if this is indeed the current expected behavior, and if anyone has any plans/ideas for addressing this issue. For reference, below is a reduced loop where this problem occurs. The SCEV for %i.07.i will have <nuw> or not depending on whether %idxprom.i was computed before it: for.body.i: %i.07.i = phi i32 [ %inc.i, %for.body.i ], [ 0, %for.body.i.preheader ] %idxprom.i = zext i32 %i.07.i to i64 %arrayidx.i = getelementptr inbounds i32, i32* %rot, i64 %idxprom.i %0 = load i32, i32* %arrayidx.i %inc.i = add i32 %i.07.i, 1 %cmp.i = icmp ult i32 %inc.i, %size br i1 %cmp.i, label %for.body.i, label %do_bwe.exit -- Geoff Berry Employee of Qualcomm Datacenter Technologies, Inc. Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
Friedman, Eli via llvm-dev
2017-Aug-08 17:37 UTC
[llvm-dev] [ScalarEvolution][SCEV] no-wrap flags dependent on order of getSCEV() calls
On 8/8/2017 10:22 AM, Geoff Berry via llvm-dev wrote:> Hi all, > > I'm looking into resolving a FIXME in the LoopDataPrefetch (and > FalkorMarkStridedAccesses) pass by marking both of these passes as > preserving the ScalarEvolution analysis. Unfortunately, when this > change is made, LSR will generate different code. One of the root > causes seems to be that SCEV will return different nsw/nuw flags for > the same Value, depending on what order the SCEVs are computed, due to > the fact that the SCEV object unique-ing doesn't take the nsw/nuw > flags into account. Since LoopDataPrefetch computes SCEVs in a > different order than LSR, the nsw/nuw flags seen by LSR will differ > based on whether the SCEVs are preserved from LoopDataPrefetch. > > I believe this issue has been discussed before, but I just wanted to > check if this is indeed the current expected behavior, and if anyone > has any plans/ideas for addressing this issue.The general issue that SCEV nsw is weird is known... see, for example https://bugs.llvm.org/show_bug.cgi?id=23527.> For reference, below is a reduced loop where this problem occurs. The > SCEV for %i.07.i will have <nuw> or not depending on whether > %idxprom.i was computed before it:%idxprom.i, the zext? I'm not sure how you're getting that particular effect. ScalarEvolution::getSCEV for a zext immediately calls getSCEV on its operand. -Eli -- Employee of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
Geoff Berry via llvm-dev
2017-Aug-08 19:17 UTC
[llvm-dev] [ScalarEvolution][SCEV] no-wrap flags dependent on order of getSCEV() calls
On 8/8/2017 1:37 PM, Friedman, Eli wrote:> On 8/8/2017 10:22 AM, Geoff Berry via llvm-dev wrote: >> Hi all, >> >> I'm looking into resolving a FIXME in the LoopDataPrefetch (and FalkorMarkStridedAccesses) pass by marking both of these passes as preserving the ScalarEvolution analysis. Unfortunately, when this change is made, LSR will generate different code. One of the root causes seems to be that SCEV will return different nsw/nuw flags for the same Value, depending on what order the SCEVs are computed, due to the fact that the SCEV object unique-ing doesn't take the nsw/nuw flags into account. Since LoopDataPrefetch computes SCEVs in a different order than LSR, the nsw/nuw flags seen by LSR will differ based on whether the SCEVs are preserved from LoopDataPrefetch. >> >> I believe this issue has been discussed before, but I just wanted to check if this is indeed the current expected behavior, and if anyone has any plans/ideas for addressing this issue. > > The general issue that SCEV nsw is weird is known... see, for example https://bugs.llvm.org/show_bug.cgi?id=23527. > >> For reference, below is a reduced loop where this problem occurs. The SCEV for %i.07.i will have <nuw> or not depending on whether %idxprom.i was computed before it: > > %idxprom.i, the zext? I'm not sure how you're getting that particular effect. ScalarEvolution::getSCEV for a zext immediately calls getSCEV on its operand.Here is an abridged record of the getSCEV results as seen by each pass with/without preserving SCEVAnalysis. In the first case, when the SCEV is invalidated, the SCEV for %i.07.i is computed in LSR as {0,+,1}<%for.body.i>. In the second case, the SCEV for %i.07.i is computed in LSR the same way, but because the SCEV for %idxprom.i from FalkorHWPFFix unique's to the same value but has the nuw flag set and is still present in the foldingset, {0,+,1}<nuw><%for.body.i> is returned for %i.07.i in LSR instead. The SCEVs for other values differ too, but I thought I'd start with this one. ********** FalkorHWPFFix ********** Created SCEV for %i.07.i = phi i32 [ %inc.i, %for.inc.i ], [ 0, %for.body.i.preheader ]: {0,+,1}<%for.body.i> Created SCEV for %idxprom.i = zext i32 %i.07.i to i64: (zext i32 {0,+,1}<%for.body.i> to i64) Created SCEV for %arrayidx.i = getelementptr inbounds i32, i32* %rot, i64 %idxprom.i: ((4 * (zext i32 {0,+,1}<%for.body.i> to i64))<nuw><nsw> + %rot)<nsw> Created SCEV for %inc.i = add i32 %i.07.i, 1: {1,+,1}<%for.body.i> Created SCEV for %idxprom.i = zext i32 %i.07.i to i64: {0,+,1}<nuw><%for.body.i> Created SCEV for %arrayidx.i = getelementptr inbounds i32, i32* %rot, i64 %idxprom.i: {%rot,+,4}<nw><%for.body.i> ********** SCEV invalidated ********** ********** LSR ********** Created SCEV for %i.07.i = phi i32 [ %inc.i, %for.inc.i ], [ 0, %for.body.i.preheader ]: {0,+,1}<%for.body.i> Created SCEV for %idxprom.i = zext i32 %i.07.i to i64: (zext i32 {0,+,1}<%for.body.i> to i64) Created SCEV for %arrayidx.i = getelementptr inbounds i32, i32* %rot, i64 %idxprom.i: ((4 * (zext i32 {0,+,1}<%for.body.i> to i64))<nuw><nsw> + %rot)<nsw> Created SCEV for %inc.i = add i32 %i.07.i, 1: {1,+,1}<%for.body.i> Created SCEV for %cmp.i = icmp ult i32 %inc.i, %size: %cmp.i vs. ********** FalkorHWPFFix ********** Created SCEV for %i.07.i = phi i32 [ %inc.i, %for.inc.i ], [ 0, %for.body.i.preheader ]: {0,+,1}<%for.body.i> Created SCEV for %idxprom.i = zext i32 %i.07.i to i64: (zext i32 {0,+,1}<%for.body.i> to i64) Created SCEV for %arrayidx.i = getelementptr inbounds i32, i32* %rot, i64 %idxprom.i: ((4 * (zext i32 {0,+,1}<%for.body.i> to i64))<nuw><nsw> + %rot)<nsw> Created SCEV for %inc.i = add i32 %i.07.i, 1: {1,+,1}<%for.body.i> Created SCEV for %idxprom.i = zext i32 %i.07.i to i64: {0,+,1}<nuw><%for.body.i> Created SCEV for %arrayidx.i = getelementptr inbounds i32, i32* %rot, i64 %idxprom.i: {%rot,+,4}<nw><%for.body.i> ********** SCEV preserved ********** ********** LSR ********** Created SCEV for %i.07.i = phi i32 [ %inc.i, %for.inc.i ], [ 0, %for.body.i.preheader ]: {0,+,1}<nuw><%for.body.i> ^^^^^ Created SCEV for %inc.i = add i32 %i.07.i, 1: {1,+,1}<nuw><%for.body.i> Existing SCEV for %idxprom.i = zext i32 %i.07.i to i64: {0,+,1}<nuw><%for.body.i> Existing SCEV for %arrayidx.i = getelementptr inbounds i32, i32* %rot, i64 %idxprom.i: {%rot,+,4}<nw><%for.body.i> Created SCEV for %cmp.i = icmp ult i32 %inc.i, %size: %cmp.i Created SCEV for %lsr.iv = phi i64 [ 0, %for.body.i.preheader ], [ %lsr.iv.next, %for.inc.i ]: {0,+,1}<nuw><nsw><%for.body.i>> > -Eli >-- Geoff Berry Employee of Qualcomm Datacenter Technologies, Inc. Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.