thr3ads.net - llvm dev - [llvm-dev] [SCEV] Inconsistent SCEV formation for zext [Feb 2018]

If this information is useful, please help other people find it:
Share via:

Chawla, Pankaj via llvm-dev

2018-Feb-11 22:32 UTC

[llvm-dev] [SCEV] Inconsistent SCEV formation for zext

Hi Sanjoy,

Thanks for investigating the issue!

I am more interested in getting the underlying problem fixed rather than making
this particular test case work. I think I have more test cases where this
problem crops up. I would any day prefer consistent results over compile time
savings which can lead to inconsistencies. These inconsistencies require
significant developer time to analyze and fix which just doesn't seem worth
it.

Based on my limited familiarity with ScalarEvolution code I would like to
propose a solution that may work-

Have two levels of caching for data structures for which infinite recursion is
possible (like Value -> SCEV, Value->Predicate,
Loop->BackedgeTakenCount etc).

The first level of caching is the 'PendingCache' and the second one is
the 'FinalCache'. PendingCache is similar to PendingLoopPredicates. It
is used to break infinite recursion. In addition we keep a Boolean class member
'PessimisticMode' (common to all caches).

The lookup logic for the cache is something like this-

If (PendingCache.find(Key)) {
  PessimisticMode = true;
  return Value;
} else if (FinalCache.find(Key)) {
  return Value;
}

Insertion logic is like this-

If (PessimisticMode) {
  PendingCache.insert(Entry);
} else {
  FinalCache.insert(Entry);
}

We need to distinguish top level calls to the external interface like
getSCEV()/getBackedgeTakenCount() etc from the internal recursive calls for
setting/resetting the state correctly. We start with adding the most
conservative result for the value in PendingCache. This would be getUnknown()
for getSCEV().

So getSCEV() may be implemented something like this-

ScalarEvolution::getSCEV(Value *V) {
  return getSCEVImpl(V, true);
}

ScalarEvolution::getSCEVImpl (Value *V, bool  IsTopCall = false) {
  // Look up in cache using logic described above
  If (S = getExistingSCEV())
    return S;

  if (IsTopCall) {
   PessimisticMode = false;
   PendingCache.clear();
   PendingCache.insert(V, getUnknown(V));
  }

  SCEV *S =  createSCEV();

  If(IsTopCall) {
    FinalCache.insert(V, S);
    forgetMemoizedResults(PendingCache);
  } else if (PessimisticMode) {
    PendingCache.insert(V, S);
  } else {
    FinalCache.insert(V, S);
  }

  return S;
}

Implementation in ScalarEvolution.cpp uses getSCEVImpl() instead of getSCEV().

The idea is that we can conclude accurate result for the topmost call and for
values computed before we hit PessimisticMode (recursion on self). We actually
do this in some parts of the codebase. For example, getBackedgeTakenInfo() 
inserts a conservative entry in the map, then replaces it with the real result
and cleans up (possibly inaccurate) results for all the loop header phis. We
just need to apply the approach in a more generic manner.

We may have to handle AddRecs (which are inherently self-recursive) as a special
case in this setup.

Of course, we may be discarding more than necessary in this setup so it may have
compile time implications. A better solution is more that welcome.

Please let me know your thoughts.

Thanks,
Pankaj

-----Original Message-----
From: Sanjoy Das [mailto:sanjoy at playingwithpointers.com] 
Sent: Friday, February 09, 2018 4:46 PM
To: Chawla, Pankaj <pankaj.chawla at intel.com>; Maxim Kazantsev
<max.kazantsev at azul.com>; Serguei Katkov <serguei.katkov at
azul.com>
Cc: llvm-dev at lists.llvm.org
Subject: Re: [SCEV] Inconsistent SCEV formation for zext

Hi,

+CC Max, Serguei

This looks like a textbook case of why caching is hard.

We first call getZeroExtendExpr on %dec, and this call does end up returning an
add rec.  However, in the process of simplifying the zext, it also calls into
isLoopBackedgeGuardedByCond which in turn calls getZeroExtendExpr(%dec) again. 
However, this second (recursive) time, we don't simplify the zext and cache
a pessimistic value for it.
We don't get the more precise answer the second time because we bail out on
proving the same predicate recursively to avoid infinite loops (see
ScalarEvolution::PendingLoopPredicates).

I don't think there is a nice and easy fix here.  We can try to find some
specific property of the IV we can exploit here to make this work, but the
general problem will remain.

-- Sanjoy

On Thu, Feb 8, 2018 at 2:19 PM, Chawla, Pankaj <pankaj.chawla at
intel.com> wrote:> Hi Sanjoy,
>
>
>
> SCEV is behaving inconsistently when forming SCEV for this zext 
> instruction in the attached test case-
>
> %conv5 = zext i32 %dec to i64
>
>
>
> If we request a SCEV for the instruction, it returns-
>
> (zext i32
{{-1,+,1}<nw><%for.body>,+,-1}<nw><%for.body7> to i64)
>
>
>
> This can be seen by invoking-
>
> $ opt -analyze -scalar-evolution inconsistent-scev-zext.ll
>
>
>
> But when computing the backedge taken count of the containing loop 
> for.body7, it is able to push zext inside the AddRec and forms the 
> following for the same instruction-
>
> {(zext i32 {-1,+,1}<%for.body> to
i64),+,-1}<nw><%for.body7>
>
>
>
> This allows ScalarEvolution to compute a valid backedge taken count 
> for the loop.
>
>
>
> The ‘simplification’ is done by this piece of code inside
> getZeroExtendExpr()-
>
>
>
>             // Cache knowledge of AR NW, which is propagated to this
>
>             // AddRec.  Negative step causes unsigned wrap, but it
>
>             // still can't self-wrap.
>
>             const_cast<SCEVAddRecExpr 
> *>(AR)->setNoWrapFlags(SCEV::FlagNW);
>
>             // Return the expression with the addrec on the outside.
>
>             return getAddRecExpr(
>
>                 getExtendAddRecStart<SCEVZeroExtendExpr>(AR, Ty,
this,
>
>                                                          Depth + 1),
>
>                 getSignExtendExpr(Step, Ty, Depth + 1), L,
>
>                 AR->getNoWrapFlags());
>
>           }
>
>
>
> I believe it is wrong for ScalarEvolution to use a simplified SCEV for 
> internal analysis and return a non-simplified SCEV to the client.
>
>
>
> Thanks,
>
> Pankaj
>
>
>
>

Sanjoy Das via llvm-dev

2018-Feb-20 00:52 UTC

head link

[llvm-dev] [SCEV] Inconsistent SCEV formation for zext

Hi Pankaj,

On Sun, Feb 11, 2018 at 2:32 PM, Chawla, Pankaj <pankaj.chawla at
intel.com> wrote:> Thanks for investigating the issue!
>
> I am more interested in getting the underlying problem fixed rather than
making this particular test case work. I think I have more test cases where this
problem crops up. I would any day prefer consistent results over compile time
savings which can lead to inconsistencies. These inconsistencies require
significant developer time to analyze and fix which just doesn't seem worth
it.
>
> Based on my limited familiarity with ScalarEvolution code I would like to
propose a solution that may work-
>
> Have two levels of caching for data structures for which infinite recursion
is possible (like Value -> SCEV, Value->Predicate,
Loop->BackedgeTakenCount etc).
I'm a bit apprehensive of adding more caching to solve problems
created by caching; but if there is no way out of adding another
cache, how about adding a cache that maps SCEV expressions to their
simplified versions?  Then we could do something like:

  getZeroExtendExpr(S) {
    if (AlreadyPresent = UniqueSCEVs.find(kZeroExtend, S) {
      if (Simplified = SimplifiedSCEVs.find(AlreadyPresent)) {
        return Simplified;
      }
      return AlreadyPresent;
    }

    ...
    // We discovered zext(s) can be simplified to t
    UniqueSCEVs.insert({kZeroExtend, S}, t);
    SimplifiedSCEVs[s] = t;
    return t;
  }
> The first level of caching is the 'PendingCache' and the second one
is the 'FinalCache'. PendingCache is similar to PendingLoopPredicates.
It is used to break infinite recursion. In addition we keep a Boolean class
member 'PessimisticMode' (common to all caches).
>
> The lookup logic for the cache is something like this-
>
> If (PendingCache.find(Key)) {
>   PessimisticMode = true;
>   return Value;
> } else if (FinalCache.find(Key)) {
>   return Value;
> }
>
> Insertion logic is like this-
>
> If (PessimisticMode) {
>   PendingCache.insert(Entry);
> } else {
>   FinalCache.insert(Entry);
> }
>
> We need to distinguish top level calls to the external interface like
getSCEV()/getBackedgeTakenCount() etc from the internal recursive calls for
setting/resetting the state correctly. We start with adding the most
conservative result for the value in PendingCache. This would be getUnknown()
for getSCEV().
>
> So getSCEV() may be implemented something like this-
>
> ScalarEvolution::getSCEV(Value *V) {
>   return getSCEVImpl(V, true);
> }
>
> ScalarEvolution::getSCEVImpl (Value *V, bool  IsTopCall = false) {
>   // Look up in cache using logic described above
>   If (S = getExistingSCEV())
>     return S;
>
>   if (IsTopCall) {
>    PessimisticMode = false;
>    PendingCache.clear();
>    PendingCache.insert(V, getUnknown(V));
Can this be assert(!PessimisticMode && !PendingCache.empty()) since on
a "top call" we could not have hit self recursion yet?
>   }
>
>   SCEV *S =  createSCEV();
>
>   If(IsTopCall) {
>     FinalCache.insert(V, S);
>     forgetMemoizedResults(PendingCache);
I'm not 100% sure, but I suspect this will not work.
forgetMemoizedResults will only remove the Value->SCEV mapping for the
conservative cases, but in the test case you attached
getZeroExtendExpr(S) tries to create a *new* SCEV for %conv5 (since I
suspect the %conv5->mapping was removed by getBackedgeTakenInfo) but
early-exits because zext(%dec) is present in UniqueSCEVs.

-- Sanjoy

Chawla, Pankaj via llvm-dev

2018-Feb-26 19:12 UTC

head link

[llvm-dev] [SCEV] Inconsistent SCEV formation for zext

Hi Sanjoy,
>> I'm a bit apprehensive of adding more caching to solve problems
created by caching; but if there is no way out of adding another cache, how
about adding a cache that maps SCEV expressions to their simplified versions? 
Then we could do something like:
I may be wrong but I think caching is not an issue in itself, but caching in the
presence of self-recursion is.

>>  getZeroExtendExpr(S) {
>>    if (AlreadyPresent = UniqueSCEVs.find(kZeroExtend, S) {
>>      if (Simplified = SimplifiedSCEVs.find(AlreadyPresent)) {
>>        return Simplified;
>>      }
>>      return AlreadyPresent;
>>    }
>>    ...
>>    // We discovered zext(s) can be simplified to t
>>    UniqueSCEVs.insert({kZeroExtend, S}, t);
>>    SimplifiedSCEVs[s] = t;
>>    return t;
>>  }
I am not sure I understand how exactly SimplifiedSCEV is maintained and how
UniqueSCEVs is going to be extended. I think the self-recursion issue is not
related to just simplification of casts. It may be applicable in other scenarios
(like deduction of wrap flags). As an example, I think ScalarEvolution has this
commonly occurring cycle-
 getBackedgeTakenInfo(Loop) -> getSCEV(LoopIV) ->
getBackedgeTakenInfo(Loop)

This is handled correctly because we have some logic to forget loop header phi
entries. Under the new setup, we would not have to 'special case' this
logic. It will be handled by the same 'IsTopCall' approach.
>
> ScalarEvolution::getSCEVImpl (Value *V, bool  IsTopCall = false) {
>   // Look up in cache using logic described above
>   If (S = getExistingSCEV())
>     return S;
>
>   if (IsTopCall) {
>    PessimisticMode = false;
>    PendingCache.clear();
>    PendingCache.insert(V, getUnknown(V));
>> Can this be assert(!PessimisticMode && !PendingCache.empty())
since on a "top call" we could not have hit self recursion yet?

Yes, if we reset them before exiting 'IsTopCall'. I wasn't doing
that :)

>   }
>
>   SCEV *S =  createSCEV();
>
>   If(IsTopCall) {
>     FinalCache.insert(V, S);
>     forgetMemoizedResults(PendingCache);
>> I'm not 100% sure, but I suspect this will not work.
>> forgetMemoizedResults will only remove the Value->SCEV mapping for
the conservative cases, but in the test case you attached
>> getZeroExtendExpr(S) tries to create a *new* SCEV for %conv5 (since I
suspect the %conv5->mapping was removed by getBackedgeTakenInfo) but
early-exits because zext(%dec) is present in UniqueSCEVs.
In the setup I am proposing all the entries including the one for (%dec ->
SCEV) will be 'forgotten' before we exit getBackedgeTakenInfo(Loop)
because the pessimistic mode would be triggered due to self-recursion on %conv5.
The only valid entry is for the topmost call
'getBackedgeTakenInfo(Loop)'.

Thanks,
Pankaj

Apparently Analagous Threads

Search for more apparently analagous threads

llvm dev - Feb 2018 - [SCEV] Inconsistent SCEV formation for zext

[llvm-dev] [SCEV] Inconsistent SCEV formation for zext

[llvm-dev] [SCEV] Inconsistent SCEV formation for zext

[llvm-dev] [SCEV] Inconsistent SCEV formation for zext

Apparently Analagous Threads