thr3ads.net - llvm dev - [llvm-dev] RFC: Stop using redundant PHI node entries for multi-edge predecessors [May 2017]

If this information is useful, please help other people find it:
Share via:

Sanjoy Das via llvm-dev

2017-May-01 16:29 UTC

[llvm-dev] RFC: Stop using redundant PHI node entries for multi-edge predecessors

Hi,

On Mon, May 1, 2017 at 8:47 AM, Daniel Berlin via llvm-dev
<llvm-dev at lists.llvm.org> wrote:>> Today, the IR requires that if you have multiple edges from A to B
>> (typically with a switch) any phi nodes in B must have an equal number
of
>> entries for A, but that all of them must have the same value.
>
>> This seems rather annoying....
>> 1) It creates multiple uses of values in A for no apparently good
reason
>> 2) It makes updating PHI nodes using sets of predecessors incredibly
hard
>> 3) There is no correspondence between the edges and the PHI node
entries
>> other than number, making most of the code that does handle this very
ad-hoc
>> "count off N repetitions of the same thing" style code, which
is brittle and
>> hard to test effectively.
>
> Having written this code a few times, i'd agree.
>
>>
>> 4) I suspect bugs are quite common given that
PHINode::removeIncomingValue
>> accepts a basic block but only removes the first incoming value,
leaving the
>> others in place.
>
>> I can't see any serious use case for this symmetry either. These
aren't
>> uses (and can't be), and there doesn't seem to be any lost
functionality if
>> we only have a single
>> entry.http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>> It also seems likely to open up more efficient in-memory
representations
>> if it is possible at some point to build maps of these things.
>>
>> Thoughts?
>
>
>
> So would this lead to a case where PHI->getNumOperands() !>
std::distance(pred_begin(phiblock), pred_end(phiblock))
>
> If so, that would seem problematic to handle as well.
> :(
Also, IIUC, today deleting an edge from A to B only requires
removeIncomingValue(A) on B's PHI nodes, but after this change you'll
have to check if you're deleting the last edge or not.

Can (2), (3) and (4) be fixed by changing the API instead of the
deeper IR change you're proposing?

-- Sanjoy

Chandler Carruth via llvm-dev

2017-May-01 17:18 UTC

head link

[llvm-dev] RFC: Stop using redundant PHI node entries for multi-edge predecessors

On Mon, May 1, 2017 at 9:30 AM Sanjoy Das via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Also, IIUC, today deleting an edge from A to B only requires
> removeIncomingValue(A) on B's PHI nodes, but after this change
you'll
> have to check if you're deleting the last edge or not.
>
Are there many places where this is a feature rather than a bug though?

I didn't do a survey of every caller of removeIncomingValue, but the 4 or 5
I looked at were all inside of a loop already. Which actually means they
would get more efficient with this change, if in some cases needing to keep
a set around.

This doesn't seem too bad to me, but again, maybe I'm looking at the
wrong
bits of code. Is there specific code you have in mind that wants to remove
one edge (but not one predecessor block)?

>
> Can (2), (3) and (4) be fixed by changing the API instead of the
> deeper IR change you're proposing?
>
Maybe, but I don't see easy ways. Ideas?

>
> -- Sanjoy
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170501/4dd12847/attachment.html>

Sanjoy Das via llvm-dev

2017-May-02 06:34 UTC

head link

[llvm-dev] RFC: Stop using redundant PHI node entries for multi-edge predecessors

Hi,

On Mon, May 1, 2017 at 10:18 AM, Chandler Carruth <chandlerc at
google.com> wrote:>> Also, IIUC, today deleting an edge from A to B only requires
>> removeIncomingValue(A) on B's PHI nodes, but after this change
you'll
>> have to check if you're deleting the last edge or not.
>
> Are there many places where this is a feature rather than a bug though?
>
> I didn't do a survey of every caller of removeIncomingValue, but the 4
or 5
> I looked at were all inside of a loop already. Which actually means they
> would get more efficient with this change, if in some cases needing to keep
> a set around.
>
> This doesn't seem too bad to me, but again, maybe I'm looking at
the wrong
> bits of code. Is there specific code you have in mind that wants to remove
> one edge (but not one predecessor block)?
I think most calls to BasicBlock::removePredecessor would need
auditing.  For instance, in llvm::ConstantFoldTerminator when we
transform:

BB:
  br true, A, B

to

BB:
  br A

then we only need to call B->removePredecessor(BB) which, in turn,
only needs to PhiInB->removeIncomingValue(BB).  With your
representation we'll have to worry if A == B.

Another example is JumpThreading:

// If the terminator is branching on an undef, we can pick any of the
// successors to branch to.  Let GetBestDestForJumpOnUndef decide.
if (isa<UndefValue>(Condition)) {
  unsigned BestSucc = GetBestDestForJumpOnUndef(BB);

  // Fold the branch/switch.
  TerminatorInst *BBTerm = BB->getTerminator();
  for (unsigned i = 0, e = BBTerm->getNumSuccessors(); i != e; ++i) {
    if (i == BestSucc) continue;
    BBTerm->getSuccessor(i)->removePredecessor(BB, true);
  }
  // ..
}

where we'll have to worry about if BestSucc'th successor is the same
as any other successor or not with your representation change.

I didn't look beyond these two.
>> Can (2), (3) and (4) be fixed by changing the API instead of the
>> deeper IR change you're proposing?
>
> Maybe, but I don't see easy ways. Ideas?
For (2), will it help if there was a setIncomingValueForBlock(BB, V)
that DTRT with setting (all) the incoming value(s) for BB to V?

For (4) perhaps we could add a removeAllIncomingValuesForBlock helper?

I'm not sure what exactly you meant by (3).

-- Sanjoy

Evgeny Astigeevich via llvm-dev

2017-May-02 12:36 UTC

head link

[llvm-dev] RFC: Stop using redundant PHI node entries for multi-edge predecessors

Hi,

I have got experience of low-level manipulations of Phi-nodes whilst I was
implementing an aggressive jump threading optimization for switches.
Chandler's proposal will simplify code.

Cases I had:

1. A->B->C, A has multi-edges to B. A is redirected to C.

I had to write the code something like this:

int NumEdgesToB = changeteTerminatorTarget(A->getTerminator(), B, C);
for (phi: all Phi-nodes in C) {
   V = get incoming value to C from B
   for (int i = 0; i < NumEdgesToB; ++i) {
       phi->addIncoming(V, A);
   }
}

I also had to update phi-nodes in B like this:

   for (int i = 0; i < NumEdgesToB; ++i) {
       PhiInB-> removeIncomingValue (B, DontDeletePHIIfEmpty);
   }

2.  I needed to analyze all values incoming to B. As there can be multi-edges I
had to write the code something like this:

SmallPtrSet<BasicBlock *, 8> SeenBlocks;
for (unsigned I = 0, E = CurPhi->getNumIncomingValues(); I != E; ++I) {
    if (!SeenBlocks.insert(CurPhi->getIncomingBlock(I)).second) {
      continue;
    }
   ...
}

3. I  copied A into A'. I needed correct phi-nodes in successors of A. I had
to write code like this:

      int VCount = 0;
      for (unsigned Id = 0, IdE = Phi->getNumIncomingValues(); Id != IdE;
++Id) {
        if (Phi->getIncomingBlock(Id) != OriginalBlock)
          continue;
        ++VCount;
      }

      for (; VCount > 0; --VCount)
        Phi->addIncoming(V, BlockCopy);

4. In order not to deal with duplicated incoming values I had to simplify a
phi-nodes Def-Use graph by making some copy of it and removing redundant 
incoming values.

So in all places where we modify/access a phi-node we have to take into account
redundant  incoming values and preserve this feature.
> > So would this lead to a case where PHI->getNumOperands() !> >
std::distance(pred_begin(phiblock), pred_end(phiblock))
I cannot imagine when this kind of code could be needed. When we work with
phi-nodes we almost work with DFG they are part of. The code you provided is an
attempt of mixing DFG and CFG. I don't this is good.
As Chandler wrote connections among phi-node values and edges are not expressed
in any way.
> Also, IIUC, today deleting an edge from A to B only requires
> removeIncomingValue(A) on B's PHI nodes, but after this change
you'll have
> to check if you're deleting the last edge or not.
All cases I've seen required to add/remove all multi-edges. It's
interesting to see real-life cases when not all edges are changed.

Thanks,
Evgeny Astigeevich
> -----Original Message-----
> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of
> Sanjoy Das via llvm-dev
> Sent: Monday, May 01, 2017 5:30 PM
> To: Daniel Berlin
> Cc: llvm-dev
> Subject: Re: [llvm-dev] RFC: Stop using redundant PHI node entries for
multi-
> edge predecessors
> 
> Hi,
> 
> On Mon, May 1, 2017 at 8:47 AM, Daniel Berlin via llvm-dev <llvm-
> dev at lists.llvm.org> wrote:
> >> Today, the IR requires that if you have multiple edges from A to B
> >> (typically with a switch) any phi nodes in B must have an equal
> >> number of entries for A, but that all of them must have the same
value.
> >
> >> This seems rather annoying....
> >> 1) It creates multiple uses of values in A for no apparently good
> >> reason
> >> 2) It makes updating PHI nodes using sets of predecessors
incredibly
> >> hard
> >> 3) There is no correspondence between the edges and the PHI node
> >> entries other than number, making most of the code that does
handle
> >> this very ad-hoc "count off N repetitions of the same
thing" style
> >> code, which is brittle and hard to test effectively.
> >
> > Having written this code a few times, i'd agree.
> >
> >>
> >> 4) I suspect bugs are quite common given that
> >> PHINode::removeIncomingValue accepts a basic block but only
removes
> >> the first incoming value, leaving the others in place.
> >
> >> I can't see any serious use case for this symmetry either.
These
> >> aren't uses (and can't be), and there doesn't seem to
be any lost
> >> functionality if we only have a single
> >> entry.http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >>
> >> It also seems likely to open up more efficient in-memory
> >> representations if it is possible at some point to build maps of
these
> things.
> >>
> >> Thoughts?
> >
> >
> >
> > So would this lead to a case where PHI->getNumOperands() !> >
std::distance(pred_begin(phiblock), pred_end(phiblock))
> >
> > If so, that would seem problematic to handle as well.
> > :(
> 
> Also, IIUC, today deleting an edge from A to B only requires
> removeIncomingValue(A) on B's PHI nodes, but after this change
you'll have
> to check if you're deleting the last edge or not.
> 
> Can (2), (3) and (4) be fixed by changing the API instead of the deeper IR
> change you're proposing?
> 
> -- Sanjoy
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Daniel Berlin via llvm-dev

2017-May-02 13:52 UTC

head link

[llvm-dev] RFC: Stop using redundant PHI node entries for multi-edge predecessors

On Tue, May 2, 2017 at 5:36 AM, Evgeny Astigeevich <
Evgeny.Astigeevich at arm.com> wrote:
> Hi,
>
> I have got experience of low-level manipulations of Phi-nodes whilst I was
> implementing an aggressive jump threading optimization for switches.
>
And i've done it a lot in GVN and NewGVN

> Chandler's proposal will simplify code.
With no offense to him, i strongly doubt this :)

>
>
Cases I had:>
> 1. A->B->C, A has multi-edges to B. A is redirected to C.
>
> I had to write the code something like this:
>
> int NumEdgesToB = changeteTerminatorTarget(A->getTerminator(), B, C);
> for (phi: all Phi-nodes in C) {
>    V = get incoming value to C from B
>    for (int i = 0; i < NumEdgesToB; ++i) {
>        phi->addIncoming(V, A);
>    }
> }
>
Yup.

>
> I also had to update phi-nodes in B like this:
>
>    for (int i = 0; i < NumEdgesToB; ++i) {
>        PhiInB-> removeIncomingValue (B, DontDeletePHIIfEmpty);
>    }
>
> Yup
This seems perfectly normal to me, and you would need it regardless of what
happens to switch statements :)

br i1 undef, bb2, bb2
etc


> 2.  I needed to analyze all values incoming to B. As there can be
> multi-edges I had to write the code something like this:
>
> SmallPtrSet<BasicBlock *, 8> SeenBlocks;
> for (unsigned I = 0, E = CurPhi->getNumIncomingValues(); I != E; ++I) {
>     if (!SeenBlocks.insert(CurPhi->getIncomingBlock(I)).second) {
>       continue;
>     }
>    ...
> }
>
> Errr, this is not safe in generalYou actually *must* know whether there are single or multiple edges into a
block to know  whether it is safe to propagate a value.

> 4. In order not to deal with duplicated incoming values I had to simplify
> a phi-nodes Def-Use graph by making some copy of it and removing redundant
> incoming values.
>
I'd love to understand why.
You actually need to deal with them anyway.

The only thing chandler's representation saves is space at the cost of time.
The analysis code should generally have to do precisely the same thing it
did before.

IE given:

switch (a)
{
  case 32: br foo
  case 64: br foo

}

Right now you would end up with:

phi({a, 1}, {a, 1})


You would know that, looking at this phi, it is unsafe to propagate into
the value.

If it is was phi({a, 1}), you cannot tell this, you'd also have to "mix
the
dfg and cfg" as you say, to tell that there are multiple edges.

> So in all places where we modify/access a phi-node we have to take into
> account redundant  incoming values and preserve this feature.
>Yes, and you will still have to do much the same analysis, just you'll end
up doing it forwards from the switch, instead of also being able to do it
backwards from the phi.

>
> > > So would this lead to a case where PHI->getNumOperands() !>
> > std::distance(pred_begin(phiblock), pred_end(phiblock))
>
> I cannot imagine when this kind of code could be needed.

As i said, you often need to know how many edges exist into a block to know
whether it is safe to propagate along a path.

> When we work with phi-nodes we almost work with DFG they are part of. The
> code you provided is an attempt of mixing DFG and CFG.

Except that PHI nodes exist precisely for this purpose?
> I don't this is good.
>
I'm going to strongly disagree.

As Chandler wrote connections among phi-node values and edges are
not> expressed in any way.
>
> ????Sure it is.

There is *always* a correspondence between phi nodes and incoming edges.
That is why phi nodes exist.
To be able to say "when i came from this edge, i have this value".

If you mean "the operands of the switch are not uses of the phi",
that's
also true in every other compiler too.

This is even solvable if you like, by a bit of indirection, so that there
is a 1:1 mapping from case successor to a unique phi operand.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170502/8a955345/attachment.html>

llvm dev - May 2017 - RFC: Stop using redundant PHI node entries for multi-edge predecessors

[llvm-dev] RFC: Stop using redundant PHI node entries for multi-edge predecessors

[llvm-dev] RFC: Stop using redundant PHI node entries for multi-edge predecessors

[llvm-dev] RFC: Stop using redundant PHI node entries for multi-edge predecessors

[llvm-dev] RFC: Stop using redundant PHI node entries for multi-edge predecessors

[llvm-dev] RFC: Stop using redundant PHI node entries for multi-edge predecessors