Björn Pettersson A via llvm-dev
2021-Sep-21 13:09 UTC
[llvm-dev] DominatorTree, JumpThreading and EarlyCSE non-determinism
Hello llvm-dev, Recently I've been debugging a non-determinism problem. I've managed to narrow it down to running: opt -passes='function(jump-threading,print<domtree>,early-cse)'>From debug printouts (also adding -debug to the cmd line) it looks like jump-threading is doing doing dominator tree updates in a non-deterministic order (when comparing different runs).I randomly get either of these two DominatorTrees in the print<domtree> printouts: DominatorTree for function: g =============================-------------------------------- Inorder Dominator Tree: DFSNumbers invalid: 0 slow queries. [1] %entry {4294967295,4294967295} [0] [2] %for.cond1 {4294967295,4294967295} [1] [3] %if.then {4294967295,4294967295} [2] [4] %for.cond5.preheader {4294967295,4294967295} [3] [5] %cleanup {4294967295,4294967295} [4] [6] %cleanup16 {4294967295,4294967295} [5] [7] %unreachable {4294967295,4294967295} [6] [7] %for.end21 {4294967295,4294967295} [6] [5] %for.body7 {4294967295,4294967295} [4] [6] %for.inc {4294967295,4294967295} [5] [5] %return {4294967295,4294967295} [4] [3] %cleanup16.thread {4294967295,4294967295} [2] [3] %for.inc19 {4294967295,4294967295} [2] [2] %for.cond {4294967295,4294967295} [1] Roots: %entry DominatorTree for function: g =============================-------------------------------- Inorder Dominator Tree: DFSNumbers invalid: 0 slow queries. [1] %entry {4294967295,4294967295} [0] [2] %for.cond1 {4294967295,4294967295} [1] [3] %for.inc19 {4294967295,4294967295} [2] [3] %if.then {4294967295,4294967295} [2] [4] %for.cond5.preheader {4294967295,4294967295} [3] [5] %cleanup {4294967295,4294967295} [4] [6] %cleanup16 {4294967295,4294967295} [5] [7] %unreachable {4294967295,4294967295} [6] [7] %for.end21 {4294967295,4294967295} [6] [5] %for.body7 {4294967295,4294967295} [4] [6] %for.inc {4294967295,4294967295} [5] [5] %return {4294967295,4294967295} [4] [3] %cleanup16.thread {4294967295,4294967295} [2] [2] %for.cond {4294967295,4294967295} [1] Roots: %entry I think both trees are correct, the nodes are just in a different order. This can also be seen if I add invalidate<domtree> before print<domtree> to force a recalculation. Then it will look like this instead (same content, but the level 3 nodes printed in yet another order): DominatorTree for function: g =============================-------------------------------- Inorder Dominator Tree: DFSNumbers invalid: 0 slow queries. [1] %entry {4294967295,4294967295} [0] [2] %for.cond1 {4294967295,4294967295} [1] [3] %cleanup16.thread {4294967295,4294967295} [2] [3] %for.inc19 {4294967295,4294967295} [2] [3] %if.then {4294967295,4294967295} [2] [4] %for.cond5.preheader {4294967295,4294967295} [3] [5] %cleanup {4294967295,4294967295} [4] [6] %cleanup16 {4294967295,4294967295} [5] [7] %unreachable {4294967295,4294967295} [6] [7] %for.end21 {4294967295,4294967295} [6] [5] %return {4294967295,4294967295} [4] [5] %for.body7 {4294967295,4294967295} [4] [6] %for.inc {4294967295,4294967295} [5] [2] %for.cond {4294967295,4294967295} [1] Roots: %entry *** Question one *** Maybe it is OK from a correctness point-of-view that JumpThreading isn't producing the same node order every time (given same input), even though it might be a bit confusing when debugging. Or should this be seen as a bug? Next problem/question is related to EarlyCSE. It looks like the output from EarlyCSE depends on the node order in the dominator tree. So depending on if JumpThreading has produced the first or second of the DominatorTree structures above we might end up eliminating one more/less PHI node during EarlyCSE. *** Question two *** Should be seen as a bug? Or are passes in general sensitive to how analysis information is produced (such as the node order in a dominator tree) and thus being allowed to produce better/worse code depending on such things? To summarize: JumpThreading is non-deterministic but produces semantically "equivalent" results. EarlyCSE is deterministic, but the result depends on order of nodes in the DominatorTree. Those two things together gives a non-deterministic IR result. And I figure that should be seen as a bug. Just not sure if the fault is in DominatorTreeUpdater, JumpThreading or EarlyCSE (or if it should be seen as multiple faults). Regards, Björn
Roman Lebedev via llvm-dev
2021-Sep-21 13:14 UTC
[llvm-dev] DominatorTree, JumpThreading and EarlyCSE non-determinism
Does JumpThreading produce exactly the same IR in all of the situations? But even if it does, this does seem like a bug. Roman On Tue, Sep 21, 2021 at 4:10 PM Björn Pettersson A via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > Hello llvm-dev, > > Recently I've been debugging a non-determinism problem. > > I've managed to narrow it down to running: > > opt -passes='function(jump-threading,print<domtree>,early-cse)' > > > From debug printouts (also adding -debug to the cmd line) it looks like jump-threading is doing doing dominator tree updates in a non-deterministic order (when comparing different runs). > > I randomly get either of these two DominatorTrees in the print<domtree> printouts: > > DominatorTree for function: g > =============================-------------------------------- > Inorder Dominator Tree: DFSNumbers invalid: 0 slow queries. > [1] %entry {4294967295,4294967295} [0] > [2] %for.cond1 {4294967295,4294967295} [1] > [3] %if.then {4294967295,4294967295} [2] > [4] %for.cond5.preheader {4294967295,4294967295} [3] > [5] %cleanup {4294967295,4294967295} [4] > [6] %cleanup16 {4294967295,4294967295} [5] > [7] %unreachable {4294967295,4294967295} [6] > [7] %for.end21 {4294967295,4294967295} [6] > [5] %for.body7 {4294967295,4294967295} [4] > [6] %for.inc {4294967295,4294967295} [5] > [5] %return {4294967295,4294967295} [4] > [3] %cleanup16.thread {4294967295,4294967295} [2] > [3] %for.inc19 {4294967295,4294967295} [2] > [2] %for.cond {4294967295,4294967295} [1] > Roots: %entry > > DominatorTree for function: g > =============================-------------------------------- > Inorder Dominator Tree: DFSNumbers invalid: 0 slow queries. > [1] %entry {4294967295,4294967295} [0] > [2] %for.cond1 {4294967295,4294967295} [1] > [3] %for.inc19 {4294967295,4294967295} [2] > [3] %if.then {4294967295,4294967295} [2] > [4] %for.cond5.preheader {4294967295,4294967295} [3] > [5] %cleanup {4294967295,4294967295} [4] > [6] %cleanup16 {4294967295,4294967295} [5] > [7] %unreachable {4294967295,4294967295} [6] > [7] %for.end21 {4294967295,4294967295} [6] > [5] %for.body7 {4294967295,4294967295} [4] > [6] %for.inc {4294967295,4294967295} [5] > [5] %return {4294967295,4294967295} [4] > [3] %cleanup16.thread {4294967295,4294967295} [2] > [2] %for.cond {4294967295,4294967295} [1] > Roots: %entry > > > I think both trees are correct, the nodes are just in a different order. This can also be seen if I add invalidate<domtree> before print<domtree> to force a recalculation. Then it will look like this instead (same content, but the level 3 nodes printed in yet another order): > > DominatorTree for function: g > =============================-------------------------------- > Inorder Dominator Tree: DFSNumbers invalid: 0 slow queries. > [1] %entry {4294967295,4294967295} [0] > [2] %for.cond1 {4294967295,4294967295} [1] > [3] %cleanup16.thread {4294967295,4294967295} [2] > [3] %for.inc19 {4294967295,4294967295} [2] > [3] %if.then {4294967295,4294967295} [2] > [4] %for.cond5.preheader {4294967295,4294967295} [3] > [5] %cleanup {4294967295,4294967295} [4] > [6] %cleanup16 {4294967295,4294967295} [5] > [7] %unreachable {4294967295,4294967295} [6] > [7] %for.end21 {4294967295,4294967295} [6] > [5] %return {4294967295,4294967295} [4] > [5] %for.body7 {4294967295,4294967295} [4] > [6] %for.inc {4294967295,4294967295} [5] > [2] %for.cond {4294967295,4294967295} [1] > Roots: %entry > > > *** Question one *** > Maybe it is OK from a correctness point-of-view that JumpThreading isn't producing the same node order every time (given same input), even though it might be a bit confusing when debugging. Or should this be seen as a bug? > > > Next problem/question is related to EarlyCSE. It looks like the output from EarlyCSE depends on the node order in the dominator tree. So depending on if JumpThreading has produced the first or second of the DominatorTree structures above we might end up eliminating one more/less PHI node during EarlyCSE. > > *** Question two *** > Should be seen as a bug? Or are passes in general sensitive to how analysis information is produced (such as the node order in a dominator tree) and thus being allowed to produce better/worse code depending on such things? > > > To summarize: > JumpThreading is non-deterministic but produces semantically "equivalent" results. > EarlyCSE is deterministic, but the result depends on order of nodes in the DominatorTree. > Those two things together gives a non-deterministic IR result. And I figure that should be seen as a bug. Just not sure if the fault is in DominatorTreeUpdater, JumpThreading or EarlyCSE (or if it should be seen as multiple faults). > > Regards, > Björn > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev