Jonas Paulsson via llvm-dev
2016-Feb-03  10:41 UTC
[llvm-dev] [buildSchedGraph] memory dependencies
Hi,
(This only concerns MISNeedChainEdge(), and is separate from D8705)
I found out that the MIScheduler (pre-ra) could not handle a simple test 
case (test/CodeGen/SystemZ/alias-01.ll), with 16 independent load / add 
/ stores.
The buildSchedGraph() put too many edges between memory accesses, because
1) There was no implementation of areMemAccessesTriviallyDisjoint() for 
SystemZ.
2) Type Based Alias Analysis (TBAA) was not used.
I have gotten rid of the edges on an experimental level, and would now 
like some help and feedback:
1): It seems common for targets to check for the same virtual address 
register and then figure out via offsets/sizes if they overlap. When 
faced with the task of doing this, I realized that the 
MachineMemOperands might do the job. I tried the following and wonder 
why this is not done already, perhaps as a default implementation of  
areMemAccessesTriviallyDisjoint(). Is it because memory operands may be 
missing, or is it because MachineCombiner may give the register based 
analysis an advantage? This is a check in the case of the *same Value*. 
In this case the Value is an argument, which is unsafe against others, 
but I am thinking it should at least be safe against itself...
diff --git a/lib/CodeGen/ScheduleDAGInstrs.cpp 
b/lib/CodeGen/ScheduleDAGInstrs.cpp
index 00a0b0f..cd48f51 100644
--- a/lib/CodeGen/ScheduleDAGInstrs.cpp
+++ b/lib/CodeGen/ScheduleDAGInstrs.cpp
@@ -584,6 +584,25 @@ static bool MIsNeedChainEdge(AliasAnalysis *AA, 
const MachineFrameInfo *MFI,
    if (!MIa->hasOneMemOperand() || !MIb->hasOneMemOperand())
      return true;
+  // If mem-operands show that the same address Value is used by both
+  // ("normal") instructions, simply check offsets and sizes of the
+  // accesses.
+  MachineMemOperand *MMOa = *MIa->memoperands_begin();
+  MachineMemOperand *MMOb = *MIb->memoperands_begin();
+  const Value *VALa = MMOa->getValue();
+  const Value *VALb = MMOb->getValue();
+  if (VALa == VALb &&
+      !MIa->hasUnmodeledSideEffects() &&
!MIb->hasUnmodeledSideEffects() &&
+      !MIa->hasOrderedMemoryRef() && !MIb->hasOrderedMemoryRef())
{
+    int OffsetA = MMOa->getOffset(), OffsetB = MMOb->getOffset();
+    int WidthA = MMOa->getSize(), WidthB = MMOb->getSize();
+    int LowOffset = OffsetA < OffsetB ? OffsetA : OffsetB;
+    int HighOffset = OffsetA < OffsetB ? OffsetB : OffsetA;
+    int LowWidth = (LowOffset == OffsetA) ? WidthA : WidthB;
+    if (LowOffset + LowWidth <= HighOffset)
+      return false;
+  }
+
    if (isUnsafeMemoryObject(MIa, MFI, DL) || isUnsafeMemoryObject(MIb, 
MFI, DL))
      return true;
2) The TBAA tags should separate the loads from the stores. In the MF I see
...
%vreg32<def> = L %vreg0, 56, %noreg; 
mem:LD4[%src1(align=64)+56](align=8)(tbaa=!1) GR32Bit:%vreg32 
ADDR64Bit:%vreg0
...
ST %vreg33, %vreg1, 60, %noreg; 
mem:ST4[%dest(align=64)+60](align=4)(tbaa=!4) GR32Bit:%vreg33 
ADDR64Bit:%vreg1
...
Since the tbba tags are part of the MachineMemOperands, it is easy to 
just continue the patch (above isUnsafeMemoryObject() checks) with
+  AAMDNodes AATagsA = MMOa->getAAInfo(), AATagsB = MMOb->getAAInfo();
+  if (AATagsA->TBAA && AATagsB->TBAA && AATagsA !=
AATagsB)
+    return false;
I see that there is a TBAA method that is intended to give this type of 
result. So I wonder why this is not used?
Would it be correct to implement SystemZs 
areMemAccessesTriviallyDisjoint() with the above code? (I am suspecting 
that perhaps my AAMDNodes check is incomplete)
/Jonas
test case:
define void @f1(<16 x i32> *%src1, <16 x float> *%dest) {
; CHECK-LABEL: f1:
; CHECK-NOT: %r15
; CHECK: br %r14
   %val = load <16 x i32> , <16 x i32> *%src1, !tbaa !1
   %add = add <16 x i32> %val, %val
   %res = bitcast <16 x i32> %add to <16 x float>
   store <16 x float> %res, <16 x float> *%dest, !tbaa !2
   ret void
}
!0 = !{ !"root" }
!1 = !{ !"set1", !0 }
!2 = !{ !"set2", !0 }
Hal Finkel via llvm-dev
2016-Feb-04  01:51 UTC
[llvm-dev] [buildSchedGraph] memory dependencies
----- Original Message -----> From: "Jonas Paulsson via llvm-dev" <llvm-dev at lists.llvm.org> > To: "llvm-dev" <llvm-dev at lists.llvm.org> > Sent: Wednesday, February 3, 2016 4:41:33 AM > Subject: [llvm-dev] [buildSchedGraph] memory dependencies > > Hi, > > (This only concerns MISNeedChainEdge(), and is separate from D8705) > > I found out that the MIScheduler (pre-ra) could not handle a simple > test > case (test/CodeGen/SystemZ/alias-01.ll), with 16 independent load / > add > / stores. > > The buildSchedGraph() put too many edges between memory accesses, > because > 1) There was no implementation of areMemAccessesTriviallyDisjoint() > for > SystemZ. > 2) Type Based Alias Analysis (TBAA) was not used. > > I have gotten rid of the edges on an experimental level, and would > now > like some help and feedback: > > 1): It seems common for targets to check for the same virtual address > register and then figure out via offsets/sizes if they overlap. When > faced with the task of doing this, I realized that the > MachineMemOperands might do the job. I tried the following and wonder > why this is not done already, perhaps as a default implementation of > areMemAccessesTriviallyDisjoint(). Is it because memory operands may > be > missing, or is it because MachineCombiner may give the register based > analysis an advantage? This is a check in the case of the *same > Value*. > In this case the Value is an argument, which is unsafe against > others, > but I am thinking it should at least be safe against itself... > > diff --git a/lib/CodeGen/ScheduleDAGInstrs.cpp > b/lib/CodeGen/ScheduleDAGInstrs.cpp > index 00a0b0f..cd48f51 100644 > --- a/lib/CodeGen/ScheduleDAGInstrs.cpp > +++ b/lib/CodeGen/ScheduleDAGInstrs.cpp > @@ -584,6 +584,25 @@ static bool MIsNeedChainEdge(AliasAnalysis *AA, > const MachineFrameInfo *MFI, > if (!MIa->hasOneMemOperand() || !MIb->hasOneMemOperand()) > return true; > > + // If mem-operands show that the same address Value is used by > both > + // ("normal") instructions, simply check offsets and sizes of the > + // accesses. > + MachineMemOperand *MMOa = *MIa->memoperands_begin(); > + MachineMemOperand *MMOb = *MIb->memoperands_begin(); > + const Value *VALa = MMOa->getValue(); > + const Value *VALb = MMOb->getValue(); > + if (VALa == VALb && > + !MIa->hasUnmodeledSideEffects() && > !MIb->hasUnmodeledSideEffects() && > + !MIa->hasOrderedMemoryRef() && !MIb->hasOrderedMemoryRef()) { > + int OffsetA = MMOa->getOffset(), OffsetB = MMOb->getOffset(); > + int WidthA = MMOa->getSize(), WidthB = MMOb->getSize(); > + int LowOffset = OffsetA < OffsetB ? OffsetA : OffsetB; > + int HighOffset = OffsetA < OffsetB ? OffsetB : OffsetA; > + int LowWidth = (LowOffset == OffsetA) ? WidthA : WidthB; > + if (LowOffset + LowWidth <= HighOffset) > + return false; > + } > + > if (isUnsafeMemoryObject(MIa, MFI, DL) || > isUnsafeMemoryObject(MIb, > MFI, DL)) > return true; >This looks reasonable to me.> > 2) The TBAA tags should separate the loads from the stores. In the MF > I see > ... > %vreg32<def> = L %vreg0, 56, %noreg; > mem:LD4[%src1(align=64)+56](align=8)(tbaa=!1) GR32Bit:%vreg32 > ADDR64Bit:%vreg0 > ... > ST %vreg33, %vreg1, 60, %noreg; > mem:ST4[%dest(align=64)+60](align=4)(tbaa=!4) GR32Bit:%vreg33 > ADDR64Bit:%vreg1 > ... > > Since the tbba tags are part of the MachineMemOperands, it is easy to > just continue the patch (above isUnsafeMemoryObject() checks) with > > + AAMDNodes AATagsA = MMOa->getAAInfo(), AATagsB > MMOb->getAAInfo(); > + if (AATagsA->TBAA && AATagsB->TBAA && AATagsA != AATagsB) > + return false; > > I see that there is a TBAA method that is intended to give this type > of > result. So I wonder why this is not used? > > Would it be correct to implement SystemZs > areMemAccessesTriviallyDisjoint() with the above code? (I am > suspecting > that perhaps my AAMDNodes check is incomplete)I don't understand this. The whole point of allowing backends use AA is to let the existing AA infrastructure take care of this. As I recall, SystemZ enabled AA in the backend, so why do you need anything special here? -Hal> > /Jonas > > test case: > > define void @f1(<16 x i32> *%src1, <16 x float> *%dest) { > ; CHECK-LABEL: f1: > ; CHECK-NOT: %r15 > ; CHECK: br %r14 > %val = load <16 x i32> , <16 x i32> *%src1, !tbaa !1 > %add = add <16 x i32> %val, %val > %res = bitcast <16 x i32> %add to <16 x float> > store <16 x float> %res, <16 x float> *%dest, !tbaa !2 > ret void > } > > !0 = !{ !"root" } > !1 = !{ !"set1", !0 } > !2 = !{ !"set2", !0 } > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
Jonas Paulsson via llvm-dev
2016-Feb-04  13:46 UTC
[llvm-dev] [buildSchedGraph] memory dependencies
On 2016-02-04 02:51, Hal Finkel wrote:> > diff --git a/lib/CodeGen/ScheduleDAGInstrs.cpp > b/lib/CodeGen/ScheduleDAGInstrs.cpp > index 00a0b0f..cd48f51 100644 > --- a/lib/CodeGen/ScheduleDAGInstrs.cpp > +++ b/lib/CodeGen/ScheduleDAGInstrs.cpp > @@ -584,6 +584,25 @@ static bool MIsNeedChainEdge(AliasAnalysis *AA, > const MachineFrameInfo *MFI, > if (!MIa->hasOneMemOperand() || !MIb->hasOneMemOperand()) > return true; > > + // If mem-operands show that the same address Value is used by > both > + // ("normal") instructions, simply check offsets and sizes of the > + // accesses. > + MachineMemOperand *MMOa = *MIa->memoperands_begin(); > + MachineMemOperand *MMOb = *MIb->memoperands_begin(); > + const Value *VALa = MMOa->getValue(); > + const Value *VALb = MMOb->getValue(); > + if (VALa == VALb && > + !MIa->hasUnmodeledSideEffects() && > !MIb->hasUnmodeledSideEffects() && > + !MIa->hasOrderedMemoryRef() && !MIb->hasOrderedMemoryRef()) { > + int OffsetA = MMOa->getOffset(), OffsetB = MMOb->getOffset(); > + int WidthA = MMOa->getSize(), WidthB = MMOb->getSize(); > + int LowOffset = OffsetA < OffsetB ? OffsetA : OffsetB; > + int HighOffset = OffsetA < OffsetB ? OffsetB : OffsetA; > + int LowWidth = (LowOffset == OffsetA) ? WidthA : WidthB; > + if (LowOffset + LowWidth <= HighOffset) > + return false; > + } > + > if (isUnsafeMemoryObject(MIa, MFI, DL) || > isUnsafeMemoryObject(MIb, > MFI, DL)) > return true; > > This looks reasonable to me.OK - I will try this then for SystemZ as a starting point, and I also need to do handle PseudoSourceValues, I guess. It would be great to try this on a target that has done the register analysis, and see if that is worth the effort... Any idea how to do that?> > + AAMDNodes AATagsA = MMOa->getAAInfo(), AATagsB > MMOb->getAAInfo(); > + if (AATagsA->TBAA && AATagsB->TBAA && AATagsA != AATagsB) > + return false; > > I see that there is a TBAA method that is intended to give this type > of > result. So I wonder why this is not used? > > Would it be correct to implement SystemZs > areMemAccessesTriviallyDisjoint() with the above code? (I am > suspecting > that perhaps my AAMDNodes check is incomplete) > I don't understand this. The whole point of allowing backends use AA is to let the existing AA infrastructure take care of this. As I recall, SystemZ enabled AA in the backend, so why do you need anything special here? > > -HalI think I realized what was wrong: The AA was never called in the first place, because the old isUnsafeMemoryObject() rejected the MI, because the value could not be identified (argument). With the new patch applied (D8705), AA was called and it works as you expected. thanks, Jonas
Seemingly Similar Threads
- How to avoid multiple registers definitions in customInserter.
- pre-RA scheduling/live register analysis optimization (handle move) forcing spill of registers
- [LLVMdev] problem with inlining pass
- [LLVMdev] problem with inlining pass
- [LLVMdev] .debug_info section size in arm executable