thr3ads.net - llvm dev - [LLVMdev] : Predication on SIMD architectures and LLVM [Nov 2012]

If this information is useful, please help other people find it:
Share via:

Bjorn De Sutter

2012-Oct-31 20:13 UTC

[LLVMdev] : Predication on SIMD architectures and LLVM

Hi all,

I am working on a CGRA backend (something like a 2D VLIW), and we also
absolutely need predication. I extended the IfConversion pass to allow it to be
executed multiple times and to predicate already predicated code. This is
necessary to predicate code with nested conditional statements. At this point,
we support or, and, and conditional predicates (see Scott Mahlke's papers on
this issue) as supported by the OpenIMPACT compiler (=Trimaran). If anyone is
interested, I can show some of the code. It is rather ad-hoc, however, so it is
not at all ready for integration in the trunk (I think).

The problem we are still facing is that this predication works post instruction
selection and post register allocation. This is problematic because some of the
earlier optimizations such as loop unrolling should ideally be applied on
if-converted code, on which it is easier to judge the opportunities for, e.g.,
modulo scheduling and initiation interval constraints (such as ResMII, RecMII).

In my view, the ideal would be to have very generic, full (OpenIMPACT-like)
predication support throughout LLVM, with the option of enabling/skipping early
if-conversion just like one can enable or disable aggressive inlining.

Best,

Bjorn De Sutter
Computer Systems Lab
Ghent University

Tom Stellard

2012-Nov-01 14:06 UTC

head link

[LLVMdev] : Predication on SIMD architectures and LLVM

On Wed, Oct 31, 2012 at 09:13:43PM +0100, Bjorn De Sutter
wrote:> Hi all,
> 
> I am working on a CGRA backend (something like a 2D VLIW), and we also
absolutely need predication. I extended the IfConversion pass to allow it to be
executed multiple times and to predicate already predicated code. This is
necessary to predicate code with nested conditional statements. At this point,
we support or, and, and conditional predicates (see Scott Mahlke's papers on
this issue) as supported by the OpenIMPACT compiler (=Trimaran). If anyone is
interested, I can show some of the code. It is rather ad-hoc, however, so it is
not at all ready for integration in the trunk (I think).
>
I would be interested in looking at the code, since Southern Islands
GPUs also require predicates for nested conditions.

-Tom
> The problem we are still facing is that this predication works post
instruction selection and post register allocation. This is problematic because
some of the earlier optimizations such as loop unrolling should ideally be
applied on if-converted code, on which it is easier to judge the opportunities
for, e.g., modulo scheduling and initiation interval constraints (such as
ResMII, RecMII).
> 
> In my view, the ideal would be to have very generic, full (OpenIMPACT-like)
predication support throughout LLVM, with the option of enabling/skipping early
if-conversion just like one can enable or disable aggressive inlining.
> 
> Best,
> 
> Bjorn De Sutter
> Computer Systems Lab
> Ghent University
> 
> 
> 
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Bjorn De Sutter

2012-Nov-01 15:15 UTC

head link

[LLVMdev] : Predication on SIMD architectures and LLVM

Hi Tom,

here is the main part of the patch. It requires some additional helper methods
in the backend, but the meaning of those will be clear from this example I
think.

By the way, this patch has only been tested on our own backend, of which we know
where it can and cannot insert predicates. For one thing, we only insert
predicates for if-conversion, in which case the predicate registers are only
used locally. My code might be buggy under different assumptions.

Best,

Bjorn De Sutter
Computer Systems Lab
Ghent University

Index: IfConversion.cpp
==================================================================---
IfConversion.cpp	(revision 167230)
+++ IfConversion.cpp	(working copy)
@@ -64,6 +64,9 @@
 STATISTIC(NumDupBBs,       "Number of duplicated blocks");
 STATISTIC(NumUnpred,       "Number of true blocks of diamonds
unpredicated");
 
+//#undef DEBUG
+//#define DEBUG(x) x
+
 namespace {
   class IfConverter : public MachineFunctionPass {
     enum IfcvtKind {
@@ -654,6 +657,9 @@
   BBI.ExtraCost = 0;
   BBI.ExtraCost2 = 0;
   BBI.ClobbersPred = false;
+
+  std::vector<MachineOperand> PredUses; // predicates used in the block
ADDED BY BJORN
+  std::vector<MachineOperand> PredDefs; // predicates certainly defined
in the block ADDED BY BJORN
   for (MachineBasicBlock::iterator I = BBI.BB->begin(), E =
BBI.BB->end();
        I != E; ++I) {
     if (I->isDebugValue())
@@ -662,7 +668,8 @@
     if (I->isNotDuplicable())
       BBI.CannotBeCopied = true;
 
-    bool isPredicated = TII->isPredicated(I);
+    PredUses.clear(); 
+    bool isPredicated = TII->isPredicatedRecordUse(I,PredUses); // add
guarding predicate to PredUses ADDED BY BJORN
     bool isCondBr = BBI.IsBrAnalyzable && I->isConditionalBranch();
 
     if (!isCondBr) {
@@ -675,11 +682,53 @@
           BBI.ExtraCost += NumCycles-1;
         BBI.ExtraCost2 += ExtraPredCost;
       } else if (!AlreadyPredicated) {
-        // FIXME: This instruction is already predicated before the
-        // if-conversion pass. It's probably something like a conditional
move.
-        // Mark this block unpredicable for now.
-        BBI.IsUnpredicable = true;
-        return;
+		  if (PredUses.empty()) { // TEST ADDED BY BJORN
+			 // ORIGINAL CODE:
+			 
+			 // FIXME: This instruction is already predicated before the
+			 // if-conversion pass. It's probably something like a conditional move.
+			 // Mark this block unpredicable for now.
+			 BBI.IsUnpredicable = true;
+			 return;
+		  }
+		  else { // ELSE CASE ADDED BY BJORN
+			 // this is to support nested conditionals 
+			 // rationale: if a guarding predicate (of an ordinary instruction) is
defined
+			 //            by an instruction in the block to be predicated itself,
+			 //            it will suffice to predicate the predicate defining
instruction
+			 //            using the correct type of or, and, or cond predicate
+			 //            and to leave the guarding predicate unchanged
+			 // example code fragment ( with dest operand on the right, guard left of
the |):
+			 //            cmpeq_ct R1, R2, P1
+			 //            P1 | add R1,R3,R4
+			 // can be predicated with a guarding predicate (suppose P2) as follows: 
+			 //            P2 | cmpeq_ut R1, R2, P1
+			 //            P1 | add R1, R3, R4
+			 // where _ct and _ut have the meaning depicted in Table 1 of the paper
titled
+			 // "Accurate and Efficient Predicate Analysis with Binary Decision
Diagrams" by
+			 // Sias, Hwo, and August in Micro 2000.
+			 
+			 // this is not a complete fix yet, as we cannot handle predicate
+			 // generating instructions that are guarded with the same predicate
+			 // but in the backend we have an assertion that will alert us if this
situation
+			 // ever occurs in our code base, so for the time being this is fine
+			 for (unsigned i = 0; i < PredUses.size() ; i++){
+				
+				unsigned used_reg = PredUses[i].getReg();
+				bool use_after_def = false;
+				for (unsigned j = 0; j < PredDefs.size() ; j++) {
+				  unsigned defined_reg = PredDefs[j].getReg();
+				  if (used_reg == defined_reg) {
+					 use_after_def = true;
+					 break;
+				  }
+				}
+				if (!use_after_def) {
+				  BBI.IsUnpredicable = true;
+				  return;
+				}
+			 }
+		  }
       }
     }
 
@@ -693,13 +742,13 @@
 
       // Predicate may have been modified, the subsequent (currently)
       // unpredicated instructions cannot be correctly predicated.
-      BBI.IsUnpredicable = true;
-      return;
+		//      BBI.IsUnpredicable = true; BY BJORN
+		//      return; BY BJORN
     }
 
     // FIXME: Make use of PredDefs? e.g. ADDC, SUBC sets predicates but are
     // still potentially predicable.
-    std::vector<MachineOperand> PredDefs;
+	 //    std::vector<MachineOperand> PredDefs; // BY BJORN
     if (TII->DefinesPredicate(I, PredDefs))
       BBI.ClobbersPred = true;
 
@@ -736,7 +785,7 @@
         return false;
     }
     if (TII->ReverseBranchCondition(RevPred) ||
-        !TII->SubsumesPredicate(Cond, RevPred))
+   		 (!TII->SubsumesPredicate(Cond, RevPred) &&
!TII->CanPredicateAlreadyPredicated(Cond,RevPred)))
       return false;
   }
 
@@ -1116,6 +1165,15 @@
     if (TII->ReverseBranchCondition(Cond))
       llvm_unreachable("Unable to reverse branch condition!");
 
+  // added by BJORN
+  bool HasEarlyExit = CvtBBI->FalseBB != NULL;
+  if (HasEarlyExit) {
+      SmallVector<MachineOperand, 4> RevCond(CvtBBI->BrCond.begin(),
+                                             CvtBBI->BrCond.end());
+      if (TII->ReverseBranchCondition(RevCond))
+       return false;
+  }
+
   if (Kind == ICTriangleRev || Kind == ICTriangleFRev) {
     if (ReverseBranchCondition(*CvtBBI)) {
       // BB has been changed, modify its predecessors (except for this
@@ -1140,7 +1198,7 @@
   InitPredRedefs(CvtBBI->BB, Redefs, TRI);
   InitPredRedefs(NextBBI->BB, Redefs, TRI);
 
-  bool HasEarlyExit = CvtBBI->FalseBB != NULL;
+  //  bool HasEarlyExit = CvtBBI->FalseBB != NULL; // BY BJORN
   if (CvtBBI->BB->pred_size() > 1) {
     BBI.NonPredSize -= TII->RemoveBranch(*BBI.BB);
     // Copy instructions in the true block, predicate them, and add them to







On 01 Nov 2012, at 15:06, Tom Stellard <tom at stellard.net> wrote:
> On Wed, Oct 31, 2012 at 09:13:43PM +0100, Bjorn De Sutter wrote:
>> Hi all,
>> 
>> I am working on a CGRA backend (something like a 2D VLIW), and we also
absolutely need predication. I extended the IfConversion pass to allow it to be
executed multiple times and to predicate already predicated code. This is
necessary to predicate code with nested conditional statements. At this point,
we support or, and, and conditional predicates (see Scott Mahlke's papers on
this issue) as supported by the OpenIMPACT compiler (=Trimaran). If anyone is
interested, I can show some of the code. It is rather ad-hoc, however, so it is
not at all ready for integration in the trunk (I think).
>> 
> 
> I would be interested in looking at the code, since Southern Islands
> GPUs also require predicates for nested conditions.
> 
> -Tom
> 
>> The problem we are still facing is that this predication works post
instruction selection and post register allocation. This is problematic because
some of the earlier optimizations such as loop unrolling should ideally be
applied on if-converted code, on which it is easier to judge the opportunities
for, e.g., modulo scheduling and initiation interval constraints (such as
ResMII, RecMII).
>> 
>> In my view, the ideal would be to have very generic, full
(OpenIMPACT-like) predication support throughout LLVM, with the option of
enabling/skipping early if-conversion just like one can enable or disable
aggressive inlining.
>> 
>> Best,
>> 
>> Bjorn De Sutter
>> Computer Systems Lab
>> Ghent University
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Andrew Trick

2012-Nov-10 02:50 UTC

head link

[LLVMdev] : Predication on SIMD architectures and LLVM

On Oct 31, 2012, at 1:13 PM, Bjorn De Sutter <bjorn.desutter at
elis.ugent.be> wrote:
> Hi all,
> 
> I am working on a CGRA backend (something like a 2D VLIW), and we also
absolutely need predication. I extended the IfConversion pass to allow it to be
executed multiple times and to predicate already predicated code. This is
necessary to predicate code with nested conditional statements. At this point,
we support or, and, and conditional predicates (see Scott Mahlke's papers on
this issue) as supported by the OpenIMPACT compiler (=Trimaran). If anyone is
interested, I can show some of the code. It is rather ad-hoc, however, so it is
not at all ready for integration in the trunk (I think).
> 
> The problem we are still facing is that this predication works post
instruction selection and post register allocation. This is problematic because
some of the earlier optimizations such as loop unrolling should ideally be
applied on if-converted code, on which it is easier to judge the opportunities
for, e.g., modulo scheduling and initiation interval constraints (such as
ResMII, RecMII).
> 
> In my view, the ideal would be to have very generic, full (OpenIMPACT-like)
predication support throughout LLVM, with the option of enabling/skipping early
if-conversion just like one can enable or disable aggressive inlining.

In theory, MachineInstrs can be predicated before register allocation (in SSA),
but the machine code will be full of false dependencies and liveness will not be
predicate-aware. (Predicated instructions would need implicit use operand for
any virtual register defs). You would basically lose reaching defs after
predication.

You could implement a loop unroller for SSA MachineInstrs. It's conceptually
similar to early tail duplication.

"Predicating" at IR level would have to take the form of an analysis
that indicates which blocks *will* be predicated. You would have to leave the
original CFG and phi nodes in tact to preserve control dependence and dataflow.
Then you just have the problem of preserving that analysis across any changes to
the CFG.

-Andy

Possibly Parallel Threads

Search for more maybe matching threads

llvm dev - Nov 2012 - [LLVMdev] : Predication on SIMD architectures and LLVM

[LLVMdev] : Predication on SIMD architectures and LLVM

[LLVMdev] : Predication on SIMD architectures and LLVM

[LLVMdev] : Predication on SIMD architectures and LLVM

[LLVMdev] : Predication on SIMD architectures and LLVM

Possibly Parallel Threads