thr3ads.net - llvm dev - [LLVMdev] [PATH] Add sub.ovf/mul.ovf intrinsics [Dec 2008]

If this information is useful, please help other people find it:
Share via:

Zoltan Varga

2008-Dec-09 01:27 UTC

[LLVMdev] [PATH] Add sub.ovf/mul.ovf intrinsics

Hi,

The attached patch implements sub.ovf/mul.ovf intrinsics similarly to
the recently added add.ovf intrinsics. These are useful for
implementing some vm instructions like sub.ovf/mul.ovf in .NET IL
efficiently.  sub.ovf is supported in target independent lowering and
on x86, while mul.ovf is only supported in the x86 backend.

                                        Please review

                                                    Zoltan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: llvm-ovf.diff
Type: text/x-diff
Size: 36435 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20081209/7a4bbba0/attachment.diff>

Bill Wendling

2008-Dec-09 06:07 UTC

head link

[LLVMdev] [PATH] Add sub.ovf/mul.ovf intrinsics

On Dec 8, 2008, at 5:27 PM, Zoltan Varga wrote:
> The attached patch implements sub.ovf/mul.ovf intrinsics similarly to
> the recently added add.ovf intrinsics. These are useful for
> implementing some vm instructions like sub.ovf/mul.ovf in .NET IL
> efficiently.  sub.ovf is supported in target independent lowering and
> on x86, while mul.ovf is only supported in the x86 backend.
>Hi Zoltan,

Thanks for the patch. A few comments:


Index: include/llvm/CodeGen/SelectionDAGNodes.h
==================================================================---
include/llvm/CodeGen/SelectionDAGNodes.h	(revision 60734)
+++ include/llvm/CodeGen/SelectionDAGNodes.h	(working copy)
@@ -259,6 +259,12 @@
      // These nodes are generated from the llvm.[su]add.with.overflow  
intrinsics.
      SADDO, UADDO,

+	// Same for subtraction
+	SSUBO, USUBO,
+
+	// Same for multiplication
+	SMULO, UMULO,

No tabs please.

+SDValue X86TargetLowering::LowerXALUO(SDValue Op, SelectionDAG &DAG) {
+  // Lower the "add/sub/mul with overflow" instruction into a regular
ins plus
+  // a "setcc" instruction that checks the overflow flag. The  
"brcond" lowering
    // looks for this combo and may remove the "setcc" instruction if
the "setcc"
    // has only one use.
    SDNode *N = Op.getNode();
    SDValue LHS = N->getOperand(0);
    SDValue RHS = N->getOperand(1);
+  unsigned BaseOp;
+  bool Signed;

Please initialize BaseOp and Signed. Instead of "Signed", please make
it "X86::COND_O" or "X86::COND_C". That saves having to use
the
ternary operator in the DAG.getConstant() call.


Index: lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
==================================================================---
lib/CodeGen/SelectionDAG/LegalizeDAG.cpp	(revision 60734)
+++ lib/CodeGen/SelectionDAG/LegalizeDAG.cpp	(working copy)
@@ -4224,7 +4225,7 @@
        SDValue LHS = LegalizeOp(Node->getOperand(0));
        SDValue RHS = LegalizeOp(Node->getOperand(1));

-      SDValue Sum = DAG.getNode(ISD::ADD, LHS.getValueType(), LHS,  
RHS);
+      SDValue Sum = DAG.getNode(Node->getOpcode () == ISD::SADDO ?  
ISD::ADD : ISD::SUB, LHS.getValueType(), LHS, RHS);

Watch out for 80-column violations.


@@ -4233,16 +4234,19 @@
        //   RHSSign -> RHS >= 0
        //   SumSign -> Sum >= 0
        //
+	  //   Add:
        //   Overflow -> (LHSSign == RHSSign) && (LHSSign != SumSign)
+	  //   Sub:
+      //   Overflow -> (LHSSign != RHSSign) && (LHSSign != SumSign)


Tabs.

        //
        SDValue LHSSign = DAG.getSetCC(OType, LHS, Zero, ISD::SETGE);
        SDValue RHSSign = DAG.getSetCC(OType, RHS, Zero, ISD::SETGE);
-      SDValue SignsEq = DAG.getSetCC(OType, LHSSign, RHSSign,  
ISD::SETEQ);
+      SDValue SignsMatch = DAG.getSetCC(OType, LHSSign, RHSSign, Node- 
 >getOpcode () == ISD::SADDO ? ISD::SETEQ : ISD::SETNE);


80-columns and strange spacing. :-)

@@ -4270,9 +4275,9 @@
        SDValue LHS = LegalizeOp(Node->getOperand(0));
        SDValue RHS = LegalizeOp(Node->getOperand(1));

-      SDValue Sum = DAG.getNode(ISD::ADD, LHS.getValueType(), LHS,  
RHS);
+      SDValue Sum = DAG.getNode(Node->getOpcode () == ISD::UADDO ?  
ISD::ADD : ISD::SUB, LHS.getValueType(), LHS, RHS);
        MVT OType = Node->getValueType(1);
-      SDValue Cmp = DAG.getSetCC(OType, Sum, LHS, ISD::SETULT);
+      SDValue Cmp = DAG.getSetCC(OType, Sum, LHS, Node->getOpcode ()  
== ISD::UADDO ? ISD::SETULT : ISD::SETUGT);


80-columns


@@ -4288,7 +4293,24 @@

      break;
    }
+  case ISD::SMULO:
+  case ISD::UMULO: {
+    MVT VT = Node->getValueType(0);
+    // This is hard to implement without access to the cpu condition  
codes
+    switch (TLI.getOperationAction(Node->getOpcode(), VT)) {
+    default: assert(0 && "This action is not supported at
all!");
+    case TargetLowering::Custom:
+      Result = TLI.LowerOperation(Op, DAG);
+      if (Result.getNode()) break;
+      // Fall Thru
+    case TargetLowering::Legal:
+      assert(0 && "Target independent lowering is not supported
for
SMULO/UMULO!");	
+    break;
+    }
+    break;
    }

It would be better to implement a target-independent check for  
overflow for the "Legal" case (like how SADDO does). Hacker's
Delight
has some hints on how to do this. It's not easy for the signed case,  
but is do-able.


Index: lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp
==================================================================---
lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp	(revision 60734)
+++ lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp	(working copy)
@@ -4113,6 +4113,40 @@
      return 0;
    }

+  case Intrinsic::usub_with_overflow:
+  case Intrinsic::ssub_with_overflow: {
+    SDValue Op1 = getValue(I.getOperand(1));
+    SDValue Op2 = getValue(I.getOperand(2));
+
+    MVT ValueVTs[] = { Op1.getValueType(), MVT::i1 };
+    SDValue Ops[] = { Op1, Op2 };
+
+    SDValue Result +      DAG.getNode((Intrinsic ==
Intrinsic::ssub_with_overflow) ?
+                    ISD::SSUBO : ISD::USUBO,
+                  DAG.getVTList(&ValueVTs[0], 2), &Ops[0], 2);
+
+    setValue(&I, Result);
+    return 0;
+  }
+
+  case Intrinsic::umul_with_overflow:
+  case Intrinsic::smul_with_overflow: {
+    SDValue Op1 = getValue(I.getOperand(1));
+    SDValue Op2 = getValue(I.getOperand(2));
+
+    MVT ValueVTs[] = { Op1.getValueType(), MVT::i1 };
+    SDValue Ops[] = { Op1, Op2 };
+
+    SDValue Result +      DAG.getNode((Intrinsic ==
Intrinsic::smul_with_overflow) ?
+                    ISD::SMULO : ISD::UMULO,
+                  DAG.getVTList(&ValueVTs[0], 2), &Ops[0], 2);
+
+    setValue(&I, Result);
+    return 0;
+  }
+


This code is essentially the same for all of the [us]*_with_overflow  
intrinsics. Could you factor it out into its own helper function?

Otherwise, the patch looks pretty good. Thanks!
-bw

Reasonably Related Threads

Search for more seemingly similar threads

llvm dev - Dec 2008 - [LLVMdev] [PATH] Add sub.ovf/mul.ovf intrinsics

[LLVMdev] [PATH] Add sub.ovf/mul.ovf intrinsics

[LLVMdev] [PATH] Add sub.ovf/mul.ovf intrinsics

Reasonably Related Threads