Kazushi Marukawa via llvm-dev
2019-Mar-13 00:24 UTC
[llvm-dev] llvm combines "ADD frameindex, constant" to OR
Hi all, I've been working on a backend of our architecture and noticed llvm performs following combining although one of operands is FrameIndex. Combining: t114: i64 = add FrameIndex:i64<0>, Constant:i64<56> Creating new node: t121: i64 = or FrameIndex:i64<0>, Constant:i64<56> ... into: t121: i64 = or FrameIndex:i64<0>, Constant:i64<56> This caused problem if frame pointer points 0x60000038 at run time. I checked DAGCombiner::visitADD. It folds ADD to OR by following code without considering about FrameIndex. This haveNoCommonBitsSet says it's safe since FrameIndex(0) is 0. // fold (a+b) -> (a|b) iff a and b share no bits. if ((!LegalOperations || TLI.isOperationLegal(ISD::OR, VT)) && DAG.haveNoCommonBitsSet(N0, N1)) return DAG.getNode(ISD::OR, DL, VT, N0, N1); I checked visitADD more and find that it performs some kind of undo like bellow if the input is (+ (+ FI <const>) <const>). // Undo the add -> or combine to merge constant offsets from a frame index. if (N0.getOpcode() == ISD::OR && isa<FrameIndexSDNode>(N0.getOperand(0)) && isa<ConstantSDNode>(N0.getOperand(1)) && DAG.haveNoCommonBitsSet(N0.getOperand(0), N0.getOperand(1))) { SDValue Add0 = DAG.getNode(ISD::ADD, DL, VT, N1, N0.getOperand(1)); return DAG.getNode(ISD::ADD, DL, VT, N0.getOperand(0), Add0); } I think this code may work fine on an architecture using FI + A + B. However, our architecture seems to use only FI + A and this ADD is converted to OR permanently. As a result, generated code use OR and it causes run time error depends on the address of frame pointer. I also checked visitOR, but I cannot find any undoing there. My question is: 1. Is there any way to control this to avoid OR folding, like TargetLowering::preferNotCombineToOrFrameIndex? (or should we add new?) 2. How to optimize FrameIndex? I desire DAGCombiner use alignment information on data, stack frame, or something. Best Regards, -- Kazushi
Tim Northover via llvm-dev
2019-Mar-13 10:09 UTC
[llvm-dev] llvm combines "ADD frameindex, constant" to OR
Hi, On Wed, 13 Mar 2019 at 00:25, Kazushi Marukawa via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > Hi all, > > I've been working on a backend of our architecture and noticed llvm performs > following combining although one of operands is FrameIndex. > > Combining: t114: i64 = add FrameIndex:i64<0>, Constant:i64<56> > Creating new node: t121: i64 = or FrameIndex:i64<0>, Constant:i64<56> > ... into: t121: i64 = or FrameIndex:i64<0>, Constant:i64<56> > > This caused problem if frame pointer points 0x60000038 at run time. > > > I checked DAGCombiner::visitADD. It folds ADD to OR by following code > without considering about FrameIndex. This haveNoCommonBitsSet says > it's safe since FrameIndex(0) is 0.This sounds like your fundamental problem to me. The generic code path is computeKnownBits -> computeKnownBitsForFrameIndex -> InferPtrAlignment. If your stack slot can legitimately end up at an offset only aligned to 0x8 then that's going wrong somewhere and you should probably work out why. I don't really have the details to say for sure, but unless you've overridden computeKnownBitsForFrameIndex a reasonably likely scenario is that the alloca for that local variable requested 16-bit alignment but your sp itself is only routinely aligned to 8 bytes (or less) and you actually need to realign sp manually in XYZFrameLowering to support that variable. The ISD::OR thing can be annoying, but shouldn't affect correctness.> // Undo the add -> or combine to merge constant offsets from a frame index. > if (N0.getOpcode() == ISD::OR && > isa<FrameIndexSDNode>(N0.getOperand(0)) && > isa<ConstantSDNode>(N0.getOperand(1)) && > DAG.haveNoCommonBitsSet(N0.getOperand(0), N0.getOperand(1))) { > SDValue Add0 = DAG.getNode(ISD::ADD, DL, VT, N1, N0.getOperand(1)); > return DAG.getNode(ISD::ADD, DL, VT, N0.getOperand(0), Add0); > } > > I think this code may work fine on an architecture using FI + A + B.It's target independent, and again shouldn't affect correctness. The entire transformation is (FI | A) + B -> (FI + (A+B)). Useful when it triggers, but not much else.> 1. Is there any way to control this to avoid OR folding, > like TargetLowering::preferNotCombineToOrFrameIndex? > (or should we add new?)There isn't as far as I know. I've heard it justified as canonicalization before, and not been entirely convinced myself. But it's generally pretty easy to cope with in the backend because SelectionDAG::isBaseWithConstantOffset deals with it correctly.> 2. How to optimize FrameIndex? I desire DAGCombiner use alignment > information on data, stack frame, or something.As I said above, I think it should already use that and something has gone wrong there in your case. Cheers. Tim.
Kazushi Marukawa via llvm-dev
2019-Mar-13 23:30 UTC
[llvm-dev] llvm combines "ADD frameindex, constant" to OR
Hi Tim, Thank you very much for explaining the computeKnownBitsForFrameIndex and related functions. I even don't notice them. As you say, I think my backend has fundamental problems at that area. I'll check it out.> > 2. How to optimize FrameIndex? I desire DAGCombiner use alignment > > information on data, stack frame, or something. > > As I said above, I think it should already use that and something has > gone wrong there in your case.That's true. The computeKnownBitsForFrameIndex and InferPtrAlignment already use that. It won't work for my case, though. For example, a vector sized data like v8sd says 64 bytes alignment in InferPtrAlignment, although I allocate v8sd using 8 bytes alignment on stack. This differences have caused the problem. I'll check details. Best Regards, -- Kazushi> -----Original Message----- > From: Tim Northover [mailto:t.p.northover at gmail.com] > Sent: Wednesday, March 13, 2019 7:09 PM > To: Marukawa Kazushi(丸川 一志) <kaz-marukawa at xr.jp.nec.com> > Cc: llvm-dev at lists.llvm.org > Subject: Re: [llvm-dev] llvm combines "ADD frameindex, constant" to OR > > Hi, > > On Wed, 13 Mar 2019 at 00:25, Kazushi Marukawa via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > > > Hi all, > > > > I've been working on a backend of our architecture and noticed llvm performs > > following combining although one of operands is FrameIndex. > > > > Combining: t114: i64 = add FrameIndex:i64<0>, Constant:i64<56> > > Creating new node: t121: i64 = or FrameIndex:i64<0>, Constant:i64<56> > > ... into: t121: i64 = or FrameIndex:i64<0>, Constant:i64<56> > > > > This caused problem if frame pointer points 0x60000038 at run time. > > > > > > I checked DAGCombiner::visitADD. It folds ADD to OR by following code > > without considering about FrameIndex. This haveNoCommonBitsSet says > > it's safe since FrameIndex(0) is 0. > > This sounds like your fundamental problem to me. The generic code path > is computeKnownBits -> computeKnownBitsForFrameIndex -> > InferPtrAlignment. If your stack slot can legitimately end up at an > offset only aligned to 0x8 then that's going wrong somewhere and you > should probably work out why. > > I don't really have the details to say for sure, but unless you've > overridden computeKnownBitsForFrameIndex a reasonably likely scenario > is that the alloca for that local variable requested 16-bit alignment > but your sp itself is only routinely aligned to 8 bytes (or less) and > you actually need to realign sp manually in XYZFrameLowering to > support that variable. > > The ISD::OR thing can be annoying, but shouldn't affect correctness. > > > // Undo the add -> or combine to merge constant offsets from a frame index. > > if (N0.getOpcode() == ISD::OR && > > isa<FrameIndexSDNode>(N0.getOperand(0)) && > > isa<ConstantSDNode>(N0.getOperand(1)) && > > DAG.haveNoCommonBitsSet(N0.getOperand(0), N0.getOperand(1))) { > > SDValue Add0 = DAG.getNode(ISD::ADD, DL, VT, N1, N0.getOperand(1)); > > return DAG.getNode(ISD::ADD, DL, VT, N0.getOperand(0), Add0); > > } > > > > I think this code may work fine on an architecture using FI + A + B. > > It's target independent, and again shouldn't affect correctness. The > entire transformation is (FI | A) + B -> (FI + (A+B)). Useful when it > triggers, but not much else. > > > 1. Is there any way to control this to avoid OR folding, > > like TargetLowering::preferNotCombineToOrFrameIndex? > > (or should we add new?) > > There isn't as far as I know. I've heard it justified as > canonicalization before, and not been entirely convinced myself. But > it's generally pretty easy to cope with in the backend because > SelectionDAG::isBaseWithConstantOffset deals with it correctly. > > > 2. How to optimize FrameIndex? I desire DAGCombiner use alignment > > information on data, stack frame, or something. > > As I said above, I think it should already use that and something has > gone wrong there in your case. > > Cheers. > > Tim.