Owen Anderson
2012-Jul-16  18:44 UTC
[LLVMdev] RFC: LLVM incubation, or requirements for committing new backends
Tom, I think it might be productive to fork this thread to discuss making the requirements for upstreaming a new LLVM target more explicit and open. I'd also like to gauge interest in an idea I've discussed privately with a few community members, namely the concept of having a semi-official "incubation" system whereby proposed backends could get a trial run before becoming part of LLVM mainline. The proposed system would be something like this: a proposed contribution would receive a branch on llvm.org, and have six months (or some other predetermined length of time) to demonstrate that it and its developers are ready for mainline integration. At the end of their term, incubated projects would be evaluated on the following criteria, and either integrated to mainline, judged to be more appropriate as an external project, or given an extension and "needs improvement" feedback on specific criteria. * Active maintainership - Backends bit rot quickly, and unmaintained backends are large maintenance burden on everyone else. We need a core of developers who are going to actively maintain any candidate backend on mainline. That last point is critical: a code drop every six months is not an acceptable level of maintenance for a mainline target. * Contributions to core - Mainlining a new backend adds the expectation that mainline LLVM developers will invest the time and energy to keep your backend building and working (see test plan, below). However, that expectation of extra work doesn't come for nothing: we expect you to contribute back fixes and improvements that you find, and to work with other community members to coordinate projects as appropriate. When looking at a new backend, I should expect to see few-to-no diffs outside of lib/Target/YourBackend, and a few other places (Triple.cpp, for example). All other changes should already be upstreamed. * Test plan - If you're going to expect us to maintain and fix your code, then you need to have a good answer to how to test it. This includes, but is not limited to, a good set of regression tests that are comprehensible to normal developers (so we can fix them when they fail due to mainline change), and continuous testing in the form of buildbots or other infrastructure (so we can know when a patch breaks your backend). * Up to date with mainline - All mainline backends must work with top-of-tree LLVM, all of the time. A candidate for inclusion must be developed at, or close to, mainline. In practice, that probably means updating at least once a week, possibly more. * LLVM coding standards - While small deviations can be fixed after mainlining, gross violations of the LLVM code standards and conventions must be fixed prior to integration. --- So, what would the community think of implementing such a system? --Owen On Jul 16, 2012, at 10:57 AM, Chandler Carruth <chandlerc at google.com> wrote:> Tom, I have to ask that you revert this. > > As we discussed a long time ago, and as I explained in great detail to the Intel folks working on x32 support[1], we simply cannot accept really significant additions to the codebase without active, trusted maintainers who have an established track record contributing and maintaining LLVM's code. For the reasons why this is so important, I would read the x32 email. > > [1]: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120709/146241.html > > I can't emphasize this enough: established maintainers with a track record. I really do know how high a barrier to entry this is, it's excruciating. We've gone through this multiple times. However, without this, the project and the codebase simply cannot scale. > > Next, there is a further problem here: this patch went in without review. That is unacceptable for a contribution of this magnitude. I realize that in the past there may be examples where this rule has not been applied well, but that does not invalidate it or exempt you from it. I'm particularly frustrated because you *knew* this was a requirement and committed anyways. > > Finally, there are really deep problems with your contribution as posed. I'll touch on a few of them here, but by no means should you consider this an exhaustive list: > 1) You must have an AsmPrinter. You must properly use and support the MC layer. This layer is no longer experimental or poorly supported, every new backend should be expected to implement proper MC support. > 2) You need to write tests that follow the prevailing LLVM style. This includes a textual input and output, with FileCheck to manage detailed and robust assertions. > 3) You need to consistently leverage the modern elements of the target independent code generator. I haven't done a deep study of this backend, but a cursory look indicates that you're not properly integrating with some of the latest additions no the target independent pipeline. Adding a new backend that doesn't support them greatly magnifies the cost of making changes to this common infrastructure. > 4) You must bring the code up to the coding standards and style of LLVM. I don't know why people find this so challenging. Look at recently added LLVM code in the backend, look at the patterns and style it follows, and *exactly* replicate it. You're not even close, considering the patch contained 'or' instead of '||'. > 5) High quality documentation about the backend, the target platform, and your plans here. > 6) An active build bot to help developers without access to your platform debug issues. > > I realize that not every backend meets this bar. That doesn't imply that your backend doesn't need to meet this bar. We have to raise the quality bar if we're going to keep LLVM moving forward at a rapid pace. Currently, we aren't doing that, and it is costing the project a great deal. > > ===> > On a separate note, I truly understand that getting a review for a patch of this magnitude is hard. You are not alone in wanting a review and not getting it. However, submitting without review does not solve anything, it merely takes more time from reviewers to deal with the problems in a rush, and makes every single reviewer less inclined to actually review your patch thoroughly. > > You need to specifically motivate people to review your patch. There is no other way you can get it into the tree. There are many ways to do this: > 1) Make it's code so excellent in quality and familiar in style to the potential reviewers that they actually enjoy it. > 2) Work tirelessly on fixing and improving the core LLVM infrastructure so that potential reviewers are grateful and motivated to keep you active in the project. > 3) Talk to developers in the community, showcase amazing things that this backend will do or let you do when it is in the tree. > 4) More of 1, 2, and 3. > > The only magic I know of is to submit more patches. To submit so many patches that other developers simply cannot ignore your presence and will have to review your code. The currency of patches does actually work in this project, but you haven't yet invested enough. > > ===> > I truly hope you don't take this to mean that I (or others in the LLVM project) am uninterested in this backend eventually being in the tree. We are interested, but it's not ready yet. We need you (or others) to be much more active in maintaining things in LLVM. We need the quality of the code and implementation and testing to go up. We need it to go through proper code review. That is the context in which we are interested. > > -Chandler > > > > > On Mon, Jul 16, 2012 at 7:17 AM, Tom Stellard <thomas.stellard at amd.com> wrote: > Author: tstellar > Date: Mon Jul 16 09:17:08 2012 > New Revision: 160270 > > URL: http://llvm.org/viewvc/llvm-project?rev=160270&view=rev > Log: > AMDGPU: Add core backend files for R600/SI codegen v6 > > Added: > llvm/trunk/lib/Target/AMDGPU/ > llvm/trunk/lib/Target/AMDGPU/AMDGPU.h > llvm/trunk/lib/Target/AMDGPU/AMDGPU.td > llvm/trunk/lib/Target/AMDGPU/AMDGPUConvertToISA.cpp > llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.cpp > llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.h > llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.cpp > llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.h > llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.td > llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructions.td > llvm/trunk/lib/Target/AMDGPU/AMDGPUIntrinsics.td > llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.cpp > llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.h > llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.td > llvm/trunk/lib/Target/AMDGPU/AMDGPUSubtarget.h > llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp > llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetMachine.h > llvm/trunk/lib/Target/AMDGPU/AMDGPUUtil.cpp > llvm/trunk/lib/Target/AMDGPU/AMDGPUUtil.h > llvm/trunk/lib/Target/AMDGPU/AMDIL.h > llvm/trunk/lib/Target/AMDGPU/AMDIL7XXDevice.cpp > llvm/trunk/lib/Target/AMDGPU/AMDIL7XXDevice.h > llvm/trunk/lib/Target/AMDGPU/AMDILAlgorithms.tpp > llvm/trunk/lib/Target/AMDGPU/AMDILBase.td > llvm/trunk/lib/Target/AMDGPU/AMDILCFGStructurizer.cpp > llvm/trunk/lib/Target/AMDGPU/AMDILCallingConv.td > llvm/trunk/lib/Target/AMDGPU/AMDILCodeEmitter.h > llvm/trunk/lib/Target/AMDGPU/AMDILDevice.cpp > llvm/trunk/lib/Target/AMDGPU/AMDILDevice.h > llvm/trunk/lib/Target/AMDGPU/AMDILDeviceInfo.cpp > llvm/trunk/lib/Target/AMDGPU/AMDILDeviceInfo.h > llvm/trunk/lib/Target/AMDGPU/AMDILDevices.h > llvm/trunk/lib/Target/AMDGPU/AMDILEnumeratedTypes.td > llvm/trunk/lib/Target/AMDGPU/AMDILEvergreenDevice.cpp > llvm/trunk/lib/Target/AMDGPU/AMDILEvergreenDevice.h > llvm/trunk/lib/Target/AMDGPU/AMDILFormats.td > llvm/trunk/lib/Target/AMDGPU/AMDILFrameLowering.cpp > llvm/trunk/lib/Target/AMDGPU/AMDILFrameLowering.h > llvm/trunk/lib/Target/AMDGPU/AMDILISelDAGToDAG.cpp > llvm/trunk/lib/Target/AMDGPU/AMDILISelLowering.cpp > llvm/trunk/lib/Target/AMDGPU/AMDILISelLowering.h > llvm/trunk/lib/Target/AMDGPU/AMDILInstrInfo.cpp > llvm/trunk/lib/Target/AMDGPU/AMDILInstrInfo.h > llvm/trunk/lib/Target/AMDGPU/AMDILInstrInfo.td > llvm/trunk/lib/Target/AMDGPU/AMDILInstructions.td > llvm/trunk/lib/Target/AMDGPU/AMDILIntrinsicInfo.cpp > llvm/trunk/lib/Target/AMDGPU/AMDILIntrinsicInfo.h > llvm/trunk/lib/Target/AMDGPU/AMDILIntrinsics.td > llvm/trunk/lib/Target/AMDGPU/AMDILMultiClass.td > llvm/trunk/lib/Target/AMDGPU/AMDILNIDevice.cpp > llvm/trunk/lib/Target/AMDGPU/AMDILNIDevice.h > llvm/trunk/lib/Target/AMDGPU/AMDILNodes.td > llvm/trunk/lib/Target/AMDGPU/AMDILOperands.td > llvm/trunk/lib/Target/AMDGPU/AMDILPatterns.td > llvm/trunk/lib/Target/AMDGPU/AMDILPeepholeOptimizer.cpp > llvm/trunk/lib/Target/AMDGPU/AMDILProfiles.td > llvm/trunk/lib/Target/AMDGPU/AMDILRegisterInfo.cpp > llvm/trunk/lib/Target/AMDGPU/AMDILRegisterInfo.h > llvm/trunk/lib/Target/AMDGPU/AMDILRegisterInfo.td > llvm/trunk/lib/Target/AMDGPU/AMDILSIDevice.cpp > llvm/trunk/lib/Target/AMDGPU/AMDILSIDevice.h > llvm/trunk/lib/Target/AMDGPU/AMDILSubtarget.cpp > llvm/trunk/lib/Target/AMDGPU/AMDILSubtarget.h > llvm/trunk/lib/Target/AMDGPU/AMDILTokenDesc.td > llvm/trunk/lib/Target/AMDGPU/AMDILUtilityFunctions.h > llvm/trunk/lib/Target/AMDGPU/AMDILVersion.td > llvm/trunk/lib/Target/AMDGPU/CMakeLists.txt > llvm/trunk/lib/Target/AMDGPU/GENERATED_FILES > llvm/trunk/lib/Target/AMDGPU/LLVMBuild.txt > llvm/trunk/lib/Target/AMDGPU/MCTargetDesc/ > llvm/trunk/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCAsmInfo.cpp > llvm/trunk/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCAsmInfo.h > llvm/trunk/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCTargetDesc.cpp > llvm/trunk/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCTargetDesc.h > llvm/trunk/lib/Target/AMDGPU/MCTargetDesc/CMakeLists.txt > llvm/trunk/lib/Target/AMDGPU/MCTargetDesc/LLVMBuild.txt > llvm/trunk/lib/Target/AMDGPU/MCTargetDesc/Makefile > llvm/trunk/lib/Target/AMDGPU/Makefile > llvm/trunk/lib/Target/AMDGPU/Processors.td > llvm/trunk/lib/Target/AMDGPU/R600CodeEmitter.cpp > llvm/trunk/lib/Target/AMDGPU/R600GenRegisterInfo.pl > llvm/trunk/lib/Target/AMDGPU/R600HwRegInfo.include > llvm/trunk/lib/Target/AMDGPU/R600ISelLowering.cpp > llvm/trunk/lib/Target/AMDGPU/R600ISelLowering.h > llvm/trunk/lib/Target/AMDGPU/R600InstrInfo.cpp > llvm/trunk/lib/Target/AMDGPU/R600InstrInfo.h > llvm/trunk/lib/Target/AMDGPU/R600Instructions.td > llvm/trunk/lib/Target/AMDGPU/R600Intrinsics.td > llvm/trunk/lib/Target/AMDGPU/R600KernelParameters.cpp > llvm/trunk/lib/Target/AMDGPU/R600MachineFunctionInfo.cpp > llvm/trunk/lib/Target/AMDGPU/R600MachineFunctionInfo.h > llvm/trunk/lib/Target/AMDGPU/R600RegisterInfo.cpp > llvm/trunk/lib/Target/AMDGPU/R600RegisterInfo.h > llvm/trunk/lib/Target/AMDGPU/R600RegisterInfo.td > llvm/trunk/lib/Target/AMDGPU/R600Schedule.td > llvm/trunk/lib/Target/AMDGPU/SIAssignInterpRegs.cpp > llvm/trunk/lib/Target/AMDGPU/SICodeEmitter.cpp > llvm/trunk/lib/Target/AMDGPU/SIGenRegisterInfo.pl > llvm/trunk/lib/Target/AMDGPU/SIISelLowering.cpp > llvm/trunk/lib/Target/AMDGPU/SIISelLowering.h > llvm/trunk/lib/Target/AMDGPU/SIInstrFormats.td > llvm/trunk/lib/Target/AMDGPU/SIInstrInfo.cpp > llvm/trunk/lib/Target/AMDGPU/SIInstrInfo.h > llvm/trunk/lib/Target/AMDGPU/SIInstrInfo.td > llvm/trunk/lib/Target/AMDGPU/SIInstructions.td > llvm/trunk/lib/Target/AMDGPU/SIIntrinsics.td > llvm/trunk/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp > llvm/trunk/lib/Target/AMDGPU/SIMachineFunctionInfo.h > llvm/trunk/lib/Target/AMDGPU/SIRegisterInfo.cpp > llvm/trunk/lib/Target/AMDGPU/SIRegisterInfo.h > llvm/trunk/lib/Target/AMDGPU/SIRegisterInfo.td > llvm/trunk/lib/Target/AMDGPU/SISchedule.td > llvm/trunk/lib/Target/AMDGPU/TargetInfo/ > llvm/trunk/lib/Target/AMDGPU/TargetInfo/AMDGPUTargetInfo.cpp > llvm/trunk/lib/Target/AMDGPU/TargetInfo/CMakeLists.txt > llvm/trunk/lib/Target/AMDGPU/TargetInfo/LLVMBuild.txt > llvm/trunk/lib/Target/AMDGPU/TargetInfo/Makefile > > Added: llvm/trunk/lib/Target/AMDGPU/AMDGPU.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPU.h?rev=160270&view=auto > =============================================================================> --- llvm/trunk/lib/Target/AMDGPU/AMDGPU.h (added) > +++ llvm/trunk/lib/Target/AMDGPU/AMDGPU.h Mon Jul 16 09:17:08 2012 > @@ -0,0 +1,35 @@ > +//===-- AMDGPU.h - MachineFunction passes hw codegen --------------*- C++ -*-=// > +// > +// The LLVM Compiler Infrastructure > +// > +// This file is distributed under the University of Illinois Open Source > +// License. See LICENSE.TXT for details. > +// > +//===----------------------------------------------------------------------===// > + > +#ifndef AMDGPU_H > +#define AMDGPU_H > + > +#include "AMDGPUTargetMachine.h" > +#include "llvm/Support/TargetRegistry.h" > +#include "llvm/Target/TargetMachine.h" > + > +namespace llvm { > + > +class FunctionPass; > +class AMDGPUTargetMachine; > + > +// R600 Passes > +FunctionPass* createR600KernelParametersPass(const TargetData* TD); > +FunctionPass *createR600CodeEmitterPass(formatted_raw_ostream &OS); > + > +// SI Passes > +FunctionPass *createSIAssignInterpRegsPass(TargetMachine &tm); > +FunctionPass *createSICodeEmitterPass(formatted_raw_ostream &OS); > + > +// Passes common to R600 and SI > +FunctionPass *createAMDGPUConvertToISAPass(TargetMachine &tm); > + > +} // End namespace llvm > + > +#endif // AMDGPU_H > > Added: llvm/trunk/lib/Target/AMDGPU/AMDGPU.td > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPU.td?rev=160270&view=auto > =============================================================================> --- llvm/trunk/lib/Target/AMDGPU/AMDGPU.td (added) > +++ llvm/trunk/lib/Target/AMDGPU/AMDGPU.td Mon Jul 16 09:17:08 2012 > @@ -0,0 +1,21 @@ > +//===-- AMDIL.td - AMDIL Tablegen files --*- tablegen -*-------------------===// > +// > +// The LLVM Compiler Infrastructure > +// > +// This file is distributed under the University of Illinois Open Source > +// License. See LICENSE.TXT for details. > +// > +//==-----------------------------------------------------------------------===// > + > +// Include AMDIL TD files > +include "AMDILBase.td" > +include "AMDILVersion.td" > + > +// Include AMDGPU TD files > +include "R600Schedule.td" > +include "SISchedule.td" > +include "Processors.td" > +include "AMDGPUInstrInfo.td" > +include "AMDGPUIntrinsics.td" > +include "AMDGPURegisterInfo.td" > +include "AMDGPUInstructions.td" > > Added: llvm/trunk/lib/Target/AMDGPU/AMDGPUConvertToISA.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUConvertToISA.cpp?rev=160270&view=auto > =============================================================================> --- llvm/trunk/lib/Target/AMDGPU/AMDGPUConvertToISA.cpp (added) > +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUConvertToISA.cpp Mon Jul 16 09:17:08 2012 > @@ -0,0 +1,63 @@ > +//===-- AMDGPUConvertToISA.cpp - Lower AMDIL to HW ISA --------------------===// > +// > +// The LLVM Compiler Infrastructure > +// > +// This file is distributed under the University of Illinois Open Source > +// License. See LICENSE.TXT for details. > +// > +//===----------------------------------------------------------------------===// > +// > +// This pass lowers AMDIL machine instructions to the appropriate hardware > +// instructions. > +// > +//===----------------------------------------------------------------------===// > + > +#include "AMDGPU.h" > +#include "AMDGPUInstrInfo.h" > +#include "llvm/CodeGen/MachineFunctionPass.h" > + > +#include <stdio.h> > +using namespace llvm; > + > +namespace { > + > +class AMDGPUConvertToISAPass : public MachineFunctionPass { > + > +private: > + static char ID; > + TargetMachine &TM; > + > +public: > + AMDGPUConvertToISAPass(TargetMachine &tm) : > + MachineFunctionPass(ID), TM(tm) { } > + > + virtual bool runOnMachineFunction(MachineFunction &MF); > + > + virtual const char *getPassName() const {return "AMDGPU Convert to ISA";} > + > +}; > + > +} // End anonymous namespace > + > +char AMDGPUConvertToISAPass::ID = 0; > + > +FunctionPass *llvm::createAMDGPUConvertToISAPass(TargetMachine &tm) { > + return new AMDGPUConvertToISAPass(tm); > +} > + > +bool AMDGPUConvertToISAPass::runOnMachineFunction(MachineFunction &MF) > +{ > + const AMDGPUInstrInfo * TII > + static_cast<const AMDGPUInstrInfo*>(TM.getInstrInfo()); > + > + for (MachineFunction::iterator BB = MF.begin(), BB_E = MF.end(); > + BB != BB_E; ++BB) { > + MachineBasicBlock &MBB = *BB; > + for (MachineBasicBlock::iterator I = MBB.begin(), E = MBB.end(); > + I != E; ++I) { > + MachineInstr &MI = *I; > + TII->convertToISA(MI, MF, MBB.findDebugLoc(I)); > + } > + } > + return false; > +} > > Added: llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.cpp?rev=160270&view=auto > =============================================================================> --- llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.cpp (added) > +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.cpp Mon Jul 16 09:17:08 2012 > @@ -0,0 +1,393 @@ > +//===-- AMDGPUISelLowering.cpp - AMDGPU Common DAG lowering functions -----===// > +// > +// The LLVM Compiler Infrastructure > +// > +// This file is distributed under the University of Illinois Open Source > +// License. See LICENSE.TXT for details. > +// > +//===----------------------------------------------------------------------===// > +// > +// This is the parent TargetLowering class for hardware code gen targets. > +// > +//===----------------------------------------------------------------------===// > + > +#include "AMDGPUISelLowering.h" > +#include "AMDILIntrinsicInfo.h" > +#include "AMDGPUUtil.h" > +#include "llvm/CodeGen/MachineRegisterInfo.h" > + > +using namespace llvm; > + > +AMDGPUTargetLowering::AMDGPUTargetLowering(TargetMachine &TM) : > + AMDILTargetLowering(TM) > +{ > + // We need to custom lower some of the intrinsics > + setOperationAction(ISD::INTRINSIC_WO_CHAIN, MVT::Other, Custom); > + > + setOperationAction(ISD::SELECT_CC, MVT::f32, Custom); > + setOperationAction(ISD::SELECT_CC, MVT::i32, Custom); > + > + // Library functions. These default to Expand, but we have instructions > + // for them. > + setOperationAction(ISD::FCEIL, MVT::f32, Legal); > + setOperationAction(ISD::FEXP2, MVT::f32, Legal); > + setOperationAction(ISD::FRINT, MVT::f32, Legal); > + > + setOperationAction(ISD::UDIV, MVT::i32, Expand); > + setOperationAction(ISD::UDIVREM, MVT::i32, Custom); > + setOperationAction(ISD::UREM, MVT::i32, Expand); > +} > + > +SDValue AMDGPUTargetLowering::LowerOperation(SDValue Op, SelectionDAG &DAG) > + const > +{ > + switch (Op.getOpcode()) { > + default: return AMDILTargetLowering::LowerOperation(Op, DAG); > + case ISD::INTRINSIC_WO_CHAIN: return LowerINTRINSIC_WO_CHAIN(Op, DAG); > + case ISD::SELECT_CC: return LowerSELECT_CC(Op, DAG); > + case ISD::UDIVREM: return LowerUDIVREM(Op, DAG); > + } > +} > + > +SDValue AMDGPUTargetLowering::LowerINTRINSIC_WO_CHAIN(SDValue Op, > + SelectionDAG &DAG) const > +{ > + unsigned IntrinsicID = cast<ConstantSDNode>(Op.getOperand(0))->getZExtValue(); > + DebugLoc DL = Op.getDebugLoc(); > + EVT VT = Op.getValueType(); > + > + switch (IntrinsicID) { > + default: return Op; > + case AMDGPUIntrinsic::AMDIL_abs: > + return LowerIntrinsicIABS(Op, DAG); > + case AMDGPUIntrinsic::AMDIL_exp: > + return DAG.getNode(ISD::FEXP2, DL, VT, Op.getOperand(1)); > + case AMDGPUIntrinsic::AMDIL_fabs: > + return DAG.getNode(ISD::FABS, DL, VT, Op.getOperand(1)); > + case AMDGPUIntrinsic::AMDGPU_lrp: > + return LowerIntrinsicLRP(Op, DAG); > + case AMDGPUIntrinsic::AMDIL_fraction: > + return DAG.getNode(AMDGPUISD::FRACT, DL, VT, Op.getOperand(1)); > + case AMDGPUIntrinsic::AMDIL_mad: > + return DAG.getNode(AMDILISD::MAD, DL, VT, Op.getOperand(1), > + Op.getOperand(2), Op.getOperand(3)); > + case AMDGPUIntrinsic::AMDIL_max: > + return DAG.getNode(AMDGPUISD::FMAX, DL, VT, Op.getOperand(1), > + Op.getOperand(2)); > + case AMDGPUIntrinsic::AMDGPU_imax: > + return DAG.getNode(AMDGPUISD::SMAX, DL, VT, Op.getOperand(1), > + Op.getOperand(2)); > + case AMDGPUIntrinsic::AMDGPU_umax: > + return DAG.getNode(AMDGPUISD::UMAX, DL, VT, Op.getOperand(1), > + Op.getOperand(2)); > + case AMDGPUIntrinsic::AMDIL_min: > + return DAG.getNode(AMDGPUISD::FMIN, DL, VT, Op.getOperand(1), > + Op.getOperand(2)); > + case AMDGPUIntrinsic::AMDGPU_imin: > + return DAG.getNode(AMDGPUISD::SMIN, DL, VT, Op.getOperand(1), > + Op.getOperand(2)); > + case AMDGPUIntrinsic::AMDGPU_umin: > + return DAG.getNode(AMDGPUISD::UMIN, DL, VT, Op.getOperand(1), > + Op.getOperand(2)); > + case AMDGPUIntrinsic::AMDIL_round_nearest: > + return DAG.getNode(ISD::FRINT, DL, VT, Op.getOperand(1)); > + case AMDGPUIntrinsic::AMDIL_round_posinf: > + return DAG.getNode(ISD::FCEIL, DL, VT, Op.getOperand(1)); > + } > +} > + > +///IABS(a) = SMAX(sub(0, a), a) > +SDValue AMDGPUTargetLowering::LowerIntrinsicIABS(SDValue Op, > + SelectionDAG &DAG) const > +{ > + > + DebugLoc DL = Op.getDebugLoc(); > + EVT VT = Op.getValueType(); > + SDValue Neg = DAG.getNode(ISD::SUB, DL, VT, DAG.getConstant(0, VT), > + Op.getOperand(1)); > + > + return DAG.getNode(AMDGPUISD::SMAX, DL, VT, Neg, Op.getOperand(1)); > +} > + > +/// Linear Interpolation > +/// LRP(a, b, c) = muladd(a, b, (1 - a) * c) > +SDValue AMDGPUTargetLowering::LowerIntrinsicLRP(SDValue Op, > + SelectionDAG &DAG) const > +{ > + DebugLoc DL = Op.getDebugLoc(); > + EVT VT = Op.getValueType(); > + SDValue OneSubA = DAG.getNode(ISD::FSUB, DL, VT, > + DAG.getConstantFP(1.0f, MVT::f32), > + Op.getOperand(1)); > + SDValue OneSubAC = DAG.getNode(ISD::FMUL, DL, VT, OneSubA, > + Op.getOperand(3)); > + return DAG.getNode(AMDILISD::MAD, DL, VT, Op.getOperand(1), > + Op.getOperand(2), > + OneSubAC); > +} > + > +SDValue AMDGPUTargetLowering::LowerSELECT_CC(SDValue Op, > + SelectionDAG &DAG) const > +{ > + DebugLoc DL = Op.getDebugLoc(); > + EVT VT = Op.getValueType(); > + > + SDValue LHS = Op.getOperand(0); > + SDValue RHS = Op.getOperand(1); > + SDValue True = Op.getOperand(2); > + SDValue False = Op.getOperand(3); > + SDValue CC = Op.getOperand(4); > + ISD::CondCode CCOpcode = cast<CondCodeSDNode>(CC)->get(); > + SDValue Temp; > + > + // LHS and RHS are guaranteed to be the same value type > + EVT CompareVT = LHS.getValueType(); > + > + // We need all the operands of SELECT_CC to have the same value type, so if > + // necessary we need to convert LHS and RHS to be the same type True and > + // False. True and False are guaranteed to have the same type as this > + // SELECT_CC node. > + > + if (CompareVT != VT) { > + ISD::NodeType ConversionOp = ISD::DELETED_NODE; > + if (VT == MVT::f32 && CompareVT == MVT::i32) { > + if (isUnsignedIntSetCC(CCOpcode)) { > + ConversionOp = ISD::UINT_TO_FP; > + } else { > + ConversionOp = ISD::SINT_TO_FP; > + } > + } else if (VT == MVT::i32 && CompareVT == MVT::f32) { > + ConversionOp = ISD::FP_TO_SINT; > + } else { > + // I don't think there will be any other type pairings. > + assert(!"Unhandled operand type parings in SELECT_CC"); > + } > + // XXX Check the value of LHS and RHS and avoid creating sequences like > + // (FTOI (ITOF)) > + LHS = DAG.getNode(ConversionOp, DL, VT, LHS); > + RHS = DAG.getNode(ConversionOp, DL, VT, RHS); > + } > + > + // If True is a hardware TRUE value and False is a hardware FALSE value or > + // vice-versa we can handle this with a native instruction (SET* instructions). > + if ((isHWTrueValue(True) && isHWFalseValue(False))) { > + return DAG.getNode(ISD::SELECT_CC, DL, VT, LHS, RHS, True, False, CC); > + } > + > + // XXX If True is a hardware TRUE value and False is a hardware FALSE value, > + // we can handle this with a native instruction, but we need to swap true > + // and false and change the conditional. > + if (isHWTrueValue(False) && isHWFalseValue(True)) { > + } > + > + // XXX Check if we can lower this to a SELECT or if it is supported by a native > + // operation. (The code below does this but we don't have the Instruction > + // selection patterns to do this yet. > +#if 0 > + if (isZero(LHS) || isZero(RHS)) { > + SDValue Cond = (isZero(LHS) ? RHS : LHS); > + bool SwapTF = false; > + switch (CCOpcode) { > + case ISD::SETOEQ: > + case ISD::SETUEQ: > + case ISD::SETEQ: > + SwapTF = true; > + // Fall through > + case ISD::SETONE: > + case ISD::SETUNE: > + case ISD::SETNE: > + // We can lower to select > + if (SwapTF) { > + Temp = True; > + True = False; > + False = Temp; > + } > + // CNDE > + return DAG.getNode(ISD::SELECT, DL, VT, Cond, True, False); > + default: > + // Supported by a native operation (CNDGE, CNDGT) > + return DAG.getNode(ISD::SELECT_CC, DL, VT, LHS, RHS, True, False, CC); > + } > + } > +#endif > + > + // If we make it this for it means we have no native instructions to handle > + // this SELECT_CC, so we must lower it. > + SDValue HWTrue, HWFalse; > + > + if (VT == MVT::f32) { > + HWTrue = DAG.getConstantFP(1.0f, VT); > + HWFalse = DAG.getConstantFP(0.0f, VT); > + } else if (VT == MVT::i32) { > + HWTrue = DAG.getConstant(-1, VT); > + HWFalse = DAG.getConstant(0, VT); > + } > + else { > + assert(!"Unhandled value type in LowerSELECT_CC"); > + } > + > + // Lower this unsupported SELECT_CC into a combination of two supported > + // SELECT_CC operations. > + SDValue Cond = DAG.getNode(ISD::SELECT_CC, DL, VT, LHS, RHS, HWTrue, HWFalse, CC); > + > + return DAG.getNode(ISD::SELECT, DL, VT, Cond, True, False); > +} > + > + > +SDValue AMDGPUTargetLowering::LowerUDIVREM(SDValue Op, > + SelectionDAG &DAG) const > +{ > + DebugLoc DL = Op.getDebugLoc(); > + EVT VT = Op.getValueType(); > + > + SDValue Num = Op.getOperand(0); > + SDValue Den = Op.getOperand(1); > + > + SmallVector<SDValue, 8> Results; > + > + // RCP = URECIP(Den) = 2^32 / Den + e > + // e is rounding error. > + SDValue RCP = DAG.getNode(AMDGPUISD::URECIP, DL, VT, Den); > + > + // RCP_LO = umulo(RCP, Den) */ > + SDValue RCP_LO = DAG.getNode(ISD::UMULO, DL, VT, RCP, Den); > + > + // RCP_HI = mulhu (RCP, Den) */ > + SDValue RCP_HI = DAG.getNode(ISD::MULHU, DL, VT, RCP, Den); > + > + // NEG_RCP_LO = -RCP_LO > + SDValue NEG_RCP_LO = DAG.getNode(ISD::SUB, DL, VT, DAG.getConstant(0, VT), > + RCP_LO); > + > + // ABS_RCP_LO = (RCP_HI == 0 ? NEG_RCP_LO : RCP_LO) > + SDValue ABS_RCP_LO = DAG.getSelectCC(DL, RCP_HI, DAG.getConstant(0, VT), > + NEG_RCP_LO, RCP_LO, > + ISD::SETEQ); > + // Calculate the rounding error from the URECIP instruction > + // E = mulhu(ABS_RCP_LO, RCP) > + SDValue E = DAG.getNode(ISD::MULHU, DL, VT, ABS_RCP_LO, RCP); > + > + // RCP_A_E = RCP + E > + SDValue RCP_A_E = DAG.getNode(ISD::ADD, DL, VT, RCP, E); > + > + // RCP_S_E = RCP - E > + SDValue RCP_S_E = DAG.getNode(ISD::SUB, DL, VT, RCP, E); > + > + // Tmp0 = (RCP_HI == 0 ? RCP_A_E : RCP_SUB_E) > + SDValue Tmp0 = DAG.getSelectCC(DL, RCP_HI, DAG.getConstant(0, VT), > + RCP_A_E, RCP_S_E, > + ISD::SETEQ); > + // Quotient = mulhu(Tmp0, Num) > + SDValue Quotient = DAG.getNode(ISD::MULHU, DL, VT, Tmp0, Num); > + > + // Num_S_Remainder = Quotient * Den > + SDValue Num_S_Remainder = DAG.getNode(ISD::UMULO, DL, VT, Quotient, Den); > + > + // Remainder = Num - Num_S_Remainder > + SDValue Remainder = DAG.getNode(ISD::SUB, DL, VT, Num, Num_S_Remainder); > + > + // Remainder_GE_Den = (Remainder >= Den ? -1 : 0) > + SDValue Remainder_GE_Den = DAG.getSelectCC(DL, Remainder, Den, > + DAG.getConstant(-1, VT), > + DAG.getConstant(0, VT), > + ISD::SETGE); > + // Remainder_GE_Zero = (Remainder >= 0 ? -1 : 0) > + SDValue Remainder_GE_Zero = DAG.getSelectCC(DL, Remainder, > + DAG.getConstant(0, VT), > + DAG.getConstant(-1, VT), > + DAG.getConstant(0, VT), > + ISD::SETGE); > + // Tmp1 = Remainder_GE_Den & Remainder_GE_Zero > + SDValue Tmp1 = DAG.getNode(ISD::AND, DL, VT, Remainder_GE_Den, > + Remainder_GE_Zero); > + > + // Calculate Division result: > + > + // Quotient_A_One = Quotient + 1 > + SDValue Quotient_A_One = DAG.getNode(ISD::ADD, DL, VT, Quotient, > + DAG.getConstant(1, VT)); > + > + // Quotient_S_One = Quotient - 1 > + SDValue Quotient_S_One = DAG.getNode(ISD::SUB, DL, VT, Quotient, > + DAG.getConstant(1, VT)); > + > + // Div = (Tmp1 == 0 ? Quotient : Quotient_A_One) > + SDValue Div = DAG.getSelectCC(DL, Tmp1, DAG.getConstant(0, VT), > + Quotient, Quotient_A_One, ISD::SETEQ); > + > + // Div = (Remainder_GE_Zero == 0 ? Quotient_S_One : Div) > + Div = DAG.getSelectCC(DL, Remainder_GE_Zero, DAG.getConstant(0, VT), > + Quotient_S_One, Div, ISD::SETEQ); > + > + // Calculate Rem result: > + > + // Remainder_S_Den = Remainder - Den > + SDValue Remainder_S_Den = DAG.getNode(ISD::SUB, DL, VT, Remainder, Den); > + > + // Remainder_A_Den = Remainder + Den > + SDValue Remainder_A_Den = DAG.getNode(ISD::ADD, DL, VT, Remainder, Den); > + > + // Rem = (Tmp1 == 0 ? Remainder : Remainder_S_Den) > + SDValue Rem = DAG.getSelectCC(DL, Tmp1, DAG.getConstant(0, VT), > + Remainder, Remainder_S_Den, ISD::SETEQ); > + > + // Rem = (Remainder_GE_Zero == 0 ? Remainder_A_Den : Rem) > + Rem = DAG.getSelectCC(DL, Remainder_GE_Zero, DAG.getConstant(0, VT), > + Remainder_A_Den, Rem, ISD::SETEQ); > + > + DAG.ReplaceAllUsesWith(Op.getValue(0).getNode(), &Div); > + DAG.ReplaceAllUsesWith(Op.getValue(1).getNode(), &Rem); > + > + return Op; > +} > + > +//===----------------------------------------------------------------------===// > +// Helper functions > +//===----------------------------------------------------------------------===// > + > +bool AMDGPUTargetLowering::isHWTrueValue(SDValue Op) const > +{ > + if (ConstantFPSDNode * CFP = dyn_cast<ConstantFPSDNode>(Op)) { > + return CFP->isExactlyValue(1.0); > + } > + if (ConstantSDNode *C = dyn_cast<ConstantSDNode>(Op)) { > + return C->isAllOnesValue(); > + } > + return false; > +} > + > +bool AMDGPUTargetLowering::isHWFalseValue(SDValue Op) const > +{ > + if (ConstantFPSDNode * CFP = dyn_cast<ConstantFPSDNode>(Op)) { > + return CFP->getValueAPF().isZero(); > + } > + if (ConstantSDNode *C = dyn_cast<ConstantSDNode>(Op)) { > + return C->isNullValue(); > + } > + return false; > +} > + > +void AMDGPUTargetLowering::addLiveIn(MachineInstr * MI, > + MachineFunction * MF, MachineRegisterInfo & MRI, > + const TargetInstrInfo * TII, unsigned reg) const > +{ > + AMDGPU::utilAddLiveIn(MF, MRI, TII, reg, MI->getOperand(0).getReg()); > +} > + > +#define NODE_NAME_CASE(node) case AMDGPUISD::node: return #node; > + > +const char* AMDGPUTargetLowering::getTargetNodeName(unsigned Opcode) const > +{ > + switch (Opcode) { > + default: return AMDILTargetLowering::getTargetNodeName(Opcode); > + > + NODE_NAME_CASE(FRACT) > + NODE_NAME_CASE(FMAX) > + NODE_NAME_CASE(SMAX) > + NODE_NAME_CASE(UMAX) > + NODE_NAME_CASE(FMIN) > + NODE_NAME_CASE(SMIN) > + NODE_NAME_CASE(UMIN) > + NODE_NAME_CASE(URECIP) > + } > +} > > Added: llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.h?rev=160270&view=auto > =============================================================================> --- llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.h (added) > +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.h Mon Jul 16 09:17:08 2012 > @@ -0,0 +1,77 @@ > +//===-- AMDGPUISelLowering.h - AMDGPU Lowering Interface --------*- C++ -*-===// > +// > +// The LLVM Compiler Infrastructure > +// > +// This file is distributed under the University of Illinois Open Source > +// License. See LICENSE.TXT for details. > +// > +//===----------------------------------------------------------------------===// > +// > +// This file contains the interface defintiion of the TargetLowering class > +// that is common to all AMD GPUs. > +// > +//===----------------------------------------------------------------------===// > + > +#ifndef AMDGPUISELLOWERING_H > +#define AMDGPUISELLOWERING_H > + > +#include "AMDILISelLowering.h" > + > +namespace llvm { > + > +class AMDGPUTargetLowering : public AMDILTargetLowering > +{ > +private: > + SDValue LowerINTRINSIC_WO_CHAIN(SDValue Op, SelectionDAG &DAG) const; > + SDValue LowerSELECT_CC(SDValue Op, SelectionDAG &DAG) const; > + SDValue LowerUDIVREM(SDValue Op, SelectionDAG &DAG) const; > + > +protected: > + > + /// addLiveIn - This functions adds reg to the live in list of the entry block > + /// and emits a copy from reg to MI.getOperand(0). > + /// > + // Some registers are loaded with values before the program > + /// begins to execute. The loading of these values is modeled with pseudo > + /// instructions which are lowered using this function. > + void addLiveIn(MachineInstr * MI, MachineFunction * MF, > + MachineRegisterInfo & MRI, const TargetInstrInfo * TII, > + unsigned reg) const; > + > + bool isHWTrueValue(SDValue Op) const; > + bool isHWFalseValue(SDValue Op) const; > + > +public: > + AMDGPUTargetLowering(TargetMachine &TM); > + > + virtual SDValue LowerOperation(SDValue Op, SelectionDAG &DAG) const; > + SDValue LowerIntrinsicIABS(SDValue Op, SelectionDAG &DAG) const; > + SDValue LowerIntrinsicLRP(SDValue Op, SelectionDAG &DAG) const; > + virtual const char* getTargetNodeName(unsigned Opcode) const; > + > +}; > + > +namespace AMDGPUISD > +{ > + > +enum > +{ > + AMDGPU_FIRST = AMDILISD::LAST_ISD_NUMBER, > + BITALIGN, > + FRACT, > + FMAX, > + SMAX, > + UMAX, > + FMIN, > + SMIN, > + UMIN, > + URECIP, > + LAST_AMDGPU_ISD_NUMBER > +}; > + > + > +} // End namespace AMDGPUISD > + > +} // End namespace llvm > + > +#endif // AMDGPUISELLOWERING_H > > Added: llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.cpp?rev=160270&view=auto > =============================================================================> --- llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.cpp (added) > +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.cpp Mon Jul 16 09:17:08 2012 > @@ -0,0 +1,46 @@ > +//===-- AMDGPUInstrInfo.cpp - Base class for AMD GPU InstrInfo ------------===// > +// > +// The LLVM Compiler Infrastructure > +// > +// This file is distributed under the University of Illinois Open Source > +// License. See LICENSE.TXT for details. > +// > +//===----------------------------------------------------------------------===// > +// > +// This file contains the implementation of the TargetInstrInfo class that is > +// common to all AMD GPUs. > +// > +//===----------------------------------------------------------------------===// > + > +#include "AMDGPUInstrInfo.h" > +#include "AMDGPURegisterInfo.h" > +#include "AMDGPUTargetMachine.h" > +#include "AMDIL.h" > +#include "llvm/CodeGen/MachineRegisterInfo.h" > + > +using namespace llvm; > + > +AMDGPUInstrInfo::AMDGPUInstrInfo(AMDGPUTargetMachine &tm) > + : AMDILInstrInfo(tm) { } > + > +void AMDGPUInstrInfo::convertToISA(MachineInstr & MI, MachineFunction &MF, > + DebugLoc DL) const > +{ > + MachineRegisterInfo &MRI = MF.getRegInfo(); > + const AMDGPURegisterInfo & RI = getRegisterInfo(); > + > + for (unsigned i = 0; i < MI.getNumOperands(); i++) { > + MachineOperand &MO = MI.getOperand(i); > + // Convert dst regclass to one that is supported by the ISA > + if (MO.isReg() && MO.isDef()) { > + if (TargetRegisterInfo::isVirtualRegister(MO.getReg())) { > + const TargetRegisterClass * oldRegClass = MRI.getRegClass(MO.getReg()); > + const TargetRegisterClass * newRegClass = RI.getISARegClass(oldRegClass); > + > + assert(newRegClass); > + > + MRI.setRegClass(MO.getReg(), newRegClass); > + } > + } > + } > +} > > Added: llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.h?rev=160270&view=auto > =============================================================================> --- llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.h (added) > +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.h Mon Jul 16 09:17:08 2012 > @@ -0,0 +1,46 @@ > +//===-- AMDGPUInstrInfo.h - AMDGPU Instruction Information ------*- C++ -*-===// > +// > +// The LLVM Compiler Infrastructure > +// > +// This file is distributed under the University of Illinois Open Source > +// License. See LICENSE.TXT for details. > +// > +//===----------------------------------------------------------------------===// > +// > +// This file contains the definition of a TargetInstrInfo class that is common > +// to all AMD GPUs. > +// > +//===----------------------------------------------------------------------===// > + > +#ifndef AMDGPUINSTRUCTIONINFO_H_ > +#define AMDGPUINSTRUCTIONINFO_H_ > + > +#include "AMDGPURegisterInfo.h" > +#include "AMDILInstrInfo.h" > + > +#include <map> > + > +namespace llvm { > + > +class AMDGPUTargetMachine; > +class MachineFunction; > +class MachineInstr; > +class MachineInstrBuilder; > + > +class AMDGPUInstrInfo : public AMDILInstrInfo { > + > +public: > + explicit AMDGPUInstrInfo(AMDGPUTargetMachine &tm); > + > + virtual const AMDGPURegisterInfo &getRegisterInfo() const = 0; > + > + /// convertToISA - Convert the AMDIL MachineInstr to a supported ISA > + /// MachineInstr > + virtual void convertToISA(MachineInstr & MI, MachineFunction &MF, > + DebugLoc DL) const; > + > +}; > + > +} // End llvm namespace > + > +#endif // AMDGPUINSTRINFO_H_ > > Added: llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.td > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.td?rev=160270&view=auto > =============================================================================> --- llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.td (added) > +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.td Mon Jul 16 09:17:08 2012 > @@ -0,0 +1,69 @@ > +//===-- AMDGPUInstrInfo.td - AMDGPU DAG nodes --------------*- tablegen -*-===// > +// > +// The LLVM Compiler Infrastructure > +// > +// This file is distributed under the University of Illinois Open Source > +// License. See LICENSE.TXT for details. > +// > +//===----------------------------------------------------------------------===// > +// > +// This file contains DAG node defintions for the AMDGPU target. > +// > +//===----------------------------------------------------------------------===// > + > +//===----------------------------------------------------------------------===// > +// AMDGPU DAG Profiles > +//===----------------------------------------------------------------------===// > + > +def AMDGPUDTIntTernaryOp : SDTypeProfile<1, 3, [ > + SDTCisSameAs<0, 1>, SDTCisSameAs<0, 2>, SDTCisInt<0>, SDTCisInt<3> > +]>; > + > +//===----------------------------------------------------------------------===// > +// AMDGPU DAG Nodes > +// > + > +// out = ((a << 32) | b) >> c) > +// > +// Can be used to optimize rtol: > +// rotl(a, b) = bitalign(a, a, 32 - b) > +def AMDGPUbitalign : SDNode<"AMDGPUISD::BITALIGN", AMDGPUDTIntTernaryOp>; > + > +// out = a - floor(a) > +def AMDGPUfract : SDNode<"AMDGPUISD::FRACT", SDTFPUnaryOp>; > + > +// out = max(a, b) a and b are floats > +def AMDGPUfmax : SDNode<"AMDGPUISD::FMAX", SDTFPBinOp, > + [SDNPCommutative, SDNPAssociative] > +>; > + > +// out = max(a, b) a and b are signed ints > +def AMDGPUsmax : SDNode<"AMDGPUISD::SMAX", SDTIntBinOp, > + [SDNPCommutative, SDNPAssociative] > +>; > + > +// out = max(a, b) a and b are unsigned ints > +def AMDGPUumax : SDNode<"AMDGPUISD::UMAX", SDTIntBinOp, > + [SDNPCommutative, SDNPAssociative] > +>; > + > +// out = min(a, b) a and b are floats > +def AMDGPUfmin : SDNode<"AMDGPUISD::FMIN", SDTFPBinOp, > + [SDNPCommutative, SDNPAssociative] > +>; > + > +// out = min(a, b) a snd b are signed ints > +def AMDGPUsmin : SDNode<"AMDGPUISD::SMIN", SDTIntBinOp, > + [SDNPCommutative, SDNPAssociative] > +>; > + > +// out = min(a, b) a and b are unsigned ints > +def AMDGPUumin : SDNode<"AMDGPUISD::UMIN", SDTIntBinOp, > + [SDNPCommutative, SDNPAssociative] > +>; > + > +// urecip - This operation is a helper for integer division, it returns the > +// result of 1 / a as a fractional unsigned integer. > +// out = (2^32 / a) + e > +// e is rounding error > +def AMDGPUurecip : SDNode<"AMDGPUISD::URECIP", SDTIntUnaryOp>; > > Added: llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructions.td > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructions.td?rev=160270&view=auto > =============================================================================> --- llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructions.td (added) > +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructions.td Mon Jul 16 09:17:08 2012 > @@ -0,0 +1,123 @@ > +//===-- AMDGPUInstructions.td - Common instruction defs ---*- tablegen -*-===// > +// > +// The LLVM Compiler Infrastructure > +// > +// This file is distributed under the University of Illinois Open Source > +// License. See LICENSE.TXT for details. > +// > +//===----------------------------------------------------------------------===// > +// > +// This file contains instruction defs that are common to all hw codegen > +// targets. > +// > +//===----------------------------------------------------------------------===// > + > +class AMDGPUInst <dag outs, dag ins, string asm, list<dag> pattern> : Instruction { > + field bits<16> AMDILOp = 0; > + field bits<3> Gen = 0; > + > + let Namespace = "AMDGPU"; > + let OutOperandList = outs; > + let InOperandList = ins; > + let AsmString = asm; > + let Pattern = pattern; > + let Itinerary = NullALU; > + let TSFlags{42-40} = Gen; > + let TSFlags{63-48} = AMDILOp; > +} > + > +class AMDGPUShaderInst <dag outs, dag ins, string asm, list<dag> pattern> > + : AMDGPUInst<outs, ins, asm, pattern> { > + > + field bits<32> Inst = 0xffffffff; > + > +} > + > +class Constants { > +int TWO_PI = 0x40c90fdb; > +int PI = 0x40490fdb; > +int TWO_PI_INV = 0x3e22f983; > +} > +def CONST : Constants; > + > +def FP_ZERO : PatLeaf < > + (fpimm), > + [{return N->getValueAPF().isZero();}] > +>; > + > +def FP_ONE : PatLeaf < > + (fpimm), > + [{return N->isExactlyValue(1.0);}] > +>; > + > +let isCodeGenOnly = 1, isPseudo = 1, usesCustomInserter = 1 in { > + > +class CLAMP <RegisterClass rc> : AMDGPUShaderInst < > + (outs rc:$dst), > + (ins rc:$src0), > + "CLAMP $dst, $src0", > + [(set rc:$dst, (int_AMDIL_clamp rc:$src0, (f32 FP_ZERO), (f32 FP_ONE)))] > +>; > + > +class FABS <RegisterClass rc> : AMDGPUShaderInst < > + (outs rc:$dst), > + (ins rc:$src0), > + "FABS $dst, $src0", > + [(set rc:$dst, (fabs rc:$src0))] > +>; > + > +class FNEG <RegisterClass rc> : AMDGPUShaderInst < > + (outs rc:$dst), > + (ins rc:$src0), > + "FNEG $dst, $src0", > + [(set rc:$dst, (fneg rc:$src0))] > +>; > + > +} // End isCodeGenOnly = 1, isPseudo = 1, hasCustomInserter = 1 > + > +/* Generic helper patterns for intrinsics */ > +/* -------------------------------------- */ > + > +class POW_Common <AMDGPUInst log_ieee, AMDGPUInst exp_ieee, AMDGPUInst mul, > + RegisterClass rc> : Pat < > + (int_AMDGPU_pow rc:$src0, rc:$src1), > + (exp_ieee (mul rc:$src1, (log_ieee rc:$src0))) > +>; > + > +/* Other helper patterns */ > +/* --------------------- */ > + > +/* Extract element pattern */ > +class Extract_Element <ValueType sub_type, ValueType vec_type, > + RegisterClass vec_class, int sub_idx, > + SubRegIndex sub_reg>: Pat< > + (sub_type (vector_extract (vec_type vec_class:$src), sub_idx)), > + (EXTRACT_SUBREG vec_class:$src, sub_reg) > +>; > + > +/* Insert element pattern */ > +class Insert_Element <ValueType elem_type, ValueType vec_type, > + RegisterClass elem_class, RegisterClass vec_class, > + int sub_idx, SubRegIndex sub_reg> : Pat < > + > + (vec_type (vector_insert (vec_type vec_class:$vec), > + (elem_type elem_class:$elem), sub_idx)), > + (INSERT_SUBREG vec_class:$vec, elem_class:$elem, sub_reg) > +>; > + > +// Vector Build pattern > +class Vector_Build <ValueType vecType, RegisterClass elemClass> : Pat < > + (IL_vbuild elemClass:$src), > + (INSERT_SUBREG (vecType (IMPLICIT_DEF)), elemClass:$src, sel_x) > +>; > + > +// bitconvert pattern > +class BitConvert <ValueType dt, ValueType st, RegisterClass rc> : Pat < > + (dt (bitconvert (st rc:$src0))), > + (dt rc:$src0) > +>; > + > +include "R600Instructions.td" > + > +include "SIInstrInfo.td" > + > > Added: llvm/trunk/lib/Target/AMDGPU/AMDGPUIntrinsics.td > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUIntrinsics.td?rev=160270&view=auto > =============================================================================> --- llvm/trunk/lib/Target/AMDGPU/AMDGPUIntrinsics.td (added) > +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUIntrinsics.td Mon Jul 16 09:17:08 2012 > @@ -0,0 +1,64 @@ > +//===-- AMDGPUIntrinsics.td - Common intrinsics -*- tablegen -*-----------===// > +// > +// The LLVM Compiler Infrastructure > +// > +// This file is distributed under the University of Illinois Open Source > +// License. See LICENSE.TXT for details. > +// > +//===----------------------------------------------------------------------===// > +// > +// This file defines intrinsics that are used by all hw codegen targets. > +// > +//===----------------------------------------------------------------------===// > + > +let TargetPrefix = "AMDGPU", isTarget = 1 in { > + > + def int_AMDGPU_load_const : Intrinsic<[llvm_float_ty], [llvm_i32_ty], [IntrNoMem]>; > + def int_AMDGPU_load_imm : Intrinsic<[llvm_v4f32_ty], [llvm_i32_ty], [IntrNoMem]>; > + def int_AMDGPU_reserve_reg : Intrinsic<[], [llvm_i32_ty], [IntrNoMem]>; > + def int_AMDGPU_store_output : Intrinsic<[], [llvm_float_ty, llvm_i32_ty], [IntrNoMem]>; > + def int_AMDGPU_swizzle : Intrinsic<[llvm_v4f32_ty], [llvm_v4f32_ty, llvm_i32_ty], [IntrNoMem]>; > + > + def int_AMDGPU_arl : Intrinsic<[llvm_i32_ty], [llvm_float_ty], [IntrNoMem]>; > + def int_AMDGPU_cndlt : Intrinsic<[llvm_float_ty], [llvm_float_ty, llvm_float_ty, llvm_float_ty], [IntrNoMem]>; > + def int_AMDGPU_cos : Intrinsic<[llvm_float_ty], [llvm_float_ty], [IntrNoMem]>; > + def int_AMDGPU_div : Intrinsic<[llvm_float_ty], [llvm_float_ty, llvm_float_ty], [IntrNoMem]>; > + def int_AMDGPU_dp4 : Intrinsic<[llvm_float_ty], [llvm_v4f32_ty, llvm_v4f32_ty], [IntrNoMem]>; > + def int_AMDGPU_floor : Intrinsic<[llvm_float_ty], [llvm_float_ty], [IntrNoMem]>; > + def int_AMDGPU_kill : Intrinsic<[], [llvm_float_ty], []>; > + def int_AMDGPU_kilp : Intrinsic<[], [], []>; > + def int_AMDGPU_lrp : Intrinsic<[llvm_float_ty], [llvm_float_ty, llvm_float_ty, llvm_float_ty], [IntrNoMem]>; > + def int_AMDGPU_mul : Intrinsic<[llvm_float_ty], [llvm_float_ty, llvm_float_ty], [IntrNoMem]>; > + def int_AMDGPU_pow : Intrinsic<[llvm_float_ty], [llvm_float_ty, llvm_float_ty], [IntrNoMem]>; > + def int_AMDGPU_rcp : Intrinsic<[llvm_float_ty], [llvm_float_ty], [IntrNoMem]>; > + def int_AMDGPU_rsq : Intrinsic<[llvm_float_ty], [llvm_float_ty], [IntrNoMem]>; > + def int_AMDGPU_seq : Intrinsic<[llvm_float_ty], [llvm_float_ty, llvm_float_ty], [IntrNoMem]>; > + def int_AMDGPU_sgt : Intrinsic<[llvm_float_ty], [llvm_float_ty, llvm_float_ty], [IntrNoMem]>; > + def int_AMDGPU_sge : Intrinsic<[llvm_float_ty], [llvm_float_ty, llvm_float_ty], [IntrNoMem]>; > + def int_AMDGPU_sin : Intrinsic<[llvm_float_ty], [llvm_float_ty], [IntrNoMem]>; > + def int_AMDGPU_sle : Intrinsic<[llvm_float_ty], [llvm_float_ty, llvm_float_ty], [IntrNoMem]>; > + def int_AMDGPU_sne : Intrinsic<[llvm_float_ty], [llvm_float_ty, llvm_float_ty], [IntrNoMem]>; > + def int_AMDGPU_ssg : Intrinsic<[llvm_float_ty], [llvm_float_ty], [IntrNoMem]>; > + def int_AMDGPU_mullit : Intrinsic<[llvm_v4f32_ty], [llvm_float_ty, llvm_float_ty, llvm_float_ty], [IntrNoMem]>; > + def int_AMDGPU_tex : Intrinsic<[llvm_v4f32_ty], [llvm_v4f32_ty, llvm_i32_ty, llvm_i32_ty], [IntrNoMem]>; > + def int_AMDGPU_txb : Intrinsic<[llvm_v4f32_ty], [llvm_v4f32_ty, llvm_i32_ty, llvm_i32_ty], [IntrNoMem]>; > + def int_AMDGPU_txf : Intrinsic<[llvm_v4f32_ty], [llvm_v4f32_ty, llvm_i32_ty, llvm_i32_ty, llvm_i32_ty, llvm_i32_ty, llvm_i32_ty], [IntrNoMem]>; > + def int_AMDGPU_txq : Intrinsic<[llvm_v4f32_ty], [llvm_v4f32_ty, llvm_i32_ty, llvm_i32_ty], [IntrNoMem]>; > + def int_AMDGPU_txd : Intrinsic<[llvm_v4f32_ty], [llvm_v4f32_ty, llvm_v4f32_ty, llvm_v4f32_ty, llvm_i32_ty, llvm_i32_ty], [IntrNoMem]>; > + def int_AMDGPU_txl : Intrinsic<[llvm_v4f32_ty], [llvm_v4f32_ty, llvm_i32_ty, llvm_i32_ty], [IntrNoMem]>; > + def int_AMDGPU_trunc : Intrinsic<[llvm_float_ty], [llvm_float_ty], [IntrNoMem]>; > + def int_AMDGPU_ddx : Intrinsic<[llvm_v4f32_ty], [llvm_v4f32_ty, llvm_i32_ty, llvm_i32_ty], [IntrNoMem]>; > + def int_AMDGPU_ddy : Intrinsic<[llvm_v4f32_ty], [llvm_v4f32_ty, llvm_i32_ty, llvm_i32_ty], [IntrNoMem]>; > + def int_AMDGPU_imax : Intrinsic<[llvm_i32_ty], [llvm_i32_ty, llvm_i32_ty], [IntrNoMem]>; > + def int_AMDGPU_imin : Intrinsic<[llvm_i32_ty], [llvm_i32_ty, llvm_i32_ty], [IntrNoMem]>; > + def int_AMDGPU_umax : Intrinsic<[llvm_i32_ty], [llvm_i32_ty, llvm_i32_ty], [IntrNoMem]>; > + def int_AMDGPU_umin : Intrinsic<[llvm_i32_ty], [llvm_i32_ty, llvm_i32_ty], [IntrNoMem]>; > + def int_AMDGPU_cube : Intrinsic<[llvm_v4f32_ty], [llvm_v4f32_ty], [IntrNoMem]>; > +} > + > +let TargetPrefix = "TGSI", isTarget = 1 in { > + > + def int_TGSI_lit_z : Intrinsic<[llvm_float_ty], [llvm_float_ty, llvm_float_ty, llvm_float_ty],[]>; > +} > + > +include "SIIntrinsics.td" > > Added: llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.cpp?rev=160270&view=auto > =============================================================================> --- llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.cpp (added) > +++ llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.cpp Mon Jul 16 09:17:08 2012 > @@ -0,0 +1,24 @@ > +//===-- AMDGPURegisterInfo.cpp - AMDGPU Register Information -------------===// > +// > +// The LLVM Compiler Infrastructure > +// > +// This file is distributed under the University of Illinois Open Source > +// License. See LICENSE.TXT for details. > +// > +//===----------------------------------------------------------------------===// > +// > +// Parent TargetRegisterInfo class common to all hw codegen targets. > +// > +//===----------------------------------------------------------------------===// > + > +#include "AMDGPURegisterInfo.h" > +#include "AMDGPUTargetMachine.h" > + > +using namespace llvm; > + > +AMDGPURegisterInfo::AMDGPURegisterInfo(AMDGPUTargetMachine &tm, > + const TargetInstrInfo &tii) > +: AMDILRegisterInfo(tm, tii), > + TM(tm), > + TII(tii) > + { } > > Added: llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.h?rev=160270&view=auto > =============================================================================> --- llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.h (added) > +++ llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.h Mon Jul 16 09:17:08 2012 > @@ -0,0 +1,42 @@ > +//===-- AMDGPURegisterInfo.h - AMDGPURegisterInfo Interface -*- C++ -*-----===// > +// > +// The LLVM Compiler Infrastructure > +// > +// This file is distributed under the University of Illinois Open Source > +// License. See LICENSE.TXT for details. > +// > +//===----------------------------------------------------------------------===// > +// > +// This file contains the TargetRegisterInfo interface that is implemented > +// by all hw codegen targets. > +// > +//===----------------------------------------------------------------------===// > + > +#ifndef AMDGPUREGISTERINFO_H_ > +#define AMDGPUREGISTERINFO_H_ > + > +#include "AMDILRegisterInfo.h" > + > +namespace llvm { > + > +class AMDGPUTargetMachine; > +class TargetInstrInfo; > + > +struct AMDGPURegisterInfo : public AMDILRegisterInfo > +{ > + AMDGPUTargetMachine &TM; > + const TargetInstrInfo &TII; > + > + AMDGPURegisterInfo(AMDGPUTargetMachine &tm, const TargetInstrInfo &tii); > + > + virtual BitVector getReservedRegs(const MachineFunction &MF) const = 0; > + > + /// getISARegClass - rc is an AMDIL reg class. This function returns the > + /// ISA reg class that is equivalent to the given AMDIL reg class. > + virtual const TargetRegisterClass * > + getISARegClass(const TargetRegisterClass * rc) const = 0; > +}; > + > +} // End namespace llvm > + > +#endif // AMDIDSAREGISTERINFO_H_ > > Added: llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.td > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.td?rev=160270&view=auto > =============================================================================> --- llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.td (added) > +++ llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.td Mon Jul 16 09:17:08 2012 > @@ -0,0 +1,22 @@ > +//===-- AMDGPURegisterInfo.td - AMDGPU register info -------*- tablegen -*-===// > +// > +// The LLVM Compiler Infrastructure > +// > +// This file is distributed under the University of Illinois Open Source > +// License. See LICENSE.TXT for details. > +// > +//===----------------------------------------------------------------------===// > +// > +// Tablegen register definitions common to all hw codegen targets. > +// > +//===----------------------------------------------------------------------===// > + > +let Namespace = "AMDGPU" in { > + def sel_x : SubRegIndex; > + def sel_y : SubRegIndex; > + def sel_z : SubRegIndex; > + def sel_w : SubRegIndex; > +} > + > +include "R600RegisterInfo.td" > +include "SIRegisterInfo.td" > > Added: llvm/trunk/lib/Target/AMDGPU/AMDGPUSubtarget.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUSubtarget.h?rev=160270&view=auto > =============================================================================> --- llvm/trunk/lib/Target/AMDGPU/AMDGPUSubtarget.h (added) > +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUSubtarget.h Mon Jul 16 09:17:08 2012 > @@ -0,0 +1,36 @@ > +//=====-- AMDGPUSubtarget.h - Define Subtarget for the AMDIL ---*- C++ -*-====// > +// > +// The LLVM Compiler Infrastructure > +// > +// This file is distributed under the University of Illinois Open Source > +// License. See LICENSE.TXT for details. > +// > +//==-----------------------------------------------------------------------===// > +// > +// This file declares the AMDGPU specific subclass of TargetSubtarget. > +// > +//===----------------------------------------------------------------------===// > + > +#ifndef _AMDGPUSUBTARGET_H_ > +#define _AMDGPUSUBTARGET_H_ > +#include "AMDILSubtarget.h" > + > +namespace llvm { > + > +class AMDGPUSubtarget : public AMDILSubtarget > +{ > + InstrItineraryData InstrItins; > + > +public: > + AMDGPUSubtarget(StringRef TT, StringRef CPU, StringRef FS) : > + AMDILSubtarget(TT, CPU, FS) > + { > + InstrItins = getInstrItineraryForCPU(CPU); > + } > + > + const InstrItineraryData &getInstrItineraryData() const { return InstrItins; } > +}; > + > +} // End namespace llvm > + > +#endif // AMDGPUSUBTARGET_H_ > > Added: llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp?rev=160270&view=auto > =============================================================================> --- llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp (added) > +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp Mon Jul 16 09:17:08 2012 > @@ -0,0 +1,162 @@ > +//===-- AMDGPUTargetMachine.cpp - TargetMachine for hw codegen targets-----===// > +// > +// The LLVM Compiler Infrastructure > +// > +// This file is distributed under the University of Illinois Open Source > +// License. See LICENSE.TXT for details. > +// > +//===----------------------------------------------------------------------===// > +// > +// The AMDGPU target machine contains all of the hardware specific information > +// needed to emit code for R600 and SI GPUs. > +// > +//===----------------------------------------------------------------------===// > + > +#include "AMDGPUTargetMachine.h" > +#include "AMDGPU.h" > +#include "R600ISelLowering.h" > +#include "R600InstrInfo.h" > +#include "SIISelLowering.h" > +#include "SIInstrInfo.h" > +#include "llvm/Analysis/Passes.h" > +#include "llvm/Analysis/Verifier.h" > +#include "llvm/CodeGen/MachineFunctionAnalysis.h" > +#include "llvm/CodeGen/MachineModuleInfo.h" > +#include "llvm/CodeGen/Passes.h" > +#include "llvm/MC/MCAsmInfo.h" > +#include "llvm/PassManager.h" > +#include "llvm/Support/TargetRegistry.h" > +#include "llvm/Support/raw_os_ostream.h" > +#include "llvm/Transforms/IPO.h" > +#include "llvm/Transforms/Scalar.h" > + > +using namespace llvm; > + > +extern "C" void LLVMInitializeAMDGPUTarget() { > + // Register the target > + RegisterTargetMachine<AMDGPUTargetMachine> X(TheAMDGPUTarget); > +} > + > +AMDGPUTargetMachine::AMDGPUTargetMachine(const Target &T, StringRef TT, > + StringRef CPU, StringRef FS, > + TargetOptions Options, > + Reloc::Model RM, CodeModel::Model CM, > + CodeGenOpt::Level OptLevel > +) > +: > + LLVMTargetMachine(T, TT, CPU, FS, Options, RM, CM, OptLevel), > + Subtarget(TT, CPU, FS), > + DataLayout(Subtarget.getDataLayout()), > + FrameLowering(TargetFrameLowering::StackGrowsUp, > + Subtarget.device()->getStackAlignment(), 0), > + IntrinsicInfo(this), > + InstrItins(&Subtarget.getInstrItineraryData()), > + mDump(false) > + > +{ > + // TLInfo uses InstrInfo so it must be initialized after. > + if (Subtarget.device()->getGeneration() <= AMDILDeviceInfo::HD6XXX) { > + InstrInfo = new R600InstrInfo(*this); > + TLInfo = new R600TargetLowering(*this); > + } else { > + InstrInfo = new SIInstrInfo(*this); > + TLInfo = new SITargetLowering(*this); > + } > +} > + > +AMDGPUTargetMachine::~AMDGPUTargetMachine() > +{ > +} > + > +bool AMDGPUTargetMachine::addPassesToEmitFile(PassManagerBase &PM, > + formatted_raw_ostream &Out, > + CodeGenFileType FileType, > + bool DisableVerify, > + AnalysisID StartAfter, > + AnalysisID StopAfter) { > + // XXX: Hack here addPassesToEmitFile will fail, but this is Ok since we are > + // only using it to access addPassesToGenerateCode() > + bool fail = LLVMTargetMachine::addPassesToEmitFile(PM, Out, FileType, > + DisableVerify); > + assert(fail); > + > + const AMDILSubtarget &STM = getSubtarget<AMDILSubtarget>(); > + std::string gpu = STM.getDeviceName(); > + if (gpu == "SI") { > + PM.add(createSICodeEmitterPass(Out)); > + } else if (Subtarget.device()->getGeneration() <= AMDILDeviceInfo::HD6XXX) { > + PM.add(createR600CodeEmitterPass(Out)); > + } else { > + abort(); > + return true; > + } > + PM.add(createGCInfoDeleter()); > + > + return false; > +} > + > +namespace { > +class AMDGPUPassConfig : public TargetPassConfig { > +public: > + AMDGPUPassConfig(AMDGPUTargetMachine *TM, PassManagerBase &PM) > + : TargetPassConfig(TM, PM) {} > + > + AMDGPUTargetMachine &getAMDGPUTargetMachine() const { > + return getTM<AMDGPUTargetMachine>(); > + } > + > + virtual bool addPreISel(); > + virtual bool addInstSelector(); > + virtual bool addPreRegAlloc(); > + virtual bool addPostRegAlloc(); > + virtual bool addPreSched2(); > + virtual bool addPreEmitPass(); > +}; > +} // End of anonymous namespace > + > +TargetPassConfig *AMDGPUTargetMachine::createPassConfig(PassManagerBase &PM) { > + return new AMDGPUPassConfig(this, PM); > +} > + > +bool > +AMDGPUPassConfig::addPreISel() > +{ > + const AMDILSubtarget &ST = TM->getSubtarget<AMDILSubtarget>(); > + if (ST.device()->getGeneration() <= AMDILDeviceInfo::HD6XXX) { > + addPass(createR600KernelParametersPass( > + getAMDGPUTargetMachine().getTargetData())); > + } > + return false; > +} > + > +bool AMDGPUPassConfig::addInstSelector() { > + addPass(createAMDILPeepholeOpt(*TM)); > + addPass(createAMDILISelDag(getAMDGPUTargetMachine())); > + return false; > +} > + > +bool AMDGPUPassConfig::addPreRegAlloc() { > + const AMDILSubtarget &ST = TM->getSubtarget<AMDILSubtarget>(); > + > + if (ST.device()->getGeneration() > AMDILDeviceInfo::HD6XXX) { > + addPass(createSIAssignInterpRegsPass(*TM)); > + } > + addPass(createAMDGPUConvertToISAPass(*TM)); > + return false; > +} > + > +bool AMDGPUPassConfig::addPostRegAlloc() { > + return false; > +} > + > +bool AMDGPUPassConfig::addPreSched2() { > + return false; > +} > + > +bool AMDGPUPassConfig::addPreEmitPass() { > + addPass(createAMDILCFGPreparationPass(*TM)); > + addPass(createAMDILCFGStructurizerPass(*TM)); > + > + return false; > +} > + > > Added: llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetMachine.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetMachine.h?rev=160270&view=auto > =============================================================================> --- llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetMachine.h (added) > +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetMachine.h Mon Jul 16 09:17:08 2012 > @@ -0,0 +1,76 @@ > +//===-- AMDGPUTargetMachine.h - AMDGPU TargetMachine Interface --*- C++ -*-===// > +// > +// The LLVM Compiler Infrastructure > +// > +// This file is distributed under the University of Illinois Open Source > +// License. See LICENSE.TXT for details. > +// > +//===----------------------------------------------------------------------===// > +// > +// The AMDGPU TargetMachine interface definition for hw codgen targets. > +// > +//===----------------------------------------------------------------------===// > + > +#ifndef AMDGPU_TARGET_MACHINE_H > +#define AMDGPU_TARGET_MACHINE_H > + > +#include "AMDGPUInstrInfo.h" > +#include "AMDGPUSubtarget.h" > +#include "AMDILFrameLowering.h" > +#include "AMDILIntrinsicInfo.h" > +#include "R600ISelLowering.h" > +#include "llvm/ADT/OwningPtr.h" > +#include "llvm/Target/TargetData.h" > + > +namespace llvm { > + > +MCAsmInfo* createMCAsmInfo(const Target &T, StringRef TT); > + > +class AMDGPUTargetMachine : public LLVMTargetMachine { > + > + AMDGPUSubtarget Subtarget; > + const TargetData DataLayout; > + AMDILFrameLowering FrameLowering; > + AMDILIntrinsicInfo IntrinsicInfo; > + const AMDGPUInstrInfo * InstrInfo; > + AMDGPUTargetLowering * TLInfo; > + const InstrItineraryData* InstrItins; > + bool mDump; > + > +public: > + AMDGPUTargetMachine(const Target &T, StringRef TT, StringRef FS, > + StringRef CPU, > + TargetOptions Options, > + Reloc::Model RM, CodeModel::Model CM, > + CodeGenOpt::Level OL); > + ~AMDGPUTargetMachine(); > + virtual const AMDILFrameLowering* getFrameLowering() const { > + return &FrameLowering; > + } > + virtual const AMDILIntrinsicInfo* getIntrinsicInfo() const { > + return &IntrinsicInfo; > + } > + virtual const AMDGPUInstrInfo *getInstrInfo() const {return InstrInfo;} > + virtual const AMDGPUSubtarget *getSubtargetImpl() const {return &Subtarget; } > + virtual const AMDGPURegisterInfo *getRegisterInfo() const { > + return &InstrInfo->getRegisterInfo(); > + } > + virtual AMDGPUTargetLowering * getTargetLowering() const { > + return TLInfo; > + } > + virtual const InstrItineraryData* getInstrItineraryData() const { > + return InstrItins; > + } > + virtual const TargetData* getTargetData() const { return &DataLayout; } > + virtual TargetPassConfig *createPassConfig(PassManagerBase &PM); > + virtual bool addPassesToEmitFile(PassManagerBase &PM, > + formatted_raw_ostream &Out, > + CodeGenFileType FileType, > + bool DisableVerify, > + AnalysisID StartAfter = 0, > + AnalysisID StopAfter = 0); > +}; > + > +} // End namespace llvm > + > +#endif // AMDGPU_TARGET_MACHINE_H > > Added: llvm/trunk/lib/Target/AMDGPU/AMDGPUUtil.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUUtil.cpp?rev=160270&view=auto > =============================================================================> --- llvm/trunk/lib/Target/AMDGPU/AMDGPUUtil.cpp (added) > +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUUtil.cpp Mon Jul 16 09:17:08 2012 > @@ -0,0 +1,139 @@ > +//===-- AMDGPUUtil.cpp - AMDGPU Utility functions -------------------------===// > +// > +// The LLVM Compiler Infrastructure > +// > +// This file is distributed under the University of Illinois Open Source > +// License. See LICENSE.TXT for details. > +// > +//===----------------------------------------------------------------------===// > +// > +// Common utility functions used by hw codegen targets > +// > +//===----------------------------------------------------------------------===// > + > +#include "AMDGPUUtil.h" > +#include "AMDGPURegisterInfo.h" > +#include "AMDIL.h" > +#include "llvm/CodeGen/MachineFunction.h" > +#include "llvm/CodeGen/MachineInstrBuilder.h" > +#include "llvm/CodeGen/MachineRegisterInfo.h" > +#include "llvm/Target/TargetInstrInfo.h" > +#include "llvm/Target/TargetMachine.h" > +#include "llvm/Target/TargetRegisterInfo.h" > + > +using namespace llvm; > + > +// Some instructions act as place holders to emulate operations that the GPU > +// hardware does automatically. This function can be used to check if > +// an opcode falls into this category. > +bool AMDGPU::isPlaceHolderOpcode(unsigned opcode) > +{ > + switch (opcode) { > + default: return false; > + case AMDGPU::RETURN: > + case AMDGPU::LOAD_INPUT: > + case AMDGPU::LAST: > + case AMDGPU::MASK_WRITE: > + case AMDGPU::RESERVE_REG: > + return true; > + } > +} > + > +bool AMDGPU::isTransOp(unsigned opcode) > +{ > + switch(opcode) { > + default: return false; > + > + case AMDGPU::COS_r600: > + case AMDGPU::COS_eg: > + case AMDGPU::MULLIT: > + case AMDGPU::MUL_LIT_r600: > + case AMDGPU::MUL_LIT_eg: > + case AMDGPU::EXP_IEEE_r600: > + case AMDGPU::EXP_IEEE_eg: > + case AMDGPU::LOG_CLAMPED_r600: > + case AMDGPU::LOG_IEEE_r600: > + case AMDGPU::LOG_CLAMPED_eg: > + case AMDGPU::LOG_IEEE_eg: > + return true; > + } > +} > + > +bool AMDGPU::isTexOp(unsigned opcode) > +{ > + switch(opcode) { > + default: return false; > + case AMDGPU::TEX_LD: > + case AMDGPU::TEX_GET_TEXTURE_RESINFO: > + case AMDGPU::TEX_SAMPLE: > + case AMDGPU::TEX_SAMPLE_C: > + case AMDGPU::TEX_SAMPLE_L: > + case AMDGPU::TEX_SAMPLE_C_L: > + case AMDGPU::TEX_SAMPLE_LB: > + case AMDGPU::TEX_SAMPLE_C_LB: > + case AMDGPU::TEX_SAMPLE_G: > + case AMDGPU::TEX_SAMPLE_C_G: > + case AMDGPU::TEX_GET_GRADIENTS_H: > + case AMDGPU::TEX_GET_GRADIENTS_V: > + case AMDGPU::TEX_SET_GRADIENTS_H: > + case AMDGPU::TEX_SET_GRADIENTS_V: > + return true; > + } > +} > + > +bool AMDGPU::isReductionOp(unsigned opcode) > +{ > + switch(opcode) { > + default: return false; > + case AMDGPU::DOT4_r600: > + case AMDGPU::DOT4_eg: > + return true; > + } > +} > + > +bool AMDGPU::isCubeOp(unsigned opcode) > +{ > + switch(opcode) { > + default: return false; > + case AMDGPU::CUBE_r600: > + case AMDGPU::CUBE_eg: > + return true; > + } > +} > + > + > +bool AMDGPU::isFCOp(unsigned opcode) > +{ > + switch(opcode) { > + default: return false; > + case AMDGPU::BREAK_LOGICALZ_f32: > + case AMDGPU::BREAK_LOGICALNZ_i32: > + case AMDGPU::BREAK_LOGICALZ_i32: > + case AMDGPU::BREAK_LOGICALNZ_f32: > + case AMDGPU::CONTINUE_LOGICALNZ_f32: > + case AMDGPU::IF_LOGICALNZ_i32: > + case AMDGPU::IF_LOGICALZ_f32: > + case AMDGPU::ELSE: > + case AMDGPU::ENDIF: > + case AMDGPU::ENDLOOP: > + case AMDGPU::IF_LOGICALNZ_f32: > + case AMDGPU::WHILELOOP: > + return true; > + } > +} > + > +void AMDGPU::utilAddLiveIn(MachineFunction * MF, > + MachineRegisterInfo & MRI, > + const TargetInstrInfo * TII, > + unsigned physReg, unsigned virtReg) > +{ > + if (!MRI.isLiveIn(physReg)) { > + MRI.addLiveIn(physReg, virtReg); > + MF->front().addLiveIn(physReg); > + BuildMI(MF->front(), MF->front().begin(), DebugLoc(), > + TII->get(TargetOpcode::COPY), virtReg) > + .addReg(physReg); > + } else { > + MRI.replaceRegWith(virtReg, MRI.getLiveInVirtReg(physReg)); > + } > +} > > Added: llvm/trunk/lib/Target/AMDGPU/AMDGPUUtil.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUUtil.h?rev=160270&view=auto > =============================================================================> --- llvm/trunk/lib/Target/AMDGPU/AMDGPUUtil.h (added) > +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUUtil.h Mon Jul 16 09:17:08 2012 > @@ -0,0 +1,46 @@ > +//===-- AMDGPUUtil.h - AMDGPU Utility function declarations -----*- C++ -*-===// > +// > +// The LLVM Compiler Infrastructure > +// > +// This file is distributed under the University of Illinois Open Source > +// License. See LICENSE.TXT for details. > +// > +//===----------------------------------------------------------------------===// > +// > +// Declarations for utility functions common to all hw codegen targets. > +// > +//===----------------------------------------------------------------------===// > + > +#ifndef AMDGPU_UTIL_H > +#define AMDGPU_UTIL_H > + > +namespace llvm { > + > +class MachineFunction; > +class MachineRegisterInfo; > +class TargetInstrInfo; > + > +namespace AMDGPU { > + > +bool isPlaceHolderOpcode(unsigned opcode); > + > +bool isTransOp(unsigned opcode); > +bool isTexOp(unsigned opcode); > +bool isReductionOp(unsigned opcode); > +bool isCubeOp(unsigned opcode); > +bool isFCOp(unsigned opcode); > + > +// XXX: Move these to AMDGPUInstrInfo.h > +#define MO_FLAG_CLAMP (1 << 0) > +#define MO_FLAG_NEG (1 << 1) > +#define MO_FLAG_ABS (1 << 2) > +#define MO_FLAG_MASK (1 << 3) > + > +void utilAddLiveIn(MachineFunction * MF, MachineRegisterInfo & MRI, > + const TargetInstrInfo * TII, unsigned physReg, unsigned virtReg); > + > +} // End namespace AMDGPU > + > +} // End namespace llvm > + > +#endif // AMDGPU_UTIL_H > > Added: llvm/trunk/lib/Target/AMDGPU/AMDIL.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDIL.h?rev=160270&view=auto > =============================================================================> --- llvm/trunk/lib/Target/AMDGPU/AMDIL.h (added) > +++ llvm/trunk/lib/Target/AMDGPU/AMDIL.h Mon Jul 16 09:17:08 2012 > @@ -0,0 +1,251 @@ > +//===-- AMDIL.h - Top-level interface for AMDIL representation --*- C++ -*-===// > +// > +// The LLVM Compiler Infrastructure > +// > +// This file is distributed under the University of Illinois Open Source > +// License. See LICENSE.TXT for details. > +// > +//==-----------------------------------------------------------------------===// > +// > +// This file contains the entry points for global functions defined in the LLVM > +// AMDIL back-end. > +// > +//===----------------------------------------------------------------------===// > + > +#ifndef AMDIL_H_ > +#define AMDIL_H_ > + > +#include "llvm/CodeGen/MachineFunction.h" > +#include "llvm/Target/TargetMachine.h" > + > +#define AMDIL_MAJOR_VERSION 2 > +#define AMDIL_MINOR_VERSION 0 > +#define AMDIL_REVISION_NUMBER 74 > +#define ARENA_SEGMENT_RESERVED_UAVS 12 > +#define DEFAULT_ARENA_UAV_ID 8 > +#define DEFAULT_RAW_UAV_ID 7 > +#define GLOBAL_RETURN_RAW_UAV_ID 11 > +#define HW_MAX_NUM_CB 8 > +#define MAX_NUM_UNIQUE_UAVS 8 > +#define OPENCL_MAX_NUM_ATOMIC_COUNTERS 8 > +#define OPENCL_MAX_READ_IMAGES 128 > +#define OPENCL_MAX_WRITE_IMAGES 8 > +#define OPENCL_MAX_SAMPLERS 16 > + > +// The next two values can never be zero, as zero is the ID that is > +// used to assert against. > +#define DEFAULT_LDS_ID 1 > +#define DEFAULT_GDS_ID 1 > +#define DEFAULT_SCRATCH_ID 1 > +#define DEFAULT_VEC_SLOTS 8 > + > +// SC->CAL version matchings. > +#define CAL_VERSION_SC_150 1700 > +#define CAL_VERSION_SC_149 1700 > +#define CAL_VERSION_SC_148 1525 > +#define CAL_VERSION_SC_147 1525 > +#define CAL_VERSION_SC_146 1525 > +#define CAL_VERSION_SC_145 1451 > +#define CAL_VERSION_SC_144 1451 > +#define CAL_VERSION_SC_143 1441 > +#define CAL_VERSION_SC_142 1441 > +#define CAL_VERSION_SC_141 1420 > +#define CAL_VERSION_SC_140 1400 > +#define CAL_VERSION_SC_139 1387 > +#define CAL_VERSION_SC_138 1387 > +#define CAL_APPEND_BUFFER_SUPPORT 1340 > +#define CAL_VERSION_SC_137 1331 > +#define CAL_VERSION_SC_136 982 > +#define CAL_VERSION_SC_135 950 > +#define CAL_VERSION_GLOBAL_RETURN_BUFFER 990 > + > +#define OCL_DEVICE_RV710 0x0001 > +#define OCL_DEVICE_RV730 0x0002 > +#define OCL_DEVICE_RV770 0x0004 > +#define OCL_DEVICE_CEDAR 0x0008 > +#define OCL_DEVICE_REDWOOD 0x0010 > +#define OCL_DEVICE_JUNIPER 0x0020 > +#define OCL_DEVICE_CYPRESS 0x0040 > +#define OCL_DEVICE_CAICOS 0x0080 > +#define OCL_DEVICE_TURKS 0x0100 > +#define OCL_DEVICE_BARTS 0x0200 > +#define OCL_DEVICE_CAYMAN 0x0400 > +#define OCL_DEVICE_ALL 0x3FFF > + > +/// The number of function ID's that are reserved for > +/// internal compiler usage. > +const unsigned int RESERVED_FUNCS = 1024; > + > +#define AMDIL_OPT_LEVEL_DECL > +#define AMDIL_OPT_LEVEL_VAR > +#define AMDIL_OPT_LEVEL_VAR_NO_COMMA > + > +namespace llvm { > +class AMDILInstrPrinter; > +class FunctionPass; > +class MCAsmInfo; > +class raw_ostream; > +class Target; > +class TargetMachine; > + > +/// Instruction selection passes. > +FunctionPass* > + createAMDILISelDag(TargetMachine &TM AMDIL_OPT_LEVEL_DECL); > +FunctionPass* > + createAMDILPeepholeOpt(TargetMachine &TM AMDIL_OPT_LEVEL_DECL); > + > +/// Pre emit passes. > +FunctionPass* > + createAMDILCFGPreparationPass(TargetMachine &TM AMDIL_OPT_LEVEL_DECL); > +FunctionPass* > + createAMDILCFGStructurizerPass(TargetMachine &TM AMDIL_OPT_LEVEL_DECL); > + > +extern Target TheAMDILTarget; > +extern Target TheAMDGPUTarget; > +} // end namespace llvm; > + > +#define GET_REGINFO_ENUM > +#include "AMDGPUGenRegisterInfo.inc" > +#define GET_INSTRINFO_ENUM > +#include "AMDGPUGenInstrInfo.inc" > + > +/// Include device information enumerations > +#include "AMDILDeviceInfo.h" > + > +namespace llvm { > +/// OpenCL uses address spaces to differentiate between > +/// various memory regions on the hardware. On the CPU > +/// all of the address spaces point to the same memory, > +/// however on the GPU, each address space points to > +/// a seperate piece of memory that is unique from other > +/// memory locations. > +namespace AMDILAS { > +enum AddressSpaces { > + PRIVATE_ADDRESS = 0, // Address space for private memory. > + GLOBAL_ADDRESS = 1, // Address space for global memory (RAT0, VTX0). > + CONSTANT_ADDRESS = 2, // Address space for constant memory. > + LOCAL_ADDRESS = 3, // Address space for local memory. > + REGION_ADDRESS = 4, // Address space for region memory. > + ADDRESS_NONE = 5, // Address space for unknown memory. > + PARAM_D_ADDRESS = 6, // Address space for direct addressible parameter memory (CONST0) > + PARAM_I_ADDRESS = 7, // Address space for indirect addressible parameter memory (VTX1) > + USER_SGPR_ADDRESS = 8, // Address space for USER_SGPRS on SI > + LAST_ADDRESS = 9 > +}; > + > +// This union/struct combination is an easy way to read out the > +// exact bits that are needed. > +typedef union ResourceRec { > + struct { > +#ifdef __BIG_ENDIAN__ > + unsigned short isImage : 1; // Reserved for future use/llvm. > + unsigned short ResourceID : 10; // Flag to specify the resourece ID for > + // the op. > + unsigned short HardwareInst : 1; // Flag to specify that this instruction > + // is a hardware instruction. > + unsigned short ConflictPtr : 1; // Flag to specify that the pointer has a > + // conflict. > + unsigned short ByteStore : 1; // Flag to specify if the op is a byte > + // store op. > + unsigned short PointerPath : 1; // Flag to specify if the op is on the > + // pointer path. > + unsigned short CacheableRead : 1; // Flag to specify if the read is > + // cacheable. > +#else > + unsigned short CacheableRead : 1; // Flag to specify if the read is > + // cacheable. > + unsigned short PointerPath : 1; // Flag to specify if the op is on the > + // pointer path. > + unsigned short ByteStore : 1; // Flag to specify if the op is byte > + // store op. > + unsigned short ConflictPtr : 1; // Flag to specify that the pointer has > + // a conflict. > + unsigned short HardwareInst : 1; // Flag to specify that this instruction > + // is a hardware instruction. > + unsigned short ResourceID : 10; // Flag to specify the resource ID for > + // the op. > + unsigned short isImage : 1; // Reserved for future use. > +#endif > + } bits; > + unsigned short u16all; > +} InstrResEnc; > + > +} // namespace AMDILAS > + > +// Enums corresponding to AMDIL condition codes for IL. These > +// values must be kept in sync with the ones in the .td file. > +namespace AMDILCC { > +enum CondCodes { > + // AMDIL specific condition codes. These correspond to the IL_CC_* > + // in AMDILInstrInfo.td and must be kept in the same order. > + IL_CC_D_EQ = 0, // DEQ instruction. > + IL_CC_D_GE = 1, // DGE instruction. > + IL_CC_D_LT = 2, // DLT instruction. > + IL_CC_D_NE = 3, // DNE instruction. > + IL_CC_F_EQ = 4, // EQ instruction. > + IL_CC_F_GE = 5, // GE instruction. > + IL_CC_F_LT = 6, // LT instruction. > + IL_CC_F_NE = 7, // NE instruction. > + IL_CC_I_EQ = 8, // IEQ instruction. > + IL_CC_I_GE = 9, // IGE instruction. > + IL_CC_I_LT = 10, // ILT instruction. > + IL_CC_I_NE = 11, // INE instruction. > + IL_CC_U_GE = 12, // UGE instruction. > + IL_CC_U_LT = 13, // ULE instruction. > + // Pseudo IL Comparison instructions here. > + IL_CC_F_GT = 14, // GT instruction. > + IL_CC_U_GT = 15, > + IL_CC_I_GT = 16, > + IL_CC_D_GT = 17, > + IL_CC_F_LE = 18, // LE instruction > + IL_CC_U_LE = 19, > + IL_CC_I_LE = 20, > + IL_CC_D_LE = 21, > + IL_CC_F_UNE = 22, > + IL_CC_F_UEQ = 23, > + IL_CC_F_ULT = 24, > + IL_CC_F_UGT = 25, > + IL_CC_F_ULE = 26, > + IL_CC_F_UGE = 27, > + IL_CC_F_ONE = 28, > + IL_CC_F_OEQ = 29, > + IL_CC_F_OLT = 30, > + IL_CC_F_OGT = 31, > + IL_CC_F_OLE = 32, > + IL_CC_F_OGE = 33, > + IL_CC_D_UNE = 34, > + IL_CC_D_UEQ = 35, > + IL_CC_D_ULT = 36, > + IL_CC_D_UGT = 37, > + IL_CC_D_ULE = 38, > + IL_CC_D_UGE = 39, > + IL_CC_D_ONE = 40, > + IL_CC_D_OEQ = 41, > + IL_CC_D_OLT = 42, > + IL_CC_D_OGT = 43, > + IL_CC_D_OLE = 44, > + IL_CC_D_OGE = 45, > + IL_CC_U_EQ = 46, > + IL_CC_U_NE = 47, > + IL_CC_F_O = 48, > + IL_CC_D_O = 49, > + IL_CC_F_UO = 50, > + IL_CC_D_UO = 51, > + IL_CC_L_LE = 52, > + IL_CC_L_GE = 53, > + IL_CC_L_EQ = 54, > + IL_CC_L_NE = 55, > + IL_CC_L_LT = 56, > + IL_CC_L_GT = 57, > + IL_CC_UL_LE = 58, > + IL_CC_UL_GE = 59, > + IL_CC_UL_EQ = 60, > + IL_CC_UL_NE = 61, > + IL_CC_UL_LT = 62, > + IL_CC_UL_GT = 63, > + COND_ERROR = 64 > +}; > + > +} // end namespace AMDILCC > +} // end namespace llvm > +#endif // AMDIL_H_ > > Added: llvm/trunk/lib/Target/AMDGPU/AMDIL7XXDevice.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDIL7XXDevice.cpp?rev=160270&view=auto > =============================================================================> --- llvm/trunk/lib/Target/AMDGPU/AMDIL7XXDevice.cpp (added) > +++ llvm/trunk/lib/Target/AMDGPU/AMDIL7XXDevice.cpp Mon Jul 16 09:17:08 2012 > @@ -0,0 +1,128 @@ > +//===-- AMDIL7XXDevice.cpp - Device Info for 7XX GPUs ---------------------===// > +// > +// The LLVM Compiler Infrastructure > +// > +// This file is distributed under the University of Illinois Open Source > +// License. See LICENSE.TXT for details. > +// > +//==-----------------------------------------------------------------------===// > +#include "AMDIL7XXDevice.h" > +#include "AMDILDevice.h" > + > +using namespace llvm; > + > +AMDIL7XXDevice::AMDIL7XXDevice(AMDILSubtarget *ST) : AMDILDevice(ST) > +{ > + setCaps(); > + std::string name = mSTM->getDeviceName(); > + if (name == "rv710") { > + mDeviceFlag = OCL_DEVICE_RV710; > + } else if (name == "rv730") { > + mDeviceFlag = OCL_DEVICE_RV730; > + } else { > + mDeviceFlag = OCL_DEVICE_RV770; > + } > +} > + > +AMDIL7XXDevice::~AMDIL7XXDevice() > +{ > +} > + > +void AMDIL7XXDevice::setCaps() > +{ > + mSWBits.set(AMDILDeviceInfo::LocalMem); > +} > + > +size_t AMDIL7XXDevice::getMaxLDSSize() const > +{ > + if (usesHardware(AMDILDeviceInfo::LocalMem)) { > + return MAX_LDS_SIZE_700; > + } > + return 0; > +} > + > +size_t AMDIL7XXDevice::getWavefrontSize() const > +{ > + return AMDILDevice::HalfWavefrontSize; > +} > + > +uint32_t AMDIL7XXDevice::getGeneration() const > +{ > + return AMDILDeviceInfo::HD4XXX; > +} > + > +uint32_t AMDIL7XXDevice::getResourceID(uint32_t DeviceID) const > +{ > + switch (DeviceID) { > + default: > + assert(0 && "ID type passed in is unknown!"); > + break; > + case GLOBAL_ID: > + case CONSTANT_ID: > + case RAW_UAV_ID: > + case ARENA_UAV_ID: > + break; > + case LDS_ID: > + if (usesHardware(AMDILDeviceInfo::LocalMem)) { > + return DEFAULT_LDS_ID; > + } > + break; > + case SCRATCH_ID: > + if (usesHardware(AMDILDeviceInfo::PrivateMem)) { > + return DEFAULT_SCRATCH_ID; > + } > + break; > + case GDS_ID: > + assert(0 && "GDS UAV ID is not supported on this chip"); > + if (usesHardware(AMDILDeviceInfo::RegionMem)) { > + return DEFAULT_GDS_ID; > + } > + break; > + }; > + > + return 0; > +} > + > +uint32_t AMDIL7XXDevice::getMaxNumUAVs() const > +{ > + return 1; > +} > + > +AMDIL770Device::AMDIL770Device(AMDILSubtarget *ST): AMDIL7XXDevice(ST) > +{ > + setCaps(); > +} > + > +AMDIL770Device::~AMDIL770Device() > +{ > +} > + > +void AMDIL770Device::setCaps() > +{ > + if (mSTM->isOverride(AMDILDeviceInfo::DoubleOps)) { > + mSWBits.set(AMDILDeviceInfo::FMA); > + mHWBits.set(AMDILDeviceInfo::DoubleOps); > + } > + mSWBits.set(AMDILDeviceInfo::BarrierDetect); > + mHWBits.reset(AMDILDeviceInfo::LongOps); > + mSWBits.set(AMDILDeviceInfo::LongOps); > + mSWBits.set(AMDILDeviceInfo::LocalMem); > +} > + > +size_t AMDIL770Device::getWavefrontSize() const > +{ > + return AMDILDevice::WavefrontSize; > +} > + > +AMDIL710Device::AMDIL710Device(AMDILSubtarget *ST) : AMDIL7XXDevice(ST) > +{ > +} > + > +AMDIL710Device::~AMDIL710Device() > +{ > +} > + > +size_t AMDIL710Device::getWavefrontSize() const > +{ > + return AMDILDevice::QuarterWavefrontSize; > +} > > Added: llvm/trunk/lib/Target/AMDGPU/AMDIL7XXDevice.h > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDIL7XXDevice.h?rev=160270&view=auto > =============================================================================> --- llvm/trunk/lib/Target/AMDGPU/AMDIL7XXDevice.h (added) > +++ llvm/trunk/lib/Target/AMDGPU/AMDIL7XXDevice.h Mon Jul 16 09:17:08 2012 > @@ -0,0 +1,71 @@ > +//==-- AMDIL7XXDevice.h - Define 7XX Device Device for AMDIL ---*- C++ -*--===// > +// > +// The LLVM Compiler Infrastructure > +// > +// This file is distributed under the University of Illinois Open Source > +// License. See LICENSE.TXT for details. > +// > +//==-----------------------------------------------------------------------===// > +// > +// Interface for the subtarget data classes. > +// > +//===----------------------------------------------------------------------===// > +// This file will define the interface that each generation needs to > +// implement in order to correctly answer queries on the capabilities of the > +// specific hardware. > +//===----------------------------------------------------------------------===// > +#ifndef _AMDIL7XXDEVICEIMPL_H_ > +#define _AMDIL7XXDEVICEIMPL_H_ > +#include "AMDILDevice.h" > +#include "AMDILSubtarget.h" > + > +namespace llvm { > +class AMDILSubtarget; > + > +//===----------------------------------------------------------------------===// > +// 7XX generation of devices and their respective sub classes > +//===----------------------------------------------------------------------===// > + > +// The AMDIL7XXDevice class represents the generic 7XX device. All 7XX > +// devices are derived from this class. The AMDIL7XX device will only > +// support the minimal features that are required to be considered OpenCL 1.0 > +// compliant and nothing more. > +class AMDIL7XXDevice : public AMDILDevice { > +public: > + AMDIL7XXDevice(AMDILSubtarget *ST); > + virtual ~AMDIL7XXDevice(); > + virtual size_t getMaxLDSSize() const; > + virtual size_t getWavefrontSize() const; > + virtual uint32_t getGeneration() const; > + virtual uint32_t getResourceID(uint32_t DeviceID) const; > + virtual uint32_t getMaxNumUAVs() const; > + > +protected: > + virtual void setCaps(); > +}; // AMDIL7XXDevice > + > +// The AMDIL770Device class represents the RV770 chip and it's > +// derivative cards. The difference between this device and the base > +// class is this device device adds support for double precision > +// and has a larger wavefront size. > +class AMDIL770Device : public AMDIL7XXDevice { > +public: > + AMDIL770Device(AMDILSubtarget *ST); > + virtual ~AMDIL770Device(); > + virtual size_t getWavefrontSize() const; > +private: > + virtual void setCaps(); > +}; // AMDIL770Device > + > +// The AMDIL710Device class derives from the 7XX base class, but this > +// class is a smaller derivative, so we need to overload some of the > +// functions in order to correctly specify this information. > +class AMDIL710Device : public AMDIL7XXDevice { > +public: > + AMDIL710Device(AMDILSubtarget *ST); > + virtual ~AMDIL710Device(); > + virtual size_t getWavefrontSize() const; > +}; // AMDIL710Device > + > +} // namespace llvm > +#endif // _AMDILDEVICEIMPL_H_ > > Added: llvm/trunk/lib/Target/AMDGPU/AMDILAlgorithms.tpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDILAlgorithms.tpp?rev=160270&view=auto > =============================================================================> --- llvm/trunk/lib/Target/AMDGPU/AMDILAlgorithms.tpp (added) > +++ llvm/trunk/lib/Target/AMDGPU/AMDILAlgorithms.tpp Mon Jul 16 09:17:08 2012 > @@ -0,0 +1,93 @@ > +//===------ AMDILAlgorithms.tpp - AMDIL Template Algorithms Header --------===// > +// > +// The LLVM Compiler Infrastructure > +// > +// This file is distributed under the University of Illinois Open Source > +// License. See LICENSE.TXT for details. > +// > +//===----------------------------------------------------------------------===// > +// > +// This file provides templates algorithms that extend the STL algorithms, but > +// are useful for the AMDIL backend > +// > +//===----------------------------------------------------------------------===// > + > +// A template function that loops through the iterators and passes the second > +// argument along with each iterator to the function. If the function returns > +// true, then the current iterator is invalidated and it moves back, before > +// moving forward to the next iterator, otherwise it moves forward without > +// issue. This is based on the for_each STL function, but allows a reference to > +// the second argument > +template<class InputIterator, class Function, typename Arg> > +Function binaryForEach(InputIterator First, InputIterator Last, Function F, > + Arg &Second) > +{ > + for ( ; First!=Last; ++First ) { > + F(*First, Second); > + } > + return F; > +} > + > +template<class InputIterator, class Function, typename Arg> > +Function safeBinaryForEach(InputIterator First, InputIterator Last, Function F, > + Arg &Second) > +{ > + for ( ; First!=Last; ++First ) { > + if (F(*First, Second)) { > + --First; > + } > + } > + return F; > +} > + > +// A template function that has two levels of looping before calling the > +// function with the passed in argument. See binaryForEach for further > +// explanation > +template<class InputIterator, class Function, typename Arg> > +Function binaryNestedForEach(InputIterator First, InputIterator Last, > + Function F, Arg &Second) > +{ > + for ( ; First != Last; ++First) { > + binaryForEach(First->begin(), First->end(), F, Second); > + } > + return F; > +} > +template<class InputIterator, class Function, typename Arg> > +Function safeBinaryNestedForEach(InputIterator First, InputIterator Last, > + Function F, Arg &Second) > +{ > + for ( ; First != Last; ++First) { > + safeBinaryForEach(First->begin(), First->end(), F, Second); > + } > + return F; > +} > + > +// Unlike the STL, a pointer to the iterator itself is passed in with the 'safe' > +// versions of these functions This allows the function to handle situations > +// such as invalidated iterators > +template<class InputIterator, class Function> > +Function safeForEach(InputIterator First, InputIterator Last, Function F) > +{ > + for ( ; First!=Last; ++First ) F(&First) > + ; // Do nothing. > + return F; > +} > + > +// A template function that has two levels of looping before calling the > +// function with a pointer to the current iterator. See binaryForEach for > +// further explanation > +template<class InputIterator, class SecondIterator, class Function> > +Function safeNestedForEach(InputIterator First, InputIterator Last, > + SecondIterator S, Function F) > +{ > + for ( ; First != Last; ++First) { > + SecondIterator sf, sl; > + for (sf = First->begin(), sl = First->end(); > + sf != sl; ) { > + if (!F(&sf)) { > + ++sf; > + } > + } > + } > + return F; > +} > > Added: llvm/trunk/lib/Target/AMDGPU/AMDILBase.td > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDILBase.td?rev=160270&view=auto > =============================================================================> --- llvm/trunk/lib/Target/AMDGPU/AMDILBase.td (added) > +++ llvm/trunk/lib/Target/AMDGPU/AMDILBase.td Mon Jul 16 09:17:08 2012 > @@ -0,0 +1,113 @@ > +//===- AMDIL.td - AMDIL Target Machine -------------*- tablegen -*-===// > +// > +// The LLVM Compiler Infrastructure > +// > +// This file is distributed under the University of Illinois Open Source > +// License. See LICENSE.TXT for details. > +// > +//===----------------------------------------------------------------------===// > +// Target-independent interfaces which we are implementing > +//===----------------------------------------------------------------------===// > + > +include "llvm/Target/Target.td" > + > +// Dummy Instruction itineraries for pseudo instructions > +def ALU_NULL : FuncUnit; > +def NullALU : InstrItinClass; > + > +//===----------------------------------------------------------------------===// > +// AMDIL Subtarget features. > +//===----------------------------------------------------------------------===// > +def FeatureFP64 : SubtargetFeature<"fp64", > + "CapsOverride[AMDILDeviceInfo::DoubleOps]", > + "true", > + "Enable 64bit double precision operations">; > +def FeatureByteAddress : SubtargetFeature<"byte_addressable_store", > + "CapsOverride[AMDILDeviceInfo::ByteStores]", > + "true", > + "Enable byte addressable stores">; > +def FeatureBarrierDetect : SubtargetFeature<"barrier_detect", > + "CapsOverride[AMDILDeviceInfo::BarrierDetect]", > + "true", > + "Enable duplicate barrier detection(HD5XXX or later).">; > +def FeatureImages : SubtargetFeature<"images", > + "CapsOverride[AMDILDeviceInfo::Images]", > + "true", > + "Enable image functions">; > +def FeatureMultiUAV : SubtargetFeature<"multi_uav", > + "CapsOverride[AMDILDeviceInfo::MultiUAV]", > + "true", > + "Generate multiple UAV code(HD5XXX family or later)">; > +def FeatureMacroDB : SubtargetFeature<"macrodb", > + "CapsOverride[AMDILDeviceInfo::MacroDB]", > + "true", > + "Use internal macrodb, instead of macrodb in driver">; > +def FeatureNoAlias : SubtargetFeature<"noalias", > + "CapsOverride[AMDILDeviceInfo::NoAlias]", > + "true", > + "assert that all kernel argument pointers are not aliased">; > +def FeatureNoInline : SubtargetFeature<"no-inline", > + "CapsOverride[AMDILDeviceInfo::NoInline]", > + "true", > + "specify whether to not inline functions">; > + > +def Feature64BitPtr : SubtargetFeature<"64BitPtr", > + "mIs64bit", > + "false", > + "Specify if 64bit addressing should be used.">; > + > +def Feature32on64BitPtr : SubtargetFeature<"64on32BitPtr", > + "mIs32on64bit", > + "false", > + "Specify if 64bit sized pointers with 32bit addressing should be used.">; > +def FeatureDebug : SubtargetFeature<"debug", > + "CapsOverride[AMDILDeviceInfo::Debug]", > + "true", > + "Debug mode is enabled, so disable hardware accelerated address spaces.">; > +def FeatureDumpCode : SubtargetFeature <"DumpCode", > + "mDumpCode", > + "true", > + "Dump MachineInstrs in the CodeEmitter">; > + > + > +//===----------------------------------------------------------------------===// > +// Register File, Calling Conv, Instruction Descriptions > +//===----------------------------------------------------------------------===// > + > + > +include "AMDILRegisterInfo.td" > +include "AMDILCallingConv.td" > +include "AMDILInstrInfo.td" > + > +def AMDILInstrInfo : InstrInfo {} > + > +//===----------------------------------------------------------------------===// > +// AMDIL processors supported. > +//===----------------------------------------------------------------------===// > +//include "Processors.td" > + > +//===----------------------------------------------------------------------===// > +// Declare the target which we are implementing > +//===----------------------------------------------------------------------===// > +def AMDILAsmWriter : AsmWriter { > + string AsmWriterClassName = "AsmPrinter"; > + int Variant = 0; > +} > + > +def AMDILAsmParser : AsmParser { > + string AsmParserClassName = "AsmParser"; > + int Variant = 0; > + > + string CommentDelimiter = ";"; > + > + string RegisterPrefix = "r"; > + > +} > + > + > +def AMDIL : Target { > + // Pull in Instruction Info: > + let InstructionSet = AMDILInstrInfo; > + let AssemblyWriters = [AMDILAsmWriter]; > + let AssemblyParsers = [AMDILAsmParser]; > +} > > Added: llvm/trunk/lib/Target/AMDGPU/AMDILCFGStructurizer.cpp > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDILCFGStructurizer.cpp?rev=160270&view=auto > =============================================================================> --- llvm/trunk/lib/Target/AMDGPU/AMDILCFGStructurizer.cpp (added) > +++ llvm/trunk/lib/Target/AMDGPU/AMDILCFGStructurizer.cpp Mon Jul 16 09:17:08 2012 > @@ -0,0 +1,3236 @@ > +//===-- AMDILCFGStructurizer.cpp - CFG Structurizer -----------------------===// > +// > +// The LLVM Compiler Infrastructure > +// > +// This file is distributed under the University of Illinois Open Source > +// License. See LICENSE.TXT for details. > +// > +//==-----------------------------------------------------------------------===// > + > +#define DEBUGME 0 > +#define DEBUG_TYPE "structcfg" > + > +#include "AMDIL.h" > +#include "AMDILInstrInfo.h" > +#include "AMDILRegisterInfo.h" > +#include "AMDILUtilityFunctions.h" > +#include "llvm/ADT/SCCIterator.h" > +#include "llvm/ADT/SmallVector.h" > +#include "llvm/ADT/Statistic.h" > +#include "llvm/Analysis/DominatorInternals.h" > +#include "llvm/Analysis/Dominators.h" > +#include "llvm/CodeGen/MachineDominators.h" > +#include "llvm/CodeGen/MachineDominators.h" > +#include "llvm/CodeGen/MachineFunction.h" > +#include "llvm/CodeGen/MachineFunctionAnalysis.h" > +#include "llvm/CodeGen/MachineFunctionPass.h" > +#include "llvm/CodeGen/MachineFunctionPass.h" > +#include "llvm/CodeGen/MachineInstrBuilder.h" > +#include "llvm/CodeGen/MachineJumpTableInfo.h" > +#include "llvm/CodeGen/MachineLoopInfo.h" > +#include "llvm/CodeGen/MachineRegisterInfo.h" > +#include "llvm/Target/TargetInstrInfo.h" > + > +#define FirstNonDebugInstr(A) A->begin() > +using namespace llvm; > + > +// TODO: move-begin. > + > +//===----------------------------------------------------------------------===// > +// > +// Statistics for CFGStructurizer. > +// > +//===----------------------------------------------------------------------===// > + > +STATISTIC(numSerialPatternMatch, "CFGStructurizer number of serial pattern " > + "matched"); > +STATISTIC(numIfPatternMatch, "CFGStructurizer number of if pattern " > + "matched"); > +STATISTIC(numLoopbreakPatternMatch, "CFGStructurizer number of loop-break " > + "pattern matched"); > +STATISTIC(numLoopcontPatternMatch, "CFGStructurizer number of loop-continue " > + "pattern matched"); > +STATISTIC(numLoopPatternMatch, "CFGStructurizer number of loop pattern " > + "matched"); > +STATISTIC(numClonedBlock, "CFGStructurizer cloned blocks"); > +STATISTIC(numClonedInstr, "CFGStructurizer cloned instructions"); > + > +//===----------------------------------------------------------------------===// > +// > +// Miscellaneous utility for CFGStructurizer. > +// > +//===----------------------------------------------------------------------===// > +namespace llvmCFGStruct > +{ > +#define SHOWNEWINSTR(i) \ > + if (DEBUGME) errs() << "New instr: " << *i << "\n" > + > +#define SHOWNEWBLK(b, msg) \ > +if (DEBUGME) { \ > + errs() << msg << "BB" << b->getNumber() << "size " << b->size(); \ > + errs() << "\n"; \ > +} > + > +#define SHOWBLK_DETAIL(b, msg) \ > +if (DEBUGME) { \ > + if (b) { \ > + errs() << msg << "BB" << b->getNumber() << "size " << b->size(); \ > + b->print(errs()); \ > + errs() << "\n"; \ > + } \ > +} > + > +#define INVALIDSCCNUM -1 > +#define INVALIDREGNUM 0 > + > +template<class LoopinfoT> > +void PrintLoopinfo(const LoopinfoT &LoopInfo, llvm::raw_ostream &OS) { > + for (typename LoopinfoT::iterator iter = LoopInfo.begin(), > + iterEnd = LoopInfo.end(); > + iter != iterEnd; ++iter) { > + (*iter)->print(OS, 0); > + } > +} > + > +template<class NodeT> > +void ReverseVector(SmallVector<NodeT *, DEFAULT_VEC_SLOTS> &Src) { > + size_t sz = Src.size(); > + for (size_t i = 0; i < sz/2; ++i) { > + NodeT *t = Src[i]; > + Src[i] = Src[sz - i - 1]; > + Src[sz - i - 1] = t; > + } > +} > + > +} //end namespace llvmCFGStruct > + > + > +//===----------------------------------------------------------------------===// > +// > +// MachinePostDominatorTree > +// > +//===----------------------------------------------------------------------===// > + > +namespace llvm { > + > +/// PostDominatorTree Class - Concrete subclass of DominatorTree that is used > +/// to compute the a post-dominator tree. > +/// > +struct MachinePostDominatorTree : public MachineFunctionPass { > + static char ID; // Pass identification, replacement for typeid > + DominatorTreeBase<MachineBasicBlock> *DT; > + MachinePostDominatorTree() : MachineFunctionPass(ID) > + { > + DT = new DominatorTreeBase<MachineBasicBlock>(true); //true indicate > + // postdominator > + } > + > + ~MachinePostDominatorTree(); > + > + virtual bool runOnMachineFunction(MachineFunction &MF); > + > + virtual void getAnalysisUsage(AnalysisUsage &AU) const { > + AU.setPreservesAll(); > + MachineFunctionPass::getAnalysisUsage(AU); > + } > + > + inline const std::vector<MachineBasicBlock *> &getRoots() const { > + return DT->getRoots(); > + } > + > + inline MachineDomTreeNode *getRootNode() const { > + return DT->getRootNode(); > + } > + > + inline MachineDomTreeNode *operator[](MachineBasicBlock *BB) const { > + return DT->getNode(BB); > + } > + > + inline MachineDomTreeNode *getNode(MachineBasicBlock *BB) const { > + return DT->getNode(BB); > + } > + > + inline bool dominates(MachineDomTreeNode *A, MachineDomTreeNode *B) const { > + return DT->dominates(A, B); > + } > + > + inline bool dominates(MachineBasicBlock *A, MachineBasicBlock *B) const { > + return DT->dominates(A, B); > + } > + > + inline bool > + properlyDominates(const MachineDomTreeNode *A, MachineDomTreeNode *B) const { > + return DT->properlyDominates(A, B); > + } > + > + inline bool > + properlyDominates(MachineBasicBlock *A, MachineBasicBlock *B) const { > + return DT->properlyDominates(A, B); > + } > + > + inline MachineBasicBlock * > + findNearestCommonDominator(MachineBasicBlock *A, MachineBasicBlock *B) { > + return DT->findNearestCommonDominator(A, B); > + } > + > + virtual void print(llvm::raw_ostream &OS, const Module *M = 0) const { > + DT->print(OS); > + } > +}; > +} //end of namespace llvm > + > +char MachinePostDominatorTree::ID = 0; > +static RegisterPass<MachinePostDominatorTree> > +machinePostDominatorTreePass("machinepostdomtree", > + "MachinePostDominator Tree Construction", > + true, true); > + > +//const PassInfo *const llvm::MachinePostDominatorsID > +//= &machinePostDominatorTreePass; > + > +bool MachinePostDominatorTree::runOnMachineFunction(MachineFunction &F) { > + DT->recalculate(F); > + //DEBUG(DT->dump()); > + return false; > +} > + > +MachinePostDominatorTree::~MachinePostDominatorTree() { > + delete DT; > +} > + > +//===----------------------------------------------------------------------===// > +// > +// supporting data structure for CFGStructurizer > +// > +//===----------------------------------------------------------------------===// > + > +namespace llvmCFGStruct > +{ > +template<class PassT> > +struct CFGStructTraits { > +}; > + > +template <class InstrT> > +class BlockInformation { > +public: > + bool isRetired; > + int sccNum; > + //SmallVector<InstrT*, DEFAULT_VEC_SLOTS> succInstr; > + //Instructions defining the corresponding successor. > + BlockInformation() : isRetired(false), sccNum(INVALIDSCCNUM) {} > +}; > + > +template <class BlockT, class InstrT, class RegiT> > +class LandInformation { > +public: > + BlockT *landBlk; > + std::set<RegiT> breakInitRegs; //Registers that need to "reg = 0", before > + //WHILELOOP(thisloop) init before entering > + //thisloop. > + std::set<RegiT> contInitRegs; //Registers that need to "reg = 0", after > + //WHILELOOP(thisloop) init after entering > + //thisloop. > + std::set<RegiT> endbranchInitRegs; //Init before entering this loop, at loop > + //land block, branch cond on this reg. > + std::set<RegiT> breakOnRegs; //registers that need to "if (reg) break > + //endif" after ENDLOOP(thisloop) break > + //outerLoopOf(thisLoop). > + std::set<RegiT> contOnRegs; //registers that need to "if (reg) continue > + //endif" after ENDLOOP(thisloop) continue on > + //outerLoopOf(thisLoop). > + LandInformation() : landBlk(NULL) {} > +}; > + > +} //end of namespace llvmCFGStruct > + > +//===----------------------------------------------------------------------===// > +// > +// CFGStructurizer > +// > +//===----------------------------------------------------------------------===// > + > +namespace llvmCFGStruct > +{ > +// bixia TODO: port it to BasicBlock, not just MachineBasicBlock. > +template<class PassT> > +class CFGStructurizer > +{ > +public: > + typedef enum { > + Not_SinglePath = 0, > + SinglePath_InPath = 1, > + SinglePath_NotInPath = 2 > + } PathToKind; > + > +public: > + typedef typename PassT::InstructionType InstrT; > + typedef typename PassT::FunctionType FuncT; > + typedef typename PassT::DominatortreeType DomTreeT; > + typedef typename PassT::PostDominatortreeType PostDomTreeT; > + typedef typename PassT::DomTreeNodeType DomTreeNodeT; > + typedef typename PassT::LoopinfoType LoopInfoT; > + > + typedef GraphTraits<FuncT *> FuncGTraits; > + //typedef FuncGTraits::nodes_iterator BlockIterator; > + typedef typename FuncT::iterator BlockIterator; > + > + typedef typename FuncGTraits::NodeType BlockT; > + typedef GraphTraits<BlockT *> BlockGTraits; > + typedef GraphTraits<Inverse<BlockT *> > InvBlockGTraits; > + //typedef BlockGTraits::succ_iterator InstructionIterator; > + typedef typename BlockT::iterator InstrIterator; > + > + typedef CFGStructTraits<PassT> CFGTraits; > + typedef BlockInformation<InstrT> BlockInfo; > + typedef std::map<BlockT *, BlockInfo *> BlockInfoMap; > + > + typedef int RegiT; > + typedef typename PassT::LoopType LoopT; > + typedef LandInformation<BlockT, InstrT, RegiT> LoopLandInfo; > + typedef std::map<LoopT *, LoopLandInfo *> LoopLandInfoMap; > + //landing info for loop break > + typedef SmallVector<BlockT *, 32> BlockTSmallerVector; > + > +public: > + CFGStructurizer(); > + ~CFGStructurizer(); > + > + /// Perform the CFG structurization > + bool run(FuncT &Func, PassT &Pass, const AMDILRegisterInfo *tri); > + > + /// Perform the CFG preparation > + bool prepare(FuncT &Func, PassT &Pass, const AMDILRegisterInfo *tri); > + > +private: > + void orderBlocks(); > + void printOrderedBlocks(llvm::raw_ostream &OS); > + int patternMatch(BlockT *CurBlock); > + int patternMatchGroup(BlockT *CurBlock); > + > + int serialPatternMatch(BlockT *CurBlock); > + int ifPatternMatch(BlockT *CurBlock); > + int switchPatternMatch(BlockT *CurBlock); > + int loopendPatternMatch(BlockT *CurBlock); > + int loopPatternMatch(BlockT *CurBlock); > + > + int loopbreakPatternMatch(LoopT *LoopRep, BlockT *LoopHeader); > + int loopcontPatternMatch(LoopT *LoopRep, BlockT *LoopHeader); > + //int loopWithoutBreak(BlockT *); > + > + void handleLoopbreak (BlockT *ExitingBlock, LoopT *ExitingLoop, > + BlockT *ExitBlock, LoopT *exitLoop, BlockT *landBlock); > + void handleLoopcontBlock(BlockT *ContingBlock, LoopT *contingLoop, > + BlockT *ContBlock, LoopT *contLoop); > + bool isSameloopDetachedContbreak(BlockT *Src1Block, BlockT *Src2Block); > + int handleJumpintoIf(BlockT *HeadBlock, BlockT *TrueBlock, > + BlockT *F... > > [Message clipped] > > _______________________________________________ > llvm-commits mailing list > llvm-commits at cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120716/03870f14/attachment-0001.html>
Anton Korobeynikov
2012-Jul-16  18:50 UTC
[LLVMdev] [llvm-commits] RFC: LLVM incubation, or requirements for committing new backends
> So, what would the community think of implementing such a system?+1 :) -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University
Tom Stellard
2012-Jul-16  20:21 UTC
[LLVMdev] [llvm-commits] RFC: LLVM incubation, or requirements for committing new backends
On Mon, Jul 16, 2012 at 11:44:25AM -0700, Owen Anderson wrote:> Tom, > > I think it might be productive to fork this thread to discuss making the requirements for upstreaming a new LLVM target more explicit and open. I'd also like to gauge interest in an idea I've discussed privately with a few community members, namely the concept of having a semi-official "incubation" system whereby proposed backends could get a trial run before becoming part of LLVM mainline. > > The proposed system would be something like this: a proposed contribution would receive a branch on llvm.org, and have six months (or some other predetermined length of time) to demonstrate that it and its developers are ready for mainline integration. At the end of their term, incubated projects would be evaluated on the following criteria, and either integrated to mainline, judged to be more appropriate as an external project, or given an extension and "needs improvement" feedback on specific criteria. > > * Active maintainership - Backends bit rot quickly, and unmaintained backends are large maintenance burden on everyone else. We need a core of developers who are going to actively maintain any candidate backend on mainline. That last point is critical: a code drop every six months is not an acceptable level of maintenance for a mainline target. > > * Contributions to core - Mainlining a new backend adds the expectation that mainline LLVM developers will invest the time and energy to keep your backend building and working (see test plan, below). However, that expectation of extra work doesn't come for nothing: we expect you to contribute back fixes and improvements that you find, and to work with other community members to coordinate projects as appropriate. When looking at a new backend, I should expect to see few-to-no diffs outside of lib/Target/YourBackend, and a few other places (Triple.cpp, for example). All other changes should already be upstreamed. > > * Test plan - If you're going to expect us to maintain and fix your code, then you need to have a good answer to how to test it. This includes, but is not limited to, a good set of regression tests that are comprehensible to normal developers (so we can fix them when they fail due to mainline change), and continuous testing in the form of buildbots or other infrastructure (so we can know when a patch breaks your backend). > > * Up to date with mainline - All mainline backends must work with top-of-tree LLVM, all of the time. A candidate for inclusion must be developed at, or close to, mainline. In practice, that probably means updating at least once a week, possibly more. > > * LLVM coding standards - While small deviations can be fixed after mainlining, gross violations of the LLVM code standards and conventions must be fixed prior to integration. > > --- > > So, what would the community think of implementing such a system? > > --Owen >Hi Owen, I am in favor of this. I think having specific criteria and time lines will be beneficial for both maintainers and reviewers. However, instead of having a separate branch, what do you think about adding the backend to the main tree, but not building it by default. This would make it easier for the backend maintainer to keep it up to date and also make it easier for users to test it. At the same time, the backend maintainer would still be responsible for updating it for changes to the LLVM core API, so other developers wouldn't need to worry about breaking the "backend-in-training". -Tom
Villmow, Micah
2012-Jul-17  14:53 UTC
[LLVMdev] [llvm-commits] RFC: LLVM incubation, or requirements for committing new backends
Owen/Chandler/etc.., While I have no issue with having a more complete and documented method of submitting backends, the problem is the barrier to entry for some backends is being significantly raised, where they did not exist in the past. In the past AMD has reported issues that we have found from internal development to LLVM, along with patches in some cases. Some have been fixed, but others are unique to our backends and still are not in mainline. Many times the response to attempts to get them fixed has been to get the backend in the mainline development tree first. Now that we are finally able to push our backends out publicly, we are getting pushback that we need to contribute in other places first. While I completely agree with 4 of the five points below, the requirement that we contribute to core for things that are outside the scope of the backend seems overly onerous. Tom's AMDGPU backend is not the only backend AMD would like to push into mainline, as our production AMDIL backend is in the pipeline to be added and AMD has announced that the HSA foundations compiler will also be open sourced(which is a third backend from AMD, plus more, see http://www.phoronix.com/scan.php?page=news_item&px=MTEyMzQ). I would just hate to see these get delayed because of barriers to entry that seem artificial or out of scope of the proposed/required changes. So I guess the issue is, what else is required to the backends to make them acceptable for LLVM. Micah> -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of Tom Stellard > Sent: Monday, July 16, 2012 1:22 PM > To: Owen Anderson > Cc: llvm-commits at cs.uiuc.edu LLVM; LLVM at dcs-maillist.cs.uiuc.edu; > Developers Mailing List > Subject: Re: [LLVMdev] [llvm-commits] RFC: LLVM incubation, or > requirements for committing new backends > > On Mon, Jul 16, 2012 at 11:44:25AM -0700, Owen Anderson wrote: > > Tom, > > > > I think it might be productive to fork this thread to discuss making > the requirements for upstreaming a new LLVM target more explicit and > open. I'd also like to gauge interest in an idea I've discussed > privately with a few community members, namely the concept of having a > semi-official "incubation" system whereby proposed backends could get a > trial run before becoming part of LLVM mainline. > > > > The proposed system would be something like this: a proposed > contribution would receive a branch on llvm.org, and have six months (or > some other predetermined length of time) to demonstrate that it and its > developers are ready for mainline integration. At the end of their > term, incubated projects would be evaluated on the following criteria, > and either integrated to mainline, judged to be more appropriate as an > external project, or given an extension and "needs improvement" feedback > on specific criteria. > > > > * Active maintainership - Backends bit rot quickly, and unmaintained > backends are large maintenance burden on everyone else. We need a core > of developers who are going to actively maintain any candidate backend > on mainline. That last point is critical: a code drop every six months > is not an acceptable level of maintenance for a mainline target. > > > > * Contributions to core - Mainlining a new backend adds the > expectation that mainline LLVM developers will invest the time and > energy to keep your backend building and working (see test plan, below). > However, that expectation of extra work doesn't come for nothing: we > expect you to contribute back fixes and improvements that you find, and > to work with other community members to coordinate projects as > appropriate. When looking at a new backend, I should expect to see few- > to-no diffs outside of lib/Target/YourBackend, and a few other places > (Triple.cpp, for example). All other changes should already be > upstreamed. > > > > * Test plan - If you're going to expect us to maintain and fix your > code, then you need to have a good answer to how to test it. This > includes, but is not limited to, a good set of regression tests that are > comprehensible to normal developers (so we can fix them when they fail > due to mainline change), and continuous testing in the form of buildbots > or other infrastructure (so we can know when a patch breaks your > backend). > > > > * Up to date with mainline - All mainline backends must work with top- > of-tree LLVM, all of the time. A candidate for inclusion must be > developed at, or close to, mainline. In practice, that probably means > updating at least once a week, possibly more. > > > > * LLVM coding standards - While small deviations can be fixed after > mainlining, gross violations of the LLVM code standards and conventions > must be fixed prior to integration. > > > > --- > > > > So, what would the community think of implementing such a system? > > > > --Owen > > > > Hi Owen, > > I am in favor of this. I think having specific criteria and time lines > will be beneficial for both maintainers and reviewers. > > However, instead of having a separate branch, what do you think about > adding the backend to the main tree, but not building it by default. > This would make it easier for the backend maintainer to keep it up to > date and also make it easier for users to test it. At the same time, > the backend maintainer would still be responsible for updating it for > changes to the LLVM core API, so other developers wouldn't need to worry > about breaking the "backend-in-training". > > -Tom > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev