thr3ads.net - search: "armisd"

Displaying 20 results from an estimated 27 matches for "armisd".

2010 Nov 12

[LLVMdev] Simple NEON optimization

On 12 November 2010 17:52, Bob Wilson <bob.wilson at apple.com> wrote: > I recommend implementing this as a target-specific DAG combine optimization. We already have target-specific DAG nodes for the relevant NEON comparison operations (ARMISD::VCEQ, etc. -- see ARMISelLowering.h) as well as the vmov (ARMISD::VMOVIMM). You just need to teach the DAG combiner how to fold them together. Here's what you need to do (all of this code is in ARMISelLowering.cpp): Hi Bob, I thought so... I'll get cracked and see if I can generate som...

[LLVMdev] Simple NEON optimization

2010 Nov 12

[LLVMdev] Simple NEON optimization

...uld I put this as a special case in NEON lowering or make it as > part of an optimization pass? Which classes should I look first? I recommend implementing this as a target-specific DAG combine optimization. We already have target-specific DAG nodes for the relevant NEON comparison operations (ARMISD::VCEQ, etc. -- see ARMISelLowering.h) as well as the vmov (ARMISD::VMOVIMM). You just need to teach the DAG combiner how to fold them together. Here's what you need to do (all of this code is in ARMISelLowering.cpp): 0. (You don't actually need to do anything, but I'm just mentioning...

[LLVMdev] Simple NEON optimization

2010 Nov 12

[LLVMdev] Simple NEON optimization

Hi folks, me again, So, I want to implement a simple optimization in a NEON case I've seen these days, most as a matter of exercise, but it also simplifies (just a bit) the code generated. The case is simple: uint32x2_t x, res; res = vceq_u32(x, vcreate_u32(0)); This will generate the following code: ; zero d16 vmov.i32 d16, #0x0 ; load a

[LLVMdev] ARM backend playing with alternative jump table implementations

2009 Feb 17

[LLVMdev] ARM backend playing with alternative jump table implementations

....data .LJTI9_0_0: .long .LBB9_2 .long .LBB9_5 .long .LBB9_7 .long .LBB9_4 .long .LBB9_8 .text The code for the lowering lives mostly in SDValue ARMTargetLowering::LowerBR_JT with some more heavy lifting done by ARMISD::WrapperJT My attempts at this are marked in the code below. My problem is to come up with the right item/value to put into the constant pool. SDValue ARMTargetLowering::LowerBR_JT(SDValue Op, SelectionDAG &DAG) { SDValue Chain = Op.getOperand(0); SDValue Table = Op.getOperand(1); SDValu...

[LLVMdev] Simple NEON optimization

2010 Nov 12

[LLVMdev] Simple NEON optimization

...12, 2010, at 10:42 AM, Renato Golin wrote: > On 12 November 2010 17:52, Bob Wilson <bob.wilson at apple.com> wrote: >> I recommend implementing this as a target-specific DAG combine optimization. We already have target-specific DAG nodes for the relevant NEON comparison operations (ARMISD::VCEQ, etc. -- see ARMISelLowering.h) as well as the vmov (ARMISD::VMOVIMM). You just need to teach the DAG combiner how to fold them together. Here's what you need to do (all of this code is in ARMISelLowering.cpp): > > Hi Bob, > > I thought so... I'll get cracked and see i...

Potential bug in SelectionDAGLegalize::ConvertNodeToLibcall()?

2019 Jan 04

Potential bug in SelectionDAGLegalize::ConvertNodeToLibcall()?

+ Eli Friedman as he often has very insightful comments regarding back end changes. On Fri, Jan 4, 2019 at 9:03 AM Nemanja Ivanovic <nemanja.i.ibm at gmail.com> wrote: > The changes seem fine to me. I don't think this is excessively intrusive > and it accomplishes what is needed by targets whose call lowering can > introduce illegal types. > Adding Justin Bogner as the

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

2013 Jul 01

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

...G combine. > > Let me know if there is another, better supported, approach for this kind of problems. > > ** Motivating Example ** > The motivating example comes form the lowering of vector code on armv7. > More specifically, the build_vector node is lowered to a target specific ARMISD::build_vector where all the parameters are bitcasted to floating point types. > > This works well, unless the inserted bitcasts survive until instruction selection. In that case, they incur moves between integer unit and floating point unit that may result in inefficient code. > > Att...

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

2013 Jul 01

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

...if there is another, better supported, approach for this kind >> of problems. >> >> ** Motivating Example ** >> The motivating example comes form the lowering of vector code on armv7. >> More specifically, the build_vector node is lowered to a target specific >> ARMISD::build_vector where all the parameters are bitcasted to floating >> point types. >> >> This works well, unless the inserted bitcasts survive until instruction >> selection. In that case, they incur moves between integer unit and floating >> point unit that may result i...

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

2013 Jul 01

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

...g will be eliminated during DAG combine. Let me know if there is another, better supported, approach for this kind of problems. ** Motivating Example ** The motivating example comes form the lowering of vector code on armv7. More specifically, the build_vector node is lowered to a target specific ARMISD::build_vector where all the parameters are bitcasted to floating point types. This works well, unless the inserted bitcasts survive until instruction selection. In that case, they incur moves between integer unit and floating point unit that may result in inefficient code. Attached motivating_exa...

[LLVMdev] Moving return value registers from MRI to return instructions

2013 Feb 02

[LLVMdev] Moving return value registers from MRI to return instructions

...e way. I'll be updating the in-tree targets. Other targets need to make three changes: 1. The XXXretflag SDNode needs to be variadic like the call SDNodes are: --- a/lib/Target/ARM/ARMInstrInfo.td +++ b/lib/Target/ARM/ARMInstrInfo.td @@ -117,7 +117,7 @@ def ARMcall_nolink : SDNode<"ARMISD::CALL_NOLINK", SDT_ARMcall, SDNPVariadic]>; def ARMretflag : SDNode<"ARMISD::RET_FLAG", SDTNone, - [SDNPHasChain, SDNPOptInGlue]>; + [SDNPHasChain, SDNPOptInGlue, SDNPVariadic]...

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

2013 Jul 01

[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?

...e. > > Let me know if there is another, better supported, approach for this kind > of problems. > > ** Motivating Example ** > The motivating example comes form the lowering of vector code on armv7. > More specifically, the build_vector node is lowered to a target specific > ARMISD::build_vector where all the parameters are bitcasted to floating > point types. > > This works well, unless the inserted bitcasts survive until instruction > selection. In that case, they incur moves between integer unit and floating > point unit that may result in inefficient code....

[LLVMdev] patch for llc/ARM: added mechanism to move switch tables from .text -> .data; also cleanup and documentation

2009 Jun 03

[LLVMdev] patch for llc/ARM: added mechanism to move switch tables from .text -> .data; also cleanup and documentation

...RMConstantPoolValue(".T", Num, + ARMCP::CPDataSegmentJumpTable); + const SDValue CPAddr = DAG.getTargetConstantPool(CPV, PTy, 4); + + // An ARM idiosyncrasy: wrap each constant pool entry before accessing it + const SDValue Wrapper = DAG.getNode(ARMISD::Wrapper, dl, MVT::i32, CPAddr); + + // Load Table start from constan pool + const SDValue Table = DAG.getLoad(PTy, dl, DAG.getEntryNode(), Wrapper, NULL, 0); + + // table entries are 4 bytes, so multiple index by 4 + const SDValue ScaledIndex = DAG.getNode(ISD::MUL, dl, PTy, Index, DAG.getCons...

[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz

2010 Jan 15

[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz

Hi, On ARMv6T2 this turns cttz into rbit, clz instead of the 4 instruction sequence it is now. I'm not sure if adding RBIT to ARMISD and doing this optimization in the legalize pass is the best option, but the only better way I could think of doing it was to add a bitreverse intrinsic to llvm ir, which itself might not be the best option since bitreverse probably isn't too common. Other targets that I know of that could pot...

[LLVMdev] patch for llc/ARM: added mechanism to move switch tables from .text -> .data; also cleanup and documentation

2009 Jun 11

[LLVMdev] patch for llc/ARM: added mechanism to move switch tables from .text -> .data; also cleanup and documentation

On Jun 8, 2009, at 2:42 PM, robert muth wrote: > On Sun, Jun 7, 2009 at 11:53 PM, Evan Cheng <evan.cheng at apple.com> > wrote: >> >> On Jun 7, 2009, at 6:59 AM, robert muth wrote: >> >>> On Sat, Jun 6, 2009 at 4:51 PM, Evan Cheng<evan.cheng at apple.com> >>> wrote: >>>> +cl::opt<std::string>

[LLVMdev] patch for llc/ARM: added mechanism to move switch tables from .text -> .data; also cleanup and documentation

2009 Jun 08

[LLVMdev] patch for llc/ARM: added mechanism to move switch tables from .text -> .data; also cleanup and documentation

On Sun, Jun 7, 2009 at 11:53 PM, Evan Cheng <evan.cheng at apple.com> wrote: > > On Jun 7, 2009, at 6:59 AM, robert muth wrote: > >> On Sat, Jun 6, 2009 at 4:51 PM, Evan Cheng<evan.cheng at apple.com> >> wrote: >>> +cl::opt<std::string> FlagJumpTableSection("jumptable-section", >>> +

[LLVMdev] Simple NEON optimization

2010 Nov 12

[LLVMdev] Simple NEON optimization

...10:42 AM, Renato Golin wrote: > >> On 12 November 2010 17:52, Bob Wilson <bob.wilson at apple.com> wrote: >>> I recommend implementing this as a target-specific DAG combine optimization. We already have target-specific DAG nodes for the relevant NEON comparison operations (ARMISD::VCEQ, etc. -- see ARMISelLowering.h) as well as the vmov (ARMISD::VMOVIMM). You just need to teach the DAG combiner how to fold them together. Here's what you need to do (all of this code is in ARMISelLowering.cpp): >> >> Hi Bob, >> >> I thought so... I'll get c...

[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz

2010 Jan 15

[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz

...Chris Lattner <clattner at apple.com> wrote: > > On Jan 14, 2010, at 10:13 PM, David Conrad wrote: > >> Hi, >> >> On ARMv6T2 this turns cttz into rbit, clz instead of the 4 >> instruction sequence it is now. >> >> I'm not sure if adding RBIT to ARMISD and doing this optimization in >> the legalize pass is the best option, but the only better way I >> could think of doing it was to add a bitreverse intrinsic to llvm >> ir, which itself might not be the best option since bitreverse >> probably isn't too common. > &gt...

[LLVMdev] ARM/Thumb2/ISEL Need help tracing down a failing match: (HOW?)

2012 Feb 17

[LLVMdev] ARM/Thumb2/ISEL Need help tracing down a failing match: (HOW?)

...e78210, 0x1e78310<LD4[ConstantPool]> [ID=10] Initial Opcode index to 24435 ...... Morphed node: 0x1e7adf0: i32,ch = LDRi12 0x1e78210, 0x1e78010, 0x1e7aef0, 0x1e7b0f0, 0x1e4c030<Mem:LD4[ConstantPool]> ISEL: Match complete! ISEL: Starting pattern match on root node: 0x1e78210: i32 = ARMISD::Wrapper 0x1e77f10 [ID=9] Initial Opcode index to 49796 OpcodeSwitch from 49799 to 49891 Skipped scope entry (due to false predicate) at index 49896, continuing at 49914 Morphed node: 0x1e78210: i32 = MOVi32imm 0x1e77f10 ISEL: Match complete! Here is the failing case in Thumb2 mode ISE...

[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz

2010 Jan 15

[LLVMdev] [PATCH] Emit rbit, clz on ARM for __builtin_ctz

On Jan 14, 2010, at 10:13 PM, David Conrad wrote: > Hi, > > On ARMv6T2 this turns cttz into rbit, clz instead of the 4 > instruction sequence it is now. > > I'm not sure if adding RBIT to ARMISD and doing this optimization in > the legalize pass is the best option, but the only better way I > could think of doing it was to add a bitreverse intrinsic to llvm > ir, which itself might not be the best option since bitreverse > probably isn't too common. I haven't l...

[LLVMdev] patch for llc/ARM: added mechanism to move switch tables from .text -> .data; also cleanup and documentation

2009 Jun 24

[LLVMdev] patch for llc/ARM: added mechanism to move switch tables from .text -> .data; also cleanup and documentation

search for: armisd