similar to: [ARM] Register pressure with -mthumb forces register reload before each call

Displaying 20 results from an estimated 200 matches similar to: "[ARM] Register pressure with -mthumb forces register reload before each call"

2020 Apr 15
4
[ARM] Register pressure with -mthumb forces register reload before each call
Hi, I have attached WIP patch for adding foldMemoryOperand to Thumb1InstrInfo. For the following case: void f(int x, int y, int z) { void bar(int, int, int); bar(x, y, z); bar(x, z, y); bar(y, x, z); bar(y, y, x); } it calls foldMemoryOperand twice, and thus converts two calls from blx to bl. callMI->dump() shows the function name "bar" correctly, however in generated
2020 Apr 07
2
[ARM] Register pressure with -mthumb forces register reload before each call
If I'm understanding what's going on in this test correctly, what's happening is: * ARMTargetLowering::LowerCall prefers indirect calls when a function is called at least 3 times in minsize * In thumb 1 (without -fno-omit-frame-pointer) we have effectively only 3 callee-saved registers (r4-r6) * The function has three arguments, so those three plus the register we need to hold the
2020 Apr 15
2
[ARM] Register pressure with -mthumb forces register reload before each call
On Wed, 15 Apr 2020 at 03:36, John Brawn <John.Brawn at arm.com> wrote: > > > Could you please point out what am I doing wrong in the patch ? > > It's because you're getting the function name by doing > callee->getName().str().c_str() > The str() call generates a temporary copy of the name which ceases to exist outside of this expression > causing the
2012 Oct 25
0
[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.
When examining the debug output of regalloc, it seems that joining 32bits reg also joins 128 parent reg. If I look at the : %vreg34<def> = COPY %vreg6:sel_y; R600_Reg32:%vreg34 R600_Reg128:%vreg6 instructions ; it gets joined to : 928B%vreg34<def> = COPY %vreg48:sel_y;  when vreg6 and vreg48 are joined. It's right. But joining the following copy 
2012 Oct 25
2
[LLVMdev] RegisterCoalescing Pass seems to ignore part of CFG.
> > PHIElim and TwoAddress passes leave SSA form. > May be a missed something in your code but %vreg48 seems to be there > after PHI elimination. PHIElim tags those kind of registers as being > PHIJoin regs, updating LiveVariables pass, so the regcoalescer is aware > of them (some SSA info is still alive but the reg coalescer will > invalidate that information after
2011 Apr 29
1
[LLVMdev] [Patch] Thumb BLXr doesn't set the register operand
The tBLXr description in ARMInstrThumb.td is not complete. It doesn't set the register operand. -- // koan-sin tan -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110429/5f375f23/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: tBLXr.diff Type:
2012 Oct 20
2
[LLVMdev] RegisterCoalescing pass crashes with ImplicitDef registers
Hi, below is an output of "llc -march=r600 -mcpu=cayman -print-before-all -debug-only=regalloc file.shader" command from llvm3.2svn. The register coalescing pass crashes when joining vreg12:sel_z with vreg13 registers, because it tries to access the interval liveness of vreg13... which is undefined. I don't know if it's a bug of the pass, or if my backend should do something
2010 Sep 05
2
[LLVMdev] Possible missed optimization?
On Sep 4, 2010, at 5:40 PM, Eli Friedman wrote: > If you want to take a look at this yourself, the issue is easy to > reproduce with Thumb1: Thanks, Eli. Nice catch! This IR: target triple = "thumbv5-u-u" define arm_aapcscc i64 @foo(i64 %a, i64 %b) nounwind readnone { entry: %xor = xor i64 %a, 18 ; <i64> [#uses=1] %xor2 = xor i64 %xor, %b
2019 Apr 14
2
[A bug?] Failed to use BuildMI to add R7 - R12 registers for tADDi8 and tPUSH of ARM
Hi Craig, Thanks for the information. Can you point to the source that specifies tGPR to be R0 - R7? I tried to search in ARMInstrThumb.td but couldn’t find it. Thanks, - Jie On Apr 14, 2019, at 15:28, Craig Topper <craig.topper at gmail.com<mailto:craig.topper at gmail.com>> wrote: I believe there is probably a separate instruction in LLVM for thumb2 add. Probably starting with t2
2020 Jun 16
2
[ARM] Thumb code-gen for 8-bit imm arguments results in extra reg copies
Hi, For the following test-case: void foo(unsigned, unsigned); void f() { foo(10, 20); foo(10, 20); } clang --target=arm-linux-gnueabi -mthumb -O2 generates: push {r4, r5, r7, lr} movs r4, #10 movs r5, #20 movs r0, r4 movs r1, r5 bl foo movs r0, r4 movs r1, r5 bl foo pop {r4,
2012 May 14
0
[LLVMdev] getMinimalPhysRegClass
Reed, On May 14, 2012, at 3:45 PM, reed kotler <rkotler at mips.com> wrote: > On 05/14/2012 02:42 PM, Jakob Stoklund Olesen wrote: >> On May 14, 2012, at 2:28 PM, reed kotler wrote: >> >>> I'm not using getMinimalPhysRegClass. Some target independent code is using it. >> Probably PEI. >> >>> It makes trouble for us and I would like to
2013 Sep 26
0
[LLVMdev] Register scavenger and SP/FP adjustments
The code has changed a lot over the years. Looks like at some point of time the assumption was broken. calculateCallsInformation() may have eliminated the pseudo set up instructions already. // If call frames are not being included as part of the stack frame, and
2013 Sep 26
1
[LLVMdev] Register scavenger and SP/FP adjustments
Thanks, I'll look into that. Still, the case where the function does not call anything remains---in such a situation there are no ADJCALLSTACK pseudos, so regardless of what that function you pointed at does, there won't be any target-independent information about the SP adjustment by the time the frame index elimination runs. Would it make sense to have ADJCALLSTACK pseudos every
2013 Sep 26
2
[LLVMdev] Register scavenger and SP/FP adjustments
Consider this example: --- ex.ll --- declare void @bar() ; Function Attrs: nounwind optsize define void @main() { entry: %hin = alloca [256 x i32], align 4 %xin = alloca [256 x i32], align 4 call void @bar() ret void } ------------- Freshly built llc: llc -O2 -march=x86 < ex.ll -print-before-all # *** IR Dump Before Prologue/Epilogue Insertion & Frame Finalization ***: #
2010 Sep 14
2
[LLVMdev] Thumb categorizing TST wrongly
I see strangeness on Thumb TST (tTST) predicate 'isCompare' It is true for regular ARM, false for Thumb: (gdb) p MI->dump() TSTri %reg16397, 3, pred:14, pred:%reg0, %CPSR<imp-def>; GPR:% reg16397 $24 = void (gdb) p MI->getDesc().isCompare() $25 = true (gdb) p MI->dump() tTST %reg16396, %reg16397, pred:14, pred:%reg0, %CPSR<imp-def>; tGPR:%reg16396,16397
2012 May 14
3
[LLVMdev] getMinimalPhysRegClass
On 05/14/2012 02:42 PM, Jakob Stoklund Olesen wrote: > On May 14, 2012, at 2:28 PM, reed kotler wrote: > >> I'm not using getMinimalPhysRegClass. Some target independent code is using it. > Probably PEI. > >> It makes trouble for us and I would like to submit a patch to make it a virtual function so that I can override it and make it meaningful for Mips, as long as this
2010 Sep 14
0
[LLVMdev] Thumb categorizing TST wrongly
On Sep 14, 2010, at 12:09 PM, Gabor Greif wrote: > I see strangeness on Thumb TST (tTST) predicate 'isCompare' > > It is true for regular ARM, false for Thumb: > > (gdb) p MI->dump() > TSTri %reg16397, 3, pred:14, pred:%reg0, %CPSR<imp-def>; GPR:% > reg16397 > $24 = void > (gdb) p MI->getDesc().isCompare() > $25 = true > > > (gdb)
2011 Aug 16
2
[LLVMdev] Tying an instruction to a specific set of registers
Jim, Thanks for the hints. Does LLVM allow allocation of the same register across register classes? For example, in the ARM backend, can an instruction write to R0 when it is part of register class tGPR, but then use R0 in the next instruction as a source register from the rGPR class? If LLVM can do this, then this will work. Micah > -----Original Message----- > From: Jim Grosbach
2012 May 14
0
[LLVMdev] getMinimalPhysRegClass
On May 14, 2012, at 2:28 PM, reed kotler wrote: > I'm not using getMinimalPhysRegClass. Some target independent code is using it. Probably PEI. > It makes trouble for us and I would like to submit a patch to make it a virtual function so that I can override it and make it meaningful for Mips, as long as this method still exists. > > I want to add another register class for
2012 Jan 25
0
[LLVMdev] mips16
On Jan 24, 2012, at 1:46 AM, Reed Kotler wrote: > Mips16 is a mode of the Mips32 (or Mips64) processor. For the most part, > it is a compressed form of the MIPS32 instruction set, though not all > instructions are supported. Most of the same opcodes and formats are > present though sometimes with some restriction. (The micro mips > architecture is a true 16 bit compressed form