search for: tbx_ret

Displaying 16 results from an estimated 16 matches for "tbx_ret".

Did you mean: bx_ret
2010 Sep 05
2
[LLVMdev] Possible missed optimization?
...d:14, pred:%reg0 88L %reg16391<def> = COPY %reg16390<kill> 96L %reg16391<def>, %CPSR<def,dead> = tEOR %reg16391, %reg16389<kill>, pred:14, pred:%reg0 108L %R0<def> = COPY %reg16391<kill> 116L %R1<def> = COPY %reg16388<kill> 128L tBX_RET %R0<imp-use,kill>, %R1<imp-use,kill> and after: 44L %R1<def>, %CPSR<def,dead> = tEOR %R1, %R3<kill>, pred:14, pred:%reg0 56L %reg16389<def> = COPY %R0<kill> 64L %reg16389<def>, %CPSR<def,dead> = tEOR %reg16389, %R2<kill>, pre...
2010 Sep 05
0
[LLVMdev] Possible missed optimization?
On Sat, Sep 4, 2010 at 1:31 PM, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote: > > On Sep 4, 2010, at 11:21 AM, Borja Ferrer wrote: > >> I've noticed this pattern happening with other operators aswell, but used xor in this example. As i said before, i tried with different register allocation orders, but it will produce always the same result. GCC is emitting longer
2010 Sep 04
3
[LLVMdev] Possible missed optimization?
On Sep 4, 2010, at 11:21 AM, Borja Ferrer wrote: > I've noticed this pattern happening with other operators aswell, but used xor in this example. As i said before, i tried with different register allocation orders, but it will produce always the same result. GCC is emitting longer code, but since LLVM is so nearer to the optimal code sequence i wanted to reach it. In LLVM, copies are
2009 Sep 24
0
[LLVMdev] Missing isBarrier on ARM/THUMB return instructions
isBarrier is not defined in BX_RET and tBX_RET instructions and the Machine Instructions Verifier (-verify-machineinstrs) give errors about that. Is it normal that isBarrier is not defined on these instructions ?
2020 Mar 31
2
[ARM] Register pressure with -mthumb forces register reload before each call
...q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp 352B ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp 368B tBX_RET 14, $noreg # End machine code for function uECC_shared_secret. ********** SIMPLE REGISTER COALESCING ********** ********** Function: uECC_shared_secret ********** JOINING INTERVALS *********** entry: 16B %2:tgpr = COPY $r2 Considering merging %2 with $r2 Can only merge into reserved registers....
2013 Sep 26
2
[LLVMdev] Register scavenger and SP/FP adjustments
...used in the frame setup: # *** IR Dump Before Prologue/Epilogue Insertion & Frame Finalization ***: # Machine code for function main: Post SSA Frame Objects: fi#0: size=1024, align=4, at location [SP] fi#1: size=1024, align=4, at location [SP] BB#0: derived from LLVM BB %entry tBX_RET pred:14, pred:%noreg # End machine code for function main. before replace frame indices # Machine code for function main: Post SSA Frame Objects: fi#0: size=1024, align=4, at location [SP-1032] fi#1: size=1024, align=4, at location [SP-2056] fi#2: size=4, align=4, at location [SP-4] f...
2013 Sep 26
0
[LLVMdev] Register scavenger and SP/FP adjustments
...*** IR Dump Before Prologue/Epilogue Insertion & Frame Finalization ***: > # Machine code for function main: Post SSA > Frame Objects: > fi#0: size=1024, align=4, at location [SP] > fi#1: size=1024, align=4, at location [SP] > > BB#0: derived from LLVM BB %entry > tBX_RET pred:14, pred:%noreg > > # End machine code for function main. > > before replace frame indices > # Machine code for function main: Post SSA > Frame Objects: > fi#0: size=1024, align=4, at location [SP-1032] > fi#1: size=1024, align=4, at location [SP-2056] > fi#2: s...
2013 Sep 26
1
[LLVMdev] Register scavenger and SP/FP adjustments
.../Epilogue Insertion & Frame Finalization ***: >> # Machine code for function main: Post SSA >> Frame Objects: >> fi#0: size=1024, align=4, at location [SP] >> fi#1: size=1024, align=4, at location [SP] >> >> BB#0: derived from LLVM BB %entry >> tBX_RET pred:14, pred:%noreg >> >> # End machine code for function main. >> >> before replace frame indices >> # Machine code for function main: Post SSA >> Frame Objects: >> fi#0: size=1024, align=4, at location [SP-1032] >> fi#1: size=1024, align=4, at lo...
2020 Apr 07
2
[ARM] Register pressure with -mthumb forces register reload before each call
If I'm understanding what's going on in this test correctly, what's happening is: * ARMTargetLowering::LowerCall prefers indirect calls when a function is called at least 3 times in minsize * In thumb 1 (without -fno-omit-frame-pointer) we have effectively only 3 callee-saved registers (r4-r6) * The function has three arguments, so those three plus the register we need to hold the
2018 Jan 18
0
[RFC] Half-Precision Support in the Arm Backends
...COPY_TO_REGCLASS t16, TargetConstant:i32<1> <~~~~~~~~~~~~~ PROBLEM HERE t12: f32 = VCVTBHS t20, TargetConstant:i32<14>, Register:i32 %noreg t7: i32 = VMOVRS t12, TargetConstant:i32<14>, Register:i32 %noreg t9: ch,glue = CopyToReg t0, Register:i32 %r0, t7 t10: ch = tBX_RET TargetConstant:i32<14>, Register:i32 %noreg, Register:i32 %r0, t9, t9:1 It goes all wrong from here, because a f16 value is produced and it is not a legal type etc. The reason that this happens must be because VCVTBHS that is used in the f16_to_fp rewrite rule, is specified to consume a f16...
2020 Apr 15
4
[ARM] Register pressure with -mthumb forces register reload before each call
...q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp 448B ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp 464B tBX_RET 14, $noreg # End machine code for function f. ********** SIMPLE REGISTER COALESCING ********** ********** Function: f ********** JOINING INTERVALS *********** entry: 16B %2:tgpr = COPY $r2 Considering merging %2 with $r2 Can only merge into reserved registers. 32B %1:tgpr = COPY $r1 Considerin...
2017 Dec 06
2
[RFC] Half-Precision Support in the Arm Backends
Thanks a lot for the suggestions! I will look into using vld1/vst1, sounds good. I am custom lowering the bitcasts, that's now the only place where FP_TO_FP16 and FP16_TO_FP nodes are created to avoid inefficient code generation. I will double check if I can't achieve the same without using these nodes (because I really would like to get completely rid of them). Cheers, Sjoerd.
2018 Jan 18
1
[RFC] Half-Precision Support in the Arm Backends
...COPY_TO_REGCLASS t16, TargetConstant:i32<1> <~~~~~~~~~~~~~ PROBLEM HERE t12: f32 = VCVTBHS t20, TargetConstant:i32<14>, Register:i32 %noreg t7: i32 = VMOVRS t12, TargetConstant:i32<14>, Register:i32 %noreg t9: ch,glue = CopyToReg t0, Register:i32 %r0, t7 t10: ch = tBX_RET TargetConstant:i32<14>, Register:i32 %noreg, Register:i32 %r0, t9, t9:1 It goes all wrong from here, because a f16 value is produced and it is not a legal type etc. The reason that this happens must be because VCVTBHS that is used in the f16_to_fp rewrite rule, is specified to consume a f16...
2013 Sep 26
0
[LLVMdev] Register scavenger and SP/FP adjustments
CallFrameSetupOpcode is a pseudo opcode like X86::ADJCALLSTACKDOWN64. That means when the code is expected to be called before the pseudo instructions are eliminated. I don't know why it's not the case for you. A quick look at PEI code indicates the pseudo's should not have been removed at the time when replaceFrameIndices are run. Evan On Sep 25, 2013, at 8:57 AM, Krzysztof
2013 Sep 25
2
[LLVMdev] Register scavenger and SP/FP adjustments
Hi All, I'm dealing with a problem where the spill/restore instructions inserted during scavenging span an adjustment of the SP/FP register. The result is that despite the base register (SP/FP) being changed between the spill and the restore, both store and load use the same immediate offset. I see code in the PEI (replaceFrameIndices) that is supposed to track the SP/FP adjustment:
2015 Jan 11
3
[LLVMdev] [RFC] [PATCH] add tail call optimization to thumb1-only targets
...wering::emitEpilogue(MachineFunction &MF, - MachineBasicBlock &MBB) const { + MachineBasicBlock &MBB) const { MachineBasicBlock::iterator MBBI = MBB.getLastNonDebugInstr(); assert((MBBI->getOpcode() == ARM::tBX_RET || - MBBI->getOpcode() == ARM::tPOP_RET) && - "Can only insert epilog into returning blocks"); + MBBI->getOpcode() == ARM::tPOP_RET || + MBBI->getOpcode() == ARM::TCRETURNri) + && "Can only insert epilog into return...