Displaying 16 results from an estimated 16 matches for "tbx_ret".
Did you mean:
bx_ret
2010 Sep 05
2
[LLVMdev] Possible missed optimization?
...d:14, pred:%reg0
88L %reg16391<def> = COPY %reg16390<kill>
96L %reg16391<def>, %CPSR<def,dead> = tEOR %reg16391, %reg16389<kill>, pred:14, pred:%reg0
108L %R0<def> = COPY %reg16391<kill>
116L %R1<def> = COPY %reg16388<kill>
128L tBX_RET %R0<imp-use,kill>, %R1<imp-use,kill>
and after:
44L %R1<def>, %CPSR<def,dead> = tEOR %R1, %R3<kill>, pred:14, pred:%reg0
56L %reg16389<def> = COPY %R0<kill>
64L %reg16389<def>, %CPSR<def,dead> = tEOR %reg16389, %R2<kill>, pre...
2010 Sep 05
0
[LLVMdev] Possible missed optimization?
On Sat, Sep 4, 2010 at 1:31 PM, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote:
>
> On Sep 4, 2010, at 11:21 AM, Borja Ferrer wrote:
>
>> I've noticed this pattern happening with other operators aswell, but used xor in this example. As i said before, i tried with different register allocation orders, but it will produce always the same result. GCC is emitting longer
2010 Sep 04
3
[LLVMdev] Possible missed optimization?
On Sep 4, 2010, at 11:21 AM, Borja Ferrer wrote:
> I've noticed this pattern happening with other operators aswell, but used xor in this example. As i said before, i tried with different register allocation orders, but it will produce always the same result. GCC is emitting longer code, but since LLVM is so nearer to the optimal code sequence i wanted to reach it.
In LLVM, copies are
2009 Sep 24
0
[LLVMdev] Missing isBarrier on ARM/THUMB return instructions
isBarrier is not defined in BX_RET and tBX_RET instructions and the
Machine Instructions Verifier (-verify-machineinstrs) give errors about
that.
Is it normal that isBarrier is not defined on these instructions ?
2020 Mar 31
2
[ARM] Register pressure with -mthumb forces register reload before each call
...q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
352B ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
368B tBX_RET 14, $noreg
# End machine code for function uECC_shared_secret.
********** SIMPLE REGISTER COALESCING **********
********** Function: uECC_shared_secret
********** JOINING INTERVALS ***********
entry:
16B %2:tgpr = COPY $r2
Considering merging %2 with $r2
Can only merge into reserved registers....
2013 Sep 26
2
[LLVMdev] Register scavenger and SP/FP adjustments
...used in the
frame setup:
# *** IR Dump Before Prologue/Epilogue Insertion & Frame Finalization ***:
# Machine code for function main: Post SSA
Frame Objects:
fi#0: size=1024, align=4, at location [SP]
fi#1: size=1024, align=4, at location [SP]
BB#0: derived from LLVM BB %entry
tBX_RET pred:14, pred:%noreg
# End machine code for function main.
before replace frame indices
# Machine code for function main: Post SSA
Frame Objects:
fi#0: size=1024, align=4, at location [SP-1032]
fi#1: size=1024, align=4, at location [SP-2056]
fi#2: size=4, align=4, at location [SP-4]
f...
2013 Sep 26
0
[LLVMdev] Register scavenger and SP/FP adjustments
...*** IR Dump Before Prologue/Epilogue Insertion & Frame Finalization ***:
> # Machine code for function main: Post SSA
> Frame Objects:
> fi#0: size=1024, align=4, at location [SP]
> fi#1: size=1024, align=4, at location [SP]
>
> BB#0: derived from LLVM BB %entry
> tBX_RET pred:14, pred:%noreg
>
> # End machine code for function main.
>
> before replace frame indices
> # Machine code for function main: Post SSA
> Frame Objects:
> fi#0: size=1024, align=4, at location [SP-1032]
> fi#1: size=1024, align=4, at location [SP-2056]
> fi#2: s...
2013 Sep 26
1
[LLVMdev] Register scavenger and SP/FP adjustments
.../Epilogue Insertion & Frame Finalization ***:
>> # Machine code for function main: Post SSA
>> Frame Objects:
>> fi#0: size=1024, align=4, at location [SP]
>> fi#1: size=1024, align=4, at location [SP]
>>
>> BB#0: derived from LLVM BB %entry
>> tBX_RET pred:14, pred:%noreg
>>
>> # End machine code for function main.
>>
>> before replace frame indices
>> # Machine code for function main: Post SSA
>> Frame Objects:
>> fi#0: size=1024, align=4, at location [SP-1032]
>> fi#1: size=1024, align=4, at lo...
2020 Apr 07
2
[ARM] Register pressure with -mthumb forces register reload before each call
If I'm understanding what's going on in this test correctly, what's happening is:
* ARMTargetLowering::LowerCall prefers indirect calls when a function is called at least 3 times in minsize
* In thumb 1 (without -fno-omit-frame-pointer) we have effectively only 3 callee-saved registers (r4-r6)
* The function has three arguments, so those three plus the register we need to hold the
2018 Jan 18
0
[RFC] Half-Precision Support in the Arm Backends
...COPY_TO_REGCLASS t16, TargetConstant:i32<1> <~~~~~~~~~~~~~ PROBLEM HERE
t12: f32 = VCVTBHS t20, TargetConstant:i32<14>, Register:i32 %noreg
t7: i32 = VMOVRS t12, TargetConstant:i32<14>, Register:i32 %noreg
t9: ch,glue = CopyToReg t0, Register:i32 %r0, t7
t10: ch = tBX_RET TargetConstant:i32<14>, Register:i32 %noreg, Register:i32 %r0, t9, t9:1
It goes all wrong from here, because a f16 value is produced and it is not a
legal type etc.
The reason that this happens must be because VCVTBHS that is used in
the f16_to_fp rewrite rule, is specified to consume a f16...
2020 Apr 15
4
[ARM] Register pressure with -mthumb forces register reload before each call
...q6 $q7 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $s16 $s17 $s18 $s19 $s20 $s21 $s22 $s23 $s24 $s25 $s26 $s27 and 35 more...>, implicit-def dead $lr, implicit $sp, implicit $r0, implicit $r1, implicit $r2, implicit-def $sp
448B ADJCALLSTACKUP 0, 0, 14, $noreg, implicit-def dead $sp, implicit $sp
464B tBX_RET 14, $noreg
# End machine code for function f.
********** SIMPLE REGISTER COALESCING **********
********** Function: f
********** JOINING INTERVALS ***********
entry:
16B %2:tgpr = COPY $r2
Considering merging %2 with $r2
Can only merge into reserved registers.
32B %1:tgpr = COPY $r1
Considerin...
2017 Dec 06
2
[RFC] Half-Precision Support in the Arm Backends
Thanks a lot for the suggestions! I will look into using vld1/vst1, sounds good.
I am custom lowering the bitcasts, that's now the only place where FP_TO_FP16
and FP16_TO_FP nodes are created to avoid inefficient code generation. I will
double check if I can't achieve the same without using these nodes (because I
really would like to get completely rid of them).
Cheers,
Sjoerd.
2018 Jan 18
1
[RFC] Half-Precision Support in the Arm Backends
...COPY_TO_REGCLASS t16, TargetConstant:i32<1> <~~~~~~~~~~~~~ PROBLEM HERE
t12: f32 = VCVTBHS t20, TargetConstant:i32<14>, Register:i32 %noreg
t7: i32 = VMOVRS t12, TargetConstant:i32<14>, Register:i32 %noreg
t9: ch,glue = CopyToReg t0, Register:i32 %r0, t7
t10: ch = tBX_RET TargetConstant:i32<14>, Register:i32 %noreg, Register:i32 %r0, t9, t9:1
It goes all wrong from here, because a f16 value is produced and it is not a
legal type etc.
The reason that this happens must be because VCVTBHS that is used in
the f16_to_fp rewrite rule, is specified to consume a f16...
2013 Sep 26
0
[LLVMdev] Register scavenger and SP/FP adjustments
CallFrameSetupOpcode is a pseudo opcode like X86::ADJCALLSTACKDOWN64. That means when the code is expected to be called before the pseudo instructions are eliminated. I don't know why it's not the case for you. A quick look at PEI code indicates the pseudo's should not have been removed at the time when replaceFrameIndices are run.
Evan
On Sep 25, 2013, at 8:57 AM, Krzysztof
2013 Sep 25
2
[LLVMdev] Register scavenger and SP/FP adjustments
Hi All,
I'm dealing with a problem where the spill/restore instructions inserted
during scavenging span an adjustment of the SP/FP register. The result
is that despite the base register (SP/FP) being changed between the
spill and the restore, both store and load use the same immediate offset.
I see code in the PEI (replaceFrameIndices) that is supposed to track
the SP/FP adjustment:
2015 Jan 11
3
[LLVMdev] [RFC] [PATCH] add tail call optimization to thumb1-only targets
...wering::emitEpilogue(MachineFunction &MF,
- MachineBasicBlock &MBB) const {
+ MachineBasicBlock &MBB) const {
MachineBasicBlock::iterator MBBI = MBB.getLastNonDebugInstr();
assert((MBBI->getOpcode() == ARM::tBX_RET ||
- MBBI->getOpcode() == ARM::tPOP_RET) &&
- "Can only insert epilog into returning blocks");
+ MBBI->getOpcode() == ARM::tPOP_RET ||
+ MBBI->getOpcode() == ARM::TCRETURNri)
+ && "Can only insert epilog into return...