search for: r10

Displaying 20 results from an estimated 1410 matches for "r10".

Did you mean: 10
2017 Jan 09
4
Tweaking the Register Allocator's spill placement
...e'll call them FXLV). In one important kernel (snippet below), register allocation needs to spill values resulting from FXLV. The spiller is unaware of FXLV's latency, and thus naively inserts those spills immediately after the FXLV, incurring huge and unnecessary data stalls. FXLV r10, 0(r3.0) SV r10, 0(r63.1) # spill to stack slot FXLV r10, 16(r3.0) SV r10, 16(r63.1) # spill to stack slot FXLV r10, 32(r3.0) SV r10, 32(r63.1) # spill to stack slot FXLV r10, 48(r3.0) SV r10, 48(r63.1) # spill to stack slot ... Note also how the register allocator u...
2008 Oct 31
3
[LLVMdev] nested function's static link gets clobbered
Fellow developers, I'm parallelizing loops to be called by pthread. The thread body that I pass to pthread_create looks like define i8* @loop1({ i32*, i32* }* nest %parent_frame, i8* %arg) parent_frame is pointer to shared variables in original function 0x00007f0de11c41f0: mov (%r10),%rax 0x00007f0de11c41f3: cmpl $0x63,(%rax) 0x00007f0de11c41f6: jg 0x7f0de11c420c 0x00007f0de11c41fc: mov 0x8(%r10),%rax 0x00007f0de11c4200: incl (%rax) 0x00007f0de11c4202: mov (%r10),%rax 0x00007f0de11c4205: incl (%rax) 0x00007f0de11c4207: jmpq 0x7f0de...
2007 Dec 02
2
Optimised qmf_synth and iir_mem16
...mla r6, r0, r14,r7 @ mem[1] = mem[2] - den[1]*y[i] ldrsh r0, [r1, #6] mla r7, r4, r14,r8 @ mem[2] = mem[3] - den[2]*y[i] ldrsh r4, [r1, #8] mla r8, r0, r14,r9 @ mem[3] = mem[4] - den[3]*y[i] ldrsh r0, [r1, #10] mla r9, r4, r14,r10 @ mem[4] = mem[5] - den[4]*y[i] ldrsh r4, [r1, #12] mla r10, r0, r14,r11 @ mem[5] = mem[6] - den[5]*y[i] ldrsh r0, [r1, #14] mla r11, r4, r14,r12 @ mem[6] = mem[7] - den[6]*y[i] subs r3, r3, #1 mul r12, r0, r14 @ mem[7] = -...
2008 Nov 01
0
[LLVMdev] nested function's static link gets clobbered
...;m parallelizing loops to be called by pthread. The thread body that I pass > to pthread_create looks like > > define i8* @loop1({ i32*, i32* }* nest %parent_frame, i8* %arg) > parent_frame is pointer to shared variables in original function > > 0x00007f0de11c41f0: mov (%r10),%rax > 0x00007f0de11c41f3: cmpl $0x63,(%rax) > 0x00007f0de11c41f6: jg 0x7f0de11c420c > 0x00007f0de11c41fc: mov 0x8(%r10),%rax > 0x00007f0de11c4200: incl (%rax) > 0x00007f0de11c4202: mov (%r10),%rax > 0x00007f0de11c4205: incl (%rax) > 0x0...
2006 Jun 26
0
[klibc 25/43] ia64 support for klibc
...ype \ +name (void) \ +{ \ + register long _r8 asm ("r8"); \ + register long _r10 asm ("r10"); \ + register long _r15 asm ("r15") = __NR_##name; \ + long _retval; \ + __asm __volatile (__IA64_BREAK \ +...
2009 Sep 24
6
[patch 1/2] grub-0.97: btrfs support for a singe device configuration
2010 Sep 21
1
[LLVMdev] Possible missed optimization on function calling?
...a); extern int mdiv(int a, int b); int foo(int a, int b) { int a4 = mdiv(mcos(a), msin(b)); return a4; } I noticed this while testing it for the backend i'm currently developing, but it produces exactly the same code for other targets: march = msp430: push.w r11 push.w r10 push.w r9 push.w r8 mov.w r14, r11 mov.w r15, r10 ; store a mov.w r13, r15 mov.w r12, r14 ; pass b call #msin mov.w r15, r9 mov.w r14, r8 ; store msin(b) mov.w r10, r15 mov.w r11, r14 ; pass a call #mcos m...
2004 Oct 06
3
flac-1.1.1 completely broken on linux/ppc and on macosx if built with the standard toolchain (not xcode)
Sadly the latest optimization broke completely everything. The asm code isn't gas compliant. the libFLAC linker script has a typo, disabling the asm optimization and/or altivec won't let a correct build anyway. Instant fixes for the asm stuff: sed -i -e"s:;:\#:" on the lpc_asm.s to load address instead of addis+ori you could use lis and la and PLEASE use the @l(register)
2004 Sep 10
1
altivec lpc_restore_signal
...0xfffffc00) mtspr 256,r31 ; declare VRs in vrsave cmplw cr0,r8,r4 ; i<data_len bc 4,0,L1400 ; load coefficients into v0-v7 and initial history into v8-v15 li r31,0xf and r31,r8,r31 ; r31: data%4 li r11,16 subf r31,r31,r11 ; r31: 4-(data%4) slwi r31,r31,3 ; convert to bits for vsro li r10,-4 stw r31,-4(r9) lvewx v0,r10,r9 vspltisb v18,-1 vsro v18,v18,v0 ; v18: mask vector li r31,0x8 lvsl v0,0,r31 vsldoi v0,v0,v0,12 li r31,0xc lvsl v1,0,r31 vspltisb v2,0 vspltisb v3,-1 vmrglw v2,v2,v3 vsel v0,v1,v0,v2 ; v0: reversal permutation vector add r10,r5,r6 lvsl v17,0,r5 ; v1...
2013 Jul 23
2
[LLVMdev] Question on optimizeThumb2JumpTables
...continue; I am trying to figure out why the restriction of LeaMI->getOperand(0).getReg() != BaseReg is there. It seems this is overly restrictive. For example, here is a case where it succeeds: 8944B BB#53: derived from LLVM BB %172 Live Ins: %R4 %R6 %D8 %Q5 %R9 %R7 %R8 %R10 %R5 %R11 Predecessors according to CFG: BB#52 8976B %R1<def> = t2LEApcrelJT <jt#2>, 2, pred:14, pred:%noreg 8992B %R1<def> = t2ADDrs %R1<kill>, %R10, 18, pred:14, pred:%noreg, opt:%noreg 9004B %LR<def> = t2MOVi 1, pred:14,...
2012 Feb 13
0
[PATCH 05/14] arm: implement exception and hypercall entries.
...); + DEFINE(OFFSET_VCPU_R6, offsetof(struct vcpu_guest_context, r6)); + DEFINE(OFFSET_VCPU_R7, offsetof(struct vcpu_guest_context, r7)); + DEFINE(OFFSET_VCPU_R8, offsetof(struct vcpu_guest_context, r8)); + DEFINE(OFFSET_VCPU_R9, offsetof(struct vcpu_guest_context, r9)); + DEFINE(OFFSET_VCPU_R10, offsetof(struct vcpu_guest_context, r10)); + DEFINE(OFFSET_VCPU_R11, offsetof(struct vcpu_guest_context, r11)); + DEFINE(OFFSET_VCPU_R12, offsetof(struct vcpu_guest_context, r12)); + DEFINE(OFFSET_VCPU_R13, offsetof(struct vcpu_guest_context, r13)); + DEFINE(OFFSET_VCPU_R14, offsetof(str...
2014 Feb 08
3
[PATCH 1/2] arm: Use the UAL syntax for ldr<cc>h instructions
On Fri, 7 Feb 2014, Timothy B. Terriberry wrote: > Martin Storsjo wrote: >> This is required in order to build using the built-in assembler >> in clang. > > These patches break the gcc build (with "Error: bad instruction"). Ah, right, sorry about that. > Documentation I've seen is contradictory on which order ({cond}{size} or > {size}{cond}) is correct.
2013 Oct 22
1
[LLVMdev] System call miscompilation using the fast register allocator
...ain(i32 %argc, i8** nocapture %argv) unnamed_addr nounwind uwtable { entry: %val = alloca i32, align 4 store i32 1, i32* %val, align 4 %0 = ptrtoint i32* %val to i64 call void asm sideeffect "", "{r8}"(i64 4) nounwind call void asm sideeffect "", "{r10}"(i64 %0) nounwind call void asm sideeffect "", "{rdx}"(i64 3) nounwind call void asm sideeffect "", "{rsi}"(i64 1) nounwind call void asm sideeffect "", "{rdi}"(i64 -1) nounwind %1 = call i64 asm sideeffect "",...
2012 Nov 01
2
[LLVMdev] Undef registers in dependency graph
Hi, I see that currently physical register uses marked as "undef" can still cause dependencies. Is this intentional? SU(9): %D5<def,undef> = LDrid %R0, 0, %R10<imp-def>, %R11<imp-def> # preds left : 0 # succs left : 11 # rdefs left : 0 Latency : 1 Depth : 0 Height : 0 Successors: ... val SU(14): Latency=1 val SU(14): Latency=1 val SU(14): Latency=1 ....
2014 Nov 05
2
[LLVMdev] Stackmaps: caller-save-registers passed as deopt args
> On Oct 31, 2014, at 5:28 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: > > Hi Kevin, > > Thank you for starting this discussion! Yes, sorry for being unresponsive for a few days. Sanjoy summarized the issues perfectly. > I think the distinction is really between whether the live values are > "live on call" or "live on return".
2013 Jul 29
0
[LLVMdev] Question on optimizeThumb2JumpTables
...restriction of > LeaMI->getOperand(0).getReg() != BaseReg is there. It seems this is overly > restrictive. For example, here is a case where it succeeds:**** > > ** ** > > 8944B BB#53: derived from LLVM BB %172**** > > Live Ins: %R4 %R6 %D8 %Q5 %R9 %R7 %R8 %R10 %R5 %R11**** > > Predecessors according to CFG: BB#52**** > > 8976B %R1<def> = t2LEApcrelJT <jt#2>, 2, pred:14, pred:%noreg*** > * > > 8992B %R1<def> = t2ADDrs %R1<kill>, %R10, 18, pred:14, > pred:%noreg, opt:%noreg****...
2017 Oct 13
2
Machine Scheduler on Power PC: Latency Limit and Register Pressure
...by those loads we are forced to spill. Here is the final assembly. -- # BB#0: # %entry std r30, -16(r1) # 8-byte Folded Spill ld r5, 0(r3) ld r6, 0(r4) ld r7, 8(r3) ld r8, 8(r4) ld r9, 16(r3) ld r10, 16(r4) ld r11, 24(r3) ld r0, 32(r3) ld r12, 24(r4) ld r30, 32(r4) ld r3, 40(r3) ld r4, 40(r4) divd r5, r5, r6 divd r6, r7, r8 divd r7, r9, r10 divd r9, r0, r30 divd r4, r3, r4 divd r8, r11, r12...
2003 Oct 14
1
Token.c appears to have a bug.
...he sign bit for the character that resulted from the cast. Is this really what is intended? Or should there be parenthesis around (n >> 8) to make sure that it happens before the most significant part of "n" is discarded? The actual assembly code generated is: LDL R10, n ; R10, 16(FP) SLL R10, 56, R10 SRA R10, 63, R10 STQ R10, temp_byte ; R10, 8(FP) -John wb8tyw@qsl.network Personal Opinion Only
2013 Jul 29
1
[LLVMdev] Question on optimizeThumb2JumpTables
...figure out why the restriction of LeaMI->getOperand(0).getReg() != BaseReg is there. It seems this is overly restrictive. For example, here is a case where it succeeds: > > > > 8944B BB#53: derived from LLVM BB %172 > > Live Ins: %R4 %R6 %D8 %Q5 %R9 %R7 %R8 %R10 %R5 %R11 > > Predecessors according to CFG: BB#52 > > 8976B %R1<def> = t2LEApcrelJT <jt#2>, 2, pred:14, pred:%noreg > > 8992B %R1<def> = t2ADDrs %R1<kill>, %R10, 18, pred:14, pred:%noreg, opt:%noreg > > 9004B...
2006 Jun 26
0
[klibc 23/43] cris support for klibc
.../* + * In 2's complement arithmetric, -x == (~x + 1), so + * -{h,l} = (~{h,l} + {0,1) + * -{h,l} = {~h,~l} + {0,1} + * -{h,l} = {~h + cy, ~l + 1} + * ... where cy = (l == 0) + * -{h,l} = {~h + cy, -l} + */ + + .text + .balign 4 + .type __negdi2, at function + .globl __negdi2 +__negdi2: + neg.d $r10,$r10 + seq $r12 + not $r11 + ret + add.d $r12,$r11 + + .size __negdi2, .-__negdi2 diff --git a/usr/klibc/arch/cris/crt0.S b/usr/klibc/arch/cris/crt0.S new file mode 100644 index 0000000..22cb9b4 --- /dev/null +++ b/usr/klibc/arch/cris/crt0.S @@ -0,0 +1,27 @@ +# +# arch/cris/crt0.S +# +# Does arch...