search for: edi

Displaying 20 results from an estimated 1881 matches for "edi".

2014 Jan 18
2
[LLVMdev] Scheduling quirks
...<=== ...I get the following result: ===> .file "test.cpp" .text .globl _Z13test_registeri .align 16, 0x90 .type _Z13test_registeri, at function _Z13test_registeri: # @_Z13test_registeri .cfi_startproc # BB#0: # %entry movl %edi, %eax sarl $2, %eax xorl %edi, %eax movl %eax, %ecx sarl $3, %ecx xorl %eax, %ecx movl %ecx, %edx sarl $4, %edx xorl %ecx, %edx movl %edx, %eax sarl $5, %eax xorl %edx, %eax retq .Ltmp0: .size _Z13test_registeri, .Ltmp0-_Z13test_registeri .cfi_endproc .globl _Z14test_scheduleri .al...
2015 Feb 13
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
...var_34 = dword ptr -34h push rbp push r15 push r14 push r13 push r12 push rbx sub rsp, 18h mov ebx, 0FFFFFFFFh cmp edi, 2 jnz loc_100000F29 mov rdi, [rsi+8] ; char * xor r14d, r14d xor esi, esi ; char ** mov edx, 0Ah ; int call _strtol mov r15, rax...
2018 Nov 20
2
A pattern for portable __builtin_add_overflow()
...nsigned types this is easy: int uaddo_native(unsigned a, unsigned b, unsigned* s) { return __builtin_add_overflow(a, b, s); } int uaddo_portable(unsigned a, unsigned b, unsigned* s) { *s = a + b; return *s < a; } We get exactly the same assembly: uaddo_native: # @uaddo_native xor eax, eax add edi, esi setb al mov dword ptr [rdx], edi ret uaddo_portable: # @uaddo_portable xor eax, eax add edi, esi setb al mov dword ptr [rdx], edi ret But with signed types it is not so easy. I tried 2 versions, but the result is quite far away from the optimal assembly. int saddo_native(int a, int b, int*...
2015 Feb 14
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
...push r15 >> push r14 >> push r13 >> push r12 >> push rbx >> sub rsp, 18h >> mov ebx, 0FFFFFFFFh >> cmp edi, 2 >> jnz loc_100000F29 >> mov rdi, [rsi+8] ; char * >> xor r14d, r14d >> xor esi, esi ; char ** >> mov edx, 0Ah ; int >> call...
2015 Feb 14
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
...push r14 >>>> push r13 >>>> push r12 >>>> push rbx >>>> sub rsp, 18h >>>> mov ebx, 0FFFFFFFFh >>>> cmp edi, 2 >>>> jnz loc_100000F29 >>>> mov rdi, [rsi+8] ; char * >>>> xor r14d, r14d >>>> xor esi, esi ; char ** >>>> mov edx, 0Ah...
2004 Sep 10
2
An assembly optimization and fix
I have optimized FLAC__fixed_compute_best_predictor_asm_ia32_mmx_cmov function and fixed bug when data_len == 0. Now the function is about 50% faster and flac -5 is about 5% faster on my box. I have tested it thoroughly, I think it can go to flac 1.0.4. -- Miroslav Lichvar -------------- next part -------------- --- src/libFLAC/ia32/fixed_asm....
2012 Mar 28
2
[LLVMdev] Suboptimal code due to excessive spilling
...{ s += sum(&x[i], 18); p[i] = 5; // xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx } return s; } ====== Output A ====== ====================== foo: # @foo .Ltmp12: .cfi_startproc # BB#0: pushl %ebx .Ltmp13: .cfi_def_cfa_offset 8 pushl %edi .Ltmp14: .cfi_def_cfa_offset 12 pushl %esi .Ltmp15: .cfi_def_cfa_offset 16 subl $88, %esp .Ltmp16: .cfi_def_cfa_offset 104 .Ltmp17: .cfi_offset %esi, -16 .Ltmp18: .cfi_offset %edi, -12 .Ltmp19: .cfi_offset %ebx, -8 pxor %xmm0, %xmm0 movl 112(%esp), %eax testl %eax, %eax je .LBB1_3 # BB#...
2011 Jul 11
4
[LLVMdev] RegAllocFast uses too much stack
...foo(0); foo(1); foo(2); } This doesn't just spill out all the registers to the stack before each call, we also set up 0, 1 and 2 into regs first, then spill them and don't even get a chance to reuse stack slots. That's just bad: pushq %rax movl $2, %edi movl $1, %eax movl $0, %ecx movl %edi, 4(%rsp) # 4-byte Spill movl %ecx, %edi movl %eax, (%rsp) # 4-byte Spill callq foo movl (%rsp), %edi # 4-byte Reload callq foo movl...
2015 Nov 21
2
Recent -Os code size regressions
...%cl,%cl 28f: 0f 89 34 01 00 00 jns 3c9 <t_run_test+0x3c9> After ​ r252152:​ Note that the OR $0x408 and OR $0x810 come ​now ​ in reverse order. 35d: 81 c9 08 04 00 00 or $0x408,%ecx 363: 89 4c 24 28 mov %ecx,0x28(%esp) 367: 89 df mov %ebx,%edi 369: 83 e7 10 ​ ​ and $0x10,%edi 36c: 89 7c 24 20 mov %edi,0x20(%esp) 370: 0f 45 d1 ​ ​ cmovne %ecx,%edx 373: 89 d7 mov %edx,%edi 375: 81 cf 10 08 00 00 or $0x810,%edi 37b: 89 7c 24 14 mov %edi,0x14(%esp) 37f: 89 d9...
2018 May 09
3
Ignored branch predictor hints
...fine likely(x) __builtin_expect((x),1) // switch like char * b(int e) { if (likely(e == 0)) return "0"; else if (e == 1) return "1"; else return "f"; } GCC correctly prefers the first case: b(int): mov eax, OFFSET FLAT:.LC0 test edi, edi jne .L7 ret But Clang seems to ignore _builtin_expect hints in this case. b(int): # @b(int) cmp edi, 1 mov eax, offset .L.str.1 mov ecx, offset .L.str.2 cmove rcx, rax test edi, edi mov eax, offset .L.str cmovne rax, rcx ret https://godbolt.org/g/tuAVT7 -------------- ne...
2012 Apr 05
0
[LLVMdev] Suboptimal code due to excessive spilling
...{ s += sum(&x[i], 18); p[i] = 5; // xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx } return s; } ====== Output A ====== ====================== foo: # @foo .Ltmp12: .cfi_startproc # BB#0: pushl %ebx .Ltmp13: .cfi_def_cfa_offset 8 pushl %edi .Ltmp14: .cfi_def_cfa_offset 12 pushl %esi .Ltmp15: .cfi_def_cfa_offset 16 subl $88, %esp .Ltmp16: .cfi_def_cfa_offset 104 .Ltmp17: .cfi_offset %esi, -16 .Ltmp18: .cfi_offset %edi, -12 .Ltmp19: .cfi_offset %ebx, -8 pxor %xmm0, %xmm0 movl 112(%esp), %eax testl %eax, %eax je .LBB1_3 # BB#...
2009 Mar 11
4
[LLVMdev] Bug in X86CompilationCallback_SSE
...%ebp) 0xb74b98e9 <X86CompilationCallback_SSE+9>: lea 0x17(%esp),%esi 0xb74b98ed <X86CompilationCallback_SSE+13>: and $0xfffffff0,%esi 0xb74b98f0 <X86CompilationCallback_SSE+16>: mov %ebx,-0xc(%ebp) 0xb74b98f3 <X86CompilationCallback_SSE+19>: mov %edi,-0x4(%ebp) 0xb74b98f6 <X86CompilationCallback_SSE+22>: lea 0x40(%esi),%edi 0xb74b98f9 <X86CompilationCallback_SSE+25>: call 0xb7315577 <__i686.get_pc_thunk.bx> 0xb74b98fe <X86CompilationCallback_SSE+30>: add $0x76d71e,%ebx 0xb74b9904 <X86CompilationCal...
2007 Apr 30
0
[LLVMdev] Boostrap Failure -- Expected Differences?
...gt; ./loop-iv.o differs > ./loop.o differs > ./loop-unroll.o differs > ./loop-unswitch.o differs > ./modulo-sched.o differs > ./optabs.o differs > ./opts.o differs > ./params.o differs > ./passes.o differs > ./postreload-gcse.o differs > ./postreload.o differs > ./predict.o differs > ./pretty-print.o differs > ./print-rtl.o differs > ./print-tree.o differs > ./profile.o differs > ./real.o differs > ./recog.o differs > ./regclass.o differs > ./regmove.o differs > ./regrename.o differs > ./reg-stack.o differs > ./reload1.o differs &g...
2007 Apr 27
2
[LLVMdev] Boostrap Failure -- Expected Differences?
...o differs ./loop-init.o differs ./loop-invariant.o differs ./loop-iv.o differs ./loop.o differs ./loop-unroll.o differs ./loop-unswitch.o differs ./modulo-sched.o differs ./optabs.o differs ./opts.o differs ./params.o differs ./passes.o differs ./postreload-gcse.o differs ./postreload.o differs ./predict.o differs ./pretty-print.o differs ./print-rtl.o differs ./print-tree.o differs ./profile.o differs ./real.o differs ./recog.o differs ./regclass.o differs ./regmove.o differs ./regrename.o differs ./reg-stack.o differs ./reload1.o differs ./reload.o differs ./resource.o differs ./rtlanal.o diffe...
2017 Aug 15
3
How to debug instruction selection
Hi there, I try to JIT compile some bitcode and seeing the following error: LLVM ERROR: Cannot select: 0x28ec830: ch,glue = X86ISD::CALL 0x28ec7c0, 0x28ef900, Register:i32 %EDI, Register:i8 %AL, RegisterMask:Untyped, 0x28ec7c0:1 0x28ef900: i32 = X86ISD::Wrapper TargetGlobalAddress:i32<void (i8*, ...)* @_ZN5FooBr7xprintfEPKcz> 0 0x28ec520: i32 = TargetGlobalAddress<void (i8*, ...)* @_ZN5FooBr7xprintfEPKcz> 0 0x28ec670: i32 = Register %EDI 0x28ec750: i...
2004 Sep 10
3
patch
...ion_asm_ia32 - ;[esp + 24] == autoc[] - ;[esp + 20] == lag - ;[esp + 16] == data_len - ;[esp + 12] == data[] + ;[esp + 28] == autoc[] + ;[esp + 24] == lag + ;[esp + 20] == data_len + ;[esp + 16] == data[] ;ASSERT(lag > 0) ;ASSERT(lag <= 33) @@ -71,21 +71,22 @@ .begin: push esi push edi + push ebx ; for(coeff = 0; coeff < lag; coeff++) ; autoc[coeff] = 0.0; - mov edi, [esp + 24] ; edi == autoc - mov ecx, [esp + 20] ; ecx = # of dwords (=lag) of 0 to write + mov edi, [esp + 28] ; edi == autoc + mov ecx, [esp + 24] ; ecx = # of dwords (=lag) of 0 to write xor eax...
2009 Jun 21
1
Incidence Function Model in R help
...nction Model in R" and can't get past some error with my glm arguments. I'm getting through > attach(amphimedon_compressa) > plot(x.crd,y.crd,asp=1,xlab="Easting",ylab="Northing",pch=21,col=p+1,bg=5*p) > d<-dist(cbind(x.crd,y.crd)) > alpha<-1 > edis<-as.matrix(exp(-alpha*d)) > diag(edis)<-0 > edis<-sweep(edis,2,A,"*") > S<-rowSums(edis[,p>0]) > mod<-glm(p~offset(2*log(S))+log(A),family=binomial) before I get the error message Error: NA/NaN/Inf in foreign function call (arg 4) which is an argument con...
2017 Feb 13
4
[PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function
...ed;" > >> +".type __raw_callee_save___kvm_vcpu_is_preempted, @function;" > >> +"__raw_callee_save___kvm_vcpu_is_preempted:" > >> +FRAME_BEGIN > >> +"push %rdi;" > >> +"push %rdx;" > >> +"movslq %edi, %rdi;" > >> +"movq $steal_time+16, %rax;" > >> +"movq __per_cpu_offset(,%rdi,8), %rdx;" > >> +"cmpb $0, (%rdx,%rax);" Could we not put the $steal_time+16 displacement as an immediate in the cmpb and save a whole register here?...
2017 Feb 13
4
[PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function
...ed;" > >> +".type __raw_callee_save___kvm_vcpu_is_preempted, @function;" > >> +"__raw_callee_save___kvm_vcpu_is_preempted:" > >> +FRAME_BEGIN > >> +"push %rdi;" > >> +"push %rdx;" > >> +"movslq %edi, %rdi;" > >> +"movq $steal_time+16, %rax;" > >> +"movq __per_cpu_offset(,%rdi,8), %rdx;" > >> +"cmpb $0, (%rdx,%rax);" Could we not put the $steal_time+16 displacement as an immediate in the cmpb and save a whole register here?...
2008 May 27
3
[LLVMdev] Float compare-for-equality and select optimization opportunity
...int t; t = a; a = b; b = c; c = t; } This is the resulting x86 assembly code: movss xmm0,dword ptr [ecx+4] ucomiss xmm0,dword ptr [ecx+8] sete al setnp dl test dl,al mov edx,edi cmovne edx,ecx cmovne ecx,esi cmovne esi,edi While I'm pleasantly surprised that my branch does get turned into several select operations as intended (cmov - conditional move - in x86), I'm confused why it uses the ucomiss instruction (unordered compare and set flag...