search for: r11d

Displaying 20 results from an estimated 25 matches for "r11d".

Did you mean: r11
2020 May 22
2
[PATCH] Optimized assembler version of md5_process() for x86-64
...' + # D is 'edx' + + cmp %rdi, %rsi # cmp end with ptr + je 1f # jmp if ptr == end + + # BEGIN of loop over 16-word blocks +2: # save old values of A, B, C, D + mov %eax, %r8d + mov %ebx, %r9d + mov %ecx, %r14d + mov %edx, %r15d + mov 0*4(%rsi), %r10d /* (NEXT STEP) X[0] */ + mov %edx, %r11d /* (NEXT STEP) z' = %edx */ + xor %ecx, %r11d /* y ^ ... */ + lea -680876936(%eax,%r10d),%eax /* Const + dst + ... */ + and %ebx, %r11d /* x & ... */ + xor %edx, %r11d /* z ^ ... */ + mov 1*4(%rsi),%r10d /* (NEXT STEP) X[1] */ + add %r11d, %eax /* dst += ... */ + rol $7, %eax /* dst <&lt...
2015 Feb 13
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
...edi, 4 ; size_t call _calloc lea edx, [r15-1] movsxd r8, edx mov ecx, r15d add ecx, 0FFFFFFFEh js loc_100000DFA test r15d, r15d mov r11d, [rax+r8*4] jle loc_100000EAE mov ecx, r15d add ecx, 0FFFFFFFEh mov [rsp+48h+var_34], ecx movsxd rcx, ecx lea rcx, [rax+rcx*4] mov [rsp+48h+var_40], rcx...
2015 Feb 14
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
...;> lea edx, [r15-1] >> movsxd r8, edx >> mov ecx, r15d >> add ecx, 0FFFFFFFEh >> js loc_100000DFA >> test r15d, r15d >> mov r11d, [rax+r8*4] >> jle loc_100000EAE >> mov ecx, r15d >> add ecx, 0FFFFFFFEh >> mov [rsp+48h+var_34], ecx >> movsxd rcx, ecx >> lea rcx, [rax+rcx*4] &...
2015 Sep 01
2
[RFC] New pass: LoopExitValues
...lt; Size; ++Outer) for (int Inner = 0; Inner < Size; ++Inner) Dst[Outer * Size + Inner] = Src[Outer * Size + Inner] * Val; } With LoopExitValues ------------------------------- matrix_mul: testl %edi, %edi je .LBB0_5 xorl %r9d, %r9d xorl %r8d, %r8d .LBB0_2: xorl %r11d, %r11d .LBB0_3: movl %r9d, %r10d movl (%rdx,%r10,4), %eax imull %ecx, %eax movl %eax, (%rsi,%r10,4) incl %r11d incl %r9d cmpl %r11d, %edi jne .LBB0_3 incl %r8d cmpl %edi, %r8d jne .LBB0_2 .LBB0_5: retq Without LoopExitValues: ----------------------...
2015 Feb 14
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
...gt;>> movsxd r8, edx >>>> mov ecx, r15d >>>> add ecx, 0FFFFFFFEh >>>> js loc_100000DFA >>>> test r15d, r15d >>>> mov r11d, [rax+r8*4] >>>> jle loc_100000EAE >>>> mov ecx, r15d >>>> add ecx, 0FFFFFFFEh >>>> mov [rsp+48h+var_34], ecx >>>> movsxd rcx, ecx >>&gt...
2015 Aug 31
2
[RFC] New pass: LoopExitValues
Hello LLVM, This is a proposal for a new pass that improves performance and code size in some nested loop situations. The pass is target independent. >From the description in the file header: This optimization finds loop exit values reevaluated after the loop execution and replaces them by the corresponding exit values if they are available. Such sequences can arise after the
2019 Nov 08
2
Register Dataflow Analysis on X86
Do you know whether it has been fixed on the 8.0.1 release? Scott On Fri, Nov 8, 2019 at 9:45 AM Krzysztof Parzyszek <kparzysz at quicinc.com<mailto:kparzysz at quicinc.com>> wrote: The one blocking issue that existed in the past has been fixed. I haven’t had time to do any work on it lately, but I’m not aware of any fundamental problems that would make it not work on x86. --
2019 Dec 23
2
Register Dataflow Analysis on X86
Hi Scott, That #1073741833 is a register mask. They are treated as aggregate registers (essentially sets of registers), so if it includes R9D and R11D, it will be treated as being aliased with both. These separate defs are there because they reach disjoint registers. -- Krzysztof Parzyszek kparzysz at quicinc.com<mailto:kparzysz at quicinc.com> AI tools development From: Scott Douglas Constable <sdconsta at syr.edu> Sent: Monda...
2005 Mar 23
3
[PATCH] promised MMX patches rc1
Hello, Here is my first speedup patch. Like 10-11%. No IDCT yet. Please feel free to comment my code or even better think about improvements. :) I belive my routines are not so bad, maybe one day they will be even more faster. What needs to be optimized is the loop filter fuction. I have no ideas now how to do it. It does not leave much space for parallel stuff, copying memory from lot of
2020 Jan 10
2
Register Dataflow Analysis on X86
...ints? Thanks, Scott On Mon, Dec 23, 2019 at 12:46 PM Krzysztof Parzyszek <kparzysz at quicinc.com<mailto:kparzysz at quicinc.com>> wrote: Hi Scott, That #1073741833 is a register mask. They are treated as aggregate registers (essentially sets of registers), so if it includes R9D and R11D, it will be treated as being aliased with both. These separate defs are there because they reach disjoint registers. -- Krzysztof Parzyszek kparzysz at quicinc.com<mailto:kparzysz at quicinc.com> AI tools development From: Scott Douglas Constable <sdconsta at syr.edu<mailto:sdcon...
2013 Sep 12
1
[LLVMdev] bug in X86 disasm code?
...IT \ ENTRY(EAX) \ ENTRY(ECX) \ ENTRY(EDX) \ ENTRY(EBX) \ ENTRY(sib) \ ENTRY(EBP) \ ENTRY(ESI) \ ENTRY(EDI) \ ENTRY(R8D) \ ENTRY(R9D) \ ENTRY(R10D) \ ENTRY(R11D) \ ENTRY(R12D) \ ENTRY(R13D) \ ENTRY(R14D) \ ENTRY(R15D) the ENTRY(sib) looks suspicious. that should be ENTRY(ESP), no? thanks. J -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/l...
2016 Jun 25
3
Tail call optimization is getting affected due to local function related optimization with IPRA
...as per regmaks collected by RegUsageInfoCollector pass. Function Name : bitrv2 Clobbered Registers: AH AL AX BH BL BP BPL BX CH CL CX DI DIL EAX EBP EBX ECX EDI EFLAGS ESI ESP RAX RBP RBX RCX RDI RSI RSP SI SIL SP SPL R8 R9 R10 R11 R12 R13 R14 R15 R8B R9B R10B R11B R12B R13B R14B R15B R8D R9D R10D R11D R12D R13D R14D R15D R8W R9W R10W R11W R12W R13W R14W R15W How ever caller of bitrv2, makewt has callee saved registers as per CC, but this code results in segmentation fault when compliled with O1 because makewt has value of *ip in R14 register and that is stored and restore by makewt at begining...
2016 Jun 25
0
Tail call optimization is getting affected due to local function related optimization with IPRA
...oCollector pass. > > Function Name : bitrv2 > Clobbered Registers: > AH AL AX BH BL BP BPL BX CH CL CX DI DIL EAX EBP EBX ECX EDI EFLAGS ESI > ESP RAX > RBP RBX RCX RDI RSI RSP SI SIL SP SPL R8 R9 R10 R11 R12 R13 R14 R15 R8B > R9B R10B > R11B R12B R13B R14B R15B R8D R9D R10D R11D R12D R13D R14D R15D R8W R9W > R10W R11W > R12W R13W R14W R15W > > How ever caller of bitrv2, makewt has callee saved registers as per CC, > but this > code results in segmentation fault when compliled with O1 because makewt > has value > of *ip in R14 register and that is st...
2016 Jun 26
3
Tail call optimization is getting affected due to local function related optimization with IPRA
...t; Function Name : bitrv2 >> Clobbered Registers: >> AH AL AX BH BL BP BPL BX CH CL CX DI DIL EAX EBP EBX ECX EDI EFLAGS ESI >> ESP RAX >> RBP RBX RCX RDI RSI RSP SI SIL SP SPL R8 R9 R10 R11 R12 R13 R14 R15 R8B >> R9B R10B >> R11B R12B R13B R14B R15B R8D R9D R10D R11D R12D R13D R14D R15D R8W R9W >> R10W R11W >> R12W R13W R14W R15W >> >> How ever caller of bitrv2, makewt has callee saved registers as per CC, >> but this >> code results in segmentation fault when compliled with O1 because makewt >> has value >> of *i...
2016 Jun 28
2
Tail call optimization is getting affected due to local function related optimization with IPRA
...gUsageInfoCollector pass. > > Function Name : bitrv2 > Clobbered Registers: > AH AL AX BH BL BP BPL BX CH CL CX DI DIL EAX EBP EBX ECX EDI EFLAGS ESI ESP RAX > RBP RBX RCX RDI RSI RSP SI SIL SP SPL R8 R9 R10 R11 R12 R13 R14 R15 R8B R9B R10B > R11B R12B R13B R14B R15B R8D R9D R10D R11D R12D R13D R14D R15D R8W R9W R10W R11W > R12W R13W R14W R15W > > How ever caller of bitrv2, makewt has callee saved registers as per CC, but this > code results in segmentation fault when compliled with O1 because makewt has value > of *ip in R14 register and that is stored and resto...
2017 Jul 01
2
KNL Assembly Code for Matrix Multiplication
...# Child Loop BB0_2 Depth 2 >>>>> # Child Loop BB0_3 Depth >>>>> 3 >>>>> # Child Loop BB0_5 Depth >>>>> 3 >>>>> xor r11d, r11d >>>>> .p2align 4, 0x90 >>>>> .LBB0_2: # %.preheader >>>>> # Parent Loop BB0_1 Depth=1 >>>>> # => This Loop Header: D...
2016 Jun 27
0
Tail call optimization is getting affected due to local function related optimization with IPRA
...2 >>> Clobbered Registers: >>> AH AL AX BH BL BP BPL BX CH CL CX DI DIL EAX EBP EBX ECX EDI EFLAGS ESI >>> ESP RAX >>> RBP RBX RCX RDI RSI RSP SI SIL SP SPL R8 R9 R10 R11 R12 R13 R14 R15 R8B >>> R9B R10B >>> R11B R12B R13B R14B R15B R8D R9D R10D R11D R12D R13D R14D R15D R8W R9W >>> R10W R11W >>> R12W R13W R14W R15W >>> >>> How ever caller of bitrv2, makewt has callee saved registers as per CC, >>> but this >>> code results in segmentation fault when compliled with O1 because makewt >>&...
2016 Jun 28
0
Tail call optimization is getting affected due to local function related optimization with IPRA
...ered Registers: >>>> AH AL AX BH BL BP BPL BX CH CL CX DI DIL EAX EBP EBX ECX EDI EFLAGS ESI >>>> ESP RAX >>>> RBP RBX RCX RDI RSI RSP SI SIL SP SPL R8 R9 R10 R11 R12 R13 R14 R15 R8B >>>> R9B R10B >>>> R11B R12B R13B R14B R15B R8D R9D R10D R11D R12D R13D R14D R15D R8W R9W >>>> R10W R11W >>>> R12W R13W R14W R15W >>>> >>>> How ever caller of bitrv2, makewt has callee saved registers as per CC, >>>> but this >>>> code results in segmentation fault when compliled with O1...
2016 Jun 28
2
Tail call optimization is getting affected due to local function related optimization with IPRA
...> Clobbered Registers: >>>>>> AH AL AX BH BL BP BPL BX CH CL CX DI DIL EAX EBP EBX ECX EDI EFLAGS ESI ESP RAX >>>>>> RBP RBX RCX RDI RSI RSP SI SIL SP SPL R8 R9 R10 R11 R12 R13 R14 R15 R8B R9B R10B >>>>>> R11B R12B R13B R14B R15B R8D R9D R10D R11D R12D R13D R14D R15D R8W R9W R10W R11W >>>>>> R12W R13W R14W R15W >>>>>> >>>>>> How ever caller of bitrv2, makewt has callee saved registers as per CC, but this >>>>>> code results in segmentation fault when compliled with O1 b...
2007 Apr 18
1
[Bridge] bridge at start up
...x0000000000000004 <invoke+4>: callq *%esi > 0x0000000000000006 <invoke+6>: add $0x8,%rsp > 0x000000000000000a <invoke+10>: retq > > gcc-3.4.4: > 0x0000000000000000 <invoke+0>: mov %rsi,%r11 > 0x0000000000000003 <invoke+3>: jmpq *%r11d > > Regards > Patrick > > > ------------------------------ > > Message: 3 > Date: Thu, 27 Jan 2005 15:24:50 -0800 > From: "David S. Miller" <davem@davemloft.net> > Subject: [Bridge] Re: [PATCH/RFC] Reduce call chain length in > ne...