search for: r10d

Displaying 20 results from an estimated 37 matches for "r10d".

Did you mean: 10d
2020 May 22
2
[PATCH] Optimized assembler version of md5_process() for x86-64
...+ # B is 'ebx' + # C is 'ecx' + # D is 'edx' + + cmp %rdi, %rsi # cmp end with ptr + je 1f # jmp if ptr == end + + # BEGIN of loop over 16-word blocks +2: # save old values of A, B, C, D + mov %eax, %r8d + mov %ebx, %r9d + mov %ecx, %r14d + mov %edx, %r15d + mov 0*4(%rsi), %r10d /* (NEXT STEP) X[0] */ + mov %edx, %r11d /* (NEXT STEP) z' = %edx */ + xor %ecx, %r11d /* y ^ ... */ + lea -680876936(%eax,%r10d),%eax /* Const + dst + ... */ + and %ebx, %r11d /* x & ... */ + xor %edx, %r11d /* z ^ ... */ + mov 1*4(%rsi),%r10d /* (NEXT STEP) X[1] */ + add %r11d, %eax /* ds...
2015 Feb 13
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
...; CODE XREF: _main+BA j cmp r15d, 1 mov esi, 0 mov r9, [rsp+48h+var_48] mov r12d, 1 jle short loc_100000DF0 loc_100000D99: ; CODE XREF: _main+15E j mov r10d, [rax+rsi*4] mov ecx, 0FFFFFFFFh mov edi, 1 mov r13, r9 nop word ptr [rax+rax+00h] loc_100000DB0: ; CODE XREF: _main+14F j xor ebx, ebx mov ebp, r10d...
2015 Feb 14
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
...> mov esi, 0 >> mov r9, [rsp+48h+var_48] >> mov r12d, 1 >> jle short loc_100000DF0 >> >> loc_100000D99: ; CODE XREF: _main+15E j >> mov r10d, [rax+rsi*4] >> mov ecx, 0FFFFFFFFh >> mov edi, 1 >> mov r13, r9 >> nop word ptr [rax+rax+00h] >> >> loc_100000DB0: ; CODE XREF: _main+14F j >>...
2015 Feb 14
2
[LLVMdev] trunk's optimizer generates slower code than 3.5
...> mov r9, [rsp+48h+var_48] >>>> mov r12d, 1 >>>> jle short loc_100000DF0 >>>> >>>> loc_100000D99: ; CODE XREF: _main+15E j >>>> mov r10d, [rax+rsi*4] >>>> mov ecx, 0FFFFFFFFh >>>> mov edi, 1 >>>> mov r13, r9 >>>> nop word ptr [rax+rax+00h] >>>> >>>> loc_100000DB0:...
2015 Sep 01
2
[RFC] New pass: LoopExitValues
...= 0; Inner < Size; ++Inner) Dst[Outer * Size + Inner] = Src[Outer * Size + Inner] * Val; } With LoopExitValues ------------------------------- matrix_mul: testl %edi, %edi je .LBB0_5 xorl %r9d, %r9d xorl %r8d, %r8d .LBB0_2: xorl %r11d, %r11d .LBB0_3: movl %r9d, %r10d movl (%rdx,%r10,4), %eax imull %ecx, %eax movl %eax, (%rsi,%r10,4) incl %r11d incl %r9d cmpl %r11d, %edi jne .LBB0_3 incl %r8d cmpl %edi, %r8d jne .LBB0_2 .LBB0_5: retq Without LoopExitValues: ----------------------------------- matrix_mul: pushq %...
2015 Aug 31
2
[RFC] New pass: LoopExitValues
Hello LLVM, This is a proposal for a new pass that improves performance and code size in some nested loop situations. The pass is target independent. >From the description in the file header: This optimization finds loop exit values reevaluated after the loop execution and replaces them by the corresponding exit values if they are available. Such sequences can arise after the
2016 Aug 04
2
XRay: Demo on x86_64/Linux almost done; some questions.
....rogatch at gmail.com> wrote: > > Hi Dean, > > I have a question about the following piece of code in compiler-rt/trunk/lib/xray/xray_trampoline_x86.S : > movq _ZN6__xray19XRayPatchedFunctionE(%rip), %rax > testq %rax, %rax > je .Ltmp0 > > // assume that %r10d has the function id. > movl %r10d, %edi > xor %esi,%esi > callq *%rax > What happens if someone unsets the handler function (i.e. calls __xray_remove_handler() ) or changes the handler (i.e. calls __xray_set_handler() with a different pointer to function) between "movq _Z...
2013 Sep 12
1
[LLVMdev] bug in X86 disasm code?
...er.h #define EA_BASES_32BIT \ ENTRY(EAX) \ ENTRY(ECX) \ ENTRY(EDX) \ ENTRY(EBX) \ ENTRY(sib) \ ENTRY(EBP) \ ENTRY(ESI) \ ENTRY(EDI) \ ENTRY(R8D) \ ENTRY(R9D) \ ENTRY(R10D) \ ENTRY(R11D) \ ENTRY(R12D) \ ENTRY(R13D) \ ENTRY(R14D) \ ENTRY(R15D) the ENTRY(sib) looks suspicious. that should be ENTRY(ESP), no? thanks. J -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://...
2016 Jun 25
3
Tail call optimization is getting affected due to local function related optimization with IPRA
...ster as per regmaks collected by RegUsageInfoCollector pass. Function Name : bitrv2 Clobbered Registers: AH AL AX BH BL BP BPL BX CH CL CX DI DIL EAX EBP EBX ECX EDI EFLAGS ESI ESP RAX RBP RBX RCX RDI RSI RSP SI SIL SP SPL R8 R9 R10 R11 R12 R13 R14 R15 R8B R9B R10B R11B R12B R13B R14B R15B R8D R9D R10D R11D R12D R13D R14D R15D R8W R9W R10W R11W R12W R13W R14W R15W How ever caller of bitrv2, makewt has callee saved registers as per CC, but this code results in segmentation fault when compliled with O1 because makewt has value of *ip in R14 register and that is stored and restore by makewt at begi...
2016 Jul 30
1
XRay: Demo on x86_64/Linux almost done; some questions.
> On 30 Jul 2016, at 05:07, Serge Rogatch via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Thanks for pointing this out, Tim. Then maybe this approach is not the best choice for x86, though ideally measuring is needed, it is just that on ARM the current x86 approach is not applicable because ARM doesn't have a single return instruction (such as RETQ on x86_64), furthermore,
2016 Jun 25
0
Tail call optimization is getting affected due to local function related optimization with IPRA
...geInfoCollector pass. > > Function Name : bitrv2 > Clobbered Registers: > AH AL AX BH BL BP BPL BX CH CL CX DI DIL EAX EBP EBX ECX EDI EFLAGS ESI > ESP RAX > RBP RBX RCX RDI RSI RSP SI SIL SP SPL R8 R9 R10 R11 R12 R13 R14 R15 R8B > R9B R10B > R11B R12B R13B R14B R15B R8D R9D R10D R11D R12D R13D R14D R15D R8W R9W > R10W R11W > R12W R13W R14W R15W > > How ever caller of bitrv2, makewt has callee saved registers as per CC, > but this > code results in segmentation fault when compliled with O1 because makewt > has value > of *ip in R14 register and that...
2016 Jun 26
3
Tail call optimization is getting affected due to local function related optimization with IPRA
...gt;> Function Name : bitrv2 >> Clobbered Registers: >> AH AL AX BH BL BP BPL BX CH CL CX DI DIL EAX EBP EBX ECX EDI EFLAGS ESI >> ESP RAX >> RBP RBX RCX RDI RSI RSP SI SIL SP SPL R8 R9 R10 R11 R12 R13 R14 R15 R8B >> R9B R10B >> R11B R12B R13B R14B R15B R8D R9D R10D R11D R12D R13D R14D R15D R8W R9W >> R10W R11W >> R12W R13W R14W R15W >> >> How ever caller of bitrv2, makewt has callee saved registers as per CC, >> but this >> code results in segmentation fault when compliled with O1 because makewt >> has value >>...
2016 Jun 23
2
AVX512 instruction generated when JIT compiling for an avx2 architecture
..., %rdi movq %r10, %rsi shrq $32, %rsi movq %rax, %rdx shlq $6, %rdx leaq 48(%rdx,%r9), %rdx .align 16, 0x90 .LBB0_1: vmovd %r8d, %xmm0 vpbroadcastd %xmm0, %xmm0 vmovd %edi, %xmm1 vpbroadcastd %xmm1, %xmm1 vmovd %r10d, %xmm2 vpbroadcastd %xmm2, %xmm2 vmovd %esi, %xmm3 vpbroadcastd %xmm3, %xmm3 vmovdqa32 %xmm0, -48(%rdx) vmovdqa32 %xmm1, -32(%rdx) vmovdqa32 %xmm2, -16(%rdx) vmovdqa32 %xmm3, (%rdx) addq $1, %rax addq $64, %rdx cmpq %rc...
2016 Jul 29
2
XRay: Demo on x86_64/Linux almost done; some questions.
Thanks for pointing this out, Tim. Then maybe this approach is not the best choice for x86, though ideally measuring is needed, it is just that on ARM the current x86 approach is not applicable because ARM doesn't have a single return instruction (such as RETQ on x86_64), furthermore, the return instructions on ARM can be conditional. I have another question: what happens if the instrumented
2016 Jun 28
2
Tail call optimization is getting affected due to local function related optimization with IPRA
...by RegUsageInfoCollector pass. > > Function Name : bitrv2 > Clobbered Registers: > AH AL AX BH BL BP BPL BX CH CL CX DI DIL EAX EBP EBX ECX EDI EFLAGS ESI ESP RAX > RBP RBX RCX RDI RSI RSP SI SIL SP SPL R8 R9 R10 R11 R12 R13 R14 R15 R8B R9B R10B > R11B R12B R13B R14B R15B R8D R9D R10D R11D R12D R13D R14D R15D R8W R9W R10W R11W > R12W R13W R14W R15W > > How ever caller of bitrv2, makewt has callee saved registers as per CC, but this > code results in segmentation fault when compliled with O1 because makewt has value > of *ip in R14 register and that is stored and...
2016 Jun 23
2
AVX512 instruction generated when JIT compiling for an avx2 architecture
...%rdx > shlq $6, %rdx > leaq 48(%rdx,%r9), %rdx > .align 16, 0x90 > .LBB0_1: > vmovd %r8d, %xmm0 > vpbroadcastd %xmm0, %xmm0 > vmovd %edi, %xmm1 > vpbroadcastd %xmm1, %xmm1 > vmovd %r10d, %xmm2 > vpbroadcastd %xmm2, %xmm2 > vmovd %esi, %xmm3 > vpbroadcastd %xmm3, %xmm3 > vmovdqa32 %xmm0, -48(%rdx) > vmovdqa32 %xmm1, -32(%rdx) > vmovdqa32 %xmm2, -16(%rdx) > vmovdqa32 %xmm3, (%rdx) >...
2016 Aug 05
2
XRay: Demo on x86_64/Linux almost done; some questions.
...;> > I have a question about the following piece of code in >> compiler-rt/trunk/lib/xray/xray_trampoline_x86.S : >> > movq _ZN6__xray19XRayPatchedFunctionE(%rip), %rax >> > testq %rax, %rax >> > je .Ltmp0 >> > >> > // assume that %r10d has the function id. >> > movl %r10d, %edi >> > xor %esi,%esi >> > callq *%rax >> > What happens if someone unsets the handler function (i.e. calls >> __xray_remove_handler() ) or changes the handler (i.e. calls >> __xray_set_handler() with a...
2016 Jun 27
0
Tail call optimization is getting affected due to local function related optimization with IPRA
...bitrv2 >>> Clobbered Registers: >>> AH AL AX BH BL BP BPL BX CH CL CX DI DIL EAX EBP EBX ECX EDI EFLAGS ESI >>> ESP RAX >>> RBP RBX RCX RDI RSI RSP SI SIL SP SPL R8 R9 R10 R11 R12 R13 R14 R15 R8B >>> R9B R10B >>> R11B R12B R13B R14B R15B R8D R9D R10D R11D R12D R13D R14D R15D R8W R9W >>> R10W R11W >>> R12W R13W R14W R15W >>> >>> How ever caller of bitrv2, makewt has callee saved registers as per CC, >>> but this >>> code results in segmentation fault when compliled with O1 because makewt >...
2016 Apr 29
2
RFC: XRay -- A Function Call Tracing System
TL;DR: At Google we use a call tracing system called XRay which inserts patchable instrumentation points into function entries and exits. If the community is interested, we'd like to contribute this system to the LLVM project. Many more details are contained in the whitepaper at: https://storage.googleapis.com/xray-downloads/whitepaper/XRayAFunctionCallTracingSystem.pdf Who's
2016 Jun 28
0
Tail call optimization is getting affected due to local function related optimization with IPRA
...Clobbered Registers: >>>> AH AL AX BH BL BP BPL BX CH CL CX DI DIL EAX EBP EBX ECX EDI EFLAGS ESI >>>> ESP RAX >>>> RBP RBX RCX RDI RSI RSP SI SIL SP SPL R8 R9 R10 R11 R12 R13 R14 R15 R8B >>>> R9B R10B >>>> R11B R12B R13B R14B R15B R8D R9D R10D R11D R12D R13D R14D R15D R8W R9W >>>> R10W R11W >>>> R12W R13W R14W R15W >>>> >>>> How ever caller of bitrv2, makewt has callee saved registers as per CC, >>>> but this >>>> code results in segmentation fault when compliled wit...