thr3ads.net - search: "rdi"

2020 Sep 01

2

Vector evolution?

...following codegen: 0000000000000160 <_Z4fct6PDv4_f>: 160: 31 c0 xor %eax,%eax 162: c4 e2 79 18 05 00 00 vbroadcastss 0x0(%rip),%xmm0 # 16b <_Z4fct6PDv4_f+0xb> 169: 00 00 16b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) 170: c5 f8 59 0c 07 vmulps (%rdi,%rax,1),%xmm0,%xmm1 175: c5 f8 29 0c 07 vmovaps %xmm1,(%rdi,%rax,1) 17a: c5 f8 59 4c 07 10 vmulps 0x10(%rdi,%rax,1),%xmm0,%xmm1 180: c5 f8 29 4c 07 10 vmovaps %xmm1,0x10(%rdi,%rax,1) 186: c5 f8 59 4c 07 20 vmulps 0x20(%rdi,%rax,1),%xmm0,%xmm1 18c: c5 f8 29 4c 07 20 vmovaps %...

[LLVMdev] llvm compilation of libc?

2011 Jun 23

1

[LLVMdev] llvm compilation of libc?

..." -DPACKAGE_URL=\"\" -I. -DMISSING_SYSCALL_NAMES -fno-builtin -DMISSING_SYSCALL_NAMES -fno-builtin -O2 -c -o lib_a-memcpy.o `test -f 'memcpy.S' || echo './'`memcpy.S /tmp/cc-qoxxpO.s:51:3: error: invalid instruction mnemonic 'movntiq' movntiq % rax, (% rdi) ^ /tmp/cc-qoxxpO.s:52:3: error: invalid instruction mnemonic 'movntiq' movntiq % r8 , 8 (% rdi) ^ /tmp/cc-qoxxpO.s:53:3: error: invalid instruction mnemonic 'movntiq' movntiq % r9 , 16 (% rdi) ^ /tmp/cc-qoxxpO.s:54:3: error: invalid instruction mnemonic 'movntiq'...

[LLVMdev] the clang 3.5 loop optimizer seems to jump in unintentional for simple loops

2014 Jul 23

4

[LLVMdev] the clang 3.5 loop optimizer seems to jump in unintentional for simple loops

...ay); delete[] array; return dummy; } ---- compiled with gcc 4.9.1 and clang 3.5 with clang3.5 + #define ITER the_func contains masses of code the code in main is also sometimes different (not just inlined) to the_func clang -DITER -O2 clang -DITER -O3 gives: the_func: leaq 12(%rdi), %rcx leaq 4(%rdi), %rax cmpq %rax, %rcx cmovaq %rcx, %rax movq %rdi, %rsi notq %rsi addq %rax, %rsi shrq $2, %rsi incq %rsi xorl %edx, %edx movabsq $9223372036854775800, %rax # imm = 0x7FFFFFFFFFFFFFF8...

BUGS n code generated for target i386 compiling __bswapdi3, and for target x86-64 compiling __bswapsi2()

2018 Nov 25

3

BUGS n code generated for target i386 compiling __bswapdi3, and for target x86-64 compiling __bswapsi2()

...but >> produces 0x67452301EFCDAB89 >> >> >> And compiled for x86-64 this yields the following code (see >> <https://godbolt.org/z/uM9nvN>): >> >> __bswapsi2: # @__bswapsi2 >> mov eax, edi >> shr eax, 24 >> mov rcx, rdi >> shr rcx, 8 >> and ecx, 65280 >> or rax, rcx >> mov rcx, rdi >> shl rcx, 8 >> and ecx, 16711680 >> or rax, rcx >> and rdi, 255 >> shl rdi, 24 >> or rax, rdi >> ret...

[LLVMdev] Problem in X86 backend

2014 Oct 27

4

[LLVMdev] Problem in X86 backend

...uble wirting an instruction in the X86 backend. I made a new intrinsic and I wrote a custom inserter for my intrinsic in the X86 backend. Everything works fine, except for one instruction that I can't find how to write. I want to add this instruction in one of my machine basic block: mov [rdi], 0 How can I achieve that with the LLVM api? I tried several stuff, but none works :( Cheers

Byte-wide stores aren't coalesced if interspersed with other stores

2018 Sep 11

2

Byte-wide stores aren't coalesced if interspersed with other stores

Andres: FWIW, codegen will do the merge if you turn on global alias analysis for it "-combiner-global-alias-analysis". That said, we should be able to do this merging earlier. -Nirav On Mon, Sep 10, 2018 at 8:33 PM, Andres Freund via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Hi, > > On 2018-09-10 13:42:21 -0700, Andres Freund wrote: > > I have, in postres,

[PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

2017 Feb 13

4

[PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

...ext;" > >> +".global __raw_callee_save___kvm_vcpu_is_preempted;" > >> +".type __raw_callee_save___kvm_vcpu_is_preempted, @function;" > >> +"__raw_callee_save___kvm_vcpu_is_preempted:" > >> +FRAME_BEGIN > >> +"push %rdi;" > >> +"push %rdx;" > >> +"movslq %edi, %rdi;" > >> +"movq $steal_time+16, %rax;" > >> +"movq __per_cpu_offset(,%rdi,8), %rdx;" > >> +"cmpb $0, (%rdx,%rax);" Could we not put the $steal_t...

[PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

2017 Feb 13

4

[PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function

...ext;" > >> +".global __raw_callee_save___kvm_vcpu_is_preempted;" > >> +".type __raw_callee_save___kvm_vcpu_is_preempted, @function;" > >> +"__raw_callee_save___kvm_vcpu_is_preempted:" > >> +FRAME_BEGIN > >> +"push %rdi;" > >> +"push %rdx;" > >> +"movslq %edi, %rdi;" > >> +"movq $steal_time+16, %rax;" > >> +"movq __per_cpu_offset(,%rdi,8), %rdx;" > >> +"cmpb $0, (%rdx,%rax);" Could we not put the $steal_t...

Code generation option for wide integers on x86_64?

2020 Aug 17

3

Code generation option for wide integers on x86_64?

Is there an existing option in X86_64 target code generator to emit a loop for the following code: define i4096 @add(i4096 %a, i4096 %b) alwaysinline { %c = add i4096 %a, %b ret i4096 %c } instead of: movq %rdi, %rax addq 96(%rsp), %rsi adcq 104(%rsp), %rdx movq %rdx, 8(%rdi) movq %rsi, (%rdi) adcq 112(%rsp), %rcx movq %rcx, 16(%rdi) adcq 120(%rsp), %r8 movq %r8, 24(%rdi) adcq 128(%rsp), %r9 movq %r9, 32(%rdi) movq 8(%rsp), %rcx adcq 136(%rsp), %...

[LLVMdev] Problem in X86 backend

2014 Oct 29

2

[LLVMdev] Problem in X86 backend

...aldini Julien <julien.rinaldini at heig-vd.ch> wrote: > > Hum, in fact, I'm still a bit lost ;) > > It seems to works in -O0, but in -O1, -O2 and -O3, I got this error (+ the dump of the function): > > # Machine code for function foo: Post SSA > Function Live Ins: %RDI in %vreg7 > > BB#0: derived from LLVM BB %entry > Live Ins: %RDI > %vreg7<def> = COPY %RDI; GR64:%vreg7 > %vreg1<def> = MOV64rm %vreg7, 1, %noreg, 8, %noreg; mem:LD8[%args.03](tbaa=<badref>) GR64:%vreg1,%vreg7 > TEST64rr %vreg1, %vreg1, %...

BUGS n code generated for target i386 compiling __bswapdi3, and for target x86-64 compiling __bswapsi2()

2018 Nov 25

3

BUGS n code generated for target i386 compiling __bswapdi3, and for target x86-64 compiling __bswapsi2()

...result for the input value 0x0123456789ABCDEF is 0xEFCDAB8967452301; the compiled code but produces 0x67452301EFCDAB89 And compiled for x86-64 this yields the following code (see <https://godbolt.org/z/uM9nvN>): __bswapsi2: # @__bswapsi2 mov eax, edi shr eax, 24 mov rcx, rdi shr rcx, 8 and ecx, 65280 or rax, rcx mov rcx, rdi shl rcx, 8 and ecx, 16711680 or rax, rcx and rdi, 255 shl rdi, 24 or rax, rdi ret __bswapdi2: # @__bswapdi2 bswap rdi mov rax, rdi ret Both are correct, but __bswaps...

[LLVMdev] llvm compilation of libc?

2011 Jun 23

0

[LLVMdev] llvm compilation of libc?

I would recommend Newlib. It's easy to configure and compile using Clang. http://sourceware.org/newlib/ - xi On Jun 23, 2011, at 2:07 AM, Gregory Malecha wrote: > Hello, > > I'm wondering if anyone had any success (even a small amount) compiling any variant of libc to llvm bitcode? > > -- > gregory malecha > _______________________________________________ >

[LLVMdev] llvm compilation of libc?

2011 Jun 23

3

[LLVMdev] llvm compilation of libc?

Hello, I'm wondering if anyone had any success (even a small amount) compiling any variant of libc to llvm bitcode? -- gregory malecha -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110622/b5fad437/attachment.html>

Byte-wide stores aren't coalesced if interspersed with other stores

2018 Sep 11

2

Byte-wide stores aren't coalesced if interspersed with other stores

...hat allow it to its job. > > In the case at hand, with a manual 64bit store (this is on a 64bit > target), llvm then combines 8 byte-wide stores into one. > > > Without -combiner-global-alias-analysis it generates: > > movb $0, 1(%rdx) > movl 4(%rsi,%rdi), %ebx > movq %rbx, 8(%rcx) > movb $0, 2(%rdx) > movl 8(%rsi,%rdi), %ebx > movq %rbx, 16(%rcx) > movb $0, 3(%rdx) > movl 12(%rsi,%rdi), %ebx > movq %rbx, 24(%rcx) > movb $0, 4(%rdx) >...

[LLD] Linking static library does not resolve symbols as gold/ld

2017 Mar 15

2

[LLD] Linking static library does not resolve symbols as gold/ld

...nc>: > 13832: 55 push %rbp > 13833: 48 89 e5 mov %rsp,%rbp > 13836: 53 push %rbx > 13837: 48 83 ec 18 sub $0x18,%rsp > 1383b: 48 89 7d e8 mov %rdi,-0x18(%rbp) > 1383f: 48 8b 45 e8 mov -0x18(%rbp),%rax > 13843: 48 89 c7 mov %rax,%rdi > -> 13846: e8 00 00 00 00 callq 1384b <func+0x19> > 1384b: 48 8b 45 e8 mov -0x18(%rbp),%rax > &gt...

[Bug 98506] New: Pagefault in gf100_vm_flush

2016 Oct 30

4

[Bug 98506] New: Pagefault in gf100_vm_flush

https://bugs.freedesktop.org/show_bug.cgi?id=98506 Bug ID: 98506 Summary: Pagefault in gf100_vm_flush Product: xorg Version: git Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: Driver/nouveau Assignee: nouveau at lists.freedesktop.org

[LLD] Linking static library does not resolve symbols as gold/ld

2017 Mar 15

2

[LLD] Linking static library does not resolve symbols as gold/ld

...()>: > 13832: 55 push %rbp > 13833: 48 89 e5 mov %rsp,%rbp > 13836: 53 push %rbx > 13837: 48 83 ec 18 sub $0x18,%rsp > 1383b: 48 89 7d e8 mov %rdi,-0x18(%rbp) > 1383f: 48 8b 45 e8 mov -0x18(%rbp),%rax > 13843: 48 89 c7 mov %rax,%rdi > 13846: e8 00 00 00 00 callq 1384b <func()+0x19> > 13847: R_X86_64_PLT32 std::vector<record, &g...

[LLVMdev] What does this error mean: psuedo instructions should be removed before code emission?

2010 Aug 26

2

[LLVMdev] What does this error mean: psuedo instructions should be removed before code emission?

...r message to print it out to >>> errs()). >>> >>> It basically means that a pseudo wasn't lowered to something that >>> the jit can output before the jit was run. Is this on ToT? >> >> Insn before the error: TCRETURNri64 %RAX<kill>, 0, %RDI<kill>, >> %RAX<imp-def,dead>, %RDI<imp-def,dead>, %RSP<imp-use>, ... > > Odd. I thought TCReturn was being lowered. At any rate can you > file a bug with the .ll file that causes this? It should be getting lowered in emitEpilogue. Must be a bug somewhe...

[LLVMdev] gcc bug?..segfault problem with getElfArchType.

2012 Jun 30

2

[LLVMdev] gcc bug?..segfault problem with getElfArchType.

...ectFile19createELFObjectFileEPNS_12MemoryBufferE+4>: push %rbx 0x00000000004cc869 <_ZN4llvm6object10ObjectFile19createELFObjectFileEPNS_12MemoryBufferE+5>: sub $0x38,%rsp 0x00000000004cc86d <_ZN4llvm6object10ObjectFile19createELFObjectFileEPNS_12MemoryBufferE+9>: mov %rdi,-0x38(%rbp) std::pair<unsigned char, unsigned char> Ident = getElfArchType(Object); 0x00000000004cc871 <_ZN4llvm6object10ObjectFile19createELFObjectFileEPNS_12MemoryBufferE+13>: mov -0x38(%rbp),%rax 0x00000000004cc875 <_ZN4llvm6object10ObjectFile19createELFObjectFileEPNS_12...

[LLVMdev] use AVX automatically if present

2012 May 24

4

[LLVMdev] use AVX automatically if present

...id } $ llc -o - avx.ll .file "avx.ll" .text .globl _fun1 .align 16, 0x90 .type _fun1, at function _fun1: # @_fun1 .cfi_startproc # BB#0: # %_L1 movaps (%rdi), %xmm0 movaps 16(%rdi), %xmm1 addps (%rsi), %xmm0 addps 16(%rsi), %xmm1 movaps %xmm1, 16(%rdi) movaps %xmm0, (%rdi) ret .Ltmp0: .size _fun1, .Ltmp0-_fun1 .cfi_endproc .section ".note.GNU-stack&q...

search for: rdi