thr3ads.net - search: "lbb1

2014 Mar 12

2

[LLVMdev] Autovectorization questions

...n; ++i) A[i*7] += B[i*k]; } I replaced "int *A"/"int *B" into "double *A"/"double *B" and then compiled the sample with $> ./clang -Ofast -ffast-math test.c -std=c99 -march=core-avx2 -S -o bb.S -fslp-vectorize-aggressive and loop body looks like: .LBB1_2: # %for.body # =>This Inner Loop Header: Depth=1 cltq vmovsd (%rsi,%rax,8), %xmm0 movq %r9, %r10 sarq $32, %r10 vaddsd (%rdi,%r10,8), %xmm0, %xmm0 vmovsd %xmm0, (%rdi,%...

[LLVMdev] Suboptimal code due to excessive spilling

2012 Mar 28

2

[LLVMdev] Suboptimal code due to excessive spilling

...p16: .cfi_def_cfa_offset 104 .Ltmp17: .cfi_offset %esi, -16 .Ltmp18: .cfi_offset %edi, -12 .Ltmp19: .cfi_offset %ebx, -8 pxor %xmm0, %xmm0 movl 112(%esp), %eax testl %eax, %eax je .LBB1_3 # BB#1: xorl %ebx, %ebx movl 108(%esp), %ecx movl 104(%esp), %edx xorl %esi, %esi .align 16, 0x90 .LBB1_2: # %.lr.ph.i # =>This Inner Loop Header: Depth=1 movsd (%edx,%ebx,8), %xmm2 addsd .LCPI1_0, %xmm2 movsd 16(%edx,%ebx,8), %xmm1 movsd %xmm1, (%esp) # 8-byte Spill movl %ebx, %edi addl $1, %edi addsd (%edx,%edi...

[LLVMdev] Suboptimal code due to excessive spilling

2012 Apr 05

0

[LLVMdev] Suboptimal code due to excessive spilling

...p16: .cfi_def_cfa_offset 104 .Ltmp17: .cfi_offset %esi, -16 .Ltmp18: .cfi_offset %edi, -12 .Ltmp19: .cfi_offset %ebx, -8 pxor %xmm0, %xmm0 movl 112(%esp), %eax testl %eax, %eax je .LBB1_3 # BB#1: xorl %ebx, %ebx movl 108(%esp), %ecx movl 104(%esp), %edx xorl %esi, %esi .align 16, 0x90 .LBB1_2: # %.lr.ph.i # =>This Inner Loop Header: Depth=1 movsd (%edx,%ebx,8), %xmm2 addsd .LCPI1_0, %xmm2 movsd 16(%edx,%ebx,8), %xmm1 movsd %xmm1, (%esp) # 8-byte Spill movl %ebx, %edi addl $1, %edi addsd (%edx,%edi...

[LLVMdev] Autovectorization questions

2014 Mar 12

4

[LLVMdev] Autovectorization questions

...A"/"int *B" into "double *A"/"double *B" and then compiled the sample with >> >> $> ./clang -Ofast -ffast-math test.c -std=c99 -march=core-avx2 -S -o bb.S -fslp-vectorize-aggressive >> >> and loop body looks like: >> >> .LBB1_2: # %for.body >> # =>This Inner Loop Header: Depth=1 >> cltq >> vmovsd (%rsi,%rax,8), %xmm0 >> movq %r9, %r10 >> sarq $32, %r10 >> vaddsd (%rd...

[LLVMdev] Reproducible testcase for r100044

2010 Oct 26

0

[LLVMdev] Reproducible testcase for r100044

Attached is a .ll with a reproducible test case for the bug addressed by r100044 -- "Fix a nasty dangling-pointer heisenbug that could generate wrong code pretty much anywhere AFAICT." Doing llvm-as < sunkaddr.ll | llc which a llc after r100044 will generate .LBB1_2: # %if-false-block movl $1, 16(%rdi) movl 120(%rdi), %eax ret while one before will generate: .LBB1_2: # %if-false-block movl $1, 16(%rdi) movl $1, %eax ret There are some FileCheck direct...

[LLVMdev] Tight overlapping loops and performance

2009 Mar 02

0

[LLVMdev] Tight overlapping loops and performance

On Mon, Mar 2, 2009 at 2:45 PM, Jonathan Turner <probata at hotmail.com> wrote: > For which version of gcc? I should mention I'm on OS X and using the LLVM > SVN. gcc 4.3. It's also possible this is processor-sensitive. >> First, try looking at the generated code... the code LLVM generates is >> probably not what you're expecting. I'm getting the

[LLVMdev] llvm register reload/spilling around calls

2010 Oct 20

0

[LLVMdev] llvm register reload/spilling around calls

On Oct 20, 2010, at 7:46 AM, Roland Scheidegger wrote: > On 20.10.2010 05:00, Jakob Stoklund Olesen wrote: >> Look in X86InstrControl.td. The call instructions are all prefixed >> by: >> >> let Defs = [RAX, RCX, RDX, RSI, RDI, R8, R9, R10, R11, FP0, FP1, FP2, >> FP3, FP4, FP5, FP6, ST0, ST1, MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, >> XMM0, XMM1, XMM2, XMM3,

[LLVMdev] llvm register reload/spilling around calls

2010 Oct 20

1

[LLVMdev] llvm register reload/spilling around calls

...------------===// We should investigate an instruction sinking pass. Consider this silly example in pic mode: #include <assert.h> void foo(int x) { assert(x); //... } we compile this to: _foo: subl $28, %esp call "L1$pb" "L1$pb": popl %eax cmpl $0, 32(%esp) je LBB1_2 # cond_true LBB1_1: # return # ... addl $28, %esp ret LBB1_2: # cond_true ... The PIC base computation (call+popl) is only used on one path through the code, but is currently always computed in the entry block. It would be better to sink the picbase computation down into the block for the asse...

[LLVMdev] Tight overlapping loops and performance

2009 Mar 03

3

[LLVMdev] Tight overlapping loops and performance

...%esp movl $1999, %eax xorl %ecx, %ecx movl $1999, %edx .align 4,0x90 LBB1_1: ## loopto cmpl $1, %eax leal -1(%eax), %eax cmove %edx, %eax incl %ecx cmpl $999999999, %ecx jne LBB1_1 ## loopto LBB1_2: ## bb1 movl %eax, 4(%esp) movl $LC, (%esp) call _printf xorl %eax, %eax addl $12, %esp ret .section __TEXT,__cstring,cstring_literals LC: ## LC .asciz "Timeout: %i\n" .su...

[LLVMdev] Tight overlapping loops and performance

2009 Mar 02

3

[LLVMdev] Tight overlapping loops and performance

> Date: Mon, 2 Mar 2009 13:41:45 -0800 > From: eli.friedman at gmail.com > To: llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] Tight overlapping loops and performance > > Hmm, on my computer, I get around 2.5 seconds with both gcc -O3 and > llvm-gcc -O3 (using llvm-gcc from svn). Not sure what you're doing > differently; I wouldn't be surprised if it's

[LLVMdev] Alias analysis and instruction level parallelism

2008 Apr 03

2

[LLVMdev] Alias analysis and instruction level parallelism

Dan Gohman wrote: > I think this is trickier than it sounds; the reason GEPs are lowered > is to > allow strength-reduction and other things to do transformations on them. > It would require those passes to know how to update the mapping. Yes, I do appreciate the amount of work involved, and I am very open to other suggestions. What the backend really needs to know is what loads

[LLVMdev] llvm register reload/spilling around calls

2010 Oct 20

2

[LLVMdev] llvm register reload/spilling around calls

On 20.10.2010 05:00, Jakob Stoklund Olesen wrote: > On Oct 19, 2010, at 6:37 PM, Roland Scheidegger wrote: > >> Thanks for giving it a look! >> >> On 19.10.2010 23:21, Jakob Stoklund Olesen wrote: >>> On Oct 19, 2010, at 11:40 AM, Roland Scheidegger wrote: >>> >>>> So I saw that the code is doing lots of register >>>>

[LLVMdev] Problems compiling llvm-gcc4 frontend on x86_64

2007 May 26

0

[LLVMdev] Problems compiling llvm-gcc4 frontend on x86_64

...> __do_global_dtors_aux: > subq $8, %rsp > movq %rbp, (%rsp) > movq %rsp, %rbp > cmpb $0, completed.4705(%rip) > jne .LBB1_4 #UnifiedReturnBlock > .LBB1_1: #bb9.preheader > movq p.4704(%rip), %rax > movq (%rax), %rax > cmpq $0, %rax > je .LBB1_3 #bb16 > .LBB1_2: #bb > addq $4, p.4704(%rip) > call *%rax > movq p.4704(%rip), %rax > movq (%rax), %rax > cmpq $0, %rax > jne .LBB1_2 #bb > .LBB1_3: #bb16 > movb $1, completed.4705(%rip) > movq %rbp, %rsp > popq %rbp > ret > .LBB1_4: #UnifiedReturnBlock > movq %rbp...

[LLVMdev] Exception handling question

2010 Jan 22

0

[LLVMdev] Exception handling question

...at function f: # @f .Leh_func_begin1: # BB#0: # %e subq $8, %rsp .Llabel4: .Llabel1: callq g .Llabel2: # BB#1: # %c addq $8, %rsp ret .LBB1_2: # %u .Llabel3: addq $8, %rsp ret .size f, .-f .Leh_func_end1: .section .gcc_except_table,"a", at progbits .align 4 GCC_except_table1: .byte 0 # Padding .byte 0...

[LLVMdev] Problems compiling llvm-gcc4 frontend on x86_64

2007 May 26

1

[LLVMdev] Problems compiling llvm-gcc4 frontend on x86_64

...rsp, %rbp > > cmpb $0, completed.4705(%rip) > > jne .LBB1_4 #UnifiedReturnBlock > > .LBB1_1: #bb9.preheader > > movq p.4704(%rip), %rax > > movq (%rax), %rax > > cmpq $0, %rax > > je .LBB1_3 #bb16 > > .LBB1_2: #bb > > addq $4, p.4704(%rip) > > call *%rax > > movq p.4704(%rip), %rax > > movq (%rax), %rax > > cmpq $0, %rax > > jne .LBB1_2 #bb > > .LBB1_3: #bb16 > > movb $1, completed.4705(%rip) > &gt...

[LLVMdev] Problems compiling llvm-gcc4 frontend on x86_64

2007 May 25

3

[LLVMdev] Problems compiling llvm-gcc4 frontend on x86_64

Hi all, I've run into problems compiling the llvm-gcc frontend on x86_64. Is this not supported, or am I making an error somewhere? The procedure I followed was: 1. Download LLVM 2.0 source as a tarball (from a few days ago, during the testing phase). 2. Download the llvm-gcc4 source today, as a tarball. 3. Extract both. 4. Configure LLVM as: ../src/configure --prefix=`pwd`../install

[LLVMdev] Exception handling question

2010 Jan 22

2

[LLVMdev] Exception handling question

...func_begin1: > # BB#0: # %e > subq $8, %rsp > .Llabel4: > .Llabel1: > callq g > .Llabel2: > # BB#1: # %c > addq $8, %rsp > ret > .LBB1_2: > # %u > .Llabel3: > addq $8, %rsp > ret > .size f, .-f > .Leh_func_end1: > > .section .gcc_except_table,"a", at progbits > .align 4 > GCC_except_table...

[LLVMdev] Two labels around one instruction in Codegen

2007 Nov 06

1

[LLVMdev] Two labels around one instruction in Codegen

...code (Llabel1 was supposed to be before the {ctld, idvl} and Llabel2 which was after is not generated) test: .Leh_func_begin1: .Llabel4: movl $2, %eax movl 4(%esp), %ecx cltd idivl %ecx .Llabel1: .LBB1_1: # continue ret .LBB1_2: # unwindblock Thanks Duncan, Nicolas

[LLVMdev] regression: double spaced asm output for thumb-2

2009 Jul 17

2

[LLVMdev] regression: double spaced asm output for thumb-2

...e- spaced, for example: fstd d13, [sp, #+40] fstd d12, [sp, #+32] fstd d11, [sp, #+24] fstd d10, [sp, #+16] fstd d9, [sp, #+8] fstd d8, [sp] sub sp, sp, #872 mov r4, r1 mov r5, r0 cmp r0, #2 ble LBB1_133 @ entry.bb2.i_crit_edge LBB1_1: @ bb.i ldr r0, [r4, #+8] bl L_atoi$stub LBB1_2: @ bb2.i str r0, [sp, #+80] cmp r5, #1 ble LBB1_134 @ bb2.i.dealwithargs.exit_crit_edge LBB1_3: @ bb3.i ldr r0, [r4, #+4] bl L_atoi$stub LBB1_4: @ dealwithargs.exit mov r4, r0 str r4, [sp, #+64] mov r0, #32 ldr r5, [sp, #+80] mov r1, r5 bl L___divsi3$stub str r0, [sp, #+72] mov r...

[LLVMdev] Miscompilation on MingW32

2008 Jun 11

0

[LLVMdev] Miscompilation on MingW32

...%eax call __alloca movl %esp, %ebx movl %esi, %eax call __alloca movl %esp, -16(%ebp) movl %esi, %eax call __alloca movl 8(%ebp), %eax movl %eax, (%edi) movl %eax, (%ebx) movl (%edi), %eax addl %eax, %eax addl %eax, %eax movl %eax, (%esp) <=== should be 8(%esp) or -40(%ebp) ? LBB1_2: # return movl -16(%ebp), %eax movl (%eax), %eax leal -12(%ebp), %esp popl %esi popl %edi popl %ebx popl %ebp ret .align 16 .globl _main .def _main; .scl 2; .type 32; .endef _main: pushl %ebp movl %esp, %ebp subl $8, %esp call ___main movl $1, (%esp) call _tmp addl $8, %esp po...

search for: lbb1_2