thr3ads.net - search: "lbb1

2006 Jul 09

2

[LLVMdev] Critical edges

...creating infinite loops. Could someone help me fixing the code below? It is creating assembly like this one below. Block LBB1_9 was inserted to break the critical edge between blocks LBB1_3 and LBB1_8. But it changes the semantics of the original program, because, before, LBB1_8 was falling through LBB1_4, and now it is falling on LBB1_9. LBB1_3: ;no_exit lis r4, 21845 ori r4, r4, 21846 mulhw r4, r2, r4 addi r5, r2, -1 li r6, -1 srwi r6, r4, 31 add r4, r4, r6 mulli r4, r4, 3 li r6, 1 subf r2, r4, r2 cmpwi cr0, r...

[LLVMdev] MI scheduler produce badly code with inline function

2013 Oct 14

2

[LLVMdev] MI scheduler produce badly code with inline function

...t=arm -mfloat-abi=hard -mcpu=cortex-a9 * *$opt foo.bc -O3 -unroll-count=4 -o foo.opt.bc* * * *$llc foo.opt.bc -o foo.opt.s -march=arm -mcpu=cortex-a9 -enable-misched* (ps. I had checked with debug-pass=structure, so I think they are equivalently) but the result is different: You can find the LBB1_4 of foo.s, it always reuses the same reg for computation, but LBB1_4 of foo.opt.s doesn't. My question is how to just use clang (method A) to achieve B result? Or i am missing something here? I really appreciate any help and suggestions. Thanks Kuan-Hsu ------- file link ------- foo.c:...

Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)

2018 Nov 06

4

Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)

...mov eax, edx | and edx, -306674912 cmovns eax, r8d | xor eax, edx add ecx, 1 jne .LBB0_3 jmp .LBB0_4 .LBB0_5: ret crc32le: # @crc32le test esi, esi je .LBB1_1 mov eax, -1 .LBB1_4: # =>This Loop Header: Depth=1 add esi, -1 movzx ecx, byte ptr [rdi] xor eax, ecx mov r8d, -8 .LBB1_5: # Parent Loop BB1_4 Depth=1 | # 4 instructions instead of 7, and mov edx, eax | # neither r8 nor rcx clobbered! sh...

[LLVMdev] Critical edges

2006 Jul 09

0

[LLVMdev] Critical edges

...ops. Could someone help me fixing the > code below? It is creating assembly like this one below. Block LBB1_9 was > inserted to break the critical edge between blocks LBB1_3 and LBB1_8. But > it changes the semantics of the original program, because, before, LBB1_8 > was falling through LBB1_4, and now it is falling on LBB1_9. > > LBB1_3: ;no_exit > lis r4, 21845 > ori r4, r4, 21846 > mulhw r4, r2, r4 > addi r5, r2, -1 > li r6, -1 > srwi r6, r4, 31 > add r4, r4, r6 > mulli r4, r4, 3 > li r6, 1...

[LLVMdev] MI scheduler produce badly code with inline function

2013 Oct 16

3

[LLVMdev] MI scheduler produce badly code with inline function

...rereading the IR can affect the optimizations. > > Sorry. I’ve been trying to think of a way to improve this situation. > > -Andy > > (ps. I had checked with debug-pass=structure, so I think they are > equivalently) > > but the result is different: > You can find the LBB1_4 of foo.s, it always reuses the same reg for > computation, but LBB1_4 of foo.opt.s doesn't. > > My question is how to just use clang (method A) to achieve B result? > Or i am missing something here? > > I really appreciate any help and suggestions. > Thanks > > Kuan-H...

[LLVMdev] MI scheduler produce badly code with inline function

2013 Oct 16

0

[LLVMdev] MI scheduler produce badly code with inline function

...IR can affect the optimizations. > > Sorry. I’ve been trying to think of a way to improve this situation. > > -Andy > >> (ps. I had checked with debug-pass=structure, so I think they are equivalently) >> >> but the result is different: >> You can find the LBB1_4 of foo.s, it always reuses the same reg for computation, but LBB1_4 of foo.opt.s doesn't. >> >> My question is how to just use clang (method A) to achieve B result? >> Or i am missing something here? >> >> I really appreciate any help and suggestions. >> T...

Expected constant simplification not happening

2016 Feb 11

3

Expected constant simplification not happening

...rcx), %rax cmovneq %r15, %rax movl $2298949, %esi ## imm = 0x231445 movq %r12, %rdi movq %r14, %rdx callq *(%rax) --- and clang -O3: --- movq -16(%r12), %rax movl -4(%rax), %ecx andl $2298949, %ecx ## imm = 0x231445 cmpl $2298949, (%rax,%rcx) ## imm = 0x231445 jne LBB1_4 leaq 8(%rax,%rcx), %rax jmp LBB1_5 .align 4, 0x90 LBB1_4: movq %r15, %rax LBB1_5: movl $2298949, %esi ## imm = 0x231445 movq %r12, %rdi movq %r14, %rdx callq *(%rax) --- As you can see in both cases the constant $2298949 is replicated 3 times. I would have expected something like...

[LLVMdev] MI scheduler produce badly code with inline function

2013 Oct 15

0

[LLVMdev] MI scheduler produce badly code with inline function

...process of serializing and rereading the IR can affect the optimizations. Sorry. I’ve been trying to think of a way to improve this situation. -Andy > (ps. I had checked with debug-pass=structure, so I think they are equivalently) > > but the result is different: > You can find the LBB1_4 of foo.s, it always reuses the same reg for computation, but LBB1_4 of foo.opt.s doesn't. > > My question is how to just use clang (method A) to achieve B result? > Or i am missing something here? > > I really appreciate any help and suggestions. > Thanks > > Kuan-Hs...

Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)

2018 Nov 27

2

Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)

...| xor eax, edx >> add ecx, 1 >> jne .LBB0_3 >> jmp .LBB0_4 >> .LBB0_5: >> ret >> crc32le: # @crc32le >> test esi, esi >> je .LBB1_1 >> mov eax, -1 >> .LBB1_4: # =>This Loop Header: Depth=1 >> add esi, -1 >> movzx ecx, byte ptr [rdi] >> xor eax, ecx >> mov r8d, -8 >> .LBB1_5: # Parent Loop BB1_4 Depth=1 | # 4 instructions instead of 7, and >> mov edx, eax...

[LLVMdev] tail call optimization question

2011 Dec 22

1

[LLVMdev] tail call optimization question

...il call instruction: <snip> _ack.15: ## @ack.15 Leh_func_begin1: ## BB#0: ## %entry pushq %rbx Ltmp1: Ltmp2: testq %rdi, %rdi jle LBB1_3 ## BB#1: ## %if.no movq %rdi, %rbx testq %rsi, %rsi jle LBB1_4 ## BB#2: ## %if.no2 decq %rsi movq %rbx, %rdi callq _ack.15 movq %rbx, %rdi decq %rdi movq %rax, %rsi popq %rbx jmp _ack.15 ## TAILCALL LBB1_3: ## %if.yes incq %rsi movq %rsi, %rax popq %rbx ret LBB1_4:...

Expected constant simplification not happening

2016 Dec 07

1

Expected constant simplification not happening

...--- > > > and clang -O3: > --- > movq -16(%r12), %rax > movl -4(%rax), %ecx > andl $2298949, %ecx ## imm = 0x231445 > cmpl $2298949, (%rax,%rcx) ## imm = 0x231445 > jne LBB1_4 > leaq 8(%rax,%rcx), %rax > jmp LBB1_5 > .align 4, 0x90 > LBB1_4: > movq %r15, %rax > LBB1_5: > movl $2298949, %esi ## imm = 0x231445 > movq %r12, %rdi >...

[LLVMdev] MI scheduler produce badly code with inline function

2013 Oct 21

1

[LLVMdev] MI scheduler produce badly code with inline function

...ations. >> >> Sorry. I’ve been trying to think of a way to improve this situation. >> >> -Andy >> >> (ps. I had checked with debug-pass=structure, so I think they are >> equivalently) >> >> but the result is different: >> You can find the LBB1_4 of foo.s, it always reuses the same reg for >> computation, but LBB1_4 of foo.opt.s doesn't. >> >> My question is how to just use clang (method A) to achieve B result? >> Or i am missing something here? >> >> I really appreciate any help and suggestions. >...

[LLVMdev] Critical edges

2006 Jul 05

0

[LLVMdev] Critical edges

> If you don't want critical edges in the machine code CFG, you're going to > have to write a machine code CFG critical edge splitting pass: LLVM > doesn't currently have one. > > -Chris Hey guys, I've coded a pass to break the critical edges of the machine control flow graph. The program works fine, but I am sure it is not the right way of implementing it.

[LLVMdev] Critical edges

2006 Jul 04

2

[LLVMdev] Critical edges

On Tue, 4 Jul 2006, Fernando Magno Quintao Pereira wrote: > However, it does not remove all the critical edges. I am getting a very > weird dataflow graph (even without the Break Critical edges pass). The > dataflow generated by MachineFunction::dump() for the program below is > given here: > http://compilers.cs.ucla.edu/fernando/projects/soc/images/loop_no_crit2.pdf ... > The

[LLVMdev] Problems compiling llvm-gcc4 frontend on x86_64

2007 May 26

0

[LLVMdev] Problems compiling llvm-gcc4 frontend on x86_64

...t; call frame_dummy > .text > # End of file scope inline assembly > > > .text > .align 16 > .type __do_global_dtors_aux, at function > __do_global_dtors_aux: > subq $8, %rsp > movq %rbp, (%rsp) > movq %rsp, %rbp > cmpb $0, completed.4705(%rip) > jne .LBB1_4 #UnifiedReturnBlock > .LBB1_1: #bb9.preheader > movq p.4704(%rip), %rax > movq (%rax), %rax > cmpq $0, %rax > je .LBB1_3 #bb16 > .LBB1_2: #bb > addq $4, p.4704(%rip) > call *%rax > movq p.4704(%rip), %rax > movq (%rax), %rax > cmpq $0, %rax > jne .LBB1_...

[LLVMdev] Problems compiling llvm-gcc4 frontend on x86_64

2007 May 26

1

[LLVMdev] Problems compiling llvm-gcc4 frontend on x86_64

...t; > .text > > .align 16 > > .type __do_global_dtors_aux, at function > > __do_global_dtors_aux: > > subq $8, %rsp > > movq %rbp, (%rsp) > > movq %rsp, %rbp > > cmpb $0, completed.4705(%rip) > > jne .LBB1_4 #UnifiedReturnBlock > > .LBB1_1: #bb9.preheader > > movq p.4704(%rip), %rax > > movq (%rax), %rax > > cmpq $0, %rax > > je .LBB1_3 #bb16 > > .LBB1_2: #bb > > addq $4, p.4704(%rip) > > call *%rax &...

[LLVMdev] Problems compiling llvm-gcc4 frontend on x86_64

2007 May 25

3

[LLVMdev] Problems compiling llvm-gcc4 frontend on x86_64

Hi all, I've run into problems compiling the llvm-gcc frontend on x86_64. Is this not supported, or am I making an error somewhere? The procedure I followed was: 1. Download LLVM 2.0 source as a tarball (from a few days ago, during the testing phase). 2. Download the llvm-gcc4 source today, as a tarball. 3. Extract both. 4. Configure LLVM as: ../src/configure --prefix=`pwd`../install

Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)

2018 Nov 28

2

Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)

...jne .LBB0_3 >> >> jmp .LBB0_4 >> >> .LBB0_5: >> >> ret >> >> crc32le: # @crc32le >> >> test esi, esi >> >> je .LBB1_1 >> >> mov eax, -1 >> >> .LBB1_4: # =>This Loop Header: Depth=1 >> >> add esi, -1 >> >> movzx ecx, byte ptr [rdi] >> >> xor eax, ecx >> >> mov r8d, -8 >> >> .LBB1_5: # Parent Loop BB1_4 Depth=1 | # 4 instructions instead of &...

[LLVMdev] regression: double spaced asm output for thumb-2

2009 Jul 17

2

[LLVMdev] regression: double spaced asm output for thumb-2

...] sub sp, sp, #872 mov r4, r1 mov r5, r0 cmp r0, #2 ble LBB1_133 @ entry.bb2.i_crit_edge LBB1_1: @ bb.i ldr r0, [r4, #+8] bl L_atoi$stub LBB1_2: @ bb2.i str r0, [sp, #+80] cmp r5, #1 ble LBB1_134 @ bb2.i.dealwithargs.exit_crit_edge LBB1_3: @ bb3.i ldr r0, [r4, #+4] bl L_atoi$stub LBB1_4: @ dealwithargs.exit mov r4, r0 str r4, [sp, #+64] mov r0, #32 ldr r5, [sp, #+80] mov r1, r5 bl L___divsi3$stub str r0, [sp, #+72] mov r0, #64 mov r1, r5 bl L___divsi3$stub str r0, [sp, #+76] ldr r0, LCPI1_30 -------------- next part -------------- An HTML attachment was scrubbed.....

OpenJDK8 failed to work after compiled by LLVM 8 for X86

2018 Sep 11

3

OpenJDK8 failed to work after compiled by LLVM 8 for X86

Hi Dimitry, Thanks for your kind response! Thanks for the commit message of Jung's patch, I found that the bug had been fixed in OpenJDK 12 by Zhengyu https://bugs.openjdk.java.net/browse/JDK-8205965 But only backported to 11. So Jung could backport it for OpenJDK 8, thanks a lot! But I argue that the root cause might be in the compiler side, why clang-3.9.1, gcc-6.4.1 couldn't

search for: lbb1_4