thr3ads.net - search: "bb0

2010 Oct 07

2

[LLVMdev] [Q] x86 peephole deficiency

...izer (http://llvm.org/PR8125) and now I am running into a deficiency of the x86 peephole optimizer (or jump-threader?). Here is what I get: andl $3, %edi je .LBB0_4 # BB#2: # %nz # in Loop: Header=BB0_1 Depth=1 cmpl $2, %edi je .LBB0_6 # BB#3: # %nz.non-middle # in Loop: Header=BB0_1 Depth=1 cmpl $2, %edi jbe .LBB0_4 # BB#5: # %sw.bb6...

[LLVMdev] Not enough optimisations in the SelectionDAG phase?

2012 Apr 25

3

[LLVMdev] Not enough optimisations in the SelectionDAG phase?

...; preds = %27, %entry %28 = load volatile i32* inttoptr (i64 2149581832 to i32*), align 8 %29 = icmp slt i32 %28, 0 br i1 %29, label %27, label %loop.exit loop.exit: ; preds = %27 llc will generate following MIPS code, $BB0_1: lui $3, 32800 ori $3, $3, 1032 lw $3, 0($3) bltz $3, $BB0_1 nop # BB#2: The two operation lui and ori which are used to calculate memory address actually are loop invariants. They supposed to be moved out of the loop. I thought it might be a limitation of the MIPS backend. Then I t...

[LLVMdev] Branch delay slots broken.

2010 Dec 14

2

[LLVMdev] Branch delay slots broken.

...ot surprising since the code is very similar. If I compile code with this snippit: while (n--) *s++ = (char) c; I get this (for the Microblaze): swi r19, r1, 0 add r3, r0, r0 cmp r3, r3, r7 beqid r3, ($BB0_3) brid ($BB0_1) add r19, r1, r0 add r3, r5, r0 $BB0_2: addi r4, r3, 1 addi r7, r7, -1 add r8, r0, r0 sbi r6, r3, 0 cmp r8, r8, r7 bneid r8, ($BB0_2) brid ($BB0_3) add r3...

[LLVMdev] [Q] x86 peephole deficiency

2010 Oct 07

0

[LLVMdev] [Q] x86 peephole deficiency

...d now I am running into a deficiency of the x86 > peephole optimizer (or jump-threader?). Here is what I get: > > > andl $3, %edi > je .LBB0_4 > # BB#2: # %nz > # in Loop: Header=BB0_1 > Depth=1 > cmpl $2, %edi > je .LBB0_6 > # BB#3: # %nz.non-middle > # in Loop: Header=BB0_1 > Depth=1 > cmpl $2, %edi > jbe .LBB0_4 > # BB#5:...

[LLVMdev] Is va_arg correct on Mips backend?

2013 Feb 20

3

[LLVMdev] Is va_arg correct on Mips backend?

...76($sp) sw $6, 72($sp) sw $5, 68($sp) lw $3, %got(__stack_chk_guard)($gp) lw $1, 0($3) sw $1, 56($sp) sw $4, 52($sp) sw $zero, 48($sp) // i sw $zero, 44($sp) // val sw $zero, 40($sp) // sum addiu $1, $sp, 68 sw $1, 16($sp) // arg_ptr1 sw $zero, 48($sp) b $BB0_2 addiu $2, $zero, 40 $BB0_1: # in Loop: Header=BB0_2 Depth=1 lw $1, 0($4) // $1 = *arg_ptr sw $1, 44($sp) // val lw $4, 40($sp) // sum addu $1, $4, $1 sw $1, 40($sp) // sum += val lw $1, 48($sp) addiu $1, $1, 1 sw $1, 48($sp) $BB0_2: # =>This...

[LLVMdev] Is va_arg correct on Mips backend?

2013 Feb 20

0

[LLVMdev] Is va_arg correct on Mips backend?

..., %got(__stack_chk_guard)($gp) > lw $1, 0($3) > sw $1, 56($sp) > sw $4, 52($sp) > sw $zero, 48($sp) // i > sw $zero, 44($sp) // val > sw $zero, 40($sp) // sum > addiu $1, $sp, 68 > sw $1, 16($sp) // arg_ptr1 > sw $zero, 48($sp) > b $BB0_2 > addiu $2, $zero, 40 > $BB0_1: # in Loop: Header=BB0_2 Depth=1 > lw $1, 0($4) // $1 = *arg_ptr > sw $1, 44($sp) // val > lw $4, 40($sp) // sum > addu $1, $4, $1 > sw $1, 40($sp) // sum += val > lw $1, 48($sp) > addiu $1, $1, 1 > sw $1, 48($sp) > $BB0_2:...

[LLVMdev] Not enough optimisations in the SelectionDAG phase?

2012 Apr 29

0

[LLVMdev] Not enough optimisations in the SelectionDAG phase?

...= %27, %entry > %28 = load volatile i32* inttoptr (i64 2149581832 to i32*), align 8 > %29 = icmp slt i32 %28, 0 > br i1 %29, label %27, label %loop.exit > > loop.exit: ; preds = %27 > > llc will generate following MIPS code, > > $BB0_1: > lui $3, 32800 > ori $3, $3, 1032 > lw $3, 0($3) > bltz $3, $BB0_1 > nop > # BB#2: > > > The two operation lui and ori which are used to calculate memory address actually are loop invariants. They supposed to be moved out of...

Comparing Clang and GCC: only clang stores updated value in each iteration.

2018 Sep 20

3

Comparing Clang and GCC: only clang stores updated value in each iteration.

...=1 .text .file "testfun.i" .globl b # -- Begin function b .p2align 4 .type b, at function b: # @b # %bb.0: # %entry lrl %r0, a .LBB0_1: # %do.body # =>This Inner Loop Header: Depth=1 cije %r0, 0, .LBB0_3 # %bb.2: # %if.then # in Loop: Header=BB0_1 Depth=1 ahi ...

[LLVMdev] .globl

2013 Sep 02

0

[LLVMdev] .globl

...# 4-byte Folded Spill sw $17, 20($sp) # 4-byte Folded Spill sw $16, 16($sp) # 4-byte Folded Spill $tmp3: .cfi_offset 31, -4 $tmp4: .cfi_offset 18, -8 $tmp5: .cfi_offset 17, -12 $tmp6: .cfi_offset 16, -16 addu $16, $2, $25 move $17, $4 lw $18, %call16(foo)($16) $BB0_1: # %loop # =>This Inner Loop Header: Depth=1 move $25, $18 jalr $25 move $gp, $16 addiu $17, $17, -1 bnez $17, $BB0_1 nop # BB#2: # %exit lw $16, 16($sp) # 4-byte Folded Relo...

[LLVMdev] Not enough optimisations in the SelectionDAG phase?

2012 Apr 29

1

[LLVMdev] Not enough optimisations in the SelectionDAG phase?

...d volatile i32* inttoptr (i64 2149581832 to i32*), align 8 >> %29 = icmp slt i32 %28, 0 >> br i1 %29, label %27, label %loop.exit >> >> loop.exit: ; preds = %27 >> >> llc will generate following MIPS code, >> >> $BB0_1: >> lui $3, 32800 >> ori $3, $3, 1032 >> lw $3, 0($3) >> bltz $3, $BB0_1 >> nop >> # BB#2: >> >> >> The two operation lui and ori which are used to calculate memory address actually are loop invariants. They supposed to be moved...

[LLVMdev] .globl

2013 Aug 29

2

[LLVMdev] .globl

I need to be able to emit .globl for the soft float routines used by mips16. The routines are called but there is no .globl definition for them. How can I do this? Background: I have a strange issue that I encountered with mips16 hard float. Part of mips16 hard float is to emit calls to runtime routines with the same signature as usual soft float routines, except that they are implemented

[LLVMdev] Is va_arg correct on Mips backend?

2013 Feb 19

0

[LLVMdev] Is va_arg correct on Mips backend?

Which part of the generated code do you think is not correct? Could you be more specific? I compiled this program with clang and ran it on a mips board. It returns the expected result (21). On Tue, Feb 19, 2013 at 4:15 AM, Jonathan <gamma_chen at yahoo.com.tw> wrote: > I check the Mips backend for the following C code fragment compile result. > It seems not correct. Is it my

[LLVMdev] Is va_arg correct on Mips backend?

2013 Feb 19

2

[LLVMdev] Is va_arg correct on Mips backend?

I check the Mips backend for the following C code fragment compile result. It seems not correct. Is it my misunderstand or it's a bug. //ch8_3.cpp #include <stdarg.h> int sum_i(int amount, ...) { int i = 0; int val = 0; int sum = 0; va_list vl; va_start(vl, amount); for (i = 0; i < amount; i++) { val = va_arg(vl, int); sum += val; } va_end(vl);

KNL Assembly Code for Matrix Multiplication

2017 Jul 01

2

KNL Assembly Code for Matrix Multiplication

...word ptr [rip + .LCPI0_4] >>>>> vpbroadcastq zmm6, qword ptr [rip + .LCPI0_5] >>>>> kxnorw k1, k0, k0 >>>>> kshiftrw k1, k1, 8 >>>>> vpbroadcastq zmm7, qword ptr [rip + .LCPI0_6] >>>>> .p2align 4, 0x90 >>>>> .LBB0_1: # %.preheader26 >>>>> # =>This Loop Header: Depth=1 >>>>> # Child Loop BB0_2 Depth 2 >>>>> #...

[LLVMdev] Branch delay slots broken.

2010 Dec 14

2

[LLVMdev] Branch delay slots broken.

On 12/14/2010 04:28 PM, Wesley Peck wrote: > On Dec 14, 2010, at 3:46 PM, Richard Pennington wrote: >> Notice that the label $BB0_1 is missing. If I disable filling in the >> branch delay slots, I get: > > Is this with the latest SVN HEAD version of LLVM or some other version? The delay slot filler and many other things have been updated for the Microblaze backend. In particular, the commit r120095 for the MBlaze ba...

[LLVMdev] Branch delay slots broken.

2010 Dec 14

0

[LLVMdev] Branch delay slots broken.

On Dec 14, 2010, at 3:46 PM, Richard Pennington wrote: > Notice that the label $BB0_1 is missing. If I disable filling in the > branch delay slots, I get: Is this with the latest SVN HEAD version of LLVM or some other version? The delay slot filler and many other things have been updated for the Microblaze backend. In particular, the commit r120095 for the MBlaze backend fixed...

[LLVMdev] [Q] x86 peephole deficiency

2010 Oct 13

2

[LLVMdev] [Q] x86 peephole deficiency

...deficiency of the x86 >> peephole optimizer (or jump-threader?). Here is what I get: >> >> >> andl $3, %edi >> je .LBB0_4 >> # BB#2: # %nz >> # in Loop: Header=BB0_1 >> Depth=1 >> cmpl $2, %edi >> je .LBB0_6 >> # BB#3: # %nz.non-middle >> # in Loop: Header=BB0_1 >> Depth=1 >> cmpl $2, %edi >> jbe...

[LLVMdev] Replacing Platform Specific IR Codes with Generic Implementation and Introducing Macro Facilities

2014 May 10

6

[LLVMdev] Replacing Platform Specific IR Codes with Generic Implementation and Introducing Macro Facilities

On 10 May 2014, at 13:53, Tim Northover <t.p.northover at gmail.com> wrote: > It doesn't make sense for everything though, particularly if you want > target-specific IR to simply not exist. What would you map ARM's > "ldrex" to on x86? This isn't a great example. Having load-linked / store-conditional in the IR would make a number of transforms related to

question about xray tls data initialization

2017 Nov 21

2

question about xray tls data initialization

...# %entry .p2align 1, 0x90 .Lxray_sled_0: .ascii "\353\t" nop word ptr [rax + rax + 512] sub rsp, 16 .seh_stackalloc 16 .seh_endprologue mov dword ptr [rsp + 12], ecx mov dword ptr [rsp + 8], 0 mov dword ptr [rsp + 4], 0 .LBB0_1: # %for.cond # =>This Inner Loop Header: Depth=1 mov eax, dword ptr [rsp + 4] cmp eax, dword ptr [rsp + 12] jge .LBB0_4 # BB#2: # %for.body...

How to remove memcpy

2016 Oct 15

3

How to remove memcpy

...6, $fp, 1248 move $4, $16 addiu $6, $zero, 400 jalr $25 move $gp, $17 lw $1, %got($main.b)($17) addiu $5, $1, %lo($main.b) lw $25, %call16(memcpy)($17) addiu $17, $fp, 848 move $4, $17 jalr $25 addiu $6, $zero, 400 sw $zero, 820($fp) sw $zero, 844($fp) addiu $2, $fp, 420 b $BB0_2 addiu $3, $fp, 20 $BB0_1: -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161015/2da75a6d/attachment.html>

search for: bb0_1