search for: bb0_2

Displaying 15 results from an estimated 15 matches for "bb0_2".

Did you mean: lbb0_2
2013 Feb 20
3
[LLVMdev] Is va_arg correct on Mips backend?
...4 addu $gp, $2, $25 sw $7, 76($sp) sw $6, 72($sp) sw $5, 68($sp) lw $3, %got(__stack_chk_guard)($gp) lw $1, 0($3) sw $1, 56($sp) sw $4, 52($sp) sw $zero, 48($sp) // i sw $zero, 44($sp) // val sw $zero, 40($sp) // sum addiu $1, $sp, 68 sw $1, 16($sp) // arg_ptr1 sw $zero, 48($sp) b $BB0_2 addiu $2, $zero, 40 $BB0_1: # in Loop: Header=BB0_2 Depth=1 lw $1, 0($4) // $1 = *arg_ptr sw $1, 44($sp) // val lw $4, 40($sp) // sum addu $1, $4, $1 sw $1, 40($sp) // sum += val lw $1, 48($sp) addiu $1, $1, 1 sw $1, 48($sp) $BB0_2:...
2013 Feb 20
0
[LLVMdev] Is va_arg correct on Mips backend?
...2($sp) > sw $5, 68($sp) > lw $3, %got(__stack_chk_guard)($gp) > lw $1, 0($3) > sw $1, 56($sp) > sw $4, 52($sp) > sw $zero, 48($sp) // i > sw $zero, 44($sp) // val > sw $zero, 40($sp) // sum > addiu $1, $sp, 68 > sw $1, 16($sp) // arg_ptr1 > sw $zero, 48($sp) > b $BB0_2 > addiu $2, $zero, 40 > $BB0_1: # in Loop: Header=BB0_2 Depth=1 > lw $1, 0($4) // $1 = *arg_ptr > sw $1, 44($sp) // val > lw $4, 40($sp) // sum > addu $1, $4, $1 > sw $1, 40($sp) // sum += val > lw $1, 48($sp) > addiu $1, $1, 1 > sw $1...
2014 Oct 24
3
[LLVMdev] IndVar widening in IndVarSimplify causing performance regression on GPU programs
...m is not restricted to the NVPTX64 target. Below is a reduced example: __attribute__((global)) void foo(int n, int *output) { for (int i = 0; i < n; i += 3) { output[i] = i * i; } } Without widening, the loop body in the PTX (a low-level assembly-like language generated by NVPTX64) is: BB0_2: // =>This Inner Loop Header: Depth=1 mul.lo.s32 %r5, %r6, %r6; st.u32 [%rd4], %r5; add.s32 %r6, %r6, 3; add.s64 %rd4, %rd4, 12; setp.lt.s32 %p2, %r6, %r3; @%p2 bra BB0_2; in whi...
2013 Feb 19
0
[LLVMdev] Is va_arg correct on Mips backend?
Which part of the generated code do you think is not correct? Could you be more specific? I compiled this program with clang and ran it on a mips board. It returns the expected result (21). On Tue, Feb 19, 2013 at 4:15 AM, Jonathan <gamma_chen at yahoo.com.tw> wrote: > I check the Mips backend for the following C code fragment compile result. > It seems not correct. Is it my
2013 Feb 19
2
[LLVMdev] Is va_arg correct on Mips backend?
I check the Mips backend for the following C code fragment compile result. It seems not correct. Is it my misunderstand or it's a bug. //ch8_3.cpp #include <stdarg.h> int sum_i(int amount, ...) { int i = 0; int val = 0; int sum = 0; va_list vl; va_start(vl, amount); for (i = 0; i < amount; i++) { val = va_arg(vl, int); sum += val; } va_end(vl);
2013 Oct 03
1
[LLVMdev] Help with a Microblaze code generation problem.
...8 swi r3, r19, 24 swi r0, r19, 28 lwi r4, r19, 16 xor r3, r4, r3 lwi r4, r19, 20 or r3, r4, r3 addik r4, r0, 0 addik r5, r0, 1 swi r5, r19, 32 beqid r3, ($BB0_2) swi r4, r19, 36 lwi r3, r19, 36 swi r3, r19, 32 $BB0_2: lwi r3, r19, 32 add r1, r19, r0 lwi r19, r1, 4 rtsd r15, 8 addik r1, r1, 40 .end main Which is very similar to t...
2010 Dec 14
2
[LLVMdev] Branch delay slots broken.
...this snippit: while (n--) *s++ = (char) c; I get this (for the Microblaze): swi r19, r1, 0 add r3, r0, r0 cmp r3, r3, r7 beqid r3, ($BB0_3) brid ($BB0_1) add r19, r1, r0 add r3, r5, r0 $BB0_2: addi r4, r3, 1 addi r7, r7, -1 add r8, r0, r0 sbi r6, r3, 0 cmp r8, r8, r7 bneid r8, ($BB0_2) brid ($BB0_3) add r3, r4, r0 $BB0_3: Notice that the label $BB0_1 is missing. If I disab...
2018 Nov 06
4
Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)
...// these 4 lines is crc >>= 1; // rather poor! } return ~crc; } See <https://godbolt.org/z/eYJeWt> (-O1) and <https://godbolt.org/z/zeExHm> (-O2) crc32be: # @crc32be xor eax, eax test esi, esi jne .LBB0_2 jmp .LBB0_5 .LBB0_4: # in Loop: Header=BB0_2 Depth=1 add rdi, 1 test esi, esi je .LBB0_5 .LBB0_2: # =>This Loop Header: Depth=1 add esi, -1 movzx edx, byte ptr [rdi] shl edx, 24 xor edx, eax mov ecx,...
2014 Sep 02
3
[LLVMdev] LICM promoting memory to scalar
...oii // BB#0: // %entry cbz w0, .LBB0_5 // BB#1: // %for.body.lr.ph mov w8, wzr cmp w0, #0 // =0 cinc w9, w0, lt asr w9, w9, #1 adrp x10, globalvar .LBB0_2: // %for.body // =>This Inner Loop Header: Depth=1 cmp w8, w9 b.hs .LBB0_4 // BB#3: // %if.then // in Loop: Header=BB0_2 Depth=1...
2017 Jul 01
2
KNL Assembly Code for Matrix Multiplication
...+ .LCPI0_6] >>>>> .p2align 4, 0x90 >>>>> .LBB0_1: # %.preheader26 >>>>> # =>This Loop Header: Depth=1 >>>>> # Child Loop BB0_2 Depth 2 >>>>> # Child Loop BB0_3 Depth >>>>> 3 >>>>> # Child Loop BB0_5 Depth >>>>> 3 >>>>> xor r11d, r11d >>>>> .p2a...
2014 Sep 02
2
[LLVMdev] LICM promoting memory to scalar
...bz w0, .LBB0_5 >> // BB#1: // %for.body.lr.ph >> mov w8, wzr >> cmp w0, #0 // =0 >> cinc w9, w0, lt >> asr w9, w9, #1 >> adrp x10, globalvar >> .LBB0_2: // %for.body >> // =>This Inner Loop Header: Depth=1 >> cmp w8, w9 >> b.hs .LBB0_4 >> // BB#3: // %if.then >>...
2018 Nov 27
2
Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)
...;> } >> return ~crc; >> } >> >> See <https://godbolt.org/z/eYJeWt> (-O1) and <https://godbolt.org/z/zeExHm> >> (-O2) >> >> crc32be: # @crc32be >> xor eax, eax >> test esi, esi >> jne .LBB0_2 >> jmp .LBB0_5 >> .LBB0_4: # in Loop: Header=BB0_2 Depth=1 >> add rdi, 1 >> test esi, esi >> je .LBB0_5 >> .LBB0_2: # =>This Loop Header: Depth=1 >> add esi, -1 >> movzx edx, byte ptr [...
2014 Sep 03
3
[LLVMdev] LICM promoting memory to scalar
...oii // BB#0:                                // %entry         cbz     w0, .LBB0_5 // BB#1:                                // %for.body.lr.ph         mov      w8, wzr         cmp      w0, #0                 // =0         cinc     w9, w0, lt         asr     w9, w9, #1         adrp    x10, globalvar .LBB0_2:                                // %for.body                                         // =>This Inner Loop Header: Depth=1         cmp      w8, w9         b.hs    .LBB0_4 // BB#3:                                // %if.then                                         //   in Loop: Header=BB0_2 Depth=1...
2016 Oct 15
3
How to remove memcpy
...l16(memcpy)($17) addiu $16, $fp, 1248 move $4, $16 addiu $6, $zero, 400 jalr $25 move $gp, $17 lw $1, %got($main.b)($17) addiu $5, $1, %lo($main.b) lw $25, %call16(memcpy)($17) addiu $17, $fp, 848 move $4, $17 jalr $25 addiu $6, $zero, 400 sw $zero, 820($fp) sw $zero, 844($fp) addiu $2, $fp, 420 b $BB0_2 addiu $3, $fp, 20 $BB0_1: -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161015/2da75a6d/attachment.html>
2018 Nov 28
2
Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)
...t; See <https://godbolt.org/z/eYJeWt> (-O1) and < >> https://godbolt.org/z/zeExHm> >> >> (-O2) >> >> >> >> crc32be: # @crc32be >> >> xor eax, eax >> >> test esi, esi >> >> jne .LBB0_2 >> >> jmp .LBB0_5 >> >> .LBB0_4: # in Loop: Header=BB0_2 Depth=1 >> >> add rdi, 1 >> >> test esi, esi >> >> je .LBB0_5 >> >> .LBB0_2: # =>This Loop Header: Depth=1 >> >>...