thr3ads.net - search: "bb0

2012 Mar 07

2

[LLVMdev] "Machine LICM" for Constants?

...isted. Here's an example using the mips-unknown-unknown target and Clang/LLVM HEAD. From newlib's implementation of strncat: #define DETECTNULL(X) (((X) - 0x01010101) & ~(X) & 0x80808080) while (!DETECTNULL (*aligned_s1)) aligned_s1++; This loop gets lowered under -O3 to: $BB0_5: lui $3, 32896 lui $7, 65278 ori $3, $3, 32896 ###### Materialize 0x80808080 lw $8, 4($2) nop and $9, $8, $3 ori $7, $7, 65279 ###### Materialize -(0x01010101) addiu $2, $2, 4 xor $3, $9, $3 addu $7, $8, $7 and $3, $3, $7 beq $3, $zero, $BB0_5 There are a ton...

[LLVMdev] "Machine LICM" for Constants?

2012 Mar 07

0

[LLVMdev] "Machine LICM" for Constants?

...s-unknown-unknown target and Clang/LLVM > HEAD. From newlib's implementation of strncat: > > #define DETECTNULL(X) (((X) - 0x01010101) & ~(X) & 0x80808080) > while (!DETECTNULL (*aligned_s1)) > aligned_s1++; > > This loop gets lowered under -O3 to: > > $BB0_5: > lui $3, 32896 > lui $7, 65278 > ori $3, $3, 32896 ###### Materialize 0x80808080 > lw $8, 4($2) > nop > and $9, $8, $3 > ori $7, $7, 65279 ###### Materialize -(0x01010101) > addiu $2, $2, 4 > xor $3, $9, $3 > addu $7, $8, $7 > and $3, $3, $...

[LLVMdev] Is va_arg correct on Mips backend?

2013 Feb 20

3

[LLVMdev] Is va_arg correct on Mips backend?

...8($sp) $BB0_2: # =>This Inner Loop Header: Depth=1 lw $1, 52($sp) lw $4, 48($sp) slt $1, $4, $1 beq $1, $zero, $BB0_6 nop # BB#3: # in Loop: Header=BB0_2 Depth=1 lw $4, 16($sp) // arg_ptr1 sltu $1, $2, $4 bne $1, $zero, $BB0_5 nop # BB#4: # in Loop: Header=BB0_2 Depth=1 addiu $1, $4, 8 // arg_ptr2 + 8 lw $5, 28($sp) // arg_ptr2_offset has no initial value sw $1, 16($sp) b $BB0_1 addu $4, $5, $4 $BB0_5: # in Loop: Header=BB0_2 Depth=1 lw $4, 24(...

[LLVMdev] "Machine LICM" for Constants?

2012 Mar 08

1

[LLVMdev] "Machine LICM" for Constants?

...g/LLVM >> HEAD. From newlib's implementation of strncat: >> >> #define DETECTNULL(X) (((X) - 0x01010101)& ~(X)& 0x80808080) >> while (!DETECTNULL (*aligned_s1)) >> aligned_s1++; >> >> This loop gets lowered under -O3 to: >> >> $BB0_5: >> lui $3, 32896 >> lui $7, 65278 >> ori $3, $3, 32896 ###### Materialize 0x80808080 >> lw $8, 4($2) >> nop >> and $9, $8, $3 >> ori $7, $7, 65279 ###### Materialize -(0x01010101) >> addiu $2, $2, 4 >> xor $3, $9, $3...

[LLVMdev] Is va_arg correct on Mips backend?

2013 Feb 20

0

[LLVMdev] Is va_arg correct on Mips backend?

...# =>This Inner Loop Header: Depth=1 > lw $1, 52($sp) > lw $4, 48($sp) > slt $1, $4, $1 > beq $1, $zero, $BB0_6 > nop > # BB#3: # in Loop: Header=BB0_2 Depth=1 > lw $4, 16($sp) // arg_ptr1 > sltu $1, $2, $4 > bne $1, $zero, $BB0_5 > nop > # BB#4: # in Loop: Header=BB0_2 Depth=1 > addiu $1, $4, 8 // arg_ptr2 + 8 > lw $5, 28($sp) // arg_ptr2_offset has no initial value > sw $1, 16($sp) > b $BB0_1 > addu $4, $5, $4 > $BB0_5: # in Loop: H...

[LLVMdev] Replacing Platform Specific IR Codes with Generic Implementation and Introducing Macro Facilities

2014 May 10

6

[LLVMdev] Replacing Platform Specific IR Codes with Generic Implementation and Introducing Macro Facilities

On 10 May 2014, at 13:53, Tim Northover <t.p.northover at gmail.com> wrote: > It doesn't make sense for everything though, particularly if you want > target-specific IR to simply not exist. What would you map ARM's > "ldrex" to on x86? This isn't a great example. Having load-linked / store-conditional in the IR would make a number of transforms related to

[LLVMdev] code generation removes duplicated instructions

2011 Jul 06

0

[LLVMdev] code generation removes duplicated instructions

On 6 July 2011 02:31, D S Khudia <daya.khudia at gmail.com> wrote: > %0 = load i32* %i, align 4 > %HV14_ = getelementptr inbounds [100 x i32]* %a, i32 0, i32 %0 > %1 = getelementptr inbounds [100 x i32]* %a, i32 0, i32 %0 > %HVCmp7 = icmp ne i32* %1, %HV14_ > br i1 %HVCmp7, label %relExit, label %bb.split > > So that HV14_ is a new instruction and I am

[LLVMdev] code generation removes duplicated instructions

2011 Jul 06

2

[LLVMdev] code generation removes duplicated instructions

Hello, I am duplicating few instructions in a basic block and splitting it. The following is an example. bb: ; preds = %bb1 %0 = load i32* %i, align 4 %1 = getelementptr inbounds [100 x i32]* %a, i32 0, i32 %0 store i32 0, i32* %1, align 4 %2 = load i32* %i, align 4 %3 = getelementptr inbounds [100 x i32]* %last_added, i32 0, i32 %2 store

[LLVMdev] code generation removes duplicated instructions

2011 Jul 06

2

[LLVMdev] code generation removes duplicated instructions

...nds [100 x i32]* %a, i32 0, i32 %0 %1 = getelementptr inbounds [100 x i32]* %a, i32 0, i32 %0 %HVCmp15 = icmp ne i32* %1, %HV10_ br i1 %HVCmp15, label %relExit, label %bb.split x86 asm: .LBB0_1: # %bb # in Loop: Header=BB0_5 Depth=1 leal 972(%esp), %eax movl 568(%esp), %ecx imull $4, %ecx, %edx addl %eax, %edx imull $4, %ecx, %ecx addl %eax, %ecx cmpl %edx, %ecx movl %ecx, 508(%esp) # 4-byte Spill jne .LBB0_88 arm asm: .LBB0_1: @ %bb...

[LLVMdev] Is va_arg correct on Mips backend?

2013 Feb 19

0

[LLVMdev] Is va_arg correct on Mips backend?

Which part of the generated code do you think is not correct? Could you be more specific? I compiled this program with clang and ran it on a mips board. It returns the expected result (21). On Tue, Feb 19, 2013 at 4:15 AM, Jonathan <gamma_chen at yahoo.com.tw> wrote: > I check the Mips backend for the following C code fragment compile result. > It seems not correct. Is it my

[LLVMdev] Is va_arg correct on Mips backend?

2013 Feb 19

2

[LLVMdev] Is va_arg correct on Mips backend?

I check the Mips backend for the following C code fragment compile result. It seems not correct. Is it my misunderstand or it's a bug. //ch8_3.cpp #include <stdarg.h> int sum_i(int amount, ...) { int i = 0; int val = 0; int sum = 0; va_list vl; va_start(vl, amount); for (i = 0; i < amount; i++) { val = va_arg(vl, int); sum += val; } va_end(vl);

KNL Assembly Code for Matrix Multiplication

2017 Jul 01

2

KNL Assembly Code for Matrix Multiplication

...der: Depth=1 >>>>> # Child Loop BB0_2 Depth 2 >>>>> # Child Loop BB0_3 Depth >>>>> 3 >>>>> # Child Loop BB0_5 Depth >>>>> 3 >>>>> xor r11d, r11d >>>>> .p2align 4, 0x90 >>>>> .LBB0_2: # %.preheader >>>>> # Parent Loop BB0_1 Depth=1 >>>>>...

search for: bb0_5