thr3ads.net - search: "i33"

Displaying 20 results from an estimated 54 matches for "i33".

Did you mean: 133

SelectionDAG::LegalizeTypes is very slow in 3.1 version

2016 Sep 27

SelectionDAG::LegalizeTypes is very slow in 3.1 version

In 3.1, the backend is very slow to legalize types. Following is the code snippet which may be the culprit: %Result.i.i.i97 = alloca i33, align 8 %Result.i.i.i96= alloca i33, align 8 %Result.i.i.i95 = alloca i33, align 8 %Result.i.i.i94 = alloca i33, align 8 %Result.i.i.i93 = alloca i33, align 8 %Result.i.i.i92= alloca i33, align 8 %Result.i.i.i91 = alloca i33, align 8 %Result.i.i.i90 = alloca i33, align 8 %Result.i....

[LLVMdev] Checked arithmetic

2008 Mar 26

[LLVMdev] Checked arithmetic

..., > Why not define an "add with overflow" intrinsic that returns its value and > overflow bit as an i1? what's the point? We have this today with apint codegen (if you turn on LegalizeTypes). For example, this function define i1 @cc(i32 %x, i32 %y) { %xx = zext i32 %x to i33 %yy = zext i32 %y to i33 %s = add i33 %xx, %yy %tmp = lshr i33 %s, 32 %b = trunc i33 %tmp to i1 ret i1 %b } codegens (on x86-32) to cc: xorl %eax, %eax movl 4(%esp), %ecx addl 8(%esp), %ecx adcl $0, %eax andl $1, %eax ret w...

[LLVMdev] Unrolling loops into constant-time expressions

2010 Nov 23

[LLVMdev] Unrolling loops into constant-time expressions

...} generates: define i32 @loop(i32 %x) nounwind readnone { %1 = icmp sgt i32 %x, 0 br i1 %1, label %bb.nph, label %3 bb.nph: ; preds = %0 %tmp4 = add i32 %x, -1 %tmp6 = add i32 %x, -2 %tmp16 = add i32 %x, -3 %tmp7 = zext i32 %tmp6 to i33 %tmp5 = zext i32 %tmp4 to i33 %tmp17 = zext i32 %tmp16 to i33 %tmp15 = mul i33 %tmp5, %tmp7 %tmp18 = mul i33 %tmp15, %tmp17 %tmp8 = mul i32 %tmp4, %tmp6 %tmp19 = lshr i33 %tmp18, 1 %2 = shl i32 %tmp8, 2 %tmp20 = trunc i33 %tmp19 to i32 %tmp12 = mul i32 %x, 5...

[LLVMdev] Virtual register def doesn't dominate all uses

2014 Oct 24

[LLVMdev] Virtual register def doesn't dominate all uses

...i32 %end_loop_index) #1 { entry: %cmp4 = icmp sgt i32 %end_loop_index, 0 br i1 %cmp4, label %for.cond.for.end_crit_edge, label %for.end for.cond.for.end_crit_edge: ; preds = %entry %0 = add i32 %end_loop_index, -2 %1 = add i32 %end_loop_index, -1 %2 = zext i32 %0 to i33 %3 = zext i32 %1 to i33 %4 = mul i33 %3, %2 %5 = lshr i33 %4, 1 %6 = trunc i33 %5 to i32 %7 = add i32 %6, %end_loop_index %8 = add i32 %7, -1 br label %for.end for.end: ; preds = %for.cond.for.end_crit_edge, %entry %sum.0.lcssa = phi i32 [ %...

[LLVMdev] Checked arithmetic

2008 Mar 26

[LLVMdev] Checked arithmetic

On Tue, 2008-03-25 at 21:18 -0700, Chris Lattner wrote: > On Mar 25, 2008, at 8:25 PM, Jonathan S. Shapiro wrote: > > > In looking at the LLVM reference manual, it is conspicuous that (a) > > the > > IR does not define condition codes, and (b) the IR does not define > > opcodes that return condition results in addition to their > > computational > >

[GlobalISel] Narrowing uneven/non-pow-2 types

2020 Mar 25

[GlobalISel] Narrowing uneven/non-pow-2 types

...y Basically we have an architecture which only works on 32-bit types and we are starting to hit many of the edge cases. For example this particular question arose because we are seeing the following LLVM-IR, which we cannot legalize with our current legalization rules: %6 = zext i32 %5 to i33 %7 = zext i32 %0 to i33 %8 = mul i33 %6, %7 %9 = lshr i33 %8, 1 %10 = trunc i33 %9 to i32 getActionDefinitionsBuilder(G_MUL) .legalFor({s32}) .clampScalar(0, s32, s32); getActionDefinitionsBuilder(G_LSHR) .legalFor({{s32, s32}}) .clampScalar(1...

how experimental are the llvm.experimental.vector.reduce.* functions?

2019 Feb 09

how experimental are the llvm.experimental.vector.reduce.* functions?

Something like this should work I think. ; ModuleID = 'test.ll' source_filename = "test.ll" define void @entry(<4 x i32>* %a, <4 x i32>* %b, <4 x i32>* %x) { Entry: %tmp = load <4 x i32>, <4 x i32>* %a, align 16 %tmp1 = load <4 x i32>, <4 x i32>* %b, align 16 %tmp2 = add <4 x i32> %tmp, %tmp1 %tmpsign = icmp slt <4 x

[LLVMdev] Virtual register def doesn't dominate all uses

2014 Oct 29

[LLVMdev] Virtual register def doesn't dominate all uses

...p sgt i32 %end_loop_index, 0 >> br i1 %cmp4, label %for.cond.for.end_crit_edge, label %for.end >> >> for.cond.for.end_crit_edge: ; preds = %entry >> %0 = add i32 %end_loop_index, -2 >> %1 = add i32 %end_loop_index, -1 >> %2 = zext i32 %0 to i33 >> %3 = zext i32 %1 to i33 >> %4 = mul i33 %3, %2 >> %5 = lshr i33 %4, 1 >> %6 = trunc i33 %5 to i32 >> %7 = add i32 %6, %end_loop_index >> %8 = add i32 %7, -1 >> br label %for.end >> >> for.end: ; preds...

[LLVMdev] Virtual register def doesn't dominate all uses

2014 Oct 31

[LLVMdev] Virtual register def doesn't dominate all uses

...i1 %cmp4, label %for.cond.for.end_crit_edge, label %for.end >>>> >>>> for.cond.for.end_crit_edge: ; preds = %entry >>>> %0 = add i32 %end_loop_index, -2 >>>> %1 = add i32 %end_loop_index, -1 >>>> %2 = zext i32 %0 to i33 >>>> %3 = zext i32 %1 to i33 >>>> %4 = mul i33 %3, %2 >>>> %5 = lshr i33 %4, 1 >>>> %6 = trunc i33 %5 to i32 >>>> %7 = add i32 %6, %end_loop_index >>>> %8 = add i32 %7, -1 >>>> br label %for.end >>>> &g...

[LLVMdev] Checked arithmetic

2008 Mar 26

[LLVMdev] Checked arithmetic

On Wed, 26 Mar 2008, Jonathan S. Shapiro wrote: > I want to background process this for a bit, but it would be helpful to > discuss some approaches first. > > There would appear to be three approaches: > > 1. Introduce a CC register class into the IR. This seems to be a > fairly major overhaul. > > 2. Introduce a set of scalar and fp computation quasi-instructions

[LLVMdev] Checked arithmetic

2008 Mar 26

[LLVMdev] Checked arithmetic

...9;s the point? We have this today with apint codegen (if you turn on > LegalizeTypes). For example, this function The desired code is something like: foo: addl %eax, %ecx jo overflow_happened use(%ecx) etc. -Chris > define i1 @cc(i32 %x, i32 %y) { > %xx = zext i32 %x to i33 > %yy = zext i32 %y to i33 > %s = add i33 %xx, %yy > %tmp = lshr i33 %s, 32 > %b = trunc i33 %tmp to i1 > ret i1 %b > } > > codegens (on x86-32) to > > cc: > xorl %eax, %eax > movl 4(%esp), %ecx > addl 8(%esp), %ecx >...

[LLVMdev] Hoisting elements of array argument into registers

2010 Nov 07

[LLVMdev] Hoisting elements of array argument into registers

...[4 x i32]* %sp, i64 0, i64 3 store i32 0, i32* %3, align 4 %4 = icmp eq i32 %a, 0 br i1 %4, label %wf.exit, label %bb.nph.i bb.nph.i: ; preds = %entry %.promoted1.i = load i32* %1, align 4 %tmp12.i = add i32 %a, -1 %tmp13.i = zext i32 %tmp12.i to i33 %tmp14.i = add i32 %a, -2 %tmp15.i = zext i32 %tmp14.i to i33 %tmp16.i = mul i33 %tmp13.i, %tmp15.i %tmp17.i = lshr i33 %tmp16.i, 1 %tmp18.i = trunc i33 %tmp17.i to i32 %tmp20.i = mul i32 %.promoted1.i, 5 %tmp21.i = add i32 %tmp20.i, -5 %tmp22.i = mul i32 %tmp21.i, %tmp12.i %tmp9....

[LLVMdev] Add/sub with carry; widening multiply

2007 Nov 21

[LLVMdev] Add/sub with carry; widening multiply

...round with llvm lately and I was wondering something about the bitcode instructions for basic arithmetic. Is there any plan to provide instructions that perform widening multiply, or add with carry? It might be written as: mulw i32 %lhs %rhs -> i64 ; widening multiply addw i32 %lhs %rhs -> i33 ; widening add addc i32 %lhs, i32 %rhs, i1 %c -> i33 ; add with carry Alternatively, would something like following get reduced to a single multiply and two stores on arch's that support wide multiplies, like x86-32 and ARM? define void @mulw(i32* hidest, i32* lodest, i32 lhs, i32 rhs) {...

[LLVMdev] Virtual register def doesn't dominate all uses

2014 Nov 01

[LLVMdev] Virtual register def doesn't dominate all uses

...edge, label %for.end >>>>>> >>>>>> for.cond.for.end_crit_edge: ; preds = %entry >>>>>> %0 = add i32 %end_loop_index, -2 >>>>>> %1 = add i32 %end_loop_index, -1 >>>>>> %2 = zext i32 %0 to i33 >>>>>> %3 = zext i32 %1 to i33 >>>>>> %4 = mul i33 %3, %2 >>>>>> %5 = lshr i33 %4, 1 >>>>>> %6 = trunc i33 %5 to i32 >>>>>> %7 = add i32 %6, %end_loop_index >>>>>> %8 = add i32 %7, -1 >&gt...

[LLVMdev] Passing return values on the stack & storing arbitrary sized integers

2012 Aug 17

[LLVMdev] Passing return values on the stack & storing arbitrary sized integers

...tions returning e.g. {i128,i1}. > > This isn't very important; you won't run into it compiling C code. OK, fine :-) >> 2. Storing arbitrary sized integers >> >> The testcase "test/CodeGen/Generic/APIntLoadStore.ll" checks for >> loading/storing e.g. i33 integers from/into global variable. The >> questions are the same as regarding feature 1: How important is this >> feature? Is it save to ignore it? Is there some guide how to implement >> this? > > If you're using the LLVM CodeGen infrastructure and have everything >...

[LLVMdev] Hoisting elements of array argument into registers

2010 Nov 06

[LLVMdev] Hoisting elements of array argument into registers

I am seeing the wf loop get optimized just fine with llvm 2.8 (and almost as good with head). I'm running on Mac OS X 10.6. I have an apple supplied llvm-gcc and a self compiled llvm 2.8. When I run $ llvm-gcc -emit-llvm -S M.c $ opt -O2 M.s | llvm-dis I see that: 1. Tail recursion has been eliminated from wf 2. The accesses to sp have been promoted to registers 3. The loop has

[LLVMdev] ConstantExpr refactoring

2012 Jul 01

[LLVMdev] ConstantExpr refactoring

...ats can actually implement that. One other request that isn't in the PR, I'd like whatever replaces GEP to not store "i32 0" vs. "i64 0". Right now "i8* getelementptr ([1 x i8]* @glbl), i32 0, i32 0" is different from "i8* getelement ([1 x i8]* @glbl), i33 0, i33 0", and there's an infinite number of these. We should canonicalize these harder and only produce one Value* here. Nick > > Bug 10368 [1] tells me that ConstantExpr shouldn't automatically fold, > and that this is source of many problems (most notably with traps) and...

how experimental are the llvm.experimental.vector.reduce.* functions?

2019 Feb 09

how experimental are the llvm.experimental.vector.reduce.* functions?

..., !dbg !57 call void @llvm.dbg.declare(metadata i32* %x, metadata !50, metadata !DIExpression()), !dbg !57 ret void, !dbg !58 } You can see this takes advantage of @llvm.sadd.with.overflow, which is not available with vectors. So here is a different approach (pseudocode): %a_zext = zext %a to i33 # 1 more bit %b_zext = zext %b to i33 # 1 more bit %result_zext = add %a_zext, %b_zext %max_result = @llvm.experimental.vector.reduce.umax(%result_zext) %overflow = icmp %max_result > @max_i32_value %result = trunc %result_zext to i32 You can imagine how this would work for signed integers, rep...

[LLVMdev] Checked arithmetic

2008 Mar 26

[LLVMdev] Checked arithmetic

...epholes. > > 3. Handle CC as a black magic special case, which at least has the > merit of tradition. :-) 4. Do arithmetic in a type with one more bit. For example, suppose you want to know if an i32 add "x+y" will overflow. Extend x and y to 33 bit integers, and do an i33 add. Inspect the upper bit to see if it overflowed. Truncate to 32 bits to get the result. Probably codegen can be taught to implement this as a 32 bit add + inspection of the CC. Ciao, Duncan. PS: In order for codegen to be able to handle i33, you need to turn on the new LegalizeTypes infras...

[LLVMdev] Passing return values on the stack & storing arbitrary sized integers

2012 Aug 20

[LLVMdev] Passing return values on the stack & storing arbitrary sized integers

Hi Eli, >>>> 2. Storing arbitrary sized integers >>>> >>>> The testcase "test/CodeGen/Generic/APIntLoadStore.ll" checks for >>>> loading/storing e.g. i33 integers from/into global variable. The >>>> questions are the same as regarding feature 1: How important is this >>>> feature? Is it save to ignore it? Is there some guide how to implement >>>> this? >>> >>> If you're using the LLVM CodeGen...

search for: i33