thr3ads.net - search: "lobit"

Displaying 15 results from an estimated 15 matches for "lobit".

[AVR] [MSP430] Code gen improvements for 8 bit and 16 bit targets

2019 Nov 14

[AVR] [MSP430] Code gen improvements for 8 bit and 16 bit targets

...e' InstCombine > transforms that I identified: > > int testSimplifySetCC_0( int x ) // 904 > (InstCombineCasts::transformZExtICmp) > { > return (x & 32) != 0; > } > > define i16 @testSimplifySetCC_0(i16 %x) { > entry: > %and = lshr i16 %x, 5 > %and.lobit = and i16 %and, 1 > ret i16 %and.lobit > } > > > int testSExtICmp_0( int x ) // 1274 (InstCombineCasts:transformSExtICmp) > { > return (x & 32) ? -1 : 0; > } > > define i16 @testSExtICmp_0(i16 %x) { > entry: > %0 = shl i16 %x, 10 > %sext = ashr i1...

[LLVMdev] Correct usage of `llvm.assume` for loop vectorization alignment?

2014 Dec 26

[LLVMdev] Correct usage of `llvm.assume` for loop vectorization alignment?

...nsw i64 %4, %dst_y_step br label %x_body x_body: ; preds = %x_body, %entry %y = phi i64 [ 0, %entry ], [ %y_increment, %x_body ] %14 = getelementptr %u8XY* %0, i64 0, i32 6, i64 %y %15 = load i8* %14, align 1, !llvm.mem.parallel_loop_access !1 %.lobit = lshr i8 %15, 7 %16 = getelementptr %u8XY* %3, i64 0, i32 6, i64 %y store i8 %.lobit, i8* %16, align 1, !llvm.mem.parallel_loop_access !1 %y_increment = add nuw nsw i64 %y, 1 %y_postcondition = icmp eq i64 %y_increment, %13 br i1 %y_postcondition, label %y_exit, label %x_body, !llvm.loop...

[cfe-dev] CFG simplification question, and preservation of branching in the original code

2019 Sep 30

[cfe-dev] CFG simplification question, and preservation of branching in the original code

...actually the front-end that does such undesired optimisations sometimes, not only the LLVM back-end. This is in part why I am saying this is not right. See copied again the IR code that gets generated for the C code that I posted before. This IR code, including the presence of expensive shifts ( %a.lobit = lshr i32 %a, 31) is generated when -mllvm -phi-node-folding-threshold=1 is specified in the command line, or when the Target implements getOperationCost(unsigned Opcode, Type *Ty, Type *OpTy) to return TCC_Expensive for operator types that are bigger than the default target register size. > &...

[cfe-dev] CFG simplification question, and preservation of branching in the original code

2019 Sep 30

[cfe-dev] CFG simplification question, and preservation of branching in the original code

...llowing code > > int cmpge32_0(long a) { > return a>=0; > } > > Compiled for the MSP430 with -O1 or -Os results in the following: > > ; Function Attrs: norecurse nounwind readnone > define dso_local i16 @cmpge32_0(i32 %a) local_unnamed_addr #0 { > entry: > %a.lobit = lshr i32 %a, 31 > %0 = trunc i32 %a.lobit to i16 > %.not = xor i16 %0, 1 > ret i16 %.not > } > > The backend then turns this into the following totally suboptimal code: > > cmpge32_0: > mov r13, r12 > inv r12 > swpb r12 > mov.b r12, r12 > clrc > rrc...

[AVR] [MSP430] Code gen improvements for 8 bit and 16 bit targets

2019 Nov 13

[AVR] [MSP430] Code gen improvements for 8 bit and 16 bit targets

As before, I'm not convinced that we want to allow target-based enable/disable in instcombine for performance. That undermines having a target-independent canonical form in the 1st place. It's not clear to me what the remaining motivating cases look like. If you could post those here or as bugs, I think you'd have a better chance of finding an answer. Let's take a minimal example

[cfe-dev] CFG simplification question, and preservation of branching in the original code

2019 Sep 29

[cfe-dev] CFG simplification question, and preservation of branching in the original code

...gt; > This gets compiled into > > ; Function Attrs: norecurse nounwind readnone > define dso_local i32 @test(i32 %a, i32 %b) local_unnamed_addr #0 { > entry: > %cmp = icmp slt i32 %a, 0 > %sub = sub nsw i32 0, %a > %a.addr.0 = select i1 %cmp, i32 %sub, i32 %a > %a.lobit = lshr i32 %a, 31 > %0 = trunc i32 %a.lobit to i16 > %cmp1 = icmp slt i32 %b, 0 > br i1 %cmp1, label %if.then2, label %if.end4 > > if.then2: ; preds = %entry > %sub3 = sub nsw i32 0, %b > %1 = xor i16 %0, 1 > br label %if.e...

[cfe-dev] CFG simplification question, and preservation of branching in the original code

2019 Oct 01

[cfe-dev] CFG simplification question, and preservation of branching in the original code

...wing code > > int cmpge32_0(long a) { > return a>=0; > } > > Compiled for the MSP430 with -O1 or -Os results in the following: > > ; Function Attrs: norecurse nounwind readnone > define dso_local i16 @cmpge32_0(i32 %a) local_unnamed_addr #0 { > entry: > %a.lobit = lshr i32 %a, 31 > %0 = trunc i32 %a.lobit to i16 > %.not = xor i16 %0, 1 > ret i16 %.not > } > > The backend then turns this into the following totally suboptimal code: > > cmpge32_0: > mov r13, r12 > inv r12 > swpb r12 > mov.b r12, r12 > clrc &...

[cfe-dev] CFG simplification question, and preservation of branching in the original code

2019 Oct 03

[cfe-dev] CFG simplification question, and preservation of branching in the original code

...ong a) { >> return a>=0; >> } >> >> Compiled for the MSP430 with -O1 or -Os results in the following: >> >> ; Function Attrs: norecurse nounwind readnone >> define dso_local i16 @cmpge32_0(i32 %a) local_unnamed_addr #0 { >> entry: >> %a.lobit = lshr i32 %a, 31 >> %0 = trunc i32 %a.lobit to i16 >> %.not = xor i16 %0, 1 >> ret i16 %.not >> } >> >> The backend then turns this into the following totally suboptimal code: >> >> cmpge32_0: >> mov r13, r12 >> inv r12 >>...

[LLVMdev] APInt::getBitsSet

2008 Feb 11

[LLVMdev] APInt::getBitsSet

APInt::getBitsSet's loBit and hiBit arguments describe a range that's inclusive on both ends. Would anyone mind if we change it to be a "half-open" range, meaning exclusive on the high end? Currently every caller (including several new ones in some code I'm writing right now) does a subtract by one...

[LLVMdev] LiveIntervals analysis problem

2013 Feb 14

[LLVMdev] LiveIntervals analysis problem

...reds = %for.inc.6.i.i79.i.i.i %134 = load i16* %incdec.ptr.7.i55.i.i, align 2, !tbaa !5 %phitmp.i80.i.i.i = icmp eq i16 %134, 0 br i1 %phitmp.i80.i.i.i, label %eisneg.exit142.i.i.i, label %if.end.i.i.i.i.i eisneg.exit142.i.i.i: ; preds = %for.inc.7.i.i81.i.i.i %.lobit.i.i.i.i = lshr i16 %79, 15 %.lobit.i138.i.i.i = lshr i16 %98, 15 %cmp.i.i.i = icmp eq i16 %.lobit.i.i.i.i, %.lobit.i138.i.i.i br i1 %cmp.i.i.i, label %if.then12.i.i.i, label %if.end.i.i.i.i.i if.then12.i.i.i: ; preds = %eisneg.exit142.i.i.i store i16 0, i16...

[RFC] jump threading on std::pair<int, bool>

2018 Mar 08

[RFC] jump threading on std::pair<int, bool>

....then.i if.then.i: ; preds = %entry %call1.i = tail call signext i32 @_Z5dummyi(i32 signext %v) %retval.sroa.0.0.insert.ext.i.i = zext i32 %call1.i to i64 br label %_ZL6calleei.exit if.else.i: ; preds = %entry %.lobit.i = lshr i32 %v, 31 %0 = zext i32 %.lobit.i to i64 %retval.sroa.2.0.insert.shift.i8.i = shl nuw nsw i64 %0, 32 %retval.sroa.0.0.insert.ext.i9.i = zext i32 %v to i64 br label %_ZL6calleei.exit _ZL6calleei.exit: ; preds = %if.then.i, %if.else.i %.sink = phi...

[cfe-dev] CFG simplification question, and preservation of branching in the original code

2019 Sep 25

[cfe-dev] CFG simplification question, and preservation of branching in the original code

Changing the order of the checks in CodeGenPrepare::optimizeSelectInst() sounds good to me. But you may need to go further for optimum performance. For example, we may be canonicalizing math/logic IR patterns into 'select' such as in the recent: https://reviews.llvm.org/D67799 So if you want those to become ALU ops again rather than branches, then you need to do the transform later in

[LLVMdev] question about enabling cfl-aa and collecting a57 numbers

2015 Jan 26

[LLVMdev] question about enabling cfl-aa and collecting a57 numbers

Inline George > On Jan 26, 2015, at 1:05 PM, Daniel Berlin <dberlin at dberlin.org> wrote: > > George, given that, can you just build constexpr handling (it's not as easy as you think) as a separate funciton and have it use it in the right places? Will do. :) > FWIW, my current list of CFLAA issues is: > > 1. Unknown values (results from ptrtoint, incoming

[LLVMdev] question about enabling cfl-aa and collecting a57 numbers

2015 Jan 26

[LLVMdev] question about enabling cfl-aa and collecting a57 numbers

George, given that, can you just build constexpr handling (it's not as easy as you think) as a separate funciton and have it use it in the right places? FWIW, my current list of CFLAA issues is: 1. Unknown values (results from ptrtoint, incoming pointers, etc) are not treated as unknown. These should be done through graph edge (so that they can be one way, otherwise, you will unify

[LLVMdev] question about enabling cfl-aa and collecting a57 numbers

2015 Jan 30

[LLVMdev] question about enabling cfl-aa and collecting a57 numbers

...he store. target datalayout = "e-p:64:64:64" @@ -14,8 +15,9 @@ target datalayout = "e-p:64:64:64" @endianness_test = global i64 1, align 8 define i32 @signbit(double %x) nounwind { +; FIXME: This would be ret i32 0 if CFLAA could prove PartialAlias ; CFLAA: ret i32 %tmp5.lobit -; CHECK: ret i32 0 +; CHECK: ret i32 0 entry: %u = alloca %union.anon, align 8 %tmp9 = getelementptr inbounds %union.anon* %u, i64 0, i32 0 diff --git a/test/Analysis/CFLAliasAnalysis/gep-signed-arithmetic.ll b/test/Analysis/CFLAliasAnalysis/gep-signed-arithmetic.ll index a0195d7..557bc40...

search for: lobit