thr3ads.net - search: "lshr"

Displaying 20 results from an estimated 259 matches for "lshr".

[LLVMdev] Stop opt from producing 832 bit integer?

2012 Aug 24

[LLVMdev] Stop opt from producing 832 bit integer?

...:64:64-n32" * *target triple = "mipsel-unknown-linux"* *@.str = private unnamed_addr constant [16 x i8] c"%.2d %.2d %.2d\0A\00", align 1* *define void @gsm_print(i8* nocapture %c) nounwind {* *entry:* * %0 = load i8* %c, align 1* * %conv = zext i8 %0 to i32* * %shr13 = lshr i32 %conv, 2* * %1 = zext i32 %shr13 to i832* * %2 = shl nuw nsw i832 %1, 304* * %shr314 = lshr i32 %conv, 1* * %and4 = and i32 %shr314, 7* * %3 = zext i32 %shr314 to i832* * %4 = shl nuw nsw i832 %3, 352* * %shr815 = lshr i32 %conv, 3* * %5 = zext i32 %shr815 to i832* * %6 = shl nuw nsw i...

[LLVMdev] InstCombine adds bit masks, confuses self, others

2012 Apr 16

[LLVMdev] InstCombine adds bit masks, confuses self, others

...into straightforward code and figures out the 0 return value: shrl $2, %edi movl %edi, (%rsi) addl %edi, %edi movl %edi, 4(%rsi) movl $0, %eax ret LLVM optimizes the code: $ clang -O -S -o- small.c -emit-llvm define i32 @f(i32 %a, i32* nocapture %p) nounwind uwtable ssp { entry: %div = lshr i32 %a, 2 store i32 %div, i32* %p, align 4, !tbaa !0 %0 = lshr i32 %a, 1 %add = and i32 %0, 2147483646 %arrayidx1 = getelementptr inbounds i32* %p, i64 1 store i32 %add, i32* %arrayidx1, align 4, !tbaa !0 %1 = lshr i32 %a, 1 %mul = and i32 %1, 2147483646 %sub = sub i32 %add, %mul...

RFC: should CVP always narrow the width of lshr?

2018 May 23

RFC: should CVP always narrow the width of lshr?

...1106601, > We overlooked these questions in the specific case of div/rem > (D44102) because we assumed that narrower div/rem are always > better for analysis and codegen <cut> Now, there is a caveat. If <C> is constant, and is a power-of-two, instcombine will transform udiv to lshr, and urem to and. And CVP pass does not narrow the width of those opcodes. And instcombine won't narrow them either, because the zext has multiple uses / that would create more instructions, which is illegal for instcombine. So [1] may or may not be handled properly... There are multiple solu...

anyone want to help tune up computeKnownBits()?

2015 Sep 01

anyone want to help tune up computeKnownBits()?

..., %2 infer %3 known from LLVM: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx known from Souper: 000000000000000000000000000000000000000000000000000000000000000x -------------------------------------------------------------------- if the big end of a word contains some zeros, lshr can't make them go away: %0:i64 = var %1:i64 = lshr 233:i64, %0 infer %1 known from LLVM: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx known from Souper: 00000000000000000000000000000000000000000000000000000000xxxxxxxx ---------------------------------------------------...

[LLVMdev] RFC: Proposal for Poison Semantics

2015 Jan 28

[LLVMdev] RFC: Proposal for Poison Semantics

On Tue, Jan 27, 2015 at 8:32 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: > > > > Correct me if I am wrong but we are talking about transforming: > > %maybe_poison = add nuw i32 %a, %b > > %x = zext i32 %maybe_poison to i64 > > %y = lshr i64 %x 32 > > > > To: > > %za = zext i32 %a to i64 > > %zb = zext i32 %b to i64 > > %x = add nuw i64 %za, %zb > > %y = lshr i64 %x 32 > > > > ? > > > > If so, this seems fine in the model given by the RFC. > > > > I...

[LLVMdev] InstCombine adds bit masks, confuses self, others

2012 Apr 17

[LLVMdev] InstCombine adds bit masks, confuses self, others

...how best to fix this. If possible, InstCombine's canonicalization shouldn't hide arithmetic progressions behind bit masks. At least, it seems these transformations should be disabled unless (X >> C).hasOneUse(). They aren't exactly optimizations. > > This: > > %div = lshr i32 %a, 2 > store i32 %div, i32* %p, align 4, !tbaa !0 > %add = shl nuw nsw i32 %div, 1 > > is better than this: > > %div = lshr i32 %a, 2 > store i32 %div, i32* %p, align 4, !tbaa !0 > %0 = lshr i32 %a, 1 > %add = and i32 %0, 2147483646 I think we could try your h...

[LLVMdev] RFC: Proposal for Poison Semantics

2015 Jan 28

[LLVMdev] RFC: Proposal for Poison Semantics

...; >> Hi David, > >> > >> I spent some time thinking about poison semantics this way, but here > >> is where I always get stuck: > >> > >> Consider the IR fragment > >> > >> %x = zext i32 %maybe_poison to i64 > >> %y = lshr i64 %x 32 > >> %ptr = gep %global, %y > >> store 42 to %ptr > >> > >> If %maybe_poison is poison, then is %y poison? For all i32 values of > >> %maybe_poison, %y is i64 0, so in some sense you can determine the > >> value %y without looking...

Disabling DAGCombine's specific optimization

2017 May 15

Disabling DAGCombine's specific optimization

Hi Vivek, You could work around this by creating a custom ISD node, e.g. MyTargetISD::MyLSHR, with the same type as the general ISD::LSHR. This custom node will then be ignored by the generic DAGCombiner. Convert ISD::LSHR to MyTargetISD::MyLSHR in DAGCombine, optimise it as you see fit, convert it back or lower it directly. I've done this for ISD::CONCAT_VECTORS to avoid an inconveni...

UB and known bits

2015 Sep 08

UB and known bits

...recision in the known bits. These rules fire on examples like the ones below. Do we have a set of rules that clients of known bits need to follow to avoid unsoundness? I remember Nuno and/or David Majnemer saying something about this but I don't have it handy. John %0:i32 = var %1:i32 = lshr %0, 1:i32 %2:i32 = addnw 1:i32, %1 infer %2 known from Souper: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx known from compiler: 0xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx llvm is stronger %0:i32 = var (000000000000000xxxxxxxxxxxxxxxxx) %1:i32 = and 65535:i32, %0 %2:i16 = var %3:i32 = zext %2 %4:i32 = mulnw %1, %3...

[LLVMdev] InstCombine adds bit masks, confuses self, others

2012 Apr 17

[LLVMdev] InstCombine adds bit masks, confuses self, others

...InstCombine's > canonicalization shouldn't hide arithmetic progressions behind bit masks. > At least, it seems these transformations should be disabled unless (X >> > C).hasOneUse(). They aren't exactly optimizations. > > > > This: > > > > %div = lshr i32 %a, 2 > > store i32 %div, i32* %p, align 4, !tbaa !0 > > %add = shl nuw nsw i32 %div, 1 > > > > is better than this: > > > > %div = lshr i32 %a, 2 > > store i32 %div, i32* %p, align 4, !tbaa !0 > > %0 = lshr i32 %a, 1 > > %add = and i3...

Branch is not optimized because of right shift

2020 Apr 05

Branch is not optimized because of right shift

Hi, > I think the IR in both of your examples makes things harder for the compiler than expected from the original C source. Note that both versions are from clang with -O2. The first is with version 9.0 and the second is with the trunk. > but in the branch only %0 is used. Sinking the lshr too early made the analysis harder. Yes, exactly! That's what I figured too. > The version in https://godbolt.org/z/_ipKhb is probably the easiest for analysis (basically the original C source code built with `clang -O0 -S -emit-llvm`, followed by running `opt -mem2reg`). There’s a patch...

[LLVMdev] Shifting by too many bits

2007 Aug 22

[LLVMdev] Shifting by too many bits

The documentation for SHL, LSHR, and ASHR is unclear. What is the result of shifting by the number of bits in the left operand. For example, <result> = shl i32 1, 32 <result> = ashr i32 1, 32 <result> = lshr i32 1, 32

[LLVMdev] Array Slicing?

2007 Oct 03

[LLVMdev] Array Slicing?

...ointer arithmetic. On a related note, can I convert a pointer-to-int to a pointer-to-array-of-1-int and vice versa? BTW, I sent another post to this e-mail address, but never received a reply: ---------------------------------- Subject: Shifting by too many bits Body: The documentation for SHL, LSHR, and ASHR is unclear. What is the result of shifting by the number of bits in the left operand. For example, <result> = shl i32 1, 32 <result> = ashr i32 1, 32 <result> = lshr i32 1, 32 ---------------------------------- Regards, Jon

[LLVMdev] equivalent IR, different asm

2010 Sep 01

[LLVMdev] equivalent IR, different asm

...causes a crash to webkit. > I suspect the usage of registers is wrong, can someone take a look ? The difference is that there is a shift right after the multiply, before the divide. In IR, the difference is: %5 = mul nsw i32 %4, %tmp1 ; <i32> [#uses=1] %btmp3 = lshr i64 %1, 32 ; <i64> [#uses=1] %btmp4 = trunc i64 %btmp3 to i32 ; <i32> [#uses=1] %6 = sdiv i32 %5, %btmp4 ; <i32> [#uses=1] vs: %5 = mul nsw i32 %4, %tmp1 ; <i32> [#uses=1] ; rem...

InstCombine wrongful (?) optimization on BinOp with SameOperands

2015 Sep 30

InstCombine wrongful (?) optimization on BinOp with SameOperands

...ve been looking at the way LLVM optimizes code before forwarding it to the backend I develop for my company and while building define i32 @test_extract_subreg_func(i32 %x, i32 %y) #0 { entry: %conv = zext i32 %x to i64 %conv1 = zext i32 %y to i64 %mul = mul nuw i64 %conv1, %conv %shr = lshr i64 %mul, 32 %xor = xor i64 %shr, %mul %conv2 = trunc i64 %xor to i32 ret i32 %conv2 } I came upon the following optimization (during instcombine): *IC: Visiting: %mul = mul nuw i64 %conv, %conv1 IC: Visiting: %shr = lshr i64 %mul, 32 IC: Visiting: %conv2 = trunc i64 %shr to i32 IC:...

[LLVMdev] X86TargetLowering::LowerToBT

2015 Jan 19

[LLVMdev] X86TargetLowering::LowerToBT

...and I have some questions. This IR *matches* and then *X86TargetLowering::LowerToBT **is called:* %and = and i64 %shl, %val * ; (val & (1 << index)) != 0 ; *bit test with a *register* index This IR *does not match* and so *X86TargetLowering::LowerToBT **is not called:* %and = lshr i64 %val, 25 * ; (val & (1 **<< 25)) != 0 ; *bit test with an *immediate* index %conv = and i64 %and, 1 Let's back that up a bit. Clang emits this IR. These expressions start out life in C as *and with a left shifted masking bit*, and are then converted into IR as...

[LLVMdev] 64bit MRV problem: { float, float, float} -> { double, float }

2010 Jan 29

[LLVMdev] 64bit MRV problem: { float, float, float} -> { double, float }

...t.float3 = type { float, float, float } define void @test(double %a.0, float %a.1, %struct.float3* nocapture %res) nounwind noinline { entry: %tmp8 = bitcast double %a.0 to i64 ; <i64> [#uses=1] %tmp9 = zext i64 %tmp8 to i96 ; <i96> [#uses=1] %tmp1 = lshr i96 %tmp9, 32 ; <i96> [#uses=1] %tmp2 = trunc i96 %tmp1 to i32 ; <i32> [#uses=1] %tmp3 = bitcast i32 %tmp2 to float ; <float> [#uses=1] %0 = getelementptr inbounds %struct.float3* %res, i64 0, i32 1 ; <float*> [#uses=1...

BSWAP matching in codegen

2016 Dec 09

BSWAP matching in codegen

..._tree(i32 %x) { > > %byte0 = and i32 %x, 255 ; 0x000000ff > > %byte1 = and i32 %x, 65280 ; 0x0000ff00 > > %byte2 = and i32 %x, 16711680 ; 0x00ff0000 > > %byte3 = and i32 %x, 4278190080 ; 0xff000000 > > %tmp0 = shl i32 %byte0, 8 > > %tmp1 = lshr i32 %byte1, 8 > > %tmp2 = shl i32 %byte2, 8 > > %tmp3 = lshr i32 %byte3, 8 > > %or0 = or i32 %tmp0, %tmp1 > > %or1 = or i32 %tmp2, %tmp3 > > %result = or i32 %or0, %or1 > > ret i32 %result > > } > > I’m still investigating exactly how it’s...

struct bitfield regression between 3.6 and 3.9 (using -O0)

2016 Dec 22

struct bitfield regression between 3.6 and 3.9 (using -O0)

...ng not just the single bit but the whole value in r0 (an 8-bit value) against 1. If we insert a logical AND with '1' to mask r0 just prior to the compare it works fine. And as it turns out, we see that *and* in the LLVM IR generated using -O0 and -emit-llvm has the AND included: ... %bf.lshr = lshr i8 %bf.load4, 1 * %bf.clear5 = and i8 %bf.lshr, 1* %bf.cast = zext i8 %bf.clear5 to i32 %cmp = icmp eq i32 %bf.cast, 1 br i1 %cmp, label %if.then, label %if.else (compiled with: clang -O0 -emit-llvm -S failing.c -o failing.ll ) I reran passing -debug to llc to see what's happen...

About CodeGen quality

2017 Jun 15

About CodeGen quality

...gned int b : 8; unsigned int c : 8; unsigned int d : 8; unsigned int e; } We want to read S->b for example. The size of struct S is 64 bits, and seems LLVM treats it as i64. Below is the IR corresponding to S->b, IIRC. %0 = load i64, *i64 ptr, align 4; %1 = %0 lshr 8; %2 = %1 and 255; Our target doesn't support load i64, so we have following code in XXXISelLowering.cpp setOperationAction(ISD::LOAD, MVT::i64, Custom); Transform load i64 to load v2i32 during type legalization. During op legalization, load v2i32 is found unaligned (4 v.s. 8), so s...

search for: lshr