search for: lshr

Displaying 20 results from an estimated 259 matches for "lshr".

2012 Aug 24
2
[LLVMdev] Stop opt from producing 832 bit integer?
...:64:64-n32" * *target triple = "mipsel-unknown-linux"* *@.str = private unnamed_addr constant [16 x i8] c"%.2d %.2d %.2d\0A\00", align 1* *define void @gsm_print(i8* nocapture %c) nounwind {* *entry:* * %0 = load i8* %c, align 1* * %conv = zext i8 %0 to i32* * %shr13 = lshr i32 %conv, 2* * %1 = zext i32 %shr13 to i832* * %2 = shl nuw nsw i832 %1, 304* * %shr314 = lshr i32 %conv, 1* * %and4 = and i32 %shr314, 7* * %3 = zext i32 %shr314 to i832* * %4 = shl nuw nsw i832 %3, 352* * %shr815 = lshr i32 %conv, 3* * %5 = zext i32 %shr815 to i832* * %6 = shl nuw nsw i...
2012 Apr 16
5
[LLVMdev] InstCombine adds bit masks, confuses self, others
...into straightforward code and figures out the 0 return value: shrl $2, %edi movl %edi, (%rsi) addl %edi, %edi movl %edi, 4(%rsi) movl $0, %eax ret LLVM optimizes the code: $ clang -O -S -o- small.c -emit-llvm define i32 @f(i32 %a, i32* nocapture %p) nounwind uwtable ssp { entry: %div = lshr i32 %a, 2 store i32 %div, i32* %p, align 4, !tbaa !0 %0 = lshr i32 %a, 1 %add = and i32 %0, 2147483646 %arrayidx1 = getelementptr inbounds i32* %p, i64 1 store i32 %add, i32* %arrayidx1, align 4, !tbaa !0 %1 = lshr i32 %a, 1 %mul = and i32 %1, 2147483646 %sub = sub i32 %add, %mul...
2018 May 23
0
RFC: should CVP always narrow the width of lshr?
...1106601, > We overlooked these questions in the specific case of div/rem > (D44102) because we assumed that narrower div/rem are always > better for analysis and codegen <cut> Now, there is a caveat. If <C> is constant, and is a power-of-two, instcombine will transform udiv to lshr, and urem to and. And CVP pass does not narrow the width of those opcodes. And instcombine won't narrow them either, because the zext has multiple uses / that would create more instructions, which is illegal for instcombine. So [1] may or may not be handled properly... There are multiple solu...
2015 Sep 01
3
anyone want to help tune up computeKnownBits()?
..., %2 infer %3 known from LLVM: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx known from Souper: 000000000000000000000000000000000000000000000000000000000000000x -------------------------------------------------------------------- if the big end of a word contains some zeros, lshr can't make them go away: %0:i64 = var %1:i64 = lshr 233:i64, %0 infer %1 known from LLVM: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx known from Souper: 00000000000000000000000000000000000000000000000000000000xxxxxxxx ---------------------------------------------------...
2015 Jan 28
4
[LLVMdev] RFC: Proposal for Poison Semantics
On Tue, Jan 27, 2015 at 8:32 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: > > > > Correct me if I am wrong but we are talking about transforming: > > %maybe_poison = add nuw i32 %a, %b > > %x = zext i32 %maybe_poison to i64 > > %y = lshr i64 %x 32 > > > > To: > > %za = zext i32 %a to i64 > > %zb = zext i32 %b to i64 > > %x = add nuw i64 %za, %zb > > %y = lshr i64 %x 32 > > > > ? > > > > If so, this seems fine in the model given by the RFC. > > > > I...
2012 Apr 17
0
[LLVMdev] InstCombine adds bit masks, confuses self, others
...how best to fix this. If possible, InstCombine's canonicalization shouldn't hide arithmetic progressions behind bit masks. At least, it seems these transformations should be disabled unless (X >> C).hasOneUse(). They aren't exactly optimizations. > > This: > >  %div = lshr i32 %a, 2 >  store i32 %div, i32* %p, align 4, !tbaa !0 >  %add = shl nuw nsw i32 %div, 1 > > is better than this: > >  %div = lshr i32 %a, 2 >  store i32 %div, i32* %p, align 4, !tbaa !0 >  %0 = lshr i32 %a, 1 >  %add = and i32 %0, 2147483646 I think we could try your h...
2015 Jan 28
2
[LLVMdev] RFC: Proposal for Poison Semantics
...; >> Hi David, > >> > >> I spent some time thinking about poison semantics this way, but here > >> is where I always get stuck: > >> > >> Consider the IR fragment > >> > >> %x = zext i32 %maybe_poison to i64 > >> %y = lshr i64 %x 32 > >> %ptr = gep %global, %y > >> store 42 to %ptr > >> > >> If %maybe_poison is poison, then is %y poison? For all i32 values of > >> %maybe_poison, %y is i64 0, so in some sense you can determine the > >> value %y without looking...
2017 May 15
2
Disabling DAGCombine's specific optimization
Hi Vivek, You could work around this by creating a custom ISD node, e.g. MyTargetISD::MyLSHR, with the same type as the general ISD::LSHR. This custom node will then be ignored by the generic DAGCombiner. Convert ISD::LSHR to MyTargetISD::MyLSHR in DAGCombine, optimise it as you see fit, convert it back or lower it directly. I've done this for ISD::CONCAT_VECTORS to avoid an inconveni...
2015 Sep 08
2
UB and known bits
...recision in the known bits. These rules fire on examples like the ones below. Do we have a set of rules that clients of known bits need to follow to avoid unsoundness? I remember Nuno and/or David Majnemer saying something about this but I don't have it handy. John %0:i32 = var %1:i32 = lshr %0, 1:i32 %2:i32 = addnw 1:i32, %1 infer %2 known from Souper: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx known from compiler: 0xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx llvm is stronger %0:i32 = var (000000000000000xxxxxxxxxxxxxxxxx) %1:i32 = and 65535:i32, %0 %2:i16 = var %3:i32 = zext %2 %4:i32 = mulnw %1, %3...
2012 Apr 17
3
[LLVMdev] InstCombine adds bit masks, confuses self, others
...InstCombine's > canonicalization shouldn't hide arithmetic progressions behind bit masks. > At least, it seems these transformations should be disabled unless (X >> > C).hasOneUse(). They aren't exactly optimizations. > > > > This: > > > > %div = lshr i32 %a, 2 > > store i32 %div, i32* %p, align 4, !tbaa !0 > > %add = shl nuw nsw i32 %div, 1 > > > > is better than this: > > > > %div = lshr i32 %a, 2 > > store i32 %div, i32* %p, align 4, !tbaa !0 > > %0 = lshr i32 %a, 1 > > %add = and i3...
2020 Apr 05
3
Branch is not optimized because of right shift
Hi, > I think the IR in both of your examples makes things harder for the compiler than expected from the original C source. Note that both versions are from clang with -O2. The first is with version 9.0 and the second is with the trunk. > but in the branch only %0 is used. Sinking the lshr too early made the analysis harder. Yes, exactly! That's what I figured too. > The version in https://godbolt.org/z/_ipKhb is probably the easiest for analysis (basically the original C source code built with `clang -O0 -S -emit-llvm`, followed by running `opt -mem2reg`). There’s a patch...
2007 Aug 22
1
[LLVMdev] Shifting by too many bits
The documentation for SHL, LSHR, and ASHR is unclear. What is the result of shifting by the number of bits in the left operand. For example, <result> = shl i32 1, 32 <result> = ashr i32 1, 32 <result> = lshr i32 1, 32
2007 Oct 03
2
[LLVMdev] Array Slicing?
...ointer arithmetic. On a related note, can I convert a pointer-to-int to a pointer-to-array-of-1-int and vice versa? BTW, I sent another post to this e-mail address, but never received a reply: ---------------------------------- Subject: Shifting by too many bits Body: The documentation for SHL, LSHR, and ASHR is unclear. What is the result of shifting by the number of bits in the left operand. For example, <result> = shl i32 1, 32 <result> = ashr i32 1, 32 <result> = lshr i32 1, 32 ---------------------------------- Regards, Jon
2010 Sep 01
0
[LLVMdev] equivalent IR, different asm
...causes a crash to webkit. > I suspect the usage of registers is wrong, can someone take a look ? The difference is that there is a shift right after the multiply, before the divide. In IR, the difference is: %5 = mul nsw i32 %4, %tmp1 ; <i32> [#uses=1] %btmp3 = lshr i64 %1, 32 ; <i64> [#uses=1] %btmp4 = trunc i64 %btmp3 to i32 ; <i32> [#uses=1] %6 = sdiv i32 %5, %btmp4 ; <i32> [#uses=1] vs: %5 = mul nsw i32 %4, %tmp1 ; <i32> [#uses=1] ; rem...
2015 Sep 30
2
InstCombine wrongful (?) optimization on BinOp with SameOperands
...ve been looking at the way LLVM optimizes code before forwarding it to the backend I develop for my company and while building define i32 @test_extract_subreg_func(i32 %x, i32 %y) #0 { entry: %conv = zext i32 %x to i64 %conv1 = zext i32 %y to i64 %mul = mul nuw i64 %conv1, %conv %shr = lshr i64 %mul, 32 %xor = xor i64 %shr, %mul %conv2 = trunc i64 %xor to i32 ret i32 %conv2 } I came upon the following optimization (during instcombine): *IC: Visiting: %mul = mul nuw i64 %conv, %conv1 IC: Visiting: %shr = lshr i64 %mul, 32 IC: Visiting: %conv2 = trunc i64 %shr to i32 IC:...
2015 Jan 19
6
[LLVMdev] X86TargetLowering::LowerToBT
...and I have some questions. This IR *matches* and then *X86TargetLowering::LowerToBT **is called:* %and = and i64 %shl, %val * ; (val & (1 << index)) != 0 ; *bit test with a *register* index This IR *does not match* and so *X86TargetLowering::LowerToBT **is not called:* %and = lshr i64 %val, 25 * ; (val & (1 **<< 25)) != 0 ; *bit test with an *immediate* index %conv = and i64 %and, 1 Let's back that up a bit. Clang emits this IR. These expressions start out life in C as *and with a left shifted masking bit*, and are then converted into IR as...
2010 Jan 29
2
[LLVMdev] 64bit MRV problem: { float, float, float} -> { double, float }
...t.float3 = type { float, float, float } define void @test(double %a.0, float %a.1, %struct.float3* nocapture %res) nounwind noinline { entry: %tmp8 = bitcast double %a.0 to i64 ; <i64> [#uses=1] %tmp9 = zext i64 %tmp8 to i96 ; <i96> [#uses=1] %tmp1 = lshr i96 %tmp9, 32 ; <i96> [#uses=1] %tmp2 = trunc i96 %tmp1 to i32 ; <i32> [#uses=1] %tmp3 = bitcast i32 %tmp2 to float ; <float> [#uses=1] %0 = getelementptr inbounds %struct.float3* %res, i64 0, i32 1 ; <float*> [#uses=1...
2016 Dec 09
0
BSWAP matching in codegen
..._tree(i32 %x) { > > %byte0 = and i32 %x, 255 ; 0x000000ff > > %byte1 = and i32 %x, 65280 ; 0x0000ff00 > > %byte2 = and i32 %x, 16711680 ; 0x00ff0000 > > %byte3 = and i32 %x, 4278190080 ; 0xff000000 > > %tmp0 = shl i32 %byte0, 8 > > %tmp1 = lshr i32 %byte1, 8 > > %tmp2 = shl i32 %byte2, 8 > > %tmp3 = lshr i32 %byte3, 8 > > %or0 = or i32 %tmp0, %tmp1 > > %or1 = or i32 %tmp2, %tmp3 > > %result = or i32 %or0, %or1 > > ret i32 %result > > } > > I’m still investigating exactly how it’s...
2016 Dec 22
2
struct bitfield regression between 3.6 and 3.9 (using -O0)
...ng not just the single bit but the whole value in r0 (an 8-bit value) against 1. If we insert a logical AND with '1' to mask r0 just prior to the compare it works fine. And as it turns out, we see that *and* in the LLVM IR generated using -O0 and -emit-llvm has the AND included: ... %bf.lshr = lshr i8 %bf.load4, 1 * %bf.clear5 = and i8 %bf.lshr, 1* %bf.cast = zext i8 %bf.clear5 to i32 %cmp = icmp eq i32 %bf.cast, 1 br i1 %cmp, label %if.then, label %if.else (compiled with: clang -O0 -emit-llvm -S failing.c -o failing.ll ) I reran passing -debug to llc to see what's happen...
2017 Jun 15
9
About CodeGen quality
...gned int b : 8; unsigned int c : 8; unsigned int d : 8; unsigned int e; } We want to read S->b for example. The size of struct S is 64 bits, and seems LLVM treats it as i64. Below is the IR corresponding to S->b, IIRC. %0 = load i64, *i64 ptr, align 4; %1 = %0 lshr 8; %2 = %1 and 255; Our target doesn't support load i64, so we have following code in XXXISelLowering.cpp setOperationAction(ISD::LOAD, MVT::i64, Custom); Transform load i64 to load v2i32 during type legalization. During op legalization, load v2i32 is found unaligned (4 v.s. 8), so s...