thr3ads.net - search: "shl"

Displaying 20 results from an estimated 623 matches for "shl".

Did you mean: sh

2011 Jul 26

[LLVMdev] XOR Optimization

...tput of O3. when I run "opt -std-compile-opts" on the -O3 optimized code, things get even more weird, it outputs the following code: while.body: ; preds = %while.body, %entry %indvar = phi i32 [ 0, %entry ], [ %indvar.next.3, %while.body ] %tmp = shl i32 %indvar, 2 %0 = lshr i32 %indvar, 3 %shr = and i32 %0, 134217727 %rem = and i32 %tmp, 16 %shl = shl i32 1, %rem %arrayidx = getelementptr inbounds i32* %bitmap, i32 %shr %tmp6 = load i32* %arrayidx, align 4 %rem.1 = or i32 %rem, 1 %shl.1 = shl i32 1, %rem.1 %rem.2 = or i32 %re...

[LLVMdev] XOR Optimization

2011 Jul 26

[LLVMdev] XOR Optimization

Hi Daniel, > Precisely. The code generated by unrolling can be folded into a single XOR and > SHL. And even if it was not inside a loop, it can still be optimized. What I > want to know is: is there any optimization supposed to optimize this code, but > for some reason it thinks it is not possible, or there is no optimization for > that situation at all? it could be a phase ordering...

[LLVMdev] XOR Optimization

2011 Jul 27

[LLVMdev] XOR Optimization

...opt -std-compile-opts" on the -O3 optimized code, things get > even more weird, it outputs the following code: > > while.body: ; preds = %while.body, > %entry > %indvar = phi i32 [ 0, %entry ], [ %indvar.next.3, %while.body ] > %tmp = shl i32 %indvar, 2 > %0 = lshr i32 %indvar, 3 > %shr = and i32 %0, 134217727 > %rem = and i32 %tmp, 16 > %shl = shl i32 1, %rem > %arrayidx = getelementptr inbounds i32* %bitmap, i32 %shr > %tmp6 = load i32* %arrayidx, align 4 > %rem.1 = or i32 %rem, 1 > %shl.1...

How to describe the RegisterInfo?

2016 Aug 23

How to describe the RegisterInfo?

...pressing register alias in RegisterInfo.td file. I am not sure whether I understand it correctly. My first trial was like below(to make things simple, I remove some WORD/QWORD register class): let Namespace = "IntelGPU" in { foreach Index = 0-15 in { def sub#Index : SubRegIndex<32, !shl(Index, 5)>; } } class IntelGPUReg<string n, bits<13> regIdx> : Register<n> { bits<2> HStride; bits<1> regFile; let Namespace = "IntelGPU"; let HWEncoding{12-0} = regIdx; let HWEncoding{15} = regFile; } // here I define the whole 4096 byte r...

Why ISel Shifts operations can only be expanded for Value type vector ?

2017 Mar 04

Why ISel Shifts operations can only be expanded for Value type vector ?

...mail.com> wrote: > Why you can't still expand it through MUL with a Custom lowering? Or am I > missing something? > > Yes we can but problem occurs when we know that it is shift with constant value than if we return ISD::MUL with constant imm operand than LLVM will convert it to SHL again because the constant will be power of 2. Thus it creates loop. So we may add target specific ISD node and lower it to mul instruction. --Vivek > Thanks. > > On Fri, Mar 3, 2017 at 12:21 PM, vivek pandya via llvm-dev < > llvm-dev at lists.llvm.org > <javascript:_e(%7B%7D...

Why ISel Shifts operations can only be expanded for Value type vector ?

2017 Mar 03

Why ISel Shifts operations can only be expanded for Value type vector ?

Hello LLVM Devs, I am working on a target on which no SHL instruction is available. So wanted to expand it through MUL. But currently it is only possible to expand SHL for vector types. One possible reason I can think is because LLVM tries to optimize MUL to SHL in certain cases and that can make compiler co in loop or may end up generating wrong code....

Why ISel Shifts operations can only be expanded for Value type vector ?

2017 Mar 04

Why ISel Shifts operations can only be expanded for Value type vector ?

On Sat, Mar 4, 2017 at 1:19 PM, Bruce Hoult <bruce at hoult.org> wrote: > If your target does not have SHL then why don't you simply disable > converting MUL to SHL? > > MUL is converted to SHL by target independent passes when second operand is power of 2. -Vivek > > On Sat, Mar 4, 2017 at 8:22 AM, vivek pandya via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >&...

[LLVMdev] Optimization bug - spurious shift in partial word test

2013 Oct 30

[LLVMdev] Optimization bug - spurious shift in partial word test

In the situation where a partial word is tested, lets say >0, by shifting left to get the sign bit into the msb and testing llvm is inserting a spurious right shift instruction. For example this IR: ... %0 = load i64* %a.addr, align 8 %shl = shl i64 %0, 28 %cmp = icmp sgt i64 %shl, 0 ... results in ... shlq $28, %rdi sarq $28, %rdi ; <<< spurious shift testq %rdi, %rdi gcc doesnt have this problem. It just emits the shift and test. The reason appears to be that the instruction combining pa...

RFC: Killing undef and spreading poison

2016 Nov 08

RFC: Killing undef and spreading poison

...am stores 8 bits, and leaves the remaining 24 bits > uninitialized. It then loads 16 bits, half initialized to %v, half > uninitialized. SROA transforms the above function to: > > define i16 @g(i8 %in) { > %v = add nsw i8 127, %in > %1 = zext i8 %v to i16 > %2 = shl i16 %1, 8 > %3 = and i16 undef, 255 > %4 = or i16 %3, %2 > ret i16 %4 > } This program above returns i16 poison only if "shl i16 poison, 8" is a full value poison. Whether that is the case today or not is anybody's guess (as Eli says, we should not rely too...

[LLVMdev] XOR Optimization

2011 Jul 26

[LLVMdev] XOR Optimization

...erates a code like this: > > > entry: > br label %while.body > > while.body: ; preds = %while.body, > %entry > %0 = phi i32 [ 0, %entry ], [ %inc.3, %while.body ] > %shr = lshr i32 %0, 5 > %rem = and i32 %0, 28 > %shl = shl i32 1, %rem > %arrayidx = getelementptr inbounds i32* %bitmap, i32 %shr > %tmp6 = load i32* %arrayidx, align 4 > %xor = xor i32 %tmp6, %shl > %rem.1 = or i32 %rem, 1 > %shl.1 = shl i32 1, %rem.1 > %xor.1 = xor i32 %xor, %shl.1 > %rem.2 = or i32 %rem,...

[LLVMdev] inconsistent wording in the LangRef regarding "shl nsw"

2015 Apr 06

[LLVMdev] inconsistent wording in the LangRef regarding "shl nsw"

...is present, then the shift produces a poison value if it shifts out any bits that disagree with the resultant sign bit." ... (1) followed by "As such, NUW/NSW have the same semantics as they would if the shift were expressed as a mul instruction with the same nsw/nuw bits in (mul %op1, (shl 1, %op2))." ... (2) But by (1) "shl i8 1, i8 7" sign overflows (since it shifts out only zeros, but the result has the sign bit set) but "mul i8 1, i8 -128" does not sign overflow (by the usual definition of sign-overflow), so this violates (2). InstCombine already has a...

RFC: Killing undef and spreading poison

2016 Nov 09

RFC: Killing undef and spreading poison

...t; > uninitialized. It then loads 16 bits, half initialized to %v, half >> > uninitialized. SROA transforms the above function to: >> > >> > define i16 @g(i8 %in) { >> > %v = add nsw i8 127, %in >> > %1 = zext i8 %v to i16 >> > %2 = shl i16 %1, 8 >> > %3 = and i16 undef, 255 >> > %4 = or i16 %3, %2 >> > ret i16 %4 >> > } >> >> This program above returns i16 poison only if "shl i16 poison, 8" is a >> full value poison. Whether that is the case today or not i...

[LLVMdev] [Target] Custom Lowering expansion of 32-bit ISD::SHL, ISD::SHR without barrel shifter

2013 Nov 09

[LLVMdev] [Target] Custom Lowering expansion of 32-bit ISD::SHL, ISD::SHR without barrel shifter

Dear All, I am trying to custom lower 32-bit ISD::SHL and SHR in a backend for 6502 family CPUs. The particular subtarget has 16-bit registers at most, so a 32-bit result is not legal. Normally, if you mark this as "Legal" or "Expand", then it will expand the node into a more nodes as follows in an example: shl i32 %a , 2 => h...

Encoding into MONO (delphi)

2004 Feb 13

Encoding into MONO (delphi)

...conversion (from Aleksandr Shamray), but it doesn't work when I'd like to convert a *.RAW into a mono *.ogg file. vorbis_encode_init_vbr(vi, 1, 44100, 0.5); //because of the mono the program stops at line: //* uninterleave samples */ . . buffer[1][i] := smallInt((pArray(@readbuffer)[i shl 2 + 3] shl 8) or pArray(@readbuffer)[i shl 2 + 2]) / 32768; . . Why????? Please, help me! Thank You! <p>--------------------------------- Do you Yahoo!? Yahoo! Finance: Get your refund fast by filing online --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepa...

[LLVMdev] [Target] Custom Lowering expansion of 32-bit ISD::SHL, ISD::SHR without barrel shifter

2013 Nov 10

[LLVMdev] [Target] Custom Lowering expansion of 32-bit ISD::SHL, ISD::SHR without barrel shifter

...n on the 2 8-bit subregs of that 16-bit register. That means the only practical solution for 32-bit shifts is to lower to a libcall but my situation for 16-bit shifts sounds similar to yours for 32-bit shifts. > > I defined pseudo-instructions in my InstrInfo.td that implemented 8 and 16-bit SHL, SRL and SRA with one variant of each for the shift count in a register and another for an immediate shift count. Some shifts by immediate values and all shifts by non-immediate values need to be implemented using a loop so I used a custom inserter. I added a pass to expand the remaining shift pseu...

[LLVMdev] XOR optimization

2011 Jul 26

[LLVMdev] XOR optimization

...itnumb); bit_addr++; } The -O3 set of optimizations generates a code like this: entry: br label %while.body while.body: ; preds = %while.body, %entry %0 = phi i32 [ 0, %entry ], [ %inc.3, %while.body ] %shr = lshr i32 %0, 5 %rem = and i32 %0, 28 %shl = shl i32 1, %rem %arrayidx = getelementptr inbounds i32* %bitmap, i32 %shr %tmp6 = load i32* %arrayidx, align 4 %xor = xor i32 %tmp6, %shl %rem.1 = or i32 %rem, 1 %shl.1 = shl i32 1, %rem.1 %xor.1 = xor i32 %xor, %shl.1 %rem.2 = or i32 %rem, 2 %shl.2 = shl i32 1, %rem.2 %xor.2 =...

[LLVMdev] [Target] Custom Lowering expansion of 32-bit ISD::SHL, ISD::SHR without barrel shifter

2013 Nov 10

[LLVMdev] [Target] Custom Lowering expansion of 32-bit ISD::SHL, ISD::SHR without barrel shifter

...rt rotation on the 2 8-bit subregs of that 16-bit register. That means the only practical solution for 32-bit shifts is to lower to a libcall but my situation for 16-bit shifts sounds similar to yours for 32-bit shifts. I defined pseudo-instructions in my InstrInfo.td that implemented 8 and 16-bit SHL, SRL and SRA with one variant of each for the shift count in a register and another for an immediate shift count. Some shifts by immediate values and all shifts by non-immediate values need to be implemented using a loop so I used a custom inserter. I added a pass to expand the remaining shift pseu...

error:Ran out of lanemask bits to represent subregister

2017 Jul 14

error:Ran out of lanemask bits to represent subregister

Do your 32768 registers also have sub registers? I can't tell you exactly what to change. I'm not familiar with the code. I would just be running grep or something. ~Craig On Fri, Jul 14, 2017 at 10:23 AM, hameeza ahmed <hahmed2305 at gmail.com> wrote: > Thank you so much. I think there is no issue with my definitions since i > have to use larger registers i.e 65536 bit

[LLVMdev] ComputeMaskedBits Bug

2008 Jul 17

[LLVMdev] ComputeMaskedBits Bug

...s. But I am not very well versed in this area of LLVM and need some more eyes. This is in the 2.3 release, though it looks like the relevant pieces operate the same way in trunk. I have the following add recurrence: %r849 = select i1 %r848, i64 0, i64 %r847 ; <i64> [#uses=10] %r862 = shl i64 %r849, 3 ; <i64> [#uses=3] %"$SR_S112.0" = phi i64 [ %r862, %"file alignment.f, line 79, in reduction loop at depth 0, bb64" ], [ %r1874, %"file alignment.f, line 79, in loop at depth 1, bb77" ] ; <i64> [#uses=2] %r1874 = add i64 %r851, %"$SR...

[LLVMdev] XOR Optimization

2011 Jul 26

[LLVMdev] XOR Optimization

...entry: > > br label %while.body > > > > while.body: ; preds = %while.body, > > %entry > > %0 = phi i32 [ 0, %entry ], [ %inc.3, %while.body ] > > %shr = lshr i32 %0, 5 > > %rem = and i32 %0, 28 > > %shl = shl i32 1, %rem > > %arrayidx = getelementptr inbounds i32* %bitmap, i32 %shr > > %tmp6 = load i32* %arrayidx, align 4 > > %xor = xor i32 %tmp6, %shl > > %rem.1 = or i32 %rem, 1 > > %shl.1 = shl i32 1, %rem.1 > > %xor.1 = xor i32 %xor, %shl.1...

search for: shl