Displaying 20 results from an estimated 623 matches for "shl".
Did you mean:
sh
2011 Jul 26
0
[LLVMdev] XOR Optimization
...tput of O3.
when I run "opt -std-compile-opts" on the -O3 optimized code, things get
even more weird, it outputs the following code:
while.body: ; preds = %while.body,
%entry
%indvar = phi i32 [ 0, %entry ], [ %indvar.next.3, %while.body ]
%tmp = shl i32 %indvar, 2
%0 = lshr i32 %indvar, 3
%shr = and i32 %0, 134217727
%rem = and i32 %tmp, 16
%shl = shl i32 1, %rem
%arrayidx = getelementptr inbounds i32* %bitmap, i32 %shr
%tmp6 = load i32* %arrayidx, align 4
%rem.1 = or i32 %rem, 1
%shl.1 = shl i32 1, %rem.1
%rem.2 = or i32 %re...
2011 Jul 26
2
[LLVMdev] XOR Optimization
Hi Daniel,
> Precisely. The code generated by unrolling can be folded into a single XOR and
> SHL. And even if it was not inside a loop, it can still be optimized. What I
> want to know is: is there any optimization supposed to optimize this code, but
> for some reason it thinks it is not possible, or there is no optimization for
> that situation at all?
it could be a phase ordering...
2011 Jul 27
2
[LLVMdev] XOR Optimization
...opt -std-compile-opts" on the -O3 optimized code, things get
> even more weird, it outputs the following code:
>
> while.body: ; preds = %while.body,
> %entry
> %indvar = phi i32 [ 0, %entry ], [ %indvar.next.3, %while.body ]
> %tmp = shl i32 %indvar, 2
> %0 = lshr i32 %indvar, 3
> %shr = and i32 %0, 134217727
> %rem = and i32 %tmp, 16
> %shl = shl i32 1, %rem
> %arrayidx = getelementptr inbounds i32* %bitmap, i32 %shr
> %tmp6 = load i32* %arrayidx, align 4
> %rem.1 = or i32 %rem, 1
> %shl.1...
2016 Aug 23
2
How to describe the RegisterInfo?
...pressing register alias in
RegisterInfo.td file. I am not sure whether I understand it correctly. My
first trial was like below(to make things simple, I remove some WORD/QWORD
register class):
let Namespace = "IntelGPU" in {
foreach Index = 0-15 in {
def sub#Index : SubRegIndex<32, !shl(Index, 5)>;
}
}
class IntelGPUReg<string n, bits<13> regIdx> : Register<n> {
bits<2> HStride;
bits<1> regFile;
let Namespace = "IntelGPU";
let HWEncoding{12-0} = regIdx;
let HWEncoding{15} = regFile;
}
// here I define the whole 4096 byte r...
2017 Mar 04
7
Why ISel Shifts operations can only be expanded for Value type vector ?
...mail.com> wrote:
> Why you can't still expand it through MUL with a Custom lowering? Or am I
> missing something?
>
> Yes we can but problem occurs when we know that it is shift with constant
value than if we return ISD::MUL with constant imm operand than LLVM will
convert it to SHL again because the constant will be power of 2. Thus it
creates loop.
So we may add target specific ISD node and lower it to mul instruction.
--Vivek
> Thanks.
>
> On Fri, Mar 3, 2017 at 12:21 PM, vivek pandya via llvm-dev <
> llvm-dev at lists.llvm.org
> <javascript:_e(%7B%7D...
2017 Mar 03
3
Why ISel Shifts operations can only be expanded for Value type vector ?
Hello LLVM Devs,
I am working on a target on which no SHL instruction is available. So
wanted to expand it through MUL. But currently it is only possible to
expand SHL for vector types.
One possible reason I can think is because LLVM tries to optimize MUL to
SHL in certain cases and that can make compiler co in loop or may end up
generating wrong code....
2017 Mar 04
2
Why ISel Shifts operations can only be expanded for Value type vector ?
On Sat, Mar 4, 2017 at 1:19 PM, Bruce Hoult <bruce at hoult.org> wrote:
> If your target does not have SHL then why don't you simply disable
> converting MUL to SHL?
>
> MUL is converted to SHL by target independent passes when second operand
is power of 2.
-Vivek
>
> On Sat, Mar 4, 2017 at 8:22 AM, vivek pandya via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>&...
2013 Oct 30
1
[LLVMdev] Optimization bug - spurious shift in partial word test
In the situation where a partial word is tested, lets say >0, by shifting
left to get the sign bit into the msb and testing llvm is inserting a
spurious right shift instruction.
For example this IR:
...
%0 = load i64* %a.addr, align 8
%shl = shl i64 %0, 28
%cmp = icmp sgt i64 %shl, 0
...
results in
...
shlq $28, %rdi
sarq $28, %rdi ; <<< spurious shift
testq %rdi, %rdi
gcc doesnt have this problem. It just emits the shift and test.
The reason appears to be that the instruction combining pa...
2016 Nov 08
2
RFC: Killing undef and spreading poison
...am stores 8 bits, and leaves the remaining 24 bits
> uninitialized. It then loads 16 bits, half initialized to %v, half
> uninitialized. SROA transforms the above function to:
>
> define i16 @g(i8 %in) {
> %v = add nsw i8 127, %in
> %1 = zext i8 %v to i16
> %2 = shl i16 %1, 8
> %3 = and i16 undef, 255
> %4 = or i16 %3, %2
> ret i16 %4
> }
This program above returns i16 poison only if "shl i16 poison, 8" is a
full value poison. Whether that is the case today or not is anybody's
guess (as Eli says, we should not rely too...
2011 Jul 26
2
[LLVMdev] XOR Optimization
...erates a code like this:
>
>
> entry:
> br label %while.body
>
> while.body: ; preds = %while.body,
> %entry
> %0 = phi i32 [ 0, %entry ], [ %inc.3, %while.body ]
> %shr = lshr i32 %0, 5
> %rem = and i32 %0, 28
> %shl = shl i32 1, %rem
> %arrayidx = getelementptr inbounds i32* %bitmap, i32 %shr
> %tmp6 = load i32* %arrayidx, align 4
> %xor = xor i32 %tmp6, %shl
> %rem.1 = or i32 %rem, 1
> %shl.1 = shl i32 1, %rem.1
> %xor.1 = xor i32 %xor, %shl.1
> %rem.2 = or i32 %rem,...
2015 Apr 06
2
[LLVMdev] inconsistent wording in the LangRef regarding "shl nsw"
...is present, then the shift produces a poison value
if it shifts out any bits that disagree with the resultant sign bit."
... (1)
followed by
"As such, NUW/NSW have the same semantics as they would if the shift
were expressed as a mul instruction with the same nsw/nuw bits in (mul
%op1, (shl 1, %op2))." ... (2)
But by (1) "shl i8 1, i8 7" sign overflows (since it shifts out only
zeros, but the result has the sign bit set) but "mul i8 1, i8 -128"
does not sign overflow (by the usual definition of sign-overflow), so
this violates (2).
InstCombine already has a...
2016 Nov 09
4
RFC: Killing undef and spreading poison
...t; > uninitialized. It then loads 16 bits, half initialized to %v, half
>> > uninitialized. SROA transforms the above function to:
>> >
>> > define i16 @g(i8 %in) {
>> > %v = add nsw i8 127, %in
>> > %1 = zext i8 %v to i16
>> > %2 = shl i16 %1, 8
>> > %3 = and i16 undef, 255
>> > %4 = or i16 %3, %2
>> > ret i16 %4
>> > }
>>
>> This program above returns i16 poison only if "shl i16 poison, 8" is a
>> full value poison. Whether that is the case today or not i...
2013 Nov 09
2
[LLVMdev] [Target] Custom Lowering expansion of 32-bit ISD::SHL, ISD::SHR without barrel shifter
Dear All,
I am trying to custom lower 32-bit ISD::SHL and SHR in a backend for 6502
family CPUs. The particular subtarget has 16-bit registers at most, so a
32-bit result is not legal. Normally, if you mark this as "Legal" or
"Expand", then it will expand the node into a more nodes as follows in an
example:
shl i32 %a , 2
=> h...
2004 Feb 13
10
Encoding into MONO (delphi)
...conversion (from Aleksandr Shamray),
but it doesn't work when I'd like to convert a *.RAW into a
mono *.ogg file.
vorbis_encode_init_vbr(vi, 1, 44100, 0.5); //because of the mono
the program stops at line:
//* uninterleave samples */
.
.
buffer[1][i] := smallInt((pArray(@readbuffer)[i shl 2 + 3] shl 8) or pArray(@readbuffer)[i shl 2 + 2]) / 32768;
.
.
Why?????
Please, help me!
Thank You!
<p>---------------------------------
Do you Yahoo!?
Yahoo! Finance: Get your refund fast by filing online
--- >8 ----
List archives: http://www.xiph.org/archives/
Ogg project homepa...
2013 Nov 10
2
[LLVMdev] [Target] Custom Lowering expansion of 32-bit ISD::SHL, ISD::SHR without barrel shifter
...n on the 2 8-bit subregs of that 16-bit register. That means the only practical solution for 32-bit shifts is to lower to a libcall but my situation for 16-bit shifts sounds similar to yours for 32-bit shifts.
>
> I defined pseudo-instructions in my InstrInfo.td that implemented 8 and 16-bit SHL, SRL and SRA with one variant of each for the shift count in a register and another for an immediate shift count. Some shifts by immediate values and all shifts by non-immediate values need to be implemented using a loop so I used a custom inserter. I added a pass to expand the remaining shift pseu...
2011 Jul 26
2
[LLVMdev] XOR optimization
...itnumb);
bit_addr++;
}
The -O3 set of optimizations generates a code like this:
entry:
br label %while.body
while.body: ; preds = %while.body,
%entry
%0 = phi i32 [ 0, %entry ], [ %inc.3, %while.body ]
%shr = lshr i32 %0, 5
%rem = and i32 %0, 28
%shl = shl i32 1, %rem
%arrayidx = getelementptr inbounds i32* %bitmap, i32 %shr
%tmp6 = load i32* %arrayidx, align 4
%xor = xor i32 %tmp6, %shl
%rem.1 = or i32 %rem, 1
%shl.1 = shl i32 1, %rem.1
%xor.1 = xor i32 %xor, %shl.1
%rem.2 = or i32 %rem, 2
%shl.2 = shl i32 1, %rem.2
%xor.2 =...
2013 Nov 10
0
[LLVMdev] [Target] Custom Lowering expansion of 32-bit ISD::SHL, ISD::SHR without barrel shifter
...rt rotation on the 2 8-bit subregs of that 16-bit register. That means the only practical solution for 32-bit shifts is to lower to a libcall but my situation for 16-bit shifts sounds similar to yours for 32-bit shifts.
I defined pseudo-instructions in my InstrInfo.td that implemented 8 and 16-bit SHL, SRL and SRA with one variant of each for the shift count in a register and another for an immediate shift count. Some shifts by immediate values and all shifts by non-immediate values need to be implemented using a loop so I used a custom inserter. I added a pass to expand the remaining shift pseu...
2017 Jul 14
3
error:Ran out of lanemask bits to represent subregister
Do your 32768 registers also have sub registers?
I can't tell you exactly what to change. I'm not familiar with the code. I
would just be running grep or something.
~Craig
On Fri, Jul 14, 2017 at 10:23 AM, hameeza ahmed <hahmed2305 at gmail.com>
wrote:
> Thank you so much. I think there is no issue with my definitions since i
> have to use larger registers i.e 65536 bit
2008 Jul 17
2
[LLVMdev] ComputeMaskedBits Bug
...s. But I am not very well versed in this area of LLVM and
need some more eyes. This is in the 2.3 release, though it looks like the
relevant pieces operate the same way in trunk.
I have the following add recurrence:
%r849 = select i1 %r848, i64 0, i64 %r847 ; <i64> [#uses=10]
%r862 = shl i64 %r849, 3 ; <i64> [#uses=3]
%"$SR_S112.0" = phi i64 [ %r862, %"file alignment.f, line 79, in reduction
loop at depth 0, bb64" ], [ %r1874, %"file alignment.f, line 79, in loop at
depth 1, bb77" ] ; <i64> [#uses=2]
%r1874 = add i64 %r851, %"$SR...
2011 Jul 26
0
[LLVMdev] XOR Optimization
...entry:
> > br label %while.body
> >
> > while.body: ; preds = %while.body,
> > %entry
> > %0 = phi i32 [ 0, %entry ], [ %inc.3, %while.body ]
> > %shr = lshr i32 %0, 5
> > %rem = and i32 %0, 28
> > %shl = shl i32 1, %rem
> > %arrayidx = getelementptr inbounds i32* %bitmap, i32 %shr
> > %tmp6 = load i32* %arrayidx, align 4
> > %xor = xor i32 %tmp6, %shl
> > %rem.1 = or i32 %rem, 1
> > %shl.1 = shl i32 1, %rem.1
> > %xor.1 = xor i32 %xor, %shl.1...