Displaying 20 results from an estimated 39 matches for "conv2".
Did you mean:
conv
2015 Sep 30
2
InstCombine wrongful (?) optimization on BinOp with SameOperands
...re
forwarding it to the backend I develop for my company and while building
define i32 @test_extract_subreg_func(i32 %x, i32 %y) #0 {
entry:
%conv = zext i32 %x to i64
%conv1 = zext i32 %y to i64
%mul = mul nuw i64 %conv1, %conv
%shr = lshr i64 %mul, 32
%xor = xor i64 %shr, %mul
%conv2 = trunc i64 %xor to i32
ret i32 %conv2
}
I came upon the following optimization (during instcombine):
*IC: Visiting: %mul = mul nuw i64 %conv, %conv1
IC: Visiting: %shr = lshr i64 %mul, 32
IC: Visiting: %conv2 = trunc i64 %shr to i32
IC: Visiting: %conv3 = trunc i64 %mul to i32
IC: Visi...
2012 Dec 20
2
[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)
...39;s really happening.
Ignore my previous statements concerning %add :)
Again, given:
05: for.body: ; preds = %entry,
%for.body
06: %j.04 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
07: %result.03 = phi i32 [ 0, %entry ], [ %add, %for.body ]
08: %conv2 = and i32 %result.03, 255
09: %add = add nsw i32 %conv2, 3
10: %inc = add nsw i32 %j.04, 1
11: %cmp = icmp slt i32 %inc, 8000
12: br i1 %cmp, label %for.body, label %for.end
LLVM executes the following:
01: createSCEV(%conv2 = and i32 %result.03, 255)
02: calls getSCEV(%result.03)...
2012 Dec 21
0
[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)
...ous statements concerning %add :)
>
> Again, given:
>
> 05: for.body: ; preds = %entry,
> %for.body
> 06: %j.04 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
> 07: %result.03 = phi i32 [ 0, %entry ], [ %add, %for.body ]
> 08: %conv2 = and i32 %result.03, 255
> 09: %add = add nsw i32 %conv2, 3
> 10: %inc = add nsw i32 %j.04, 1
> 11: %cmp = icmp slt i32 %inc, 8000
> 12: br i1 %cmp, label %for.body, label %for.end
>
> LLVM executes the following:
>
> 01: createSCEV(%conv2 = and i32 %result.03,...
2012 Dec 10
3
[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)
...01: define signext i8 @foo() nounwind readnone {
02: entry:
03: br label %for.body
04:
05: for.body: ; preds = %entry,
%for.body
06: %j.04 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
07: %result.03 = phi i32 [ 0, %entry ], [ %add, %for.body ]
08: %conv2 = and i32 %result.03, 255
09: %add = add nsw i32 %conv2, 3
10: %inc = add nsw i32 %j.04, 1
11: %cmp = icmp slt i32 %inc, 8000
12: br i1 %cmp, label %for.body, label %for.end
13:
14: for.end: ; preds = %for.body
15: %conv1 = trunc i32 %add to i8
16:...
2012 Dec 18
0
[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)
On Tue, Dec 18, 2012 at 9:56 AM, Matthew Curtis <mcurtis at codeaurora.org> wrote:
>
> Here's how I'm evaluating the expression (in my head):
>
> 00: Add(ZeroExtend(Truncate(Minus(AddRec(Start=0,Step=3)[n],3), i8), i32),3)
> |
> 01: Add(ZeroExtend(Truncate(Minus(AddRec(Start=0,Step=3)[0],3), i8), i32),3)
>
2012 Dec 18
2
[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)
...>> 03: br label %for.body
>> 04:
>> 05: for.body: ; preds = %entry,
>> %for.body
>> 06: %j.04 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
>> 07: %result.03 = phi i32 [ 0, %entry ], [ %add, %for.body ]
>> 08: %conv2 = and i32 %result.03, 255
>> 09: %add = add nsw i32 %conv2, 3
>> 10: %inc = add nsw i32 %j.04, 1
>> 11: %cmp = icmp slt i32 %inc, 8000
>> 12: br i1 %cmp, label %for.body, label %for.end
>> 13:
>> 14: for.end: ; pre...
2012 Mar 01
3
[LLVMdev] Aliasing bug or feature?
...ign extends ***
define void @test() nounwind {
entry:
store i8 0, i8* @s, align 1, !tbaa !0
%0 = load i8** @p, align 4, !tbaa !2
%1 = load i8* %0, align 1, !tbaa !0
%conv = zext i8 %1 to i32
%arrayidx1 = getelementptr inbounds i8* %0, i32 1
%2 = load i8* %arrayidx1, align 1, !tbaa !0
%conv2 = zext i8 %2 to i32
%3 = load i8** @q, align 4, !tbaa !2 <<< Can this load be bypassed by the
store below?
%4 = load i8* %3, align 1, !tbaa !0
%conv5 = zext i8 %4 to i32
%add = add i32 %conv2, %conv
%add7 = add i32 %add, %conv5
%conv8 = trunc i32 %add7 to i8
store i8 %conv8,...
2012 Feb 28
1
[LLVMdev] How to vectorize a vector type cast?
...(i32 %in.coerce) nounwind uwtable readnone {
entry:
%0 = bitcast i32 %in.coerce to <4 x i8>
%1 = extractelement <4 x i8> %0, i32 0
%conv = uitofp i8 %1 to float
%vecinit = insertelement <4 x float> undef, float %conv, i32 0
%2 = extractelement <4 x i8> %0, i32 1
%conv2 = uitofp i8 %2 to float
%vecinit3 = insertelement <4 x float> %vecinit, float %conv2, i32 1
%3 = extractelement <4 x i8> %0, i32 2
%conv4 = uitofp i8 %3 to float
%vecinit5 = insertelement <4 x float> %vecinit3, float %conv4, i32 2
%4 = extractelement <4 x i8> %0, i...
2012 Mar 01
0
[LLVMdev] Aliasing bug or feature?
...() nounwind {
> entry:
> store i8 0, i8* @s, align 1, !tbaa !0
> %0 = load i8** @p, align 4, !tbaa !2
> %1 = load i8* %0, align 1, !tbaa !0
> %conv = zext i8 %1 to i32
> %arrayidx1 = getelementptr inbounds i8* %0, i32 1
> %2 = load i8* %arrayidx1, align 1, !tbaa !0
> %conv2 = zext i8 %2 to i32
> %3 = load i8** @q, align 4, !tbaa !2 <<< Can this load be bypassed by the
> store below?
> %4 = load i8* %3, align 1, !tbaa !0
> %conv5 = zext i8 %4 to i32
> %add = add i32 %conv2, %conv
> %add7 = add i32 %add, %conv5
> %conv8 = trunc i32 %a...
2020 Jan 11
2
[RFC][SDAG] Convert build_vector of ops on extractelts into ops on input vectors
...th PPC).
But what I am proposing here is actually handling something like this:
define dso_local <2 x double> @test(<2 x i64> %a) {
entry:
%vecext = extractelement <2 x i64> %a, i32 0
%vecext1 = extractelement <2 x i64> %a, i32 1
%conv = sitofp i64 %vecext to double
%conv2 = sitofp i64 %vecext1 to double
%vecinit = insertelement <2 x double> undef, double %conv, i32 0
%vecinit3 = insertelement <2 x double> %vecinit, double %conv2, i32 1
ret <2 x double> %vecinit3
}
With this type conversion, InstCombine will actually simplify this as
expected....
2009 Feb 20
1
smoothing 2D vector field
Hi all,
is there a function / package in R that provides a function like
Matlab's conv2 or filter2 for smoothing a vector- / velocity- field.
I unfortunately could not find anything.
Thanks a lot.
2011 Sep 08
1
[LLVMdev] [cfe-dev] Proposal: floating point accuracy metadata (OpenCL related)
...t, align 4
%y.addr = alloca float, align 4
store float* %result, float** %result.addr, align 8
store float %x, float* %x.addr, align 4
store float %y, float* %y.addr, align 4
%tmp = load float* %x.addr, align 4
%conv = fpext float %tmp to double
%tmp1 = load float* %y.addr, align 4
%conv2 = fpext float %tmp1 to double
%div = fdiv double %conv, %conv2
%conv3 = fptrunc double %div to float
%tmp4 = load float** %result.addr, align 8
store float %conv3, float* %tmp4
ret void
}
-----
With optimisations turned on:
-----
define void @dpdiv(float* nocapture %result, float %x, fl...
2020 Jan 11
2
[RFC][SDAG] Convert build_vector of ops on extractelts into ops on input vectors
...ling something like this:
>> define dso_local <2 x double> @test(<2 x i64> %a) {
>> entry:
>> %vecext = extractelement <2 x i64> %a, i32 0
>> %vecext1 = extractelement <2 x i64> %a, i32 1
>> %conv = sitofp i64 %vecext to double
>> %conv2 = sitofp i64 %vecext1 to double
>> %vecinit = insertelement <2 x double> undef, double %conv, i32 0
>> %vecinit3 = insertelement <2 x double> %vecinit, double %conv2, i32 1
>> ret <2 x double> %vecinit3
>> }
>> With this type conversion, InstCom...
2012 Dec 17
0
[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)
...readnone {
> 02: entry:
> 03: br label %for.body
> 04:
> 05: for.body: ; preds = %entry,
> %for.body
> 06: %j.04 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
> 07: %result.03 = phi i32 [ 0, %entry ], [ %add, %for.body ]
> 08: %conv2 = and i32 %result.03, 255
> 09: %add = add nsw i32 %conv2, 3
> 10: %inc = add nsw i32 %j.04, 1
> 11: %cmp = icmp slt i32 %inc, 8000
> 12: br i1 %cmp, label %for.body, label %for.end
> 13:
> 14: for.end: ; preds = %for.body
> 15:...
2018 Aug 06
2
Lowering ISD::TRUNCATE
...entry:
%val1.addr = alloca i8, align 1
store i8 %val1, i8* %val1.addr, align 1
%0 = load i8, i8* %val1.addr, align 1
%conv = zext i8 %0 to i16
%1 = load i8, i8* %val1.addr, align 1
%conv1 = zext i8 %1 to i16
%add = add nsw i16 %conv, %conv1
%conv2 = trunc i16 %add to i8
ret i8 %conv2
}
I looked into the X86 backend, which has a Z80-like register design,
i.e. being able to access the subregs AL (and AH) from AX directly,
without any specific truncation operation necessary. But, to be honest,
I do not really understand from the...
2012 Oct 08
3
[LLVMdev] Multiply i8 operands promotes to i32
...or MUL_I16 in order to do the correct lowering?
Thanks in advance,
Pedro
P.S: I add C code and corresponding LLVM code.
C code:
void
(const u_int16_t in_data, u_int16_t* out)
{
u_int8_t kk = in_data&0xFF;
u_int16_t kk16 = kk * kk;
*out = kk16;
}
LLVM:
%1 = load i8* %kk, align 1
%conv2 = zext i8 %1 to i32
%2 = load i8* %kk, align 1
%conv3 = zext i8 %2 to i32
%mul = mul nsw i32 %conv2, %conv3
%conv4 = trunc i32 %mul to i16
store i16 %conv4, i16* %kk16, align 2
--
Pedro Malagón - Profesor ayudante
91 549 57 00 - ext. 4220
Departamento de Ingeniería Electrónica
Escuela T...
2016 Jul 27
2
Remove zext-unfolding from InstCombine
...generates for `foo` and `goo` just before they are passed to InstCombine:
```
define signext i8 @foo_before_InstCombine(i8 signext %a, i8 signext %b) local_unnamed_addr #0 {
entry:
%conv = sext i8 %a to i32
%and = and i32 %conv, 1
%cmp = icmp eq i32 %and, 0
%conv1 = zext i1 %cmp to i32
%conv2 = sext i8 %b to i32
%cmp3 = icmp eq i32 %conv2, 0
%conv4 = zext i1 %cmp3 to i32
%or = or i32 %conv1, %conv4
%conv5 = trunc i32 %or to i8
ret i8 %conv5
}
; Function Attrs: nounwind ssp uwtable
define signext i8 @goo_before_InstCombine(i8 signext %a, i8 signext %b) local_unnamed_addr #0 {...
2013 Oct 09
4
[LLVMdev] Related constant folding of floating point values
...d float* %a, align 4
%conv = fpext float %0 to double
%sub = fsub double %conv, 8.100000e+00
%cmp = fcmp oge double %sub, 0x3E8000000102F4FD
br i1 %cmp, label %if.then, label %lor.lhs.false
lor.lhs.false: ; preds = %entry
%1 = load float* %a, align 4
%conv2 = fpext float %1 to double
%sub3 = fsub double %conv2, 8.100000e+00
%cmp4 = fcmp ole double %sub3, 0xBE8000000102F4FD
br i1 %cmp4, label %if.then, label %if.else
...
during the transformation the %conv is replaced with "double
0x4020333340000000" and then the result of comparison i...
2008 May 02
1
Speedups with Ra and jit
...s is using Ra with R-2.7.0.
> conv1 <- function(a, b) {
> ### with Ra and jit
require(jit)
jit(1)
ab <- numeric(length(a)+length(b)-1)
for(i in 1:length(a))
for(j in 1:length(b))
ab[i+j-1] <- ab[i+j-1] + a[i]*b[j]
ab
}
>
> conv2 <- function(a, b) {
> ### with just Ra
ab <- numeric(length(a)+length(b)-1)
for(i in 1:length(a))
for(j in 1:length(b))
ab[i+j-1] <- ab[i+j-1] + a[i]*b[j]
ab
}
>
> x <- 1:2000
> y <- 1:500
> system.time(tst1 <- conv1(x, y))...
2012 Feb 17
0
[LLVMdev] Folding an insertelt chain
On Feb 17, 2012, at 12:50 AM, Ivan Llopard wrote:
> Hello,
>
> I've added a little combining operation in DAGCombiner to fold a chain of insertelt nodes if that chain is proved to fully overwrite the very first source vector. In which case, I supposed a build_vector is better. It seems to be safe but I don't know if it is correctly implemented or if it is already done somewhere