search for: conv2

Displaying 20 results from an estimated 39 matches for "conv2".

Did you mean: conv
2015 Sep 30
2
InstCombine wrongful (?) optimization on BinOp with SameOperands
...re forwarding it to the backend I develop for my company and while building define i32 @test_extract_subreg_func(i32 %x, i32 %y) #0 { entry: %conv = zext i32 %x to i64 %conv1 = zext i32 %y to i64 %mul = mul nuw i64 %conv1, %conv %shr = lshr i64 %mul, 32 %xor = xor i64 %shr, %mul %conv2 = trunc i64 %xor to i32 ret i32 %conv2 } I came upon the following optimization (during instcombine): *IC: Visiting: %mul = mul nuw i64 %conv, %conv1 IC: Visiting: %shr = lshr i64 %mul, 32 IC: Visiting: %conv2 = trunc i64 %shr to i32 IC: Visiting: %conv3 = trunc i64 %mul to i32 IC: Visi...
2012 Dec 20
2
[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)
...39;s really happening. Ignore my previous statements concerning %add :) Again, given: 05: for.body: ; preds = %entry, %for.body 06: %j.04 = phi i32 [ 0, %entry ], [ %inc, %for.body ] 07: %result.03 = phi i32 [ 0, %entry ], [ %add, %for.body ] 08: %conv2 = and i32 %result.03, 255 09: %add = add nsw i32 %conv2, 3 10: %inc = add nsw i32 %j.04, 1 11: %cmp = icmp slt i32 %inc, 8000 12: br i1 %cmp, label %for.body, label %for.end LLVM executes the following: 01: createSCEV(%conv2 = and i32 %result.03, 255) 02: calls getSCEV(%result.03)...
2012 Dec 21
0
[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)
...ous statements concerning %add :) > > Again, given: > > 05: for.body: ; preds = %entry, > %for.body > 06: %j.04 = phi i32 [ 0, %entry ], [ %inc, %for.body ] > 07: %result.03 = phi i32 [ 0, %entry ], [ %add, %for.body ] > 08: %conv2 = and i32 %result.03, 255 > 09: %add = add nsw i32 %conv2, 3 > 10: %inc = add nsw i32 %j.04, 1 > 11: %cmp = icmp slt i32 %inc, 8000 > 12: br i1 %cmp, label %for.body, label %for.end > > LLVM executes the following: > > 01: createSCEV(%conv2 = and i32 %result.03,...
2012 Dec 10
3
[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)
...01: define signext i8 @foo() nounwind readnone { 02: entry: 03: br label %for.body 04: 05: for.body: ; preds = %entry, %for.body 06: %j.04 = phi i32 [ 0, %entry ], [ %inc, %for.body ] 07: %result.03 = phi i32 [ 0, %entry ], [ %add, %for.body ] 08: %conv2 = and i32 %result.03, 255 09: %add = add nsw i32 %conv2, 3 10: %inc = add nsw i32 %j.04, 1 11: %cmp = icmp slt i32 %inc, 8000 12: br i1 %cmp, label %for.body, label %for.end 13: 14: for.end: ; preds = %for.body 15: %conv1 = trunc i32 %add to i8 16:...
2012 Dec 18
0
[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)
On Tue, Dec 18, 2012 at 9:56 AM, Matthew Curtis <mcurtis at codeaurora.org> wrote: > > Here's how I'm evaluating the expression (in my head): > > 00: Add(ZeroExtend(Truncate(Minus(AddRec(Start=0,Step=3)[n],3), i8), i32),3) > | > 01: Add(ZeroExtend(Truncate(Minus(AddRec(Start=0,Step=3)[0],3), i8), i32),3) >
2012 Dec 18
2
[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)
...>> 03: br label %for.body >> 04: >> 05: for.body: ; preds = %entry, >> %for.body >> 06: %j.04 = phi i32 [ 0, %entry ], [ %inc, %for.body ] >> 07: %result.03 = phi i32 [ 0, %entry ], [ %add, %for.body ] >> 08: %conv2 = and i32 %result.03, 255 >> 09: %add = add nsw i32 %conv2, 3 >> 10: %inc = add nsw i32 %j.04, 1 >> 11: %cmp = icmp slt i32 %inc, 8000 >> 12: br i1 %cmp, label %for.body, label %for.end >> 13: >> 14: for.end: ; pre...
2012 Mar 01
3
[LLVMdev] Aliasing bug or feature?
...ign extends *** define void @test() nounwind { entry: store i8 0, i8* @s, align 1, !tbaa !0 %0 = load i8** @p, align 4, !tbaa !2 %1 = load i8* %0, align 1, !tbaa !0 %conv = zext i8 %1 to i32 %arrayidx1 = getelementptr inbounds i8* %0, i32 1 %2 = load i8* %arrayidx1, align 1, !tbaa !0 %conv2 = zext i8 %2 to i32 %3 = load i8** @q, align 4, !tbaa !2 <<< Can this load be bypassed by the store below? %4 = load i8* %3, align 1, !tbaa !0 %conv5 = zext i8 %4 to i32 %add = add i32 %conv2, %conv %add7 = add i32 %add, %conv5 %conv8 = trunc i32 %add7 to i8 store i8 %conv8,...
2012 Feb 28
1
[LLVMdev] How to vectorize a vector type cast?
...(i32 %in.coerce) nounwind uwtable readnone { entry: %0 = bitcast i32 %in.coerce to <4 x i8> %1 = extractelement <4 x i8> %0, i32 0 %conv = uitofp i8 %1 to float %vecinit = insertelement <4 x float> undef, float %conv, i32 0 %2 = extractelement <4 x i8> %0, i32 1 %conv2 = uitofp i8 %2 to float %vecinit3 = insertelement <4 x float> %vecinit, float %conv2, i32 1 %3 = extractelement <4 x i8> %0, i32 2 %conv4 = uitofp i8 %3 to float %vecinit5 = insertelement <4 x float> %vecinit3, float %conv4, i32 2 %4 = extractelement <4 x i8> %0, i...
2012 Mar 01
0
[LLVMdev] Aliasing bug or feature?
...() nounwind { > entry: >  store i8 0, i8* @s, align 1, !tbaa !0 >  %0 = load i8** @p, align 4, !tbaa !2 >  %1 = load i8* %0, align 1, !tbaa !0 >  %conv = zext i8 %1 to i32 >  %arrayidx1 = getelementptr inbounds i8* %0, i32 1 >  %2 = load i8* %arrayidx1, align 1, !tbaa !0 >  %conv2 = zext i8 %2 to i32 >  %3 = load i8** @q, align 4, !tbaa !2 <<< Can this load be bypassed by the > store below? >  %4 = load i8* %3, align 1, !tbaa !0 >  %conv5 = zext i8 %4 to i32 >  %add = add i32 %conv2, %conv >  %add7 = add i32 %add, %conv5 >  %conv8 = trunc i32 %a...
2020 Jan 11
2
[RFC][SDAG] Convert build_vector of ops on extractelts into ops on input vectors
...th PPC). But what I am proposing here is actually handling something like this: define dso_local <2 x double> @test(<2 x i64> %a) { entry: %vecext = extractelement <2 x i64> %a, i32 0 %vecext1 = extractelement <2 x i64> %a, i32 1 %conv = sitofp i64 %vecext to double %conv2 = sitofp i64 %vecext1 to double %vecinit = insertelement <2 x double> undef, double %conv, i32 0 %vecinit3 = insertelement <2 x double> %vecinit, double %conv2, i32 1 ret <2 x double> %vecinit3 } With this type conversion, InstCombine will actually simplify this as expected....
2009 Feb 20
1
smoothing 2D vector field
Hi all, is there a function / package in R that provides a function like Matlab's conv2 or filter2 for smoothing a vector- / velocity- field. I unfortunately could not find anything. Thanks a lot.
2011 Sep 08
1
[LLVMdev] [cfe-dev] Proposal: floating point accuracy metadata (OpenCL related)
...t, align 4 %y.addr = alloca float, align 4 store float* %result, float** %result.addr, align 8 store float %x, float* %x.addr, align 4 store float %y, float* %y.addr, align 4 %tmp = load float* %x.addr, align 4 %conv = fpext float %tmp to double %tmp1 = load float* %y.addr, align 4 %conv2 = fpext float %tmp1 to double %div = fdiv double %conv, %conv2 %conv3 = fptrunc double %div to float %tmp4 = load float** %result.addr, align 8 store float %conv3, float* %tmp4 ret void } ----- With optimisations turned on: ----- define void @dpdiv(float* nocapture %result, float %x, fl...
2020 Jan 11
2
[RFC][SDAG] Convert build_vector of ops on extractelts into ops on input vectors
...ling something like this: >> define dso_local <2 x double> @test(<2 x i64> %a) { >> entry: >> %vecext = extractelement <2 x i64> %a, i32 0 >> %vecext1 = extractelement <2 x i64> %a, i32 1 >> %conv = sitofp i64 %vecext to double >> %conv2 = sitofp i64 %vecext1 to double >> %vecinit = insertelement <2 x double> undef, double %conv, i32 0 >> %vecinit3 = insertelement <2 x double> %vecinit, double %conv2, i32 1 >> ret <2 x double> %vecinit3 >> } >> With this type conversion, InstCom...
2012 Dec 17
0
[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)
...readnone { > 02: entry: > 03: br label %for.body > 04: > 05: for.body: ; preds = %entry, > %for.body > 06: %j.04 = phi i32 [ 0, %entry ], [ %inc, %for.body ] > 07: %result.03 = phi i32 [ 0, %entry ], [ %add, %for.body ] > 08: %conv2 = and i32 %result.03, 255 > 09: %add = add nsw i32 %conv2, 3 > 10: %inc = add nsw i32 %j.04, 1 > 11: %cmp = icmp slt i32 %inc, 8000 > 12: br i1 %cmp, label %for.body, label %for.end > 13: > 14: for.end: ; preds = %for.body > 15:...
2018 Aug 06
2
Lowering ISD::TRUNCATE
...entry: %val1.addr = alloca i8, align 1 store i8 %val1, i8* %val1.addr, align 1 %0 = load i8, i8* %val1.addr, align 1 %conv = zext i8 %0 to i16 %1 = load i8, i8* %val1.addr, align 1 %conv1 = zext i8 %1 to i16 %add = add nsw i16 %conv, %conv1 %conv2 = trunc i16 %add to i8 ret i8 %conv2 } I looked into the X86 backend, which has a Z80-like register design, i.e. being able to access the subregs AL (and AH) from AX directly, without any specific truncation operation necessary. But, to be honest, I do not really understand from the...
2012 Oct 08
3
[LLVMdev] Multiply i8 operands promotes to i32
...or MUL_I16 in order to do the correct lowering? Thanks in advance, Pedro P.S: I add C code and corresponding LLVM code. C code: void (const u_int16_t in_data, u_int16_t* out) { u_int8_t kk = in_data&0xFF; u_int16_t kk16 = kk * kk; *out = kk16; } LLVM: %1 = load i8* %kk, align 1 %conv2 = zext i8 %1 to i32 %2 = load i8* %kk, align 1 %conv3 = zext i8 %2 to i32 %mul = mul nsw i32 %conv2, %conv3 %conv4 = trunc i32 %mul to i16 store i16 %conv4, i16* %kk16, align 2 -- Pedro Malagón - Profesor ayudante 91 549 57 00 - ext. 4220 Departamento de Ingeniería Electrónica Escuela T...
2016 Jul 27
2
Remove zext-unfolding from InstCombine
...generates for `foo` and `goo` just before they are passed to InstCombine: ``` define signext i8 @foo_before_InstCombine(i8 signext %a, i8 signext %b) local_unnamed_addr #0 { entry: %conv = sext i8 %a to i32 %and = and i32 %conv, 1 %cmp = icmp eq i32 %and, 0 %conv1 = zext i1 %cmp to i32 %conv2 = sext i8 %b to i32 %cmp3 = icmp eq i32 %conv2, 0 %conv4 = zext i1 %cmp3 to i32 %or = or i32 %conv1, %conv4 %conv5 = trunc i32 %or to i8 ret i8 %conv5 } ; Function Attrs: nounwind ssp uwtable define signext i8 @goo_before_InstCombine(i8 signext %a, i8 signext %b) local_unnamed_addr #0 {...
2013 Oct 09
4
[LLVMdev] Related constant folding of floating point values
...d float* %a, align 4 %conv = fpext float %0 to double %sub = fsub double %conv, 8.100000e+00 %cmp = fcmp oge double %sub, 0x3E8000000102F4FD br i1 %cmp, label %if.then, label %lor.lhs.false lor.lhs.false: ; preds = %entry %1 = load float* %a, align 4 %conv2 = fpext float %1 to double %sub3 = fsub double %conv2, 8.100000e+00 %cmp4 = fcmp ole double %sub3, 0xBE8000000102F4FD br i1 %cmp4, label %if.then, label %if.else ... during the transformation the %conv is replaced with "double 0x4020333340000000" and then the result of comparison i...
2008 May 02
1
Speedups with Ra and jit
...s is using Ra with R-2.7.0. > conv1 <- function(a, b) { > ### with Ra and jit require(jit) jit(1) ab <- numeric(length(a)+length(b)-1) for(i in 1:length(a)) for(j in 1:length(b)) ab[i+j-1] <- ab[i+j-1] + a[i]*b[j] ab } > > conv2 <- function(a, b) { > ### with just Ra ab <- numeric(length(a)+length(b)-1) for(i in 1:length(a)) for(j in 1:length(b)) ab[i+j-1] <- ab[i+j-1] + a[i]*b[j] ab } > > x <- 1:2000 > y <- 1:500 > system.time(tst1 <- conv1(x, y))...
2012 Feb 17
0
[LLVMdev] Folding an insertelt chain
On Feb 17, 2012, at 12:50 AM, Ivan Llopard wrote: > Hello, > > I've added a little combining operation in DAGCombiner to fold a chain of insertelt nodes if that chain is proved to fully overwrite the very first source vector. In which case, I supposed a build_vector is better. It seems to be safe but I don't know if it is correctly implemented or if it is already done somewhere