Displaying 14 results from an estimated 14 matches for "conv3".
Did you mean:
conv
2011 Sep 08
1
[LLVMdev] [cfe-dev] Proposal: floating point accuracy metadata (OpenCL related)
...** %result.addr, align 8
store float %x, float* %x.addr, align 4
store float %y, float* %y.addr, align 4
%tmp = load float* %x.addr, align 4
%conv = fpext float %tmp to double
%tmp1 = load float* %y.addr, align 4
%conv2 = fpext float %tmp1 to double
%div = fdiv double %conv, %conv2
%conv3 = fptrunc double %div to float
%tmp4 = load float** %result.addr, align 8
store float %conv3, float* %tmp4
ret void
}
-----
With optimisations turned on:
-----
define void @dpdiv(float* nocapture %result, float %x, float %y) nounwind uwtable {
entry:
%conv3 = fdiv float %x, %y
store flo...
2015 Sep 30
2
InstCombine wrongful (?) optimization on BinOp with SameOperands
...%xor = xor i64 %shr, %mul
%conv2 = trunc i64 %xor to i32
ret i32 %conv2
}
I came upon the following optimization (during instcombine):
*IC: Visiting: %mul = mul nuw i64 %conv, %conv1
IC: Visiting: %shr = lshr i64 %mul, 32
IC: Visiting: %conv2 = trunc i64 %shr to i32
IC: Visiting: %conv3 = trunc i64 %mul to i32
IC: Visiting: %xor = xor i32 %conv3, %conv2
IC: ADD: %xor6 = xor i64 %mul, %shr
IC: Old = %xor = xor i32 %conv3, %conv2
New = <badref> = trunc i64 %xor6 to i32
*
which seems to be performed by SDValue
DAGCombiner::SimplifyBinOpWithSameOpcodeHands(SDNode *...
2011 Sep 08
0
[LLVMdev] [cfe-dev] Proposal: floating point accuracy metadata (OpenCL related)
Peter,
Is there a way to make this flag globally available? Metadata can be fairly expensive to handle at each node when in many cases it is a global flag and not a per operation flag.
> -----Original Message-----
> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu]
> On Behalf Of Robert Quill
> Sent: Thursday, September 08, 2011 3:24 AM
> To: Peter
2012 Oct 08
3
[LLVMdev] Multiply i8 operands promotes to i32
...advance,
Pedro
P.S: I add C code and corresponding LLVM code.
C code:
void
(const u_int16_t in_data, u_int16_t* out)
{
u_int8_t kk = in_data&0xFF;
u_int16_t kk16 = kk * kk;
*out = kk16;
}
LLVM:
%1 = load i8* %kk, align 1
%conv2 = zext i8 %1 to i32
%2 = load i8* %kk, align 1
%conv3 = zext i8 %2 to i32
%mul = mul nsw i32 %conv2, %conv3
%conv4 = trunc i32 %mul to i16
store i16 %conv4, i16* %kk16, align 2
--
Pedro Malagón - Profesor ayudante
91 549 57 00 - ext. 4220
Departamento de Ingeniería Electrónica
Escuela Técnica Superior de Ingenieros de Telecomunicación
Universi...
2011 Sep 08
1
[LLVMdev] [cfe-dev] Proposal: floating point accuracy metadata (OpenCL related)
Hi Peter,
This sounds like I really good idea. One thing that did occur to me
though from an OpenCL point of view is that ULP accuracy requirements
can differ for embedded and full profile so that may need to be handled
somehow.
Thanks,
Rob
On Wed, 2011-09-07 at 21:55 +0100, Peter Collingbourne wrote:
> Hi,
>
> This is my proposal to add floating point accuracy support to LLVM.
>
2008 May 02
1
Speedups with Ra and jit
...0.00 0.55
> system.time(tst2 <- conv2(x, y))
user system elapsed
9.49 0.00 9.56
> all.equal(tst1, tst2)
[1] TRUE
>
> 9.56/0.55
[1] 17.38182
>
However for this example you can achieve speed-ups like that or better
just using vectorised code intelligently:
> conv3 <- local({
conv <- function(a, b, na, nb) {
r <- numeric(na + nb -1)
ij <- 1:nb
for(e in a) {
r[ij] <- r[ij] + e*b
ij <- ij + 1
}
r
}
function(a, b) {
na <- length(a)
nb <- len...
2011 Feb 07
2
[LLVMdev] Promoting i16 load to i32
...oca i16, align 2
%y.addr = alloca i16, align 2
store i16 %x, i16* %x.addr, align 2
store i16 %y, i16* %y.addr, align 2
%tmp = load i16* %x.addr, align 2
%conv = zext i16 %tmp to i32
%tmp1 = load i16* %y.addr, align 2
%conv2 = zext i16 %tmp1 to i32
%add = add nsw i32 %conv, %conv2
%conv3 = trunc i32 %add to i16
ret i16 %conv3
}
Upon compiling I get this failed assertion:
llc: LegalizeDAG.cpp:1309:
llvm::SDValue<unnamed>::SelectionDAGLegalize::LegalizeOp(llvm::SDValue):
Assertion `0 && "This action is not supported yet!"' failed.
I initially expected...
2013 Nov 11
2
[LLVMdev] What's the Alias Analysis does clang use ?
...lign 4
%6 = load float* %arrayidx1, align 4
store float %6, float* %y, align 4
%7 = load float* %arrayidx2, align 4
store float %7, float* %z, align 4
%8 = load float* %x, align 4
%conv = fpext float %8 to double
%mul = fmul double %conv, 6.700000e-01
%9 = load float* %y, align 4
%conv3 = fpext float %9 to double
%mul4 = fmul double %conv3, 1.700000e-01
%add = fadd double %mul, %mul4
%10 = load float* %z, align 4
%conv5 = fpext float %10 to double
%mul6 = fmul double %conv5, 1.600000e-01
%add7 = fadd double %add, %mul6
%conv8 = fptrunc double %add7 to float
store f...
2009 Apr 20
4
[LLVMdev] Unnecessary moves after sign-extension in 2-address target
...{
return (signed char) a + (signed short) b + c;
}
I get this IR:
define i32 @sext(i32 %a, i32 %b, i32 %c) nounwind readnone {
entry:
%conv = trunc i32 %a to i8 ; <i8> [#uses=1]
%conv1 = sext i8 %conv to i32 ; <i32> [#uses=1]
%conv3 = trunc i32 %b to i16 ; <i16> [#uses=1]
%conv4 = sext i16 %conv3 to i32 ; <i32> [#uses=1]
%add = add i32 %conv1, %c ; <i32> [#uses=1]
%add6 = add i32 %add, %conv4 ; <i32> [#uses=1]
ret i32 %add6
}
And this not-...
2013 Nov 12
0
[LLVMdev] What's the Alias Analysis does clang use ?
...* %arrayidx1, align 4
> store float %6, float* %y, align 4
> %7 = load float* %arrayidx2, align 4
> store float %7, float* %z, align 4
> %8 = load float* %x, align 4
> %conv = fpext float %8 to double
> %mul = fmul double %conv, 6.700000e-01
> %9 = load float* %y, align 4
> %conv3 = fpext float %9 to double
> %mul4 = fmul double %conv3, 1.700000e-01
> %add = fadd double %mul, %mul4
> %10 = load float* %z, align 4
> %conv5 = fpext float %10 to double
> %mul6 = fmul double %conv5, 1.600000e-01
> %add7 = fadd double %add, %mul6
> %conv8 = fptrunc double %ad...
2010 Sep 14
1
[LLVMdev] global type legalization?
Returning to an old discussion here....
On Aug 18, 2010, at 10:42 AM, Chris Lattner wrote:
> On Aug 18, 2010, at 10:27 AM, Bob Wilson wrote:
>>> I tend to think that it isn't worth the compile time to try to microoptimize out every compare, but I could be convinced otherwise if there are important use cases we're failing to handle. I also do think that whole-function
2011 Jan 04
4
[LLVMdev] Bug in MachineInstr::isIdenticalTo
...53 = extractelement <4 x i32> %format, i32 1 ; <i32> [#uses=1]
switch i32 %tmp53, label %if.then [
i32 1, label %switch.case55
i32 2, label %switch.case61
]
switch.case55: ; preds = %switch.case
%arrayidx = getelementptr i8 addrspace(1)* %conv3, i32 %tmp22 ; <i8 addrspace(1)*> [#uses=1]
%tmp59 = extractelement <4 x i32> %9, i32 0 ; <i32> [#uses=1]
%conv60 = trunc i32 %tmp59 to i8 ; <i8> [#uses=1]
store i8 %conv60, i8 addrspace(1)* %arrayidx
ret void
switch.case61:...
2011 Jan 04
0
[LLVMdev] Bug in MachineInstr::isIdenticalTo
...2> %format, i32 1 ; <i32> [#uses=1]
> switch i32 %tmp53, label %if.then [
> i32 1, label %switch.case55
> i32 2, label %switch.case61
> ]
> switch.case55: ; preds = %switch.case
> %arrayidx = getelementptr i8 addrspace(1)* %conv3, i32 %tmp22 ; <i8 addrspace(1)*> [#uses=1]
> %tmp59 = extractelement <4 x i32> %9, i32 0 ; <i32> [#uses=1]
> %conv60 = trunc i32 %tmp59 to i8 ; <i8> [#uses=1]
> store i8 %conv60, i8 addrspace(1)* %arrayidx
> ret void
> switch.case61:...
2013 Feb 14
1
[LLVMdev] LiveIntervals analysis problem
...nv2.i.i.i.i.i = trunc i32 %or.i.i.i.i.i to i16
br label %if.end.i126.i.i.i.i
if.end.i126.i.i.i.i: ; preds = %if.then.i.i.i.i.i, %for.body.i.i.i.i.i
%bits.1.i.i.i.i.i = phi i16 [ %conv2.i.i.i.i.i, %if.then.i.i.i.i.i ], [ %bits.024.i.i.i.i.i, %for.body.i.i.i.i.i ]
%conv3.i.i.i.i.i = zext i16 %194 to i32
%shl.i.i.i.i.i = shl nuw nsw i32 %conv3.i.i.i.i.i, 1
%conv5.i.i.i.i.i = zext i16 %bits.1.i.i.i.i.i to i32
%and6.i.i.i.i.i = lshr i32 %conv5.i.i.i.i.i, 1
%and6.lobit.i.i.i.i.i = and i32 %and6.i.i.i.i.i, 1
%storemerge.in.i.i.i.i.i = or i32 %and6.lobit.i.i.i....